1.0.27 • Published 2 years ago

wikiparse v1.0.27

Weekly downloads
-
License
MIT
Repository
github
Last release
2 years ago

wikiparse — Wikipedia/MediaWiki syntax parser

Converts wiki markup into a JSON abstract syntax tree or plain text.

Installation

npm i wikiparse

Basic usage

import {parse} from 'wikiparse';

const ast = parse(`'''Cats''', also called '''domestic cats''' (''Felis catus''), are small, [[carnivore|carnivorous]] [[mammal]]s`);
console.log(ast);
[
  {"type": "bold", "content": ["Cats"]},
  ", also called ",
  {"type": "bold", "content": ["domestic cats"]},
  " (",
  {"type": "italics", "content": ["Felis catus"]},
  "), are small, ",
  {"type": "link", "to": "carnivore", "content": ["carnivorous"]},
  " ",
  {"type": "link", "to": "mammal", "content": ["mammals"]}
]

Getting plain text content

import {astToText} from 'wikiparse';

console.log(astToText(ast));
Cats, also called domestic cats (Felis catus), are small, carnivorous mammals

Parsing a Wikipedia article

import fetch from 'node-fetch';
import WikiParser, {astToText} from 'wikiparse';

const wiki = 'simple';
const page = 'Cat';
const url = `https://${wiki}.wikipedia.org/w/api.php?&action=query&titles=${encodeURIComponent(page)}&prop=revisions&rvprop=content&format=json`;
// If you need lots of pages, use XML dumps from https://dumps.wikimedia.org/
const json = await (await fetch(url)).json();
const source = Object.entries(json.query.pages)[0][1].revisions[0]['*'];

const parser = new WikiParser();
const ast = parser.parse(source);
console.log(JSON.stringify(ast, null, 2));
console.log(astToText(ast));

Output: JSON, Text.

Importing Wikipedia dumps

You can use wiki-import to parse and import a whole Wikipedia dump into LevelDB (or something else with minor code modifications).

1.0.27

2 years ago

1.0.26

2 years ago

1.0.25

2 years ago

1.0.24

2 years ago

1.0.23

2 years ago

1.0.22

2 years ago

1.0.21

2 years ago

1.0.20

2 years ago

1.0.19

2 years ago

1.0.18

2 years ago

1.0.17

2 years ago

1.0.16

2 years ago

1.0.15

2 years ago

1.0.14

2 years ago

1.0.13

2 years ago

1.0.12

2 years ago

1.0.11

2 years ago

1.0.10

2 years ago

1.0.9

2 years ago

1.0.8

2 years ago

1.0.7

2 years ago

1.0.6

2 years ago

1.0.5

2 years ago

1.0.4

2 years ago

1.0.3

2 years ago

1.0.2

2 years ago

1.0.1

2 years ago

1.0.0

2 years ago