1.0.27 • Published 3 years ago
wikiparse v1.0.27
wikiparse — Wikipedia/MediaWiki syntax parser
Converts wiki markup into a JSON abstract syntax tree or plain text.
Installation
npm i wikiparse
Basic usage
import {parse} from 'wikiparse';
const ast = parse(`'''Cats''', also called '''domestic cats''' (''Felis catus''), are small, [[carnivore|carnivorous]] [[mammal]]s`);
console.log(ast);
[
{"type": "bold", "content": ["Cats"]},
", also called ",
{"type": "bold", "content": ["domestic cats"]},
" (",
{"type": "italics", "content": ["Felis catus"]},
"), are small, ",
{"type": "link", "to": "carnivore", "content": ["carnivorous"]},
" ",
{"type": "link", "to": "mammal", "content": ["mammals"]}
]
Getting plain text content
import {astToText} from 'wikiparse';
console.log(astToText(ast));
Cats, also called domestic cats (Felis catus), are small, carnivorous mammals
Parsing a Wikipedia article
import fetch from 'node-fetch';
import WikiParser, {astToText} from 'wikiparse';
const wiki = 'simple';
const page = 'Cat';
const url = `https://${wiki}.wikipedia.org/w/api.php?&action=query&titles=${encodeURIComponent(page)}&prop=revisions&rvprop=content&format=json`;
// If you need lots of pages, use XML dumps from https://dumps.wikimedia.org/
const json = await (await fetch(url)).json();
const source = Object.entries(json.query.pages)[0][1].revisions[0]['*'];
const parser = new WikiParser();
const ast = parser.parse(source);
console.log(JSON.stringify(ast, null, 2));
console.log(astToText(ast));
Importing Wikipedia dumps
You can use wiki-import to parse and import a whole Wikipedia dump into LevelDB (or something else with minor code modifications).
1.0.27
3 years ago
1.0.26
3 years ago
1.0.25
3 years ago
1.0.24
3 years ago
1.0.23
3 years ago
1.0.22
3 years ago
1.0.21
3 years ago
1.0.20
3 years ago
1.0.19
3 years ago
1.0.18
3 years ago
1.0.17
3 years ago
1.0.16
3 years ago
1.0.15
3 years ago
1.0.14
3 years ago
1.0.13
3 years ago
1.0.12
3 years ago
1.0.11
3 years ago
1.0.10
3 years ago
1.0.9
3 years ago
1.0.8
3 years ago
1.0.7
3 years ago
1.0.6
3 years ago
1.0.5
3 years ago
1.0.4
3 years ago
1.0.3
3 years ago
1.0.2
3 years ago
1.0.1
3 years ago
1.0.0
3 years ago