1.9.4-1563180279660 • Published 6 years ago

mwoffliner v1.9.4-1563180279660

Weekly downloads
406
License
GPL-3.0
Repository
github
Last release
6 years ago

mwoffliner

mwoffliner is a tool for making a local HTML snapshot of any online (recent) Mediawiki instance. It goes through all articles (or a selection if specified) and writes the HTML/images to a local directory. It has mainly been tested against Wikimedia projects like Wikipedia, Wiktionary, ... But it should also work for any recent Mediawiki.

NPM

npm Build Status codecov CodeFactor NPM

Prerequisites

  • *NIX Operating System (Linux/macOS)
  • NodeJS
  • Redis
  • Libzim (On linux we automatically download binaries)
  • Various build tools that are probably already installed on your machine (libjpeg, gcc)

Setup

MacOS

NodeJS

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --version

Redis

> brew install redis

LibZim

See instructions here: https://github.com/openzim/libzim

Linux (Debian)

NodeJS

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --version

Redis

> sudo apt-get install redis-server

Usage

Command Line

> npm i -g mwoffliner
> mwoffliner --help

> mwoffliner \
    --mwUrl=https://es.wikipedia.org \
    --adminEmail=foo@bar.net \
    --verbose \
    --format=nozim \ # Won't make a final ZIM file
    --articleList=./dev/articleList # Will download one article

Programmatic API

const mwoffliner = require('mwoffliner');
const parameters = {
    mwUrl: "https://es.wikipedia.org",
    adminEmail: "foo@bar.net",
    verbose: true,
    format: "nozim",
    articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a Promise

Development

Please see CONTRIBUTING.md

To setup mwoffliner locally:

git clone https://github.com/openzim/mwoffliner.git
cd mwoffliner
npm i

To run it (this is only an example):
```bash
./node_modules/.bin/ts-node ./src/cli.ts --mwUrl=https://bm.wikipedia.org --adminEmail=XXX

or

npm start -- --mwUrl=https://bm.wikipedia.org --adminEmail=XXX

Tests

There are two test suites:

  • Unit
npm run test:unit # (or just npm test)
  • End to end (e2e)
npm run test:e2e

Code Style

We follow a nearly exact tslint:recommended scheme - you can see more information here: ./tslint.json

It's best to use TSLint to check your code as you develop, this project is pre-configured for development with VSCode and the TSLint plugin.

Debugging

There is a pre-configured debug config for VSCode, just click on the debugging tab.

Make sure you read CONTRIBUTING.md for tips on how to best debug and submit issues.

Publishing

To publish, it's best to use a clean clone of the project:

git clone https://github.com/openzim/mwoffliner.git
npm i
./dev/build.sh
npm publish  # you must be logged in already (npm login)

Background

There are two Wikitext parsers. mwoffliner uses Parsoid.

  • Wikitext is the name of the markup language that Wikipedia uses.
  • MediaWiki is a PHP package that runs a wiki, including Wikipedia.
  • MediaWiki includes a parser for Wikitext into HTML, and this parser creates Wikipedia currently.
  • There is another Wikitext parser, called Parsoid, implemented in Javascript (Node.js).
  • Parsoid is planned to eventually become the main parser for Wikipedia.
  • mwoffliner uses Parsoid.
  • mwoffliner calls Parsoid and then post-processes the results for offline format.
1.13.0

2 years ago

1.12.1

2 years ago

1.12.0

2 years ago

1.11.12

3 years ago

1.11.11

3 years ago

1.11.10

4 years ago

1.11.9

4 years ago

1.11.8

4 years ago

1.11.7

4 years ago

1.11.6

4 years ago

1.11.5

4 years ago

1.11.4

4 years ago

1.11.3

4 years ago

1.11.2

4 years ago

1.11.1

4 years ago

1.11.0

4 years ago

1.10.12

5 years ago

1.10.11

5 years ago

1.10.10

5 years ago

1.10.9

5 years ago

1.10.8

5 years ago

1.10.7

5 years ago

1.10.6

5 years ago

1.10.5

5 years ago

1.10.4

5 years ago

1.10.3

5 years ago

1.10.2

5 years ago

1.10.1

5 years ago

1.10.0

5 years ago

1.9.13

5 years ago

1.9.12

6 years ago

1.9.10

6 years ago

1.9.9

6 years ago

1.9.8

6 years ago

1.9.6

6 years ago

1.9.5

6 years ago

1.9.5-rc1

6 years ago

1.9.4

6 years ago

1.9.4-rc1

6 years ago

1.9.3

6 years ago

1.9.3-rc4

6 years ago

1.9.3-rc3

6 years ago

1.9.3-rc2

6 years ago

1.9.3-rc1

6 years ago

1.9.2

6 years ago

1.9.2-rc3

6 years ago

1.9.2-rc2

6 years ago

1.9.2-rc1

6 years ago

1.9.1

6 years ago

1.9.0

6 years ago

1.9.0-rc.1

6 years ago

1.9.0-rc1

6 years ago

1.8.6

6 years ago

1.8.5

6 years ago

1.8.4

6 years ago

1.8.3

6 years ago

1.8.2

6 years ago

1.8.1

6 years ago

1.8.0

6 years ago

1.7.1

6 years ago

1.7.0

6 years ago

1.6.0

6 years ago

1.5.0

7 years ago

1.4.1

7 years ago

1.4.0

7 years ago

1.3.7

8 years ago

1.3.6

8 years ago

1.3.5

8 years ago

1.3.4

8 years ago

1.3.3

8 years ago

1.3.2

8 years ago

1.3.1

8 years ago

1.3.0

8 years ago

1.2.7

8 years ago

1.2.6

8 years ago

1.2.5

8 years ago

1.2.4

8 years ago

1.2.3

8 years ago

1.2.2

8 years ago

1.2.1

8 years ago

1.2.0

8 years ago

1.1.6

8 years ago

1.1.5

8 years ago

1.1.4

8 years ago

1.1.3

8 years ago

1.1.2

8 years ago

1.1.1

8 years ago

1.0.5-readme

8 years ago

1.0.5

8 years ago

1.0.4

8 years ago

1.0.3

8 years ago

1.0.2

8 years ago

1.0.1

8 years ago

1.0.0

8 years ago