1.13.0 • Published 2 years ago

mwoffliner v1.13.0

Weekly downloads
406
License
GPL-3.0
Repository
github
Last release
2 years ago

MWoffliner

MWoffliner is a tool for making a local offline HTML snapshot of any online MediaWiki instance. It goes through all online articles (or a selection if specified) and create the corresponding ZIM file. It has mainly been tested against Wikimedia projects like Wikipedia and Wiktionary --- but it should also work for any recent MediaWiki.

Read CONTRIBUTING.md to know more about MWoffliner development.

NPM

npm Docker Build Status codecov CodeFactor License

Features

  • Scrape with or without image thumbnail
  • Scrape with or without audio/video multimedia content
  • S3 cache (optional)
  • Image size optimiser / Webp converter
  • Scrape all articles in namespaces or title list based
  • Specify additional/non-main namespaces to scrape

Run mwoffliner --help to get all the possible options.

Prerequisites

  • *NIX Operating System (GNU/Linux, macOS, ...)
  • Redis
  • NodeJS version 16 or greater
  • Libzim (On GNU/Linux & macOS we automatically download it)
  • Various build tools which are probably already installed on your machine (packages libjpeg-dev, libglu1, autoconf, automake, gcc on Debian/Ubuntu)

... and an online MediaWiki with its API available.

Usage

To install MWoffliner globally:

npm i -g mwoffliner

You might need to run this command with the sudo command, depending how your npm is configured.

npm permission checking can be a bit annoying for a newcomer. Please read the documentation carefully if you hit problems: https://docs.npmjs.com/cli/v7/using-npm/scripts#user

Then to run it:

mwoffliner --help

To install and run it locally:

npm i
npm run mwoffliner -- --help

To use MWoffliner with a S3 cache, you should provide a S3 URL like this:

--optimisationCacheUrl="https://wasabisys.com/?bucketName=my-bucket&keyId=my-key-id&secretAccessKey=my-sac"

API

MWoffliner provides also an API and therefore can be used as a NodeJS library. Here a stub example:

const mwoffliner = require('mwoffliner');
const parameters = {
    mwUrl: "https://es.wikipedia.org",
    adminEmail: "foo@bar.net",
    verbose: true,
    format: "nopic",
    articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a Promise

Background

Complementary information about MWoffliner:

  • MediaWiki software is used by thousands of wikis, the most famous ones being the Wikimedia ones, including Wikipedia.
  • MediaWiki is a PHP wiki runtime engine.
  • Wikitext is the name of the markup language that MediaWiki uses.
  • MediaWiki includes a parser for WikiText into HTML, and this parser creates the HTML pages displayed in your browser.

GNU/Linux - Debian based distributions

Install NodeJS: Read https://nodejs.org/en/download/current/

Install Redis:

sudo apt-get install redis-server

Troubleshooting

Older GNU/Linux distributions and/or versions of Node.js might be shipped with a deprecated version of npm. Older versions of npm have incompatbilities with certain versions of Node.js and might simply fail to install mwoffliner package.

We recommend to use a recent version of npm. Recent versions can perfectly deal with older Node.js 10. Do install the packaged version of npm and then use it to install a newer version like:

sudo npm install --unsafe-perm -g npm

Don't forget to remove the packaged version of npm afterward.

License

GPLv3 or later, see LICENSE for more details.

1.13.0

2 years ago

1.12.1

2 years ago

1.12.0

2 years ago

1.11.12

2 years ago

1.11.11

3 years ago

1.11.10

3 years ago

1.11.9

3 years ago

1.11.8

3 years ago

1.11.7

4 years ago

1.11.6

4 years ago

1.11.5

4 years ago

1.11.4

4 years ago

1.11.3

4 years ago

1.11.2

4 years ago

1.11.1

4 years ago

1.11.0

4 years ago

1.10.12

4 years ago

1.10.11

4 years ago

1.10.10

4 years ago

1.10.9

4 years ago

1.10.8

5 years ago

1.10.7

5 years ago

1.10.6

5 years ago

1.10.5

5 years ago

1.10.4

5 years ago

1.10.3

5 years ago

1.10.2

5 years ago

1.10.1

5 years ago

1.10.0

5 years ago

1.9.13

5 years ago

1.9.12

5 years ago

1.9.10

5 years ago

1.9.9

5 years ago

1.9.8

5 years ago

1.9.6

5 years ago

1.9.5

5 years ago

1.9.5-rc1

5 years ago

1.9.4

5 years ago

1.9.4-rc1

5 years ago

1.9.3

6 years ago

1.9.3-rc4

6 years ago

1.9.3-rc3

6 years ago

1.9.3-rc2

6 years ago

1.9.3-rc1

6 years ago

1.9.2

6 years ago

1.9.2-rc3

6 years ago

1.9.2-rc2

6 years ago

1.9.2-rc1

6 years ago

1.9.1

6 years ago

1.9.0

6 years ago

1.9.0-rc.1

6 years ago

1.9.0-rc1

6 years ago

1.8.6

6 years ago

1.8.5

6 years ago

1.8.4

6 years ago

1.8.3

6 years ago

1.8.2

6 years ago

1.8.1

6 years ago

1.8.0

6 years ago

1.7.1

6 years ago

1.7.0

6 years ago

1.6.0

6 years ago

1.5.0

6 years ago

1.4.1

7 years ago

1.4.0

7 years ago

1.3.7

7 years ago

1.3.6

7 years ago

1.3.5

7 years ago

1.3.4

7 years ago

1.3.3

7 years ago

1.3.2

7 years ago

1.3.1

8 years ago

1.3.0

8 years ago

1.2.7

8 years ago

1.2.6

8 years ago

1.2.5

8 years ago

1.2.4

8 years ago

1.2.3

8 years ago

1.2.2

8 years ago

1.2.1

8 years ago

1.2.0

8 years ago

1.1.6

8 years ago

1.1.5

8 years ago

1.1.4

8 years ago

1.1.3

8 years ago

1.1.2

8 years ago

1.1.1

8 years ago

1.0.5-readme

8 years ago

1.0.5

8 years ago

1.0.4

8 years ago

1.0.3

8 years ago

1.0.2

8 years ago

1.0.1

8 years ago

1.0.0

8 years ago