0.2.2 ā€¢ Published 3 years ago

vanilla-clipper v0.2.2

Weekly downloads
19
License
MIT
Repository
github
Last release
3 years ago

šŸ“ƒ Vanilla Clipper

npm.io npm.io

ę—„ęœ¬čŖž (Qiita)

Vanilla Clipper is a Node.js library to completely save a webpage to local with Puppeteer. You can save all the contents in the page such as images, videos, CSS, web fonts, iframes, and Shadow DOMs with one command.

Dependencies

  • Node.js (>= 8.10)
  • Chrome or Chromium (Latest version)

Installation

yarn global add vanilla-clipper
# or
npm i -g vanilla-clipper

Usage

CLI

Note: If it fails to launch, try adding --no-sandbox (-n) option.

  • Save https://example.com:

    vanilla-clipper https://example.com
  • Save .timeline element in https://example.com to tech directory (Set browser language to Japanese):

    vanilla-clipper -d tech -s .timeline -l ja-JP https://example.com
  • Login with sub account in the config file:

    vanilla-clipper -a sub https://example.com

See here for details of the options.

šŸ“‚ Directory structure in ~/.vanilla-clipper

šŸ“‚ .vanilla-clipper
   šŸ“‚ pages
      šŸ“‚ main
         šŸ“ƒ 20190213-page1.html
         ļø™
      šŸ“‚ {SOME_FOLDER}
         šŸ“ƒ 20190213-page2.html
         šŸ“ƒ 20190214-page3.html
         ļø™

   šŸ“‚ resources
      šŸ“‚ 20190213
         šŸ“Ž {ulid}.jpg
         šŸ“Ž {ulid}.svg
         ļø™
      šŸ“‚ 20190214
         šŸ“Ž {ulid}.woff2
         ļø™

   šŸ’Ž resources.json
   šŸ’Ž config.json

āš™ļø Config file example

{YOUR_HOME_DIRECTORY}/.vanilla-clipper/config.js

module.exports = {
    resource: { maxSize: 50 * 1024 * 1024 },
    sites: [
        {
            url: 'example.com', // site URL
            accounts: {
                default: {
                    // ā†‘ account label
                    username: 'main', // or () => 'main'
                    password: 'password1',
                },
                sub: {
                    // ā†‘ account label
                    username: 'sub_account',
                    password: 'password2',
                },
            },
            login: [
                // [action, arg1, arg2, ...]
                [
                    'goto',
                    'https://example.com/login', // URL
                ],
                [
                    'input',
                    'input[name="session[username_or_email]"]', // selector
                    '$username', // -> accounts.{ACCOUNT_LABEL}.username
                ],
                [
                    'input',
                    'input[name="session[password]"]', // selector
                    '$password', // -> accounts.{ACCOUNT_LABEL}.password
                ],
                [
                    'submit',
                    '[role=button]', // selector
                ],
            ],
        },
    ],
}