@xapp/arachne-cli v1.9.0
@xapp/arachne-cli
A command line crawler based on puppeteer
Example Usage
To crawl a site and save the pages to a local ./temp directory
$ arachne crawl http://www.thecoffeefaq.com/ -d ./temp
To also save markdown and schema.org FAQs
$ arachne crawl http://www.thecoffeefaq.com/ -a -t markdown -d ./temp
With a whitelisted patterns file
$ arachne crawl http://www.thecoffeefaq.com/ -a -t markdown -d ./temp -w ./temp/whitelist.md
With a settling period
$ arachne crawl http://www.thecoffeefaq.com/ -d ./temp -b 5000 -o 9000
Windows & WSL2 Notes
Follow the instructions here to setup: https://github.com/puppeteer/puppeteer/issues/1837#issuecomment-689006806
You will need to start XLaunch before running the CLI, select multiple windows, no client, turn off access control.
Another option is to add -h
flag to run headless (no browser application launched).
If the normal commands don't work, you might need to pass in the executablePath (-e) and run headless (-h).
$ arachne crawl http://www.thecoffeefaq.com/ -e /usr/bin/google-chrome -h
Licenses
6 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
11 months ago
10 months ago
11 months ago
11 months ago
11 months ago
11 months ago
11 months ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
5 years ago