@xapp/arachne-cli v1.9.0
@xapp/arachne-cli
A command line crawler based on puppeteer
Example Usage
To crawl a site and save the pages to a local ./temp directory
$ arachne crawl http://www.thecoffeefaq.com/ -d ./tempTo also save markdown and schema.org FAQs
$ arachne crawl http://www.thecoffeefaq.com/ -a -t markdown -d ./tempWith a whitelisted patterns file
$ arachne crawl http://www.thecoffeefaq.com/ -a -t markdown -d ./temp -w ./temp/whitelist.mdWith a settling period
$ arachne crawl http://www.thecoffeefaq.com/ -d ./temp -b 5000 -o 9000Windows & WSL2 Notes
Follow the instructions here to setup: https://github.com/puppeteer/puppeteer/issues/1837#issuecomment-689006806
You will need to start XLaunch before running the CLI, select multiple windows, no client, turn off access control.
Another option is to add -h flag to run headless (no browser application launched).
If the normal commands don't work, you might need to pass in the executablePath (-e) and run headless (-h).
$ arachne crawl http://www.thecoffeefaq.com/ -e /usr/bin/google-chrome -hLicenses
11 months ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
2 years ago
3 years ago
3 years ago
3 years ago
2 years ago
3 years ago
3 years ago
2 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
3 years ago
4 years ago
4 years ago
4 years ago
4 years ago
5 years ago