1.0.3 • Published 8 years ago
wwweb v1.0.3
WWWEB
An autonomous webcrawler for indexing robots.txt files.
Requirements
- node.js ^6.0.0
Usage
wwweb -d <domain> [-s <interval>] -o <directory> [--rest <seconds>] [[-v] -v] [-t <timeout>]
Options
Flag | Alias | Description | Info |
---|---|---|---|
--domain | -d | Initial domain | required |
--save-interval | -s | Interval in seconds for outputting reports | default: 30 |
--output | -o | Name of the output directory | required |
--help | -h | Show help | |
--rest | -r | Seconds to rest between requests | default: 0 |
--timeout | -t | Milliseconds before a request times out | default: 15000 |
--verbose | -v | Verbose output of what is going on | -vv for debug output |
--no-color | Disable colorful output |
Examples
Crawl from example.org and output files to the current working directory:
wwweb -d example.org -o .
Crawl from example.org, output files to ./reports/, output warning, wait eight seconds for files to load and save a report every minute.
wwweb -d example.org -o reports/ -v -t 8000 -s 60