0.0.9 • Published 3 months ago

@themaximalist/scrape.js v0.0.9

Weekly downloads
-
License
MIT
Repository
github
Last release
3 months ago

scrape.js

Scrape.js is an easy to use web scraping library for Node.js:

  • Extremely Fast
  • Scrape nearly any website
  • Auto-retries with increasing sophistication
  • Auto proxy rotation
  • ...it just works
const data = await scrape("https://example.com");
// { url, html, original_url, options }

You can specify additional options to scrape() for more control:

const data = await scrape("https://example.com", { headless: true, proxy: true});
// { url, html }

Installation

npm install @themaximalist/scrape.js

Usage

const scrape = require("@themaximalist/scrape.js");
await scrape("http://example.com");

Configuration

scrape.js uses Zen Rows for proxy rotation. To use it acquire a Zen Rows API key and setup the environment variable. scrape.js can be used without proxies, but is less effective.

ZENROWS_API_KEY=abcxyz123

Examples

View test on how to use scrape.js.

Projects

scrape.js is currently used in the following projects:

  • News Score — score the news, score the news, rewrite the headlines

Author

License

MIT