@themaximalist/scrape.js NPM

scrape.js

Scrape.js is an easy to use web scraping library for Node.js:

Extremely Fast
Scrape nearly any website
Auto-retries with increasing sophistication
Auto proxy rotation
...it just works

const data = await scrape("https://example.com");
// { url, html, original_url, options }

You can specify additional options to scrape() for more control:

const data = await scrape("https://example.com", { headless: true, proxy: true});
// { url, html }

Installation

npm install @themaximalist/scrape.js

Usage

const scrape = require("@themaximalist/scrape.js");
await scrape("http://example.com");

Configuration

scrape.js uses Zen Rows for proxy rotation. To use it acquire a Zen Rows API key and setup the environment variable. scrape.js can be used without proxies, but is less effective.

ZENROWS_API_KEY=abcxyz123

Examples

View test on how to use scrape.js.

Projects

scrape.js is currently used in the following projects:

News Score — score the news, score the news, rewrite the headlines

Author

License

MIT

web scraping extraction library

@mozilla/readability axios dompurify http-proxy-agent https-proxy-agent jsdom proxy-chain puppeteer puppeteer-extra puppeteer-extra-plugin-stealth string-strip-html

@everything-registry/sub-chunk-912

3 months ago

11 months ago

12 months ago

12 months ago

12 months ago