0.1.0 • Published 7 years ago

scraptor v0.1.0

Weekly downloads
5
License
GPL-3.0
Repository
github
Last release
7 years ago

scraptor

!!This library is a work in progress. The API most likely will change.!!

This library is my attempt to wrap puppeteer and cheerio to create a library that allows me to easily construct web scrapers. A DSL implements common patterns, while allowing to break out into the underlying libraries if necessary.

Synopsis

import {browse, once, fillForm, click, html, usingHeadlessBrowser} from "scraptor";
import {flowP} from "combinators-p";

const spinnerDone = "document.querySelector('.spinner').classList.contains('hide')";
const waitForSpinner = once(spinnerDone);
const search = (url, term) =>
  flowP([
    browse,
    waitForSpinner,
    fillForm("#search"),
    click("button.search"),
    waitForSpinner,
    html("body"),
  ], url);

usingHeadlessBrowser(search("https://example.org", "Keith Johnstone"))
  .then(console.log); // Prints full HTML

API

usingBrowser

usingHeadlessBrowser

browse

html

fillForm

click

once

onceLoaded

onceMs

doUntil