0.1.0 • Published 7 years ago

puppeteer-spider v0.1.0

Weekly downloads
3
License
MIT
Repository
github
Last release
7 years ago

puppeteer-spider

A web crawler powered by puppeteer

APIs

spider.fetchPage(url, options)

Fetch the title and content of the page.

const spider = require('puppeteer-spider')

async main() {
  const result = await spider.fetchPage('https://www.example.com/')
  console.log(result.title, result.content)
}

Options:

timeout: Maximum navigation time in milliseconds, defaults to 30 seconds
userAgent: Specific user agent to use in this page

spider.debug

Open or disable debug mode. Defaults to disabled.

spider.debug = true

spider.timeout

Set the default timeout value.

spider.timeout = 20000

spider.userAgent

Set the default user agent.

spider.userAgent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36'