2.0.1 • Published 7 years ago

@dschnare/chai v2.0.1

Weekly downloads
1
License
MIT
Repository
-
Last release
7 years ago

Chai

Build Status Code Climate Test Coverage npm version License

Chai is a simple web crawler that scrapes relevant SEO data from each page it visits.

Usage

npm install @dschnare/chai -g
chai http://mywebsite.com > crawl.json

Scraping

Chai will scrape the following data from each page it visits.

  • Page title
  • All H1 values
  • All H2 values

The scrape data written to stdout is a JSON array of objects with the following shape:

{ title, url, headings: { h1: [], h2: [] } }

For URLs that respond with an error the scrape object has this shape:

{ url, statusCode, error }

Where error is the error object returned from Superagent.

Roadmap

  • Expose way to filter out URLs to be crawled
  • Expose way to customize the scraper
  • Make it easier to identify 404 URLs
  • Add option to control verbosity
2.0.1

7 years ago

2.0.0

7 years ago

1.0.3

9 years ago

1.0.2

9 years ago

1.0.1

9 years ago

1.0.0

9 years ago