scrap v0.1.0
Node.js - scrap
A simple screen scraper module that uses jQuery style semantics.
Why?
In every screen scraper program that I wrote, I had to include request and cheerio. I would then have to check the response error object and the response code. It became a bit annoying. Hence this package.
Installation
npm install scrapQuick and Dirty
var scrap = require('scrap');
scrap('http://google.com', function(err, $) {
console.log($('title').text().trim()); //Google
});API
scrap(options, callback)
options: Can either be a string url or an object containing options as key,value pair.
Options include:
url: The url to parse.timeout: The number of milliseconds to wait before aborting the request.proxy: The proxy string e.g. 245.12.19.145:8080.
callback: The callback function for a response. The function can include the following parameters:
err: The error object if it exists. If the response code is not200this will be set. This may be a poor design choice, time will tell.$: jQuery object to use on the page.code: HTTP response status code.html: HTML or response body text.resp: The actual response object.
Credits
This would not be possible without the great Node.js modules:
Author
This module was written by JP Richardson. You should follow him on Twitter @jprichardson. Also read his coding blog Procbits. If you write software with others, you should checkout Gitpilot to make collaboration with Git simple.
License
(MIT License)
Copyright 2012, JP Richardson jprichardson@gmail.com