0.2.1 • Published 8 years ago

yan-crawler v0.2.1

Weekly downloads
2
License
MIT
Repository
github
Last release
8 years ago

Overview

Simple module which allows you to poll websites at regular intervals and extract whatever information you want from the response. Strictly speaking, it's not a crawler. If you are looking for one, there are some quite popular alternatives out there like node-crawler.

Installation

npm install yan-crawler

Usage

var Crawler = require('yan-crawler').Crawler;
var crawler = Crawler.getInstance();

var amazonTemplate = {
    name: 'Amazon',
    url: 'https://www.amazon.com/',
    interval: 3000,
    callback: function(body, $) {
        // $ is cheerio - https://github.com/cheeriojs/cheerio
        console.log("Grabbed Amazon.");
    }
};

var IMDBTemplate = {
    name: 'IMDB',
    interval: 2000,
    url: 'http://www.imdb.com',
    callback: function(body, $) {
        console.log('Grabbed IMDB.');
    }
};

crawler.addEntry(amazonTemplate);
crawler.addEntry(IMDBTemplate);
crawler.start();

The code above will make requests to www.amazon.com every 3000ms and to www.imdb.com every 2000ms, calling their respective callbacks when it gets the results.

License

MIT

0.2.1

8 years ago

0.2.0

8 years ago

0.1.0

8 years ago

0.0.1

9 years ago