0.2.1 • Published 10 years ago

tarantula v0.2.1

Weekly downloads
2
License
-
Repository
github
Last release
10 years ago

node-tarantula

nodejs crawler/spider which provides a simple interface for crawling the Web. Its API has been inspired by crawler4j.

Quick Examples

var brain = {

    legs: 8,

    shouldVisit: function(uri) {
        return true;
    }

};

var tarantula = new Tarantula(brain);

tarantula.on('data', function (uri) {
	  console.info('200', uri);
});

tarantula.on('done', function() { 
    console.log('done'); 
});

tarantula.start(["http://stackoverflow.com"]);

Phantom Usage

If you would like to use the included PhantomJS plugin, you'll need to install the PhantomJS app (it is not an npm module). 1. You can download PhantomJS on their website. 2. It's also on popular OS Package Managers: brew install phantomjs, apt-get install phantomjs

0.2.1

10 years ago

0.2.0

10 years ago

0.1.0

10 years ago

0.0.6

10 years ago

0.0.5

11 years ago

0.0.4

11 years ago

0.0.3

11 years ago

0.0.2

11 years ago

0.0.1

12 years ago