0.0.7 • Published 8 years ago

gretel v0.0.7

Weekly downloads
3
License
MIT
Repository
github
Last release
8 years ago

Gretel

Follows and collects breadcrumbs across the web.

Heavily relies on Christopher Giffard's node-simplecrawler

Usage

###CLI gretel options

Options:

  -h, --help                  output usage information
  -V, --version               output the version number
  -s, --startUri [uri]        Uri to start crawling from
  -q, --queuePath [filePath]  File path to load / save queue from

###Module var gretel = require('gretel')('www.example.com');

gretel.start();

Optionally load / save breadcrumb queue state

gretel.load('./breadcrumbs.json', function(error){
    if(error){
        return console.log(error.stack || error);
    }

    gretel.start();
});

gretel.queue.freeze("./breadcrumbs.json", function(error){
    if(error){
        console.log(error.stack || error);
    }
});

Other settings on gretel are the same as node-simplecrawler (she is actually an instance of Crawler) for more info and examples see the readme for node-simplecrawler

// sync processing
gretel.on('fetchcomplete', function(queueItem, data, response) {
    console.log(queueItem.url);
});

// async processing
gretel.on("fetchcomplete", function(queueItem, data, response) {
    var continue = this.wait();
    doSomethingAsync(data, function(){
        console.log(queueItem.url);
        continue();
    });
});
0.0.7

8 years ago

0.0.6

11 years ago

0.0.5

11 years ago

0.0.4

11 years ago

0.0.3

11 years ago

0.0.2

11 years ago

0.0.1

11 years ago

0.0.0

11 years ago