0.3.5 • Published 11 years ago

crawlstream v0.3.5

Weekly downloads
2
License
-
Repository
github
Last release
11 years ago

crawlstream

A website crawler that gives a readable stream of request streams.

Development of this module has been sponsored by Knowit

Build Status

Installation

$ npm install crawlstream

Running the tests

$ npm test

Examples

Printing out the paths of all the pages found.

Streaming API

var crawlstream = require('crawlstream');

crawlstream('mysite.com', 10)
	.on('data', function(req) {
		console.log(req.uri.path);
	});

Callback API

var crawlstream = require('crawlstream');

crawlstream('mysite.com', 10, function(err, req) {
	console.log(req.uri.path);
});

Methods

var crawlstream = require('crawlstream')

crawlstream(baseUrl, concurrency, callback)

Crawl all pages under baseUrl.

Optionally supply a callback(err, req) which will receive the request stream(!) for all pages.

License

Copyright 2012 Knowit

MIT

0.3.5

11 years ago

0.3.4

11 years ago

0.3.3

11 years ago

0.3.2

11 years ago

0.3.0

11 years ago

0.2.0

12 years ago

0.1.0

12 years ago