Spiders NPM | npm.io

SPIDERS

Crawl web pages efficiently

Feautures

Persistance
Optimization
Light weight

Installation

npm install spiders

yarn add spiders

Simple Usage Demo

ES6 syntax:

let Spiders = require('spiders');
let spidy = new Spider();
//Crawl
spidy.crawl( 'http://urltoscrape' )
	.then( $ => {
		let title = $("title").text();//Jquery functions
		console.log(title);
	})

Options

Options can be passed as arguement during object intialization.

The options supports following

{
	persist : './fileToStore',
	toStore : (params,url) => {
	},
	fromStore : (obj ,params, url){
	}
}

persist - Used for persistance. See below briefly
toStore - returns a object to tell spider how to store given url and params
fromStore - specify match condition for the given object & url & params

Persistance

let spider = new Spider({persist:'./songs'});

spider.persist().then(()=>{
	// Spiders gets loaded with previous scraped details
	// Scrape fn here.
})

Methods

crawl( url , params)

Demo

	let Spider = require('spiders');
	let songSpidy = new Spider({
		persist:"./persist/song",
		toStore: (url,params){
			return {url}
		},
		fromStore: (obj,url,params){
			return obj.url == url;
		}
	});
	songspidy.persist().then(scrape);
	
	function scrape(){
		songspidy.scrape('pathtoSong',{lng:'en'}).then($=>{
	let title = $("title").text();
})