promise-parser v0.0.4
#promise-parser
Promise-based HTML/XML parser and web scraper for NodeJS.
##Features
- Fast: uses libxml C bindings
 - Lightweight: no dependencies like jQuery, cheerio, or jsdom
 - Clean: promise based interface- no more nested callbacks
 - Flexible: supports both CSS and XPath selectors
 
##Example
var pp = require('promise-parser');
var parser = new pp();
// scrape all craigslist listings
parser
.get('www.craigslist.org/about/sites') 
.find('h1 + div a')
.set('location')
.follow('@href')
.find('header + table a')
.set('category')
.follow('@href')
.find('p > a')
.follow('@href', { next: '.button.next' })
.set({
    'title':        'section > h2',
    'description':  '#postingbody',
    'subcategory':  'div.breadbox > span[4]',
    'time':         'time@datetime',
    'latitude':     '#map@data-latitude',
    'longitude':    '#map@data-longitude',
    'images[]':     'img@src'
})
.get(function(listing) {
    // do something with listing data
})##Install
npm install promise-parser##Usage
new promise-parser([opts])###opts [object]
- opts.http object - HTTP options given to needle instance
 - opts.http.timeout int - Timeout in milliseconds
 - opts.http.proxy string - Forward requests through HTTP(s) proxy
 - opts.http.concurrency int - Number of simultaneous HTTP requests
 - opts.http.tries int - Number of tries before giving up on a request
 
##Promises
####.parse(string)
Parse an HTML or XML string
HTTP GET request
HTTP POST request
####.find(selector, opts)
Find elements based on selector within the current context
Follow URLs found within the element text or attr
####.set(args)
Find and set values for context.data
// set 'title' to current element text
pp.set('title')
// set 'title' to text of 'a.title'
pp.set('title', 'a.title')
// set multiple
pp.set({
	// set 'title' to text of 'a.title'
	'title':  'a.title',
	// set 'description' to text of 'p.description'
	'description': 'p.description',
	// set 'url' to 'a.permalink' href attribute
	'url': 'a.permalink @href',
	// set 'images[]' to the 'src' attribute of each '<img>'
	'images[]': 'img @src',
});####.then(callback(next))
Calls callback from the context of the current element.
To continue, the callback must call next([context]) at least once.
The context argument can optionally be a new context.
pp.then(function(next) {
	var links = this.find('a');
	this.log('found '+links.length+' links');
	links.forEach(function(link) {
		next(link);
	});
})#####context
The this value of .then callback function is set to the current context.
The context is a libxmljs Element object representing the current HTML/XML element.
In addition to all of the libxmljs Element functions,
each context also supports these functions:
- context.request(url, data, callback(context))
 - context.post(url, data, callback(context))
 - context.log(msg)
 - context.debug(msg)
 - context.error(msg)
 - context.data object
 
####.data(callback(data))
Get data stored in context.data
####.done(callback)
Calls callback when parsing has completely finished
####.log(callback(msg))
Call callback when any log messages are received
####.error(callback(msg))
Call callback when any error messages are received
####.debug(callback(msg))
Call callback when any debug messages are received
##CSS helpers
These CSS helper selectors are provided to simplify complex CSS expressions and to add jQuery-like functionality.
####:contains(string)
Select elements whose contents contain string
####:starts-with(string)
Select elements whose contents start with string
####:ends-with(string)
Select elements whose contents end with string
####:first
Select first element  (shortcut for :first-of-type)
####:first(n), :limit(n)
Select first n elements
####:last
Select last element (shortcut for :last-of-type)
####:last(n)
Select last n elements
####:even
Select even elements
####:odd
Select odd elements
####:skip(n), skip-first(n)
Skip first n elements
####:skip-last(n)
Skip last n elements
####:range(n1, n2)
Select n1 through n2 elements inclusive
####.exampleSelectorn
Select nth element (shortcut for :nth-of-type)
####@attribute
Select attribute
##Dependencies