1.1.10 • Published 6 years ago
osmosis v1.1.10
Osmosis
HTML/XML parser and web scraper for NodeJS.
Features
- Uses native libxml C bindings
- Clean promise-like interface
- Supports CSS 3.0 and XPath 1.0 selector hybrids
- Sizzle selectors, Slick selectors, and more
- No large dependencies like jQuery, cheerio, or jsdom
Compose deep and complex data structures
HTML parser features
- Fast parsing
- Very fast searching
- Small memory footprint
HTML DOM features
- Load and search ajax content
- DOM interaction and events
- Execute embedded and remote scripts
- Execute code in the DOM
HTTP request features
- Logs urls, redirects, and errors
- Cookie jar and custom cookies/headers/user agent
- Login/form submission, session cookies, and basic auth
- Single proxy or multiple proxies and handles proxy failure
- Retries and redirect limits
Example
var osmosis = require('osmosis');
osmosis
.get('www.craigslist.org/about/sites')
.find('h1 + div a')
.set('location')
.follow('@href')
.find('header + div + div li > a')
.set('category')
.follow('@href')
.paginate('.totallink + a.button.next:first')
.find('p > a')
.follow('@href')
.set({
'title': 'section > h2',
'description': '#postingbody',
'subcategory': 'div.breadbox > span[4]',
'date': 'time@datetime',
'latitude': '#map@data-latitude',
'longitude': '#map@data-longitude',
'images': ['img@src']
})
.data(function(listing) {
// do something with listing data
})
.log(console.log)
.error(console.log)
.debug(console.log)
Documentation
For documentation and examples check out https://rchipka.github.io/node-osmosis/global.html
Dependencies
- libxmljs-dom - DOM wrapper for libxmljs C bindings
- needle - Lightweight HTTP wrapper
Donate
Please consider a donation if you depend on web scraping and Osmosis makes your job a bit easier. Your contribution allows me to spend more time making this the best web scraper for Node.
1.1.10
6 years ago
1.1.9
6 years ago
1.1.8
7 years ago
1.1.7
7 years ago
1.1.6
7 years ago
1.1.5
8 years ago
1.1.4
8 years ago
1.1.3
8 years ago
1.1.2
9 years ago
1.1.1
9 years ago
1.1.0
9 years ago
1.0.2
9 years ago
1.0.1
9 years ago
1.0.0
9 years ago
0.1.3
9 years ago
0.1.2
9 years ago
0.1.1
10 years ago
0.1.0
10 years ago
0.0.9
10 years ago
0.0.8
10 years ago
0.0.7
10 years ago
0.0.6
10 years ago
0.0.5
10 years ago
0.0.4
10 years ago
0.0.3
10 years ago
0.0.2
10 years ago
0.0.1
10 years ago