node-phantom-simple v2.2.4
node-phantom-simple
This module is API-compatible with
node-phantom but doesn't rely on
WebSockets
/ socket.io
. In essence the communication between Node and
Phantom / Slimer has been simplified significantly. It has the following advantages
over node-phantom
:
- Fewer dependencies/layers.
- Doesn't use the unreliable and huge socket.io.
- Works under
cluster
(node-phantom does not, due to how it works)server.listen(0)
works in cluster. - Supports SlimerJS.
Migrating 1.x -> 2.x
Your software should work without changes, but can show deprecation warning about outdated signatures. You need to update:
options.phantomPath
->options.path
- in
.create()
.evaluate()
&.waitForSelector()
-> movecallback
to last position of arguments list.
That's all!
Installing
npm install node-phantom-simple
# Also need phantomjs OR slimerjs:
npm install phantomjs
# OR
npm install slimerjs
Note. SlimerJS is not headless and requires a windowing environment.
Under Linux/FreeBSD/OSX xvfb can be used to run headlessly.. For example, if you wish
to run SlimerJS on Travis-CI, add those lines to your .travis.yml
config:
before_script:
- export DISPLAY=:99.0
- "sh -e /etc/init.d/xvfb start"
Usage
You can use it exactly like node-phantom, and the entire API of PhantomJS should work, with the exception that every method call takes a callback (always as the last parameter), instead of returning values.
For example, this is an adaptation of a web scraping example:
var driver = require('node-phantom-simple');
driver.create({ path: require('phantomjs').path }, function (err, browser) {
return browser.createPage(function (err, page) {
return page.open("http://tilomitra.com/repository/screenscrape/ajax.html", function (err,status) {
console.log("opened site? ", status);
page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js', function (err) {
// jQuery Loaded.
// Wait for a bit for AJAX content to load on the page. Here, we are waiting 5 seconds.
setTimeout(function () {
return page.evaluate(function () {
//Get what you want from the page using jQuery. A good way is to populate an object with all the jQuery commands that you need and then return the object.
var h2Arr = [],
pArr = [];
$('h2').each(function () { h2Arr.push($(this).html()); });
$('p').each(function () { pArr.push($(this).html()); });
return {
h2: h2Arr,
p: pArr
};
}, function (err,result) {
console.log(result);
browser.exit();
});
}, 5000);
});
});
});
});
.create(options, callback)
options (not mandatory):
- path (String) - path to phantomjs/slimerjs, if not set - will search in $PATH
- parameters (Array) - CLI params for executed engine, { nave: value } . You can also pass in an array to use verbatim names and values.
- ignoreErrorPattern (RegExp) - a regular expression that can be used to
silence spurious warnings in console, generated by Qt and PhantomJS.
On Mavericks, you can use
/CoreText/
to suppress some common annoying font-related warnings.
For example
driver.create({ parameters: { 'ignore-ssl-errors': 'yes' } }, callback)
driver.create({ parameters: ['-jsconsole', '-P', 'myVal']} }, callback)
will start phantom as:
phantomjs --ignore-ssl-errors=yes
You can rely on globally installed engines, but we recommend to pass path explicit:
driver.create({ path: require('phantomjs').path }, callback)
// or for slimer
driver.create({ path: require('slimerjs').path }, callback)
You can also have a look at the test directory to see some examples of using the API, however the de-facto reference is the PhantomJS documentation. Just mentally substitute all return values for callbacks.
WebPage Callbacks
All of the WebPage
callbacks have been implemented including onCallback
,
and are set the same way as with the core phantomjs library:
page.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
This includes the onPageCreated
callback which receives a new page
object.
Properties
Properties on the WebPage
and Phantom
objects are accessed via the get()
/set()
method calls:
page.get('content', function (err, html) {
console.log("Page HTML is: " + html);
});
page.set('zoomfactor', 0.25, function () {
page.render('capture.png');
});
// You can get/set nested values easy!
page.set('settings.userAgent', 'PhAnToSlImEr', callback);
Known issues
Engines are buggy. Here are some cases you should know.
.evaluate
can return corrupted result:- SlimerJS: undefined -> null.
- PhantomJS:
- undefined -> null
- null -> '' (empty string)
- 1, undefined, 2 -> null
page.onConfirm()
handler can not return value due async driver nature. Use.setFn()
instead:page.setFn('onConfirm', function () { return true; })
.
License
Other
Made by Matt Sergeant for Hubdoc Inc.
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
9 years ago
10 years ago
10 years ago
10 years ago
10 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago
11 years ago