pageparser v1.2.2
Pageparser is a small CLI tool for easy access to HTML/XML elements on local/remote pages
Installation
$ npm install pageparser
or
$ npm install -g pageparser
Script example
Import Parser
Javascript:
var Parser = require('pageparser').Parser;Typescript:
import {Parser} from "pageparser"var parser = new Parser('http://example.com'); // argument may be a ReadStream or String (URL or File Path)
var $ = await parser.load(); // Do you love JQuery? <3
var element = $('h1');
console.log(element.html()); // Example Domainor
var data = await Parser.process('http://example.com', 'h1', ':html');
console.log(data); // Example DomainCheerio Docs
Pageparser using cheerio.
You can get additional info about it here
Writing custom processors
Call this from needed directory
$ pageparser --init-configto place.parserconfig.jsfile to itWrite your own processor function in
processorssection
Running from command line
$ pageparser http://example.com/ "h1" :html
Example Domain
$ cat tests\testpage.html | pageparser "h1" :html
Example Page
$ pageparser "h1" :html < tests\testpage.html
Example Page
Running tests
$ npm test
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago