1.2.2 • Published 7 years ago

pageparser v1.2.2

Weekly downloads
3
License
MIT
Repository
github
Last release
7 years ago

Pageparser is a small CLI tool for easy access to HTML/XML elements on local/remote pages

Installation

$ npm install pageparser

or

$ npm install -g pageparser

Script example

Import Parser

Javascript:

var Parser = require('pageparser').Parser;

Typescript:

import {Parser} from "pageparser"
var parser = new Parser('http://example.com'); // argument may be a ReadStream or String (URL or File Path)
var $ = await parser.load(); // Do you love JQuery? <3
var element = $('h1');
console.log(element.html()); // Example Domain

or

var data = await Parser.process('http://example.com', 'h1', ':html');
console.log(data); // Example Domain

Cheerio Docs

Pageparser using cheerio. You can get additional info about it here

Writing custom processors

  1. Call this from needed directory $ pageparser --init-config to place .parserconfig.js file to it

  2. Write your own processor function in processors section

Running from command line

$ pageparser http://example.com/ "h1" :html

Example Domain

$ cat tests\testpage.html | pageparser "h1" :html

Example Page

$ pageparser "h1" :html < tests\testpage.html

Example Page

Running tests

$ npm test

1.2.2

7 years ago

1.2.1

7 years ago

1.2.0

7 years ago

1.1.9

7 years ago

1.1.8

7 years ago

1.1.7

7 years ago

1.1.6

7 years ago

1.1.5

7 years ago

1.1.4

7 years ago

1.1.3

7 years ago

1.1.2

7 years ago

1.1.1

7 years ago

1.1.0

7 years ago

1.0.9

7 years ago

1.0.8

7 years ago

1.0.7

7 years ago

1.0.6

7 years ago

1.0.5

7 years ago

1.0.4

7 years ago

1.0.3

7 years ago

1.0.2

7 years ago

1.0.1

7 years ago

1.0.0

7 years ago