html-aggregator v0.0.12
html-aggregator
Aggregate html snippets from other pages.
Usage
Install with npm install -g html-aggregator.
Run with html-aggregator --templateDir=<directory> --output=<file> --maxLen=<number> input files....
templateDir contains json files that define how to extract data from HTML files:
{
"selectors": {
"title": "header.post-header h1",
"content": "article.post-content"
},
"static": {
"name": "My Name"
}
}The values in selectors are CSS selectors that are applied to the input HTML files. static contains static strings.
output is a file defining how to render the scraped data:
<h1>%title%</h1>
<div>By %name%</div>
<div>%content%</div>The variables defined in a template are referenced by the expression%var%.
For every occurrence of <aggregate url="..." template="..."></aggregate> in every input file
- the contents of the given URL is fetched
- the contents is parsed with the given template
if the input file name has the form
<name>.html.<ext>a new file
<name>.htmlis created where all<aggregate>s are replaced by theoutputfile having its variables replaced.Otherwise,
<aggregate>'s child nodes are replaced with theoutputfile having its variables replaced.