Canner-extract NPM

canner-extract

A html extractor for canner

Install

npm install -g canner-extract

Usage

  Usage: canner-extract <html_file, default value: "index.html">

  Options:

    -h, --help     output usage information
    -V, --version  output the version number
    -m, --manually  manually naming text node

Auto-extracting

extracting a html file to canner.json and layout automatically.

canner-extract index.html

Before

A sample html:

<html>
  <title>
  </title>

  <body>
    This is text 1

    <p> This is text 2 </p>

    <span> This is text 3 </span>

    <p> This is text 4 <span> This is text 5 </span> </p>

    This is text 6
  </body>

</html>

AFTER extract-html

test_after.html:

<html>

<head>
    <title>
    </title>

</head>

<body>{{0}}
    <p>{{1}}</p>

    <span>{{2}}</span>

    <p>{{3}}<span>{{4}}</span> </p>{{5}}</body>

</html>

test_output.json:

{
    "0": "This is text 1",
    "1": "This is text 2",
    "2": "This is text 3",
    "3": "This is text 4",
    "4": "This is text 5",
    "5": "This is text 6"
}

Manually-extracting

extracting a html file to canner.json and layout manually.

canner-extract -m index.html

This will prompt some messages for you to fill in, which will set the values in canner.json.

Preview

preview

Result

canner.json:

{
    "layout": "layout.hbs",
    "filename": "output.html",
    "data": {
        "name": "Willis Corto",
        "side-title1": "I got reprogrammed by a rogue AI",
        "side-title2": "and now I'm totally cray",
        "side-tab1": "About",
        "side-tab2": "Things I Can Do",
        "side-tab3": "A Few Accomplishments",
        "side-tab4": "Contact",
        "side-twitter": "Twitter",
        "side-facebook": "Facebook",
        "side-instagram": "Instagram",
        "side-github": "Github",
        "side-email": "Email",
        "main-title1": "Read Only",
        "main-title2": "Things I Can Do",
        "main-subtitle1": "Just an incredibly simple responsive site",
        "main-subtitle2": "template freebie by",
        "main-content1": "Faucibus sed lobortis aliquam lorem blandit. Lorem eu nunc metus col. Commodo id in arcu ante lorem ipsum sed accumsan erat praesent faucibus commodo ac mi lacus
        ....
        ...
    }
}

layout.hbs:

...
...

<body>
    <div id="wrapper">

        <!-- Header -->
        <section id="header" class="skel-layers-fixed">
            <header>
                <span class="image avatar"><img src="images/avatar.jpg" alt=""></span>
                <h1 id="logo"><a href="#">{{name}}</a></h1>
                <p>{{side-title1}}
                    <br>{{side-title2}}</p>
            </header>
            <nav id="nav">
                <ul>
                    <li><a href="#one" class="active">{{side-tab1}}</a>
                    </li>
                    <li><a href="#two">{{side-tab2}}</a>
                    </li>
                    <li><a href="#three">{{side-tab3}}</a>
                    </li>
                    <li><a href="#four">{{side-tab4}}</a>
                    </li>
                </ul>
            </nav>

...

API

autoParse(html path, opt)

html path: should be the absolute path to your html file.

return a promise

canner_extract.autoParse(html, opt)
  .then(function(result) {
    // console.log(result.html);  
    // console.log(result.json);  
  });

manuallyParse(html, opt)

html path: should be the absolute path to your html file.

return a promise

canner_extract.autoParse(html, opt)
  .then(function(result) {
    // console.log(result.html);  
    // console.log(result.json);  
  });

Example

https://github.com/Canner/readonly-can

License

MIT

canner extract

async commander js-beautify q-io jsdom lodash promptly q

@infinitebrahmanuniverse/nolb-cann @everything-registry/sub-chunk-1290 @zalastax/nolb-cann

11 years ago

11 years ago

11 years ago

12 years ago

12 years ago

12 years ago