0.0.8 • Published 4 years ago

@calba1114/autoscrape v0.0.8

Weekly downloads
9
License
ISC
Repository
github
Last release
4 years ago

Purpose

The reason @alba/autoscrape was created was an effort to facilitate the process of scraping and analyzing any HTML or XHTML page. You can use powerful Array methods like reduce, map, filter in order to sort and search through a list of objects containing HTML data in key pair values.

Future Features

The Implementation of Filtering through Parent, Sibling and Child elements will soon be implemented in the next release.

Example Code

const { DataCrawler } = require("@calba1114/autoscrape");

(async () => {
    const crawler = DataCrawler();
    const rawHTML = await crawler.fetchPageText("https://alligator.io/js/filter-array-method/");
    const Objects = crawler.unlinkAll(rawHTML);
    Objects
        .filter(e => e.textContent.includes('Filter'))
        .filter(e => e.tagName === "DIV")
        .filter(e => e.attributes.get('class') === 'article-content')
        .forEach(e => console.log(e));
})();

New Trans Function (Utilizes Native Query Strings)

const { DataCrawler } = require("@calba1114/autoscrape");

(async () => {
    const crawler = DataCrawler();
    const rawHTML = await crawler.fetchPageText("https://alligator.io/js/filter-array-method/");
    const Objects = crawler.unlinkQuery(rawHTML,'div .article-content');
    Objects
        .filter(e => e.textContent.includes('Filter'))
        .forEach(e => console.log(e));
})();
0.0.8

4 years ago

0.0.7

5 years ago

0.0.6

5 years ago

0.0.5

5 years ago

0.0.4

5 years ago

0.0.3

5 years ago

0.0.2

5 years ago

0.0.1

5 years ago