0.3.18 • Published 4 months ago

@qualweb/crawler v0.3.18

Weekly downloads
39
License
ISC
Repository
github
Last release
4 months ago

QualWeb Crawler

Crawler mechanism for QualWeb. Implementation using puppeteer.

How to install

  $ npm i @qualweb/crawler --save

How to run

  'use strict';

  const puppeteer = require('puppeteer');
  const { Crawler } = require('@qualweb/crawler');


  (async () => {
    const browser = await puppeteer.launch();

    const viewport = {
      // check https://github.com/puppeteer/puppeteer/blob/v8.0.0/docs/api.md#pagesetviewportviewport
    };

    const crawler = new Crawler(browser, 'https://ciencias.ulisboa.pt', viewport);

    const options = {
      maxDepth?: 2, // max depth to search, 0 to search only the given domain. Default value = -1 (search everything)
      maxUrls?: 100, // max urls to find. Default value = -1 (search everything)
      timeout?: 60, // how many seconds the domain should be crawled before it ends. Default value = -1 (never stops)
      maxParallelCrawls?: 10, // max urls to crawl at the same time. Default value = 5
      logging?: true // logs domain, current depth, urls found and time passed to the terminal
    };

    await crawler.crawl(options);

    await browser.close();

    const urls = crawler.getResults();

    console.log(urls);
  })();

License

ISC

0.3.18

4 months ago

0.3.17

5 months ago

0.3.15

1 year ago

0.3.14

2 years ago

0.3.13

2 years ago

0.3.12

3 years ago

0.3.11

3 years ago

0.3.10

3 years ago

0.3.9

3 years ago

0.3.8

3 years ago

0.3.7

3 years ago

0.3.6

3 years ago

0.3.5

3 years ago

0.3.4

3 years ago

0.3.3

3 years ago

0.3.2

3 years ago

0.3.0

3 years ago

0.3.1

3 years ago

0.2.2

4 years ago

0.2.1

4 years ago

0.2.0

4 years ago

0.1.1

4 years ago

0.1.0

4 years ago

0.0.1

5 years ago