0.2.3 • Published 5 years ago

pardosa v0.2.3

Weekly downloads
2
License
MIT
Repository
github
Last release
5 years ago

Pardosa

npm

A spider framework has a Koa like APIs, written by Typescript.

PS: The repository still in developing, the APIs may be changed a lot before version 1.0. And welcome for your suggestions.

Feature

  • Koa like APIs, can configurate page processing with middlewares.
  • Support schedule request, based on node-schedule.
  • Build-in middlewares:
    • guard: Print the request and it's processing time.
    • fetch: Use node-fetch to request page.
      • ctx.res: node-fetch's Response.
      • ctx.response: Pardosa's Response
        • Exposes 3 interfaces of xSelector: .css(), .xpath(), .re();
        • .$: Equivalent to Cheerio.load(ctx.response.body).
    • Router: Koa Router like APIs.
    • schema: Use XPath extract data to ctx.state.
    • storage
      • file(): Use after fetch and before router.
        • If ctx.state.file exist, save ctx.response.body to path ctx.req.file.
        • If ctx.state.files exist, save every ctx.state.files[].content into ctx.state.files[].file.
    • inspect: Print field of ctx by JSON Path, like state.file.

Useage

import * as Pardosa from "pardosa";
import * as fetch from "pardosa/middlewares/fetch";

const spider = new Pardosa({ exitOnIdle: true })
    .use(fetch())
    .use(async function (ctx, next) {
        console.log(ctx.response.xpath('//article').html());
    });

spider.source.enqueue('https://github.com/plylrnsdy/pardosa');
spider.start();

More examples.

Install

npm i -P pardosa

If you make a spider using Pardosa with Typescript, install with these declarations dependencies:

npm i -D @types/node-schedule @types/node-fetch @types/cheerio

Contribution

Submit the issues if you find any bug or have any suggestion.

Or fork the repo and submit pull requests.

About

Author:plylrnsdy

Github:pardosa

0.2.3

5 years ago

0.2.2

5 years ago

0.2.0

5 years ago

0.1.9

5 years ago

0.1.8

5 years ago

0.1.6

5 years ago

0.1.5

5 years ago

0.1.4

5 years ago

0.1.3

5 years ago

0.1.2

5 years ago

0.1.1

5 years ago

0.1.0

5 years ago