0.0.44 • Published 3 months ago

scrap-ai v0.0.44

Weekly downloads
-
License
MIT
Repository
github
Last release
3 months ago

Web Scraping Helper

A lightweight TypeScript library for asynchronous web scraping with customizable prompts and callback support.

Installation

# NPM
npm install scrap-ai

# Yarn
yarn add scrap-ai

# Deno
import { ScrapeClient } from "https://deno.land/x/scrap_ai/mod.ts";

Then import and use:

// ESM/TypeScript
import { ScrapeClient } from "scrap-ai";

// CommonJS
const { ScrapeClient } = require("scrap-ai");

Features

  • 🤖 AI-powered data extraction
  • 🔄 Asynchronous processing with callback support
  • 🔒 Secure webhook verification
  • 📦 TypeScript support
  • 🌐 Cross-platform (Node.js and Deno)

Usage

The library provides a ScrapeClient class for initiating web scraping operations:

import { ScrapeClient } from "scrap-ai";

// Initialize the client with your API key
const scrapeClient = new ScrapeClient(process.env.SCRAP_API_KEY);

// Basic scraping
await scrapeClient.scrape(
  "https://example.com",
  "Extract all product titles and prices",
  "https://your-api.com/webhook"
);

// Scraping with custom ID
await scrapeClient.scrape(
  "https://example.com",
  "Extract product information",
  "https://your-api.com/webhook",
  "optional-custom-id"
);

API Reference

new ScrapeClient(apiKey)

Creates a new scraping client instance.

Parameters

ParameterTypeDescription
apiKeystringYour API key for authentication

scrapeClient.scrape(url, prompt, callbackUrl, id?)

Initiates a scraping operation and sends results to the specified callback URL upon completion.

Parameters

ParameterTypeDescription
urlstringThe URL of the webpage to scrape
promptstringInstructions for what data to extract
callbackUrlstringURL where results will be sent via POST
id?stringOptional custom identifier for the scraping request

Webhook Verification

The library provides webhook verification to ensure the authenticity of incoming webhook requests:

const isValid = scrapeClient.verifyWebhook({
  body: req.body,
  signature: req.headers["x-webhook-signature"],
  timestamp: req.headers["x-webhook-timestamp"],
});

scrapeClient.verifyWebhook(options)

Verifies that a webhook request is authentic using timing-safe signature comparison.

Parameters

ParameterTypeDescription
options.bodyObjectThe raw request body as an object
options.signaturestringThe signature from x-webhook-signature header
options.timestampstringThe timestamp from x-webhook-timestamp header
options.maxAge?numberMaximum age of webhook in milliseconds (default: 5 minutes)

scrapeClient.parseWebhookBody(body)

Parses and validates the webhook body.

Parameters

ParameterTypeDescription
bodystringThe raw webhook body as a string

Returns the parsed and validated webhook event.

Example Usage with Express

Here's a complete example of how to use the scraping client with webhook verification in an Express application:

import { ScrapeClient } from "scrap-ai";
import express from "express";

const app = express();
const scrapeClient = new ScrapeClient(process.env.SCRAP_API_KEY);

// Webhook endpoint
app.post("/webhook", express.json(), (req, res) => {
  const isValid = scrapeClient.verifyWebhook({
    body: req.body,
    signature: req.headers["x-webhook-signature"] as string,
    timestamp: req.headers["x-webhook-timestamp"] as string,
  });

  if (!isValid) {
    return res.status(400).send("Invalid webhook signature");
  }

  const event = scrapeClient.parseWebhookBody(JSON.stringify(req.body));
  console.log("Received verified webhook:", event);

  res.status(200).send("OK");
});

// Start scraping
app.post("/start-scrape", async (req, res) => {
  try {
    const result = await scrapeClient.scrape(
      "https://example.com",
      "Extract product information",
      "https://your-api.com/webhook"
    );
    res.json(result);
  } catch (error) {
    res.status(500).json({ error: "Scraping failed" });
  }
});

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

0.0.44

3 months ago

0.0.43

3 months ago

0.0.42

3 months ago

0.0.41

3 months ago

0.0.40

4 months ago

0.0.39

4 months ago

0.0.38

4 months ago

0.0.37

4 months ago

0.0.36

4 months ago

0.0.35

4 months ago

0.0.34

4 months ago

0.0.33

4 months ago

0.0.32

4 months ago

0.0.31

4 months ago

0.0.30

4 months ago

0.0.29

4 months ago

0.0.28

4 months ago

0.0.27

4 months ago

0.0.26

4 months ago

0.0.25

4 months ago

0.0.24

4 months ago

0.0.23

4 months ago

0.0.22

4 months ago

0.0.21

4 months ago

0.0.20

4 months ago

0.0.19

4 months ago

0.0.18

4 months ago

0.0.17

4 months ago

0.0.16

4 months ago

0.0.15

4 months ago

0.0.14

4 months ago

0.0.13

4 months ago

0.0.12

4 months ago

0.0.11

4 months ago

0.0.10

4 months ago

0.0.9

4 months ago

0.0.8

4 months ago

0.0.7

4 months ago

0.0.6

4 months ago

0.0.5

4 months ago

0.0.4

4 months ago

0.0.3

4 months ago

0.0.2

4 months ago

0.0.1

4 months ago