Crawler | npm.io

Web Crawler to create directed graph of links among connected sites. Runs with Node.js and stores data with Redis

node redis webcrawler javascript crawler scraper web-crawler

1.0.0 • Published 4 years ago

Fetch the pre-rendered content, meta, links and Open Graph of a webpage, especially Single-Page Application (SPA)

puppeteer-prerender-next puppeteer prerender crawler seo open graph

0.15.0 • Published 10 months ago

A Node.js package that provides a convenient wrapper around Puppeteer for handling browser automation tasks. This package simplifies common browser operations like navigation, downloads management, screenshots, and page interactions.

puppeteer browser automation chrome headless scraping testing screenshots web-automation browser-automation

1.0.3 • Published 10 months ago

Stop website fingerprinting techniques

npm puppeteer security nodejs stealth detection-evasion anti-fingerprint scraping crawler chrome

1.1.11 • Published 1 year ago

Crawl a website to generate knowledge file for RAG

crawler llm RAG website

1.5.0 • Published 1 year ago

crawl youtube without api key (search videos channels or get all channel/playlist's videos)

youtube crawler scraper api no key simple javascript youtube api youtube search

3.3.3 • Published 2 years ago

Download README files from GitHub repository links

node github readme webcrawler javascript crawler scraper nodejs web-crawler

1.0.5 • Published 4 years ago

Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

automation bot bot-detection crawler crawling chromedriver webdriver headless headless-chrome stealth

1.0.17 • Published 9 months ago

Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

automation bot bot-detection crawler crawling chromedriver webdriver headless headless-chrome stealth

1.0.18 • Published 7 months ago

An easy, lightweight scraper for humans with many inbuilt features..

scraper decoding iconv utf-8 quick spider crawler

1.7.0 • Published 4 years ago

A simple node module to crawl a domain and generate a page list.

crawler node web tools

0.3.10 • Published 1 year ago

Generate comprehensive PDFs of entire websites, ideal for RAG.

crawler PDF

0.1.8 • Published 7 months ago

A snazzy light Node.js image crawler laced with TypeScript goodness! 🕵️🦾

web-crawler crawler image-crawler

1.2.8 • Published 2 years ago

The scalable web crawling and scraping library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

apify headless chrome puppeteer crawler scraper

0.1.0 • Published 4 years ago

The elite unit of sitemap.xml generation—precise, efficient, dominating. If RobotsForce1 is your air defense, this is your recon mission.

sitemap sitemap.xml seo search-engines crawler fluent-api javascript web lydio automation

1.0.0 • Published 6 months ago

Generate sitemap just throw any link.

sitemap crawler href

1.0.1 • Published 3 years ago

Pacote que permite consultar algumas informações do aluno presentes no SIGA da FATEC

siga fatec crawler

0.0.4 • Published 10 months ago

GNewsScraper is a TypeScript package that scrapes article data from Google News based on a keyword or phrase. It returns the results as an array of JSON objects, making it convenient to access and use the scraped information

gnews-scraper Google News scraper gnews google news gnews scraper news scraper scraping web scraping Google News API web crawling

1.2.3 • Published 2 years ago

Simple web crawler for creation CDN cache after deploy.

glob crawler

0.1.1 • Published 4 years ago

A CLI tool to crawl GitHub repositories and pull all names and email addresses from commit histories.

osint github cli email crawler commits

1.0.7 • Published 1 year ago

Crawler Packages

redis-web-crawler

puppeteer-prerender-next

puppet-browser-handler

puppeteer-afp-with-vendor

rag-crawler

rifqisyndra

readme-crawler

rebrowser-patches

rebrowser-patches-fadi-patch

quick-scraper

simple-node-site-crawler

site2pdf-cli

snazzy-crawler

skaut

sitemapteam6

sitemap-crawler2

siga-crawler-node

gnews-scraper

glob-cache-warmer

git-oh-shit