algolia-crawl v1.2.16
🕷️🔍 Algolia Crawl
Crawl your website and sync all pages to Algolia search, and auto generate sitemaps from their index.
⭐️ Features
- Crawl your website using Puppeteer
- Sync all pages to an Algolia search index
- Generate
sitemap.xmlfrom the index
💻 Getting started
Install from npm:
npm install algolia-crawlUse API for Node.js:
import { algoliaCrawl, generateSitemap } from "algolia-crawl";
await algoliaCrawl(); // Crawl all pages and sync index
await generateSitemap("sitemap.xml"); // Generate a sitemap.xml fileCLI usage:
npx algolia-crawl crawl # Crawl all pages and sync index
npx algolia-crawl sitemap sitemap.xml # Generate a sitemap.xml fileConfiguration
You can either create a .algoliacrawlrc.json configuration file with the following keys:
{
"algoliaCrawlAppId": "2UFBBTMSYW",
"algoliaCrawlIndex": "dev_KOJ",
"algoliaCrawlStartUrl": "https://koj.co",
"algoliaCrawlBaseUrl": "https://koj.co"
}appId is your Algolia application ID and index is the name of the index. startUrl is the first page to crawl (it can also be an array of strings), and only pages starting with baseUrl will be indexed.
Alternately, you can provide these values as environment variables instead of the configuration file:
| Environment variable | Description |
|---|---|
ALGOLIA_CRAWL_APP_ID | Algolia search application ID |
ALGOLIA_CRAWL_INDEX | Algolia search index |
ALGOLIA_CRAWL_START_URL | First page to crawl |
ALGOLIA_CRAWL_BASE_URL | Index pages with this base URL |
Other environment variables required are:
| Environment variable | Description |
|---|---|
ALGOLIA_CRAWL_API_KEY | Algolia search API key |
📄 License
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago