Licence
MIT
Version
2.1.0
Deps
1
Size
11 kB
Vulns
0
Weekly
0
sitemap2array
Fetch a sitemap.xml URL and return its URLs as an array. Automatically resolves sitemap index files.
Install
npm install sitemap2array
Usage
const sitemap2array = require('sitemap2array');
// Regular sitemap — returns page URLs
const urls = await sitemap2array('https://example.com/sitemap.xml');
// ['https://example.com/page1', 'https://example.com/page2', ...]
// Sitemap index — automatically fetches all child sitemaps and returns all page URLs
const allUrls = await sitemap2array('https://example.com/sitemap-index.xml');
// ['https://example.com/page1', ..., 'https://example.com/page500']
Options
followIndex
When true (default), sitemap index files are resolved recursively — each child sitemap is fetched in parallel and all page URLs are flattened into a single array.
Set to false to get just the child sitemap URLs without following them:
const sitemapUrls = await sitemap2array('https://example.com/sitemap-index.xml', {
followIndex: false,
});
// ['https://example.com/sitemap-1.xml', 'https://example.com/sitemap-2.xml']
API
sitemap2array(url, [options])
Returns a Promise<string[]>.
| Parameter | Type | Default | Description |
|---|---|---|---|
url |
string |
— | Full URL to a sitemap.xml (must include http:// or https://) |
options.followIndex |
boolean |
true |
Recursively fetch child sitemaps from sitemap index files |
Supports both <urlset> (standard sitemaps) and <sitemapindex> (sitemap index files) per the sitemaps.org protocol.
Recursive depth is capped at 3 levels to prevent infinite loops.
Requirements
Node.js >= 18 (uses native fetch).
License
MIT