stream-sitemap-parser v4.0.3
sitemap-parser
Stream a sitemap file and get back a stream of URLs or any error found while parsing the file.
Usage
const { fetch, verify, getRules } = require('stream-sitemap-parser');
fs.createReadStream(file)
.pipe(fetch())
.on('data', function (url) {
// each chunk now contains an url and all its given atributes
{
loc: 'www.google.com',
lastmod: '2017-01-01T00:00:00.000Z',
changefreq: 'monthly',
priority: '0.8',
alternate: [
{
href: 'https://www.google.com/es/',
hreflang: 'es'
}
]
}
})
verify(fs.createReadStream(file))
.then(result => {
// result will be an object containing information about any warning or error found while parsing the sitemap
{
messages: [
{
type: 'tooManyTags',
details: {
parent: 'url',
tag: 'loc'
}
}
],
alternates: [
{
loc: 'https://www.google.com',
alternate: [
{
href: 'https://www.google.com/es/',
hreflang: 'es'
}
]
]
}
})
getRules();
// returns an object of all loaded rules of the parserfetch and verify can take several options.
fetch ( { contentType, domain, maxSize, maxUrls } )
verify (sitemapStream, { contentType, domain, maxSize, maxUrls } )contentType will be by default xml. Set it to txt when streaming that data type.
domain will be by default null. Set it to a given domain to make sure that the URLs parsed will have the same domain.
maxSize will be by default 50MB. Set it to any given size to make sure that the stream can't have a larger size than this.
maxUrls will be by default 50000. Set it to any given value to make sure that no more URLs will be parsed.
3 years ago
4 years ago
4 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
7 years ago
7 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago
8 years ago