2.2.1 • Published 8 years ago

psi-report v2.2.1

Weekly downloads
9
License
MIT
Repository
github
Last release
8 years ago

Version Downloads Dependencies devDependencies

psi-report

Crawls a website or get URLs from a sitemap.xml or a file, gets PageSpeed Insights data for each page, and exports an HTML report.

npm.io


Installation

Install with npm:

$ npm install psi-report --global
# --global isn't required if you plan to use the node module

CLI usage

$ psi-report [options] <url> <dest_path>

Options:

    -V, --version               output the version number
    --urls-from-sitemap [name]  Get the list of URLs from sitemap.xml (don't crawl)
    --urls-from-file [name]     Get the list of URLs from a file, one url per line (don't crawl)
    -h, --help                  output usage information

Example:

$ psi-report daringfireball.net/projects/markdown /Users/johan/Desktop/report.html

Programmatic usage

// Basic usage

var PSIReport = require('psi-report');
var psi_report = new PSIReport({baseurl: 'http://domain.org'}, onComplete);
reporter.start();

function onComplete(baseurl, data, html)
{
    console.log('Report for: ' + baseurl);
    console.log(data); // An array of pages with their PSI results
    console.log(html); // The HTML report (as a string)
}

// The "fetch_url" and "fetch_psi" events allow to monitor the crawling process

psi_report.on('fetch_url', onFetchURL);
function onFetchURL(error, url)
{
    console.log((error ? 'Error with URL: ' : 'Fetched URL: ') + url);
}

psi_report.on('fetch_psi', onFetchPSI);
function onFetchPSI(error, url, strategy)
{
    console.log((error ? 'Error with PSI for ' : 'PSI data (' + strategy + ') fetched for ') + url);
}

Crawler behavior

The base URL is used as a root when crawling the pages.

For instance, using the URL https://daringfireball.net/ will crawl the entire website.

However, https://daringfireball.net/projects/markdown/ will crawl only:

  • https://daringfireball.net/projects/markdown/
  • https://daringfireball.net/projects/markdown/basics
  • https://daringfireball.net/projects/markdown/syntax
  • https://daringfireball.net/projects/markdown/license
  • And so on

This may be useful to crawl only one part of a website: everything starting with /en, for instance.

URLs from a sitemap.xml or a file

Instead of crawling the website, you can set the URL list with a sitemap.xml or a file.

  • --urls-from-sitemap https://example.com/sitemap.xml
  • --urls-from-file /path/to/urls.txt

Only the URLs inside this file will be processed.

Changelog

This project uses semver.

VersionDateNotes
2.2.12018-01-19Fix missing source files on NPM (@blaryjp)
2.2.02017-11-27Prepend baseurl if not present, for each urls in file (@blaryjp)
2.1.02017-11-19Add --urls-from-sitemap and --urls-from-file (@blaryjp)
2.0.02016-04-02Deep module rewrite (New module API, updated CLI usage)
1.0.12016-01-15Fix call on obsolete package
1.0.02015-12-01Initial version

License

This project is released under the MIT License.

Credits

2.2.1

8 years ago

2.2.0

8 years ago

2.1.0

8 years ago

2.0.0

10 years ago

1.0.1

10 years ago

1.0.0

10 years ago