1.0.0 • Published 6 years ago

ejws-metadata v1.0.0

Weekly downloads
1
License
MIT
Repository
github
Last release
6 years ago

url-metadata

Request an http url and scrape its metadata. Many of the metadata fields returned are Open Graph Protocol (og:) so far.

Usage

Install via npm to use in a Node.js project:

npm install url-metadata --save`

Then in your project file (from example/basic.js):

const urlMetadata = require('url-metadata')
urlMetadata('http://bit.ly/2ePIrDy').then(
  function (metadata) { // success handler
    console.log(metadata)
  },
  function (error) { // failure handler
    console.log(error)
  })

If you'd like to override the default options (see below), pass in a second argument:

const urlMetadata = require('urlMetadata')
urlMetadata('http://bit.ly/2ePIrDy', {fromEmail: 'me@myexample.com'}).then(...)

Options

Defaults are the values below that you may want to override:

{
  userAgent: 'MetadataScraper', // name the bot that will make url request
  fromEmail: 'example@example.com', // your email
  maxRedirects: 10,
  timeout: 10000, // 10 seconds
  descriptionLength: 750, // number of chars to truncate description to
  ensureSecureImageRequest: true,
  sourceMap: {} // example: https://github.com/LevelNewsOrg/source-map
  encode: undefined, // a function to encode metadata with, see example/encoding.js
}

Returns

Returns a promise that gets resolved with the following url metadata if the url request response returns successfully. Note that the url field returned below will be the last hop in the request chain. So if you passed in a url that was generated by a link shortener, for example, you'll get back the final destination of the link as the url.

{
  'url'                  : '',
  'canonical'            : '',
  'title'                : '',
  'image'                : '',
  'author'               : '',
  'description'          : '',
  'keywords'             : '',
  'source'               : '',
  'og:url'               : '',
  'og:locale'            : '',
  'og:locale:alternate'  : '',
  'og:title'             : '',
  'og:type'              : '',
  'og:description'       : '',
  'description'          : '',
  'og:determiner'        : '',
  'og:site_name'         : '',
  'og:image'             : '',
  'og:image:secure_url'  : '',
  'og:image:type'        : '',
  'og:image:width'       : '',
  'og:image:height'      : ''
}

Additional fields are also returned if the url has an og:type set to article. These fields are:

{
  'article:published_time'     : '',
  'article:modified_time'      : '',
  'article:expiration_time'    : '',
  'article:author'             : '',
  'article:section'            : '',
  'article:tag'                : '',
  'og:article:published_time'  : '',
  'og:article:modified_time'   : '',
  'og:article:expiration_time' : '',
  'og:article:author'          : '',
  'og:article:section'         : '',
  'og:article:tag'             : ''
}