0.3.1 • Published 5 years ago

@webalytics/metadata v0.3.1

Weekly downloads
-
License
LGPL-3.0
Repository
github
Last release
5 years ago

@webalytics/metadata

TypeScript Build Status styled with prettier License: LGPL v3

Given some HTML website, extract common metadata fields. Part of the Webalytics Toolbox.

Installation

npm install --save @webalytics/metadata

Extracted Properties (if possible)

interface Metadata {
  title: string
  description: string
  url: string
  image: string
  feeds: string[]
  favicon: string
  keywords: string[]
  author: string
}

Usage (convenience)

This package works out-of-the-box with any HTML document without further configuration:

import metadata from '@webalytics/metadata'

const html = '<html><head><title>abc</title></head><body></body></html>'
const data = metadata(html) // { title: 'abc' }

Usage (convenience, with url aid)

When given an additional hint with the base url of the HTML document, relative urls can be resolved correctly:

import metadata from '@webalytics/metadata'

const url = 'http://example.com'
const html = '<html><head><link rel="icon" href="/fav" /></head><body /></html>'
const data = metadata(html, url) // { favicon: 'http://example.com/fav' }

Usage (selective)

When the HTML document or snippet shall not be processed completely, use the underlying parser class directly and select just the fields you want:

import { Parser } from '@webalytics/metadata'

const url = 'http://example.com'
const html = '<html><head><title>abc</title></head><body></body></html>'
const document = new Parser(html, url) // same signature as default method
const title = document.selectTitle() // 'abc'

Usage (fully customized)

If the pre-chosen selectors not suit you completely, you can also hook directly into the underlying cheerio DOM selector engine. It's like jQuery, but in node:

import { Parser } from '@webalytics/metadata'

const url = 'http://example.com'
const html = '<html><head><title>abc</title></head><body></body></html>'
const document = new Parser(html, url) // same signature as default method
const title = document.$('title').text() // 'abc'

License

LGPL v3. You can use this code any way you want without restrictions, but I want bugfixes and improvements to flow back to this repository to benefit everyone.

0.3.1

5 years ago

0.3.0

5 years ago

0.2.0

5 years ago

0.1.4

5 years ago

0.1.3

5 years ago