1.0.1 • Published 3 years ago

rehype-extract-article v1.0.1

Weekly downloads
-
License
MIT
Repository
github
Last release
3 years ago

Rehype Extract Article

Build Coverage Size

Extract the clean article contents from an HTML page. Remove classes, IDs, & flatten nested children.

Installation

npm install rehype-extract-article

Usage

In your script:

import { unified } from 'unified'
import rehypeRemark from 'rehype-remark'
import rehypeParse from 'rehype-parse'
import remarkStringify from 'remark-stringify'
import rehypeExtractArticle from 'rehype-extract-article'

const processor = unified()
  .use(rehypeParse)
  .use(rehypeExtractArticle)
  .use(rehypeRemark)
  .use(remarkStringify)

const htmlString = axios.get('http://some-blog.com/article')
const result = processor.processSync(htmlString)
console.log(result.value)

Running the above code with a valid htmlString will return a clean markdown containing the extracted contents from the original page.

Tests

Run npm test to run tests.

Run npm coverage to produce a test coverage report.

License

MIT © Goran Spasojevic