markdown-read v3.8.0
Markdown Read
Convert any URL to Markdown.
Try it online: HTML To Markdown
Tech Stack
@mozilla/readability
for read meaning htmlturndown
for html to markdownjsdom
for parse html
Usage
You will need Node.js installed on your system, then install it globally.
$ npm i -g markdown-read
# Turn current page to markdown
$ markdown https://example.com
## Example Domain
This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.
[More information...](https://www.iana.org/domains/example)
Options
--header
: Add custom headers to the request. This can be useful for setting user-agent strings or other HTTP headers required by the target website.
Example:
$ markdown https://httpbin.org/get --header 'User-Agent: Markdown Reader'
API Reference
markdown(url: string, options?: MarkdownOptions): Promise<MarkdownContent | null>
Converts a web page to Markdown format.
url
: The URL of the web page to convertoptions
: Optional settings for document retrieval and Markdown conversionheaders
: Additional headers to include in the requestfetcher
: Custom function to fetch the HTML content- All options from
TurndownOptions
are also supported
Returns a Promise that resolves to a MarkdownContent
object or null
if conversion fails.
MarkdownContent
The MarkdownContent
object extends ReadabilityContent
and includes:
markdown
: The converted Markdown contentlength
: The length of the Markdown contenturl
: The original URL of the web page
turndown(html: string, options?: TurndownOptions): string
Converts HTML content to Markdown.
html
: The HTML string to convertoptions
: Optional settings for Turndown conversion. These options will override the default settings.
Returns the Markdown representation of the input HTML.
Default Options
{
emDelimiter: '*',
codeBlockStyle: 'fenced',
fence: '```',
headingStyle: 'atx',
bulletListMarker: '+'
}
Example
import { turndown } from 'markdown-read';
const html = '<h1>Hello</h1><em>World</em>';
const options = {
headingStyle: 'setext',
emDelimiter: '_'
};
const markdown = turndown(html, options);
console.log(markdown);
// Output:
// Hello
// =====
//
// _World_
For a full list of available options, please refer to the Turndown Options documentation.
Advanced Features
- Handles lazy-loaded images by setting their
src
attribute. - Extracts byline information from meta tags.
- Supports platform-specific processing for various websites.
- Uses Mozilla's Readability for content extraction.
- Allows custom fetching logic through the
fetcher
option.
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
1 year ago
2 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago