1.0.2 • Published 1 year ago

htmlarkdown v1.0.2

Weekly downloads
-
License
-
Repository
github
Last release
1 year ago

HTMLarkdown is a HTML-to-Markdown converter that's able to output HTML-syntax when required.
Like when center-aligning, or resizing images:

How is this different?

Switching to HTML-syntax

Whenever elements cannot be represented in markdown-syntax, HTMLarkdown will switch to HTML-syntax:

Note: The HTML-switching is controlled by the rules' Rule.toUseHtmlPredicate.

But HTMLarkdown tries to use as little HTML-syntax as possible. Mixing markdown and HTML if needed:

Depending on the situation, HTMLarkdown will switch between markdown's backslash-escaping or HTML-escaping:

Handling of edge cases

Adding separators in-between adjacent lists to prevent them from being combined by markdown-renderers:

And more!
But this section is getting too long so...

Installation

npm install htmlarkdown

Usage

Markdown conversion (either from Element or string)

import { HTMLarkdown } from 'htmlarkdown'

/** Convert an element! */
const htmlarkdown = new HTMLarkdown()
const container = document.getElementById('container')
console.log(container.outerHTML)
// => '<div id="container"><h1>Heading</h1></div>'
htmlarkdown.convert(container)
// => '# Heading'


/** 
 * Or a HTML string! 
 * Whichever u prefer. It's 2022, I don't judge :^)
 */
const htmlString = `
<h1>Heading</h1>
<p>Paragraph</p>
`
const htmlStrWithContainer = `<div>${htmlString}</div>`
htmlarkdown.convert(htmlString)
// Set 2nd param 'hasContainer' to true, for container-wrapped string.
htmlarkdown.convert(htmlStrWithContainer, true)
// Both output => '# Heading\n\nParagraph'

Note: If an element is given to convert, it's deep-cloned before any processing/conversion.
Thus, you don't have to worry about it mutating the original element :)

Configuring

/** Configure when creating an instance. */
const htmlarkdown = new HTMLarkdown({
    htmlEscapingMode: '&<>',
    maxPrettyTableWidth: Number.POSITIVE_INFINITY,
    addTrailingLinebreak: true
})

/** Or on an existing instance. */
htmlarkdown.options.maxPrettyTableWidth = -1

Plugins

Plugins are of type (htmlarkdown: HTMLarkdown): void.
They take in a HTMLarkdown instance and configure it by mutating it.

There's 2 plugin-options available in the options object: preloadPlugins and plugins.
The difference is:

  • preloadPlugins loads the plugins first, before your other options. (likes "presets")
    Allowing you to overwrite the plugins' changes:
    const enableTrailingLinebreak: Plugin = (htmlarkdown) => {
        htmlarkdown.options.addTrailingLinebreak = true
    }
    const htmlarkdown = new HTMLarkdown({
        addTrailingLinebreak: false,
        preloadPlugins: [enableTrailingLinebreak],
    })
    htmlarkdown.options.preloadPlugins // false
  • plugins loads the plugins after your other options.
    Meaning, plugins can overwrite your options.
    const enableTrailingLinebreak: Plugin = (htmlarkdown) => {
        htmlarkdown.options.addTrailingLinebreak = true
    }
    const htmlarkdown = new HTMLarkdown({
        addTrailingLinebreak: false,
        plugins: [enableTrailingLinebreak],
    })
    htmlarkdown.options.preloadPlugins // true

You can also load plugins on existing instances:

htmlarkdown.loadPlugins([myPlugin])

Making a copy of an instance

The conversion of a HTMLarkdown instance solely depends on its options property.
Meaning, you create a copy of an instance like this:

const htmlarkdown = new HTMLarkdown()
const copy = new HTMLarkdown(htmlarkdown.options)

Configuring rules/processes

See this section for info on what the rules/processes do.

/**
 * Overwriting default rules/processes.
 * (does NOT include the defaults)
 */
const htmlarkdown = new HTMLarkdown({
    preProcesses: [myPreProcess1, myPreProcess2],
    rules: [myRule1, myRule2],
    textProcesses: [myTextProcess1, myTextProcess2],
    postProcesses: [myPostProcess1, myPostProcess2]
})

/**
 * Adding on to default rules/processes.
 * (includes the defaults)
 */
const htmlarkdown = new HTMLarkdown()
htmlarkdown.addPreProcess(myPreProcess)
htmlarkdown.addRule(myRule)
htmlarkdown.addTextProcess(myTextProcess)
htmlarkdown.addPostProcess(myPostProcess)

How it works

HTMLarkdown has 3 distinct phases:

  1. Pre-processing
    The container-element that's received (and deep-cloned) by the convert method is passed consecutively to each PreProcess in options.preProcesses.

  2. Conversion
    The pre-processed container-element is then recursively converted to markdown.
    Elements are converted by Rule in options.rules.
    Text-nodes are converted by TextProcess in options.textProcesses.
    The rule/text-process outputs strings are then appended to each other, to give the raw markdown.

  3. Post-processing
    The raw markdown string is then passed consecutively to each PostProcess in options.postProcess, to give the final markdown.

Contributing

Bugs

HTMLarkdown is still under-development, so there'll likely be bugs.

So the easiest way to contribute is submit an issue (with the bug label), especially for any incorrect markdown-conversions :)

For any incorrect markdown-conversions, state the:

  • input HTML
  • current incorrect markdown output
  • expected markdown output

New conversions, ideas, features, tests

If you have any new elements-conversions / ideas / features / tests that you think should be added, leave an issue with feature or improve label!

  • feature label is for new features
  • improve label is for improvements on existing features

Understandably, there are gray areas on what is a "feature" and what is an "improvement". So just go with whichever seems more appropriate :)

Other markdown specs

Currently, HTMLarkdown has been designed to output markdown for GitHub specifically (ie. GFM).
BUT, if there's another markdown spec. that you'd like to design for (maybe as a plugin?), do leave an issue/discussion :D

Coding-related stuff

Code-formatting is handled by Prettier, so no need to worry bout it :)

Any new feature should

  • be documented via TSDoc
  • come with new unit-tests for them
  • and should pass all new/existing tests

As for which merging method to use, check out the discussion.

Contributors

So far it's just me, so pls send help! :^)

Roadmap

If you've any new ideas / features, check out the Contributing section for it!

Element conversions

Block-elements:

  • Headings (For now, only ATX-style)
  • Paragraph
  • Codeblock
  • Blockquote
  • Lists
    (ordered, unordered, tight and loose)
  • (GFM) Table
  • (GFM) Task-list (Below are some planned block-elements that don't have markdown-equivalent)
  • <span> (handled by a noop-rule)
  • <div> (For now, handled by a noop-rule)
  • Definition list (ie. <dl>, <dt>, <dd>)
  • Collapsible section (ie. <details>)

Text-formattings:

  • Bold (For now, only outputs in asterisks **BOLD**)
  • Italic (For now, only outputs in asterisks *ITALIC*)
  • (GFM) Strikethrough
  • Code
  • Link (For now, only inline links)
  • Superscript (ie. <sup>)
  • Subscript (ie. <sub>)
  • Underline (ie. <u>, <ins>)
    (didn't know underlines possible till recently)

Misc:

  • Images (For now, only inline links)
  • Horizontal-rule (ie. <hr>)
  • Linebreaks (ie. <brr>)
  • Preserved HTML comments (Issue #25) (eg. <!-- COMMENT -->)

Features to be added:

  • Custom id attributes

    Go to [section with id](#my-section)
    
    <p id="my-section">
      My section
    </p>
  • Reversing GitHub's Issue/PR autolinks

  • Ability to customise how codeblock's syntax-highlighting langauge is obtained from the <pre><code> elements

License

The MIT License (MIT).
So it's freeeeeee