1.0.2 • Published 1 year ago

grab-xml v1.0.2

Weekly downloads
-
License
ISC
Repository
-
Last release
1 year ago

grab-xml

A simple XML parser.

Installation

Use npm (or yarn, or pnpm) to add grab-xml to your project:

npm install grab-xml

grab-xml contains CommonJS and ESM modules for use in Node, in the browser and at the edge.

Usage

grabXml

Parses XML into a root XmlNode object with all of the XML's nodes as children.

import { grabXml } from 'grab-xml';
const xml = '<xml></xml>';
const doc = grabXml(xml);

XmlNode

Each XmlNode object has the following properties and functions:

PropertyTypeDescription
typeXmlNodeTypeThe type of node
parentXmlNodeThe parent node
tagstringThe tag name of the node, if applicable for the node type
attributesobjectAny attributes that were set on the node, if applicable for the node type
childrenXmlNode[]The child nodes, if applicable for the node type
textstringThe text content, if applicable for the node type
selfClosingbooleanundefinedWhether this node is self-closing
FunctionDescription
contentReturns a string containing the text content of the node and its children
outerXmlReturns a string containing the XML of the node and its children, including the node itself
innerXmlReturns a string containing the XML of the node's children
jsonReturns a string containing a JSON representation of the node and its children, excluding the circular parent references

Options

You can pass an options object into the grabXml function with the following optional properties:

PropertyTypeDescription
trimWhitespacebooleanWhether to trim whitespace from text elements and omit text elements that contain only whitespace
ignoreCommentsbooleanWhether to ignore comment nodes
ignoreInstructionsbooleanWhether to ignore processing instruction nodes
voidElementsstring[]The tags of elements that do not have any children, such as <input> and <br> in HTML documents
literalElementsstring[]The tags of elements that should be extracted with their unprocessed text content, such as <script> and <style> in HTML documents

grabHtml

Parses HTML into a root XmlNode object with all of the HTML's nodes as children. Basically, it calls grabXml with the voidElements and literalElements options set to values that work for HTML.

import { grabHtml } from 'grab-xml';
const html = '<html></html>';
const doc = grabHtml(html);

Benchmarks

There is a benchmark available in the bench directory that compares grab-xml with some other options. Use the following commands to run it:

cd bench
npm install
npm run bench

Note that grab-xml does a lot less than some of the slower options.