grab-xml v1.0.2
grab-xml
A simple XML parser.
Installation
Use npm (or yarn, or pnpm) to add grab-xml to your project:
npm install grab-xmlgrab-xml contains CommonJS and ESM modules for use in Node, in the browser and at the edge.
Usage
grabXml
Parses XML into a root XmlNode object with all of the XML's nodes as children.
import { grabXml } from 'grab-xml';
const xml = '<xml></xml>';
const doc = grabXml(xml);XmlNode
Each XmlNode object has the following properties and functions:
| Property | Type | Description | |
|---|---|---|---|
| type | XmlNodeType | The type of node | |
| parent | XmlNode | The parent node | |
| tag | string | The tag name of the node, if applicable for the node type | |
| attributes | object | Any attributes that were set on the node, if applicable for the node type | |
| children | XmlNode[] | The child nodes, if applicable for the node type | |
| text | string | The text content, if applicable for the node type | |
| selfClosing | boolean | undefined | Whether this node is self-closing |
| Function | Description |
|---|---|
| content | Returns a string containing the text content of the node and its children |
| outerXml | Returns a string containing the XML of the node and its children, including the node itself |
| innerXml | Returns a string containing the XML of the node's children |
| json | Returns a string containing a JSON representation of the node and its children, excluding the circular parent references |
Options
You can pass an options object into the grabXml function with the following optional properties:
| Property | Type | Description |
|---|---|---|
| trimWhitespace | boolean | Whether to trim whitespace from text elements and omit text elements that contain only whitespace |
| ignoreComments | boolean | Whether to ignore comment nodes |
| ignoreInstructions | boolean | Whether to ignore processing instruction nodes |
| voidElements | string[] | The tags of elements that do not have any children, such as <input> and <br> in HTML documents |
| literalElements | string[] | The tags of elements that should be extracted with their unprocessed text content, such as <script> and <style> in HTML documents |
grabHtml
Parses HTML into a root XmlNode object with all of the HTML's nodes as children. Basically, it calls grabXml with the voidElements and literalElements options set to values that work for HTML.
import { grabHtml } from 'grab-xml';
const html = '<html></html>';
const doc = grabHtml(html);Benchmarks
There is a benchmark available in the bench directory that compares grab-xml with some other options. Use the following commands to run it:
cd bench
npm install
npm run benchNote that grab-xml does a lot less than some of the slower options.