1.0.0 • Published 11 months ago
arraybuffer-xml-parser v1.0.0
arraybuffer-xml-parser
This code is based on a copy of fast-xml-parser.
The reason is that we wanted to parse large XML files (over 1Gb) and the current implementation of fast-xml-parser use as input a string. In the current implementation of javascript in V8 this limits the size to 512Mb.
In this code we parse directly a Uint8Array (or an ArrayBuffer) and the limit is now 4Gb.
Installation
$ npm i arraybuffer-xml-parser
Usage
XML to JSON
import { parse } from 'arraybuffer-xml-parser';
// in order to show an example we will encode the data to get the ArrayBuffer.
const encoder = new TextEncoder();
const xmlData = encoder.encode(
`<rootNode><tag>value</tag><boolean>true</boolean><intTag>045</intTag><floatTag>65.34</floatTag></rootNode>`,
);
const object = parse(xmlData, options);
/*
object = {
rootNode: {
tag: 'value',
boolean: 'true',
intTag: '045',
floatTag: '65.34',
},
}
*/Options
| Option | Description | Default value |
|---|---|---|
| trimValues | trim string values of an attribute or node | true |
| attributeNamePrefix | prepend given string to attribute name for identification | '$' |
| attributesNodeName | (Valid name) Group all the attributes as properties of given name. | false |
| ignoreAttributes | Ignore attributes to be parsed. | false |
| ignoreNameSpace | Remove namespace string from tag and attribute names. | false |
| allowBooleanAttributes | a tag can have attributes without any value | false |
| textNodeName | Name of the property containing text nodes | '#text' |
| dynamicTypingAttributeValue | Parse the value of an attribute to float, integer, or boolean. | true |
| dynamicTypingNodeValue | Parse the value of text node to float, integer, or boolean. | true |
| cdataTagName | If specified, parser parse CDATA as nested tag instead of adding it's value to parent tag. | false |
| arrayMode | When false, a tag with single occurrence is parsed as an object but as an array in case of multiple occurences. When true, a tag will be parsed as an array always excluding leaf nodes. When strict, all the tags will be parsed as array only. When instance of RegEx, only tags will be parsed as array that match the regex. When function a tag name is passed to the callback that can be checked. | false |
| tagValueProcessor | Process tag value during transformation. Like HTML decoding, word capitalization, etc. Applicable in case of string only. | (value) => decoder.decode(value).replace(/\r/g, '') |
| attributeValueProcessor | Process attribute value during transformation. Like HTML decoding, word capitalization, etc. | (value) => value |
| stopNodes | an array of tag names which are not required to be parsed. They are kept as Uint8Array. | [] |