This code is based on a copy of fast-xml-parser.
The reason is that we wanted to parse large XML files (over 1Gb) and the current implementation of fast-xml-parser use as input a string. In the current implementation of javascript in V8 this limits the size to 512Mb.
In this code we parse directly a Uint8Array (or an ArrayBuffer) and the limit is now 4Gb.
$ npm i arraybuffer-xml-parser
import { parse } from 'arraybuffer-xml-parser';
// in order to show an example we will encode the data to get the ArrayBuffer.
const encoder = new TextEncoder();
const xmlData = encoder.encode(
`<rootNode><tag>value</tag><boolean>true</boolean><intTag>045</intTag><floatTag>65.34</floatTag></rootNode>`,
);
const object = parse(xmlData, options);
/*
object = {
rootNode: {
tag: 'value',
boolean: 'true',
intTag: '045',
floatTag: '65.34',
},
}
*/
Option | Description | Default value |
---|---|---|
trimValues | trim string values of an attribute or node | true |
attributeNamePrefix | prepend given string to attribute name for identification | '$' |
attributesNodeName | (Valid name) Group all the attributes as properties of given name. | false |
ignoreAttributes | Ignore attributes to be parsed. | false |
ignoreNameSpace | Remove namespace string from tag and attribute names. | false |
allowBooleanAttributes | a tag can have attributes without any value | false |
textNodeName | Name of the property containing text nodes | '#text' |
dynamicTypingAttributeValue | Parse the value of an attribute to float, integer, or boolean. | true |
dynamicTypingNodeValue | Parse the value of text node to float, integer, or boolean. | true |
cdataTagName | If specified, parser parse CDATA as nested tag instead of adding it's value to parent tag. | false |
arrayMode | When false , a tag with single occurrence is parsed as an object but as an array in case of multiple occurences. When true , a tag will be parsed as an array always excluding leaf nodes. When strict , all the tags will be parsed as array only. When instance of RegEx , only tags will be parsed as array that match the regex. When function a tag name is passed to the callback that can be checked. |
false |
tagValueProcessor | Process tag value during transformation. Like HTML decoding, word capitalization, etc. Applicable in case of string only. | (value) => decoder.decode(value).replace(/\r/g, '') |
attributeValueProcessor | Process attribute value during transformation. Like HTML decoding, word capitalization, etc. | (value) => value |
stopNodes | an array of tag names which are not required to be parsed. They are kept as Uint8Array. | [] |