1 // mxj - A collection of map[string]interface{} and associated XML and JSON utilities. 2 // Copyright 2012-2019, Charles Banning. All rights reserved. 3 // Use of this source code is governed by a MIT-style 4 // license that can be found in the LICENSE file 5 6 /* 7 Marshal/Unmarshal XML to/from map[string]interface{} values (and JSON); extract/modify values from maps by key or key-path, including wildcards. 8 9 mxj supplants the legacy x2j and j2x packages. The subpackage x2j-wrapper is provided to facilitate migrating from the x2j package. The x2j and j2x subpackages provide similar functionality of the old packages but are not function-name compatible with them. 10 11 Note: this library was designed for processing ad hoc anonymous messages. Bulk processing large data sets may be much more efficiently performed using the encoding/xml or encoding/json packages from Go's standard library directly. 12 13 Related Packages: 14 checkxml: github.com/clbanning/checkxml provides functions for validating XML data. 15 16 Notes: 17 2022.11.28: v2.7 - add SetGlobalKeyMapPrefix to change default prefix, '#', for default keys 18 2022.11.20: v2.6 - add NewMapForattedXmlSeq for XML docs formatted with whitespace character 19 2021.02.02: v2.5 - add XmlCheckIsValid toggle to force checking that the encoded XML is valid 20 2020.12.14: v2.4 - add XMLEscapeCharsDecoder to preserve XML escaped characters in Map values 21 2020.10.28: v2.3 - add TrimWhiteSpace option 22 2020.05.01: v2.2 - optimize map to XML encoding for large XML docs. 23 2019.07.04: v2.0 - remove unnecessary methods - mv.XmlWriterRaw, mv.XmlIndentWriterRaw - for Map and MapSeq. 24 2019.07.04: Add MapSeq type and move associated functions and methods from Map to MapSeq. 25 2019.01.21: DecodeSimpleValuesAsMap - decode to map[<tag>:map["#text":<value>]] rather than map[<tag>:<value>]. 26 2018.04.18: mv.Xml/mv.XmlIndent encodes non-map[string]interface{} map values - map[string]string, map[int]uint, etc. 27 2018.03.29: mv.Gob/NewMapGob support gob encoding/decoding of Maps. 28 2018.03.26: Added mxj/x2j-wrapper sub-package for migrating from legacy x2j package. 29 2017.02.22: LeafNode paths can use ".N" syntax rather than "[N]" for list member indexing. 30 2017.02.21: github.com/clbanning/checkxml provides functions for validating XML data. 31 2017.02.10: SetFieldSeparator changes field separator for args in UpdateValuesForPath, ValuesFor... methods. 32 2017.02.06: Support XMPP stream processing - HandleXMPPStreamTag(). 33 2016.11.07: Preserve name space prefix syntax in XmlSeq parser - NewMapXmlSeq(), etc. 34 2016.06.25: Support overriding default XML attribute prefix, "-", in Map keys - SetAttrPrefix(). 35 2016.05.26: Support customization of xml.Decoder by exposing CustomDecoder variable. 36 2016.03.19: Escape invalid chars when encoding XML attribute and element values - XMLEscapeChars(). 37 2016.03.02: By default decoding XML with float64 and bool value casting will not cast "NaN", "Inf", and "-Inf". 38 To cast them to float64, first set flag with CastNanInf(true). 39 2016.02.22: New mv.Root(), mv.Elements(), mv.Attributes methods let you examine XML document structure. 40 2016.02.16: Add CoerceKeysToLower() option to handle tags with mixed capitalization. 41 2016.02.12: Seek for first xml.StartElement token; only return error if io.EOF is reached first (handles BOM). 42 2015-12-02: NewMapXmlSeq() with mv.XmlSeq() & co. will try to preserve structure of XML doc when re-encoding. 43 2014-08-02: AnyXml() and AnyXmlIndent() will try to marshal arbitrary values to XML. 44 45 SUMMARY 46 47 type Map map[string]interface{} 48 49 Create a Map value, 'mv', from any map[string]interface{} value, 'v': 50 mv := Map(v) 51 52 Unmarshal / marshal XML as a Map value, 'mv': 53 mv, err := NewMapXml(xmlValue) // unmarshal 54 xmlValue, err := mv.Xml() // marshal 55 56 Unmarshal XML from an io.Reader as a Map value, 'mv': 57 mv, err := NewMapXmlReader(xmlReader) // repeated calls, as with an os.File Reader, will process stream 58 mv, raw, err := NewMapXmlReaderRaw(xmlReader) // 'raw' is the raw XML that was decoded 59 60 Marshal Map value, 'mv', to an XML Writer (io.Writer): 61 err := mv.XmlWriter(xmlWriter) 62 raw, err := mv.XmlWriterRaw(xmlWriter) // 'raw' is the raw XML that was written on xmlWriter 63 64 Also, for prettified output: 65 xmlValue, err := mv.XmlIndent(prefix, indent, ...) 66 err := mv.XmlIndentWriter(xmlWriter, prefix, indent, ...) 67 raw, err := mv.XmlIndentWriterRaw(xmlWriter, prefix, indent, ...) 68 69 Bulk process XML with error handling (note: handlers must return a boolean value): 70 err := HandleXmlReader(xmlReader, mapHandler(Map), errHandler(error)) 71 err := HandleXmlReaderRaw(xmlReader, mapHandler(Map, []byte), errHandler(error, []byte)) 72 73 Converting XML to JSON: see Examples for NewMapXml and HandleXmlReader. 74 75 There are comparable functions and methods for JSON processing. 76 77 Arbitrary structure values can be decoded to / encoded from Map values: 78 mv, err := NewMapStruct(structVal) 79 err := mv.Struct(structPointer) 80 81 To work with XML tag values, JSON or Map key values or structure field values, decode the XML, JSON 82 or structure to a Map value, 'mv', or cast a map[string]interface{} value to a Map value, 'mv', then: 83 paths := mv.PathsForKey(key) 84 path := mv.PathForKeyShortest(key) 85 values, err := mv.ValuesForKey(key, subkeys) 86 values, err := mv.ValuesForPath(path, subkeys) // 'path' can be dot-notation with wildcards and indexed arrays. 87 count, err := mv.UpdateValuesForPath(newVal, path, subkeys) 88 89 Get everything at once, irrespective of path depth: 90 leafnodes := mv.LeafNodes() 91 leafvalues := mv.LeafValues() 92 93 A new Map with whatever keys are desired can be created from the current Map and then encoded in XML 94 or JSON. (Note: keys can use dot-notation. 'oldKey' can also use wildcards and indexed arrays.) 95 newMap, err := mv.NewMap("oldKey_1:newKey_1", "oldKey_2:newKey_2", ..., "oldKey_N:newKey_N") 96 newMap, err := mv.NewMap("oldKey1", "oldKey3", "oldKey5") // a subset of 'mv'; see "examples/partial.go" 97 newXml, err := newMap.Xml() // for example 98 newJson, err := newMap.Json() // ditto 99 100 XML PARSING CONVENTIONS 101 102 Using NewMapXml() 103 104 - Attributes are parsed to `map[string]interface{}` values by prefixing a hyphen, `-`, 105 to the attribute label. (Unless overridden by `PrependAttrWithHyphen(false)` or 106 `SetAttrPrefix()`.) 107 - If the element is a simple element and has attributes, the element value 108 is given the key `#text` for its `map[string]interface{}` representation. (See 109 the 'atomFeedString.xml' test data, below.) 110 - XML comments, directives, and process instructions are ignored. 111 - If CoerceKeysToLower() has been called, then the resultant keys will be lower case. 112 113 Using NewMapXmlSeq() 114 115 - Attributes are parsed to `map["#attr"]map[<attr_label>]map[string]interface{}`values 116 where the `<attr_label>` value has "#text" and "#seq" keys - the "#text" key holds the 117 value for `<attr_label>`. 118 - All elements, except for the root, have a "#seq" key. 119 - Comments, directives, and process instructions are unmarshalled into the Map using the 120 keys "#comment", "#directive", and "#procinst", respectively. (See documentation for more 121 specifics.) 122 - Name space syntax is preserved: 123 - <ns:key>something</ns.key> parses to map["ns:key"]interface{}{"something"} 124 - xmlns:ns="http://myns.com/ns" parses to map["xmlns:ns"]interface{}{"http://myns.com/ns"} 125 126 Both 127 128 - By default, "Nan", "Inf", and "-Inf" values are not cast to float64. If you want them 129 to be cast, set a flag to cast them using CastNanInf(true). 130 131 XML ENCODING CONVENTIONS 132 133 - 'nil' Map values, which may represent 'null' JSON values, are encoded as "<tag/>". 134 NOTE: the operation is not symmetric as "<tag/>" elements are decoded as 'tag:""' Map values, 135 which, then, encode in JSON as '"tag":""' values.. 136 - ALSO: there is no guarantee that the encoded XML doc will be the same as the decoded one. (Go 137 randomizes the walk through map[string]interface{} values.) If you plan to re-encode the 138 Map value to XML and want the same sequencing of elements look at NewMapXmlSeq() and 139 mv.XmlSeq() - these try to preserve the element sequencing but with added complexity when 140 working with the Map representation. 141 142 */ 143 package mxj 144