Implement XML data support
authorPaul van Brouwershaven <vanbroup@users.noreply.github.com>
Thu, 2 Dec 2021 16:30:36 +0000 (17:30 +0100)
committerGitHub <noreply@github.com>
Thu, 2 Dec 2021 16:30:36 +0000 (17:30 +0100)
Example:

```
{{ with resources.Get "https://example.com/rss.xml" | transform.Unmarshal }}
    {{ range .channel.item }}
        <strong>{{ .title | plainify | htmlUnescape }}</strong><br />
        <p>{{ .description | plainify | htmlUnescape }}</p>
        {{ $link := .link | plainify | htmlUnescape }}
        <a href="{{ $link }}">{{ $link }}</a><br />
        <hr>
    {{ end }}
{{ end }}
```

Closes #4470

12 files changed:
docs/content/en/functions/transform.Unmarshal.md
docs/content/en/templates/data-templates.md
go.mod
go.sum
hugolib/resource_chain_test.go
parser/frontmatter.go
parser/metadecoders/decoder.go
parser/metadecoders/decoder_test.go
parser/metadecoders/format.go
parser/metadecoders/format_test.go
tpl/transform/remarshal_test.go
tpl/transform/unmarshal_test.go

index 973779c4c1daf3e5223b547ec5ab45031b4f742f..9b380dc578c74aeb8a59da142b392650c31c82e4 100644 (file)
@@ -1,6 +1,6 @@
 ---
 title: "transform.Unmarshal"
-description: "`transform.Unmarshal` (alias `unmarshal`) parses the input and converts it into a map or an array. Supported formats are JSON, TOML, YAML and CSV."
+description: "`transform.Unmarshal` (alias `unmarshal`) parses the input and converts it into a map or an array. Supported formats are JSON, TOML, YAML, XML and CSV."
 date: 2018-12-23
 categories: [functions]
 menu:
@@ -45,3 +45,32 @@ Example:
 ```go-html-template
 {{ $csv := "a;b;c" | transform.Unmarshal (dict "delimiter" ";") }}
 ```
+
+## XML data
+
+As a convenience, Hugo allows you to access XML data in the same way that you access JSON, TOML, and YAML: you do not need to specify the root node when accessing the data.
+
+To get the contents of `<title>` in the document below, you use `{{ .message.title }}`:
+
+```
+<root>
+    <message>
+        <title>Hugo rocks!</title>
+        <description>Thanks for using Hugo</description>
+    </message>
+</root>
+```
+
+The following example lists the items of an RSS feed:
+
+```
+{{ with resources.Get "https://example.com/rss.xml" | transform.Unmarshal }}
+    {{ range .channel.item }}
+        <strong>{{ .title | plainify | htmlUnescape }}</strong><br />
+        <p>{{ .description | plainify | htmlUnescape }}</p>
+        {{ $link := .link | plainify | htmlUnescape }}
+        <a href="{{ $link }}">{{ $link }}</a><br />
+        <hr>
+    {{ end }}
+{{ end }}
+```
index c36344776be44e5835169948a55b36e2769c65d1..441cc2f10150f58c38dd3e3a11221458be620406 100644 (file)
@@ -6,7 +6,7 @@ date: 2017-02-01
 publishdate: 2017-02-01
 lastmod: 2017-03-12
 categories: [templates]
-keywords: [data,dynamic,csv,json,toml,yaml]
+keywords: [data,dynamic,csv,json,toml,yaml,xml]
 menu:
   docs:
     parent: "templates"
@@ -20,7 +20,7 @@ toc: true
 
 <!-- begin data files -->
 
-Hugo supports loading data from YAML, JSON, and TOML files located in the `data` directory in the root of your Hugo project.
+Hugo supports loading data from YAML, JSON, XML, and TOML files located in the `data` directory in the root of your Hugo project.
 
 {{< youtube FyPgSuwIMWQ >}}
 
@@ -28,7 +28,7 @@ Hugo supports loading data from YAML, JSON, and TOML files located in the `data`
 
 The `data` folder is where you can store additional data for Hugo to use when generating your site. Data files aren't used to generate standalone pages; rather, they're meant to be supplemental to content files. This feature can extend the content in case your front matter fields grow out of control. Or perhaps you want to show a larger dataset in a template (see example below). In both cases, it's a good idea to outsource the data in their own files.
 
-These files must be YAML, JSON, or TOML files (using the `.yml`, `.yaml`, `.json`, or `.toml` extension). The data will be accessible as a `map` in the `.Site.Data` variable.
+These files must be YAML, JSON, XML, or TOML files (using the `.yml`, `.yaml`, `.json`, `.xml`, or `.toml` extension). The data will be accessible as a `map` in the `.Site.Data` variable.
 
 ## Data Files in Themes
 
@@ -95,7 +95,7 @@ Discover a new favorite bass player? Just add another `.toml` file in the same d
 
 ## Example: Accessing Named Values in a Data File
 
-Assume you have the following data structure in your `User0123.[yml|toml|json]` data file located directly in `data/`:
+Assume you have the following data structure in your `User0123.[yml|toml|xml|json]` data file located directly in `data/`:
 
 {{< code-toggle file="User0123" >}}
 Name: User0123
@@ -232,6 +232,7 @@ If you change any local file and the LiveReload is triggered, Hugo will read the
 * [YAML Spec][yaml]
 * [JSON Spec][json]
 * [CSV Spec][csv]
+* [XML Spec][xml]
 
 [config]: /getting-started/configuration/
 [csv]: https://tools.ietf.org/html/rfc4180
@@ -247,3 +248,4 @@ If you change any local file and the LiveReload is triggered, Hugo will read the
 [variadic]: https://en.wikipedia.org/wiki/Variadic_function
 [vars]: /variables/
 [yaml]: https://yaml.org/spec/
+[xml]: https://www.w3.org/XML/
diff --git a/go.mod b/go.mod
index 0fbade221d3ca8f42569cbfa84e4c49b661ff88a..db14b0100d32e0824f1bd9a30ea92e347cb1d55e 100644 (file)
--- a/go.mod
+++ b/go.mod
@@ -13,6 +13,7 @@ require (
        github.com/bep/golibsass v1.0.0
        github.com/bep/gowebp v0.1.0
        github.com/bep/tmc v0.5.1
+       github.com/clbanning/mxj/v2 v2.5.5
        github.com/cli/safeexec v1.0.0
        github.com/disintegration/gift v1.2.1
        github.com/dustin/go-humanize v1.0.0
diff --git a/go.sum b/go.sum
index 4093bf6958fdd53685606fd30dd060442cc3478c..952d02793aadb7c7344f5c1700a2d1517b662d42 100644 (file)
--- a/go.sum
+++ b/go.sum
@@ -144,6 +144,8 @@ github.com/cheekybits/is v0.0.0-20150225183255-68e9c0620927/go.mod h1:h/aW8ynjgk
 github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI=
 github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e/go.mod h1:nSuG5e5PlCu98SY8svDHJxuZscDgtXS6KTTbou5AhLI=
 github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMnBNeIyt5eFwwo7qiLfzFZmjNmxjkiQlU=
+github.com/clbanning/mxj/v2 v2.5.5 h1:oT81vUeEiQQ/DcHbzSytRngP6Ky9O+L+0Bw0zSJag9E=
+github.com/clbanning/mxj/v2 v2.5.5/go.mod h1:hNiWqW14h+kc+MdF9C6/YoRfjEJoR3ou6tn/Qo+ve2s=
 github.com/cli/safeexec v1.0.0 h1:0VngyaIyqACHdcMNWfo6+KdUYnqEr2Sg+bSP1pdF+dI=
 github.com/cli/safeexec v1.0.0/go.mod h1:Z/D4tTN8Vs5gXYHDCbaM1S/anmEDnJb1iW0+EJ5zx3Q=
 github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
index 0a997cda35fbd92f524ff2e48b848ac4deef6ab5..05e8a9d009979161744e4a69c6940703d529dacf 100644 (file)
@@ -591,9 +591,9 @@ func TestResourceChains(t *testing.T) {
 
                case "/mydata/xml1.xml":
                        w.Write([]byte(`
-                               <hello>
-                                       <world>Hugo Rocks!</<world>
-                               </hello>`))
+                                       <hello>
+                                               <world>Hugo Rocks!</<world>
+                                       </hello>`))
                        return
 
                case "/mydata/svg1.svg":
@@ -872,16 +872,19 @@ Publish 2: {{ $cssPublish2.Permalink }}
 {{ $toml := "slogan = \"Hugo Rocks!\"" | resources.FromString "slogan.toml" | transform.Unmarshal }}
 {{ $csv1 := "\"Hugo Rocks\",\"Hugo is Fast!\"" | resources.FromString "slogans.csv" | transform.Unmarshal }}
 {{ $csv2 := "a;b;c" | transform.Unmarshal (dict "delimiter" ";") }}
+{{ $xml := "<?xml version=\"1.0\" encoding=\"UTF-8\"?><note><to>You</to><from>Me</from><heading>Reminder</heading><body>Do not forget XML</body></note>" | transform.Unmarshal }}
 
 Slogan: {{ $toml.slogan }}
 CSV1: {{ $csv1 }} {{ len (index $csv1 0)  }}
 CSV2: {{ $csv2 }}              
+XML: {{ $xml.body }}
 `)
                }, func(b *sitesBuilder) {
                        b.AssertFileContent("public/index.html",
                                `Slogan: Hugo Rocks!`,
                                `[[Hugo Rocks Hugo is Fast!]] 2`,
                                `CSV2: [[a b c]]`,
+                               `XML: Do not forget XML`,
                        )
                }},
                {"resources.Get", func() bool { return true }, func(b *sitesBuilder) {
index 79701a0fcd2d22e4a9533fb9c0e09f3612904314..a998295217eba8bcc9b828c42896909ad528ff33 100644 (file)
@@ -23,6 +23,8 @@ import (
        toml "github.com/pelletier/go-toml/v2"
 
        yaml "gopkg.in/yaml.v2"
+
+       xml "github.com/clbanning/mxj/v2"
 )
 
 const (
@@ -62,7 +64,14 @@ func InterfaceToConfig(in interface{}, format metadecoders.Format, w io.Writer)
 
                _, err = w.Write([]byte{'\n'})
                return err
+       case metadecoders.XML:
+               b, err := xml.AnyXmlIndent(in, "", "\t", "root")
+               if err != nil {
+                       return err
+               }
 
+               _, err = w.Write(b)
+               return err
        default:
                return errors.New("unsupported Format provided")
        }
index 168c130ed9076df8d69708c9f6120f1b3e8b02c0..f0dcb08560d897688fb36a6e36c851fa1246dfc4 100644 (file)
@@ -24,6 +24,7 @@ import (
        "github.com/gohugoio/hugo/common/herrors"
        "github.com/niklasfasching/go-org/org"
 
+       xml "github.com/clbanning/mxj/v2"
        toml "github.com/pelletier/go-toml/v2"
        "github.com/pkg/errors"
        "github.com/spf13/afero"
@@ -135,6 +136,25 @@ func (d Decoder) UnmarshalTo(data []byte, f Format, v interface{}) error {
                err = d.unmarshalORG(data, v)
        case JSON:
                err = json.Unmarshal(data, v)
+       case XML:
+               var xmlRoot xml.Map
+               xmlRoot, err = xml.NewMapXml(data)
+
+               var xmlValue map[string]interface{}
+               if err == nil {
+                       xmlRootName, err := xmlRoot.Root()
+                       if err != nil {
+                               return toFileError(f, errors.Wrap(err, "failed to unmarshal XML"))
+                       }
+                       xmlValue = xmlRoot[xmlRootName].(map[string]interface{})
+               }
+
+               switch v := v.(type) {
+               case *map[string]interface{}:
+                       *v = xmlValue
+               case *interface{}:
+                       *v = xmlValue
+               }
        case TOML:
                err = toml.Unmarshal(data, v)
        case YAML:
index e0990a5f7a9b415d4d1190df7a828f06bdbe2eb2..8cd5513b2c6ad746592ab88af37201637951ed05 100644 (file)
@@ -20,6 +20,59 @@ import (
        qt "github.com/frankban/quicktest"
 )
 
+func TestUnmarshalXML(t *testing.T) {
+       c := qt.New(t)
+
+       xmlDoc := `<?xml version="1.0" encoding="utf-8" standalone="yes"?>
+       <rss version="2.0"
+               xmlns:atom="http://www.w3.org/2005/Atom">
+               <channel>
+                       <title>Example feed</title>
+                       <link>https://example.com/</link>
+                       <description>Example feed</description>
+                       <generator>Hugo -- gohugo.io</generator>
+                       <language>en-us</language>
+                       <copyright>Example</copyright>
+                       <lastBuildDate>Fri, 08 Jan 2021 14:44:10 +0000</lastBuildDate>
+                       <atom:link href="https://example.com/feed.xml" rel="self" type="application/rss+xml"/>
+                       <item>
+                               <title>Example title</title>
+                               <link>https://example.com/2021/11/30/example-title/</link>
+                               <pubDate>Tue, 30 Nov 2021 15:00:00 +0000</pubDate>
+                               <guid>https://example.com/2021/11/30/example-title/</guid>
+                               <description>Example description</description>
+                       </item>
+               </channel>
+       </rss>`
+
+       expect := map[string]interface{}{
+               "-atom": "http://www.w3.org/2005/Atom", "-version": "2.0",
+               "channel": map[string]interface{}{
+                       "copyright":   "Example",
+                       "description": "Example feed",
+                       "generator":   "Hugo -- gohugo.io",
+                       "item": map[string]interface{}{
+                               "description": "Example description",
+                               "guid":        "https://example.com/2021/11/30/example-title/",
+                               "link":        "https://example.com/2021/11/30/example-title/",
+                               "pubDate":     "Tue, 30 Nov 2021 15:00:00 +0000",
+                               "title":       "Example title"},
+                       "language":      "en-us",
+                       "lastBuildDate": "Fri, 08 Jan 2021 14:44:10 +0000",
+                       "link": []interface{}{"https://example.com/", map[string]interface{}{
+                               "-href": "https://example.com/feed.xml",
+                               "-rel":  "self",
+                               "-type": "application/rss+xml"}},
+                       "title": "Example feed",
+               }}
+
+       d := Default
+
+       m, err := d.Unmarshal([]byte(xmlDoc), XML)
+       c.Assert(err, qt.IsNil)
+       c.Assert(m, qt.DeepEquals, expect)
+
+}
 func TestUnmarshalToMap(t *testing.T) {
        c := qt.New(t)
 
@@ -38,6 +91,7 @@ func TestUnmarshalToMap(t *testing.T) {
                {"a: Easy!\nb:\n  c: 2\n  d: [3, 4]", YAML, map[string]interface{}{"a": "Easy!", "b": map[string]interface{}{"c": 2, "d": []interface{}{3, 4}}}},
                {"a:\n  true: 1\n  false: 2", YAML, map[string]interface{}{"a": map[string]interface{}{"true": 1, "false": 2}}},
                {`{ "a": "b" }`, JSON, expect},
+               {`<root><a>b</a></root>`, XML, expect},
                {`#+a: b`, ORG, expect},
                // errors
                {`a = b`, TOML, false},
@@ -72,6 +126,7 @@ func TestUnmarshalToInterface(t *testing.T) {
                {`#+DATE: <2020-06-26 Fri>`, ORG, map[string]interface{}{"date": "2020-06-26"}},
                {`a = "b"`, TOML, expect},
                {`a: "b"`, YAML, expect},
+               {`<root><a>b</a></root>`, XML, expect},
                {`a,b,c`, CSV, [][]string{{"a", "b", "c"}}},
                {"a: Easy!\nb:\n  c: 2\n  d: [3, 4]", YAML, map[string]interface{}{"a": "Easy!", "b": map[string]interface{}{"c": 2, "d": []interface{}{3, 4}}}},
                // errors
index bba89dbea93ed9fa8c5861eed45182737e1bc06e..d34a261bf10ee0468ab58f3f52008284cec11e32 100644 (file)
@@ -30,6 +30,7 @@ const (
        TOML Format = "toml"
        YAML Format = "yaml"
        CSV  Format = "csv"
+       XML  Format = "xml"
 )
 
 // FormatFromString turns formatStr, typically a file extension without any ".",
@@ -51,6 +52,8 @@ func FormatFromString(formatStr string) Format {
                return ORG
        case "csv":
                return CSV
+       case "xml":
+               return XML
        }
 
        return ""
@@ -68,27 +71,32 @@ func FormatFromMediaType(m media.Type) Format {
        return ""
 }
 
-// FormatFromContentString tries to detect the format (JSON, YAML or TOML)
+// FormatFromContentString tries to detect the format (JSON, YAML, TOML or XML)
 // in the given string.
 // It return an empty string if no format could be detected.
 func (d Decoder) FormatFromContentString(data string) Format {
        csvIdx := strings.IndexRune(data, d.Delimiter)
        jsonIdx := strings.Index(data, "{")
        yamlIdx := strings.Index(data, ":")
+       xmlIdx := strings.Index(data, "<")
        tomlIdx := strings.Index(data, "=")
 
-       if isLowerIndexThan(csvIdx, jsonIdx, yamlIdx, tomlIdx) {
+       if isLowerIndexThan(csvIdx, jsonIdx, yamlIdx, xmlIdx, tomlIdx) {
                return CSV
        }
 
-       if isLowerIndexThan(jsonIdx, yamlIdx, tomlIdx) {
+       if isLowerIndexThan(jsonIdx, yamlIdx, xmlIdx, tomlIdx) {
                return JSON
        }
 
-       if isLowerIndexThan(yamlIdx, tomlIdx) {
+       if isLowerIndexThan(yamlIdx, xmlIdx, tomlIdx) {
                return YAML
        }
 
+       if isLowerIndexThan(xmlIdx, tomlIdx) {
+               return XML
+       }
+
        if tomlIdx != -1 {
                return TOML
        }
index 2f625935e07dd63740746e9b0d08af67fcf74341..0d94cf67e8767792fb1f605a879bf2a03c002b6a 100644 (file)
@@ -30,6 +30,7 @@ func TestFormatFromString(t *testing.T) {
                {"json", JSON},
                {"yaml", YAML},
                {"yml", YAML},
+               {"xml", XML},
                {"toml", TOML},
                {"config.toml", TOML},
                {"tOMl", TOML},
@@ -48,6 +49,7 @@ func TestFormatFromMediaType(t *testing.T) {
        }{
                {media.JSONType, JSON},
                {media.YAMLType, YAML},
+               {media.XMLType, XML},
                {media.TOMLType, TOML},
                {media.CalendarType, ""},
        } {
@@ -70,6 +72,7 @@ func TestFormatFromContentString(t *testing.T) {
                {`foo:"bar"`, YAML},
                {`{ "foo": "bar"`, JSON},
                {`a,b,c"`, CSV},
+               {`<foo>bar</foo>"`, XML},
                {`asdfasdf`, Format("")},
                {``, Format("")},
        } {
index 2cb4c3a2fe624c219aef1afb99e150a63660f9c1..8e94ef6bf143347ed6d1ad0ca79f64dcddd65791 100644 (file)
@@ -82,6 +82,25 @@ title: Test Metadata
    "title": "Test Metadata"
 }
 `
+               xmlExample := `<root>
+                 <resources>
+                       <params>
+                         <byline>picasso</byline>
+                       </params>
+                       <src>**image-4.png</src>
+                       <title>The Fourth Image!</title>
+                 </resources>
+                 <resources>
+                       <name>my-cool-image-:counter</name>
+                       <params>
+                         <byline>bep</byline>
+                       </params>
+                       <src>**.png</src>
+                       <title>TOML: The Image #:counter</title>
+                 </resources>
+                 <title>Test Metadata</title>
+               </root>
+               `
 
                variants := []struct {
                        format string
@@ -93,6 +112,7 @@ title: Test Metadata
                        {"TOML", tomlExample},
                        {"Toml", tomlExample},
                        {" TOML ", tomlExample},
+                       {"XML", xmlExample},
                }
 
                for _, v1 := range variants {
index 85e3610d155ec7c3d53a738f822c7563f0c6b6bd..fb0e446c338ee7c64b15fec8f35c8368be5281bb 100644 (file)
@@ -111,6 +111,9 @@ func TestUnmarshal(t *testing.T) {
                {testContentResource{key: "r1", content: `slogan = "Hugo Rocks!"`, mime: media.TOMLType}, nil, func(m map[string]interface{}) {
                        assertSlogan(m)
                }},
+               {testContentResource{key: "r1", content: `<root><slogan>Hugo Rocks!</slogan></root>"`, mime: media.XMLType}, nil, func(m map[string]interface{}) {
+                       assertSlogan(m)
+               }},
                {testContentResource{key: "r1", content: `1997,Ford,E350,"ac, abs, moon",3000.00
 1999,Chevy,"Venture ""Extended Edition""","",4900.00`, mime: media.CSVType}, nil, func(r [][]string) {
                        c.Assert(len(r), qt.Equals, 2)