Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XML as a support data format #4470

Closed
bzerangue opened this issue Mar 4, 2018 · 15 comments · Fixed by #9044
Closed

Add XML as a support data format #4470

bzerangue opened this issue Mar 4, 2018 · 15 comments · Fixed by #9044

Comments

@bzerangue
Copy link

bzerangue commented Mar 4, 2018

It would be nice to provide XML as an available data type to load.

Currently, Hugo accepts, TOML, YAML, JSON, and CSV. Is there any reason why XML is NOT amongst the available formats?

It would be great to add XML as an available data type for the /data folder.

@bep bep added the Enhancement label Mar 4, 2018
@rdwatters
Copy link
Contributor

@bzerangue This would be fantastic. Not sure if something like https://github.com/beevik/etree would help said efforts...

@stale
Copy link

stale bot commented Jul 3, 2018

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help.
If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open.
If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.
This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

@stale stale bot added the Stale label Jul 3, 2018
@stale stale bot closed this as completed Aug 2, 2018
@bep bep reopened this Aug 2, 2018
@stale stale bot removed the Stale label Aug 2, 2018
@bep bep added the Keep label Aug 2, 2018
@fbaube
Copy link

fbaube commented Aug 5, 2018

+1 on this. Type "XML" could then act as a flag to invoke DOCTYPE parsing and optional validation.

@kaushalmodi
Copy link
Contributor

kaushalmodi commented Nov 6, 2018

Having a getXML similar to getJSON would be useful in creating "planets" from feeds. Ref: https://discourse.gohugo.io/t/anyone-here-interested-in-a-hugo-planet/15092.

It would be great to add XML as an available data type for the /data folder.

Not just the data/ folder, my proposal is to allow remote XML fetching. Once RSS/ATOM feeds can be fetched, we can maintain theme components that do the parsing of the fetched XML feeds.

@bep bep changed the title Add XML as a Data File Type Add XML as a support data format Dec 22, 2018
@bep bep modified the milestones: v0.53, v0.54 Dec 22, 2018
@bep
Copy link
Member

bep commented Dec 22, 2018

This looks promising: https://github.com/clbanning/mxj

But note that mapping arbitrary XML into a map isn't trivial.

@bep bep modified the milestones: v0.54, v0.55, v0.56 Jan 26, 2019
@stephlocke
Copy link

This'll be great for surfacing RSS from other sources like Medium

@raybellis
Copy link

+1 from me, too - I'd like to be able to include references from the RFC index, but it's only published in XML format.

@fbaube
Copy link

fbaube commented Apr 29, 2019

I'm working on a Go library to parse mixed content. If anyone can propose an API (i.e. interfaces) that would be useful for Hugo, I could give it a whack.

@stephlocke
Copy link

On the RSS specific side, I currently leverage a site that parses RSS to JSON for consumption

{{ $rssJ := getJSON "https://api.rss2json.com/v1/api.json?rss_url=https%3A%2F%2Fmedium.com%2Ffeed%2Fnightingale-hq" }}


{{ range $rssJ.items }}
    {{ $post := . }}
    <div class="col-12 col-md-6 col-lg-4 mb-2 mr-2 pb-8 blog" style="background-image: url({{ $post.thumbnail }});">
    <a href="{{ $post.link }}" target="_blank" alt="{{ $post.title }}">
    <h2>{{ $post.title }}</h2>
    </a>
    <h3>{{ $post.author }}</h3>
    </div>
{{ end }}

@bep bep modified the milestones: v0.56, v0.57 Jun 14, 2019
@bep bep modified the milestones: v0.57, v0.58 Jul 31, 2019
@bep bep modified the milestones: v0.58, v0.59 Aug 15, 2019
@bep bep removed this from the v0.59 milestone Sep 6, 2019
@bep bep modified the milestones: v0.76, v0.77 Oct 6, 2020
@bep bep modified the milestones: v0.77, v0.78 Oct 30, 2020
@bep bep modified the milestones: v0.78, v0.83 Apr 23, 2021
@bep bep modified the milestones: v0.83, v0.84 May 1, 2021
@bep bep modified the milestones: v0.84, v0.85 Jun 18, 2021
@bep bep modified the milestones: v0.85, v0.86 Jul 5, 2021
@bep bep modified the milestones: v0.86, v0.87, v0.88 Jul 26, 2021
@bep bep modified the milestones: v0.88, v0.89 Sep 2, 2021
@vanbroup
Copy link
Contributor

https://github.com/antchfx/xmlquery seems to be a good fit to add XML data source support.

xmlquery is an XPath query package for XML documents, allowing you to extract data or evaluate from XML documents with an XPath expression.

xmlquery has a built-in query object caching feature that caches recently used XPATH query strings. Enabling caching can avoid recompile XPath expression for each query.

I created a simple PR with a working implementation: #9031

@vanbroup
Copy link
Contributor

#9044 is another attempt to get XML data support

While this implementation provides less flexibility than my previous implementation using xmlquery (#9031), this one doesn't come with its own API and is much easier to use (equal to getJSON):

Sample usage:

{{ with getXML "https://www.w3schools.com/xml/note.xml" }}
{{ .note.body }}
{{ end }}

A common usage:

{{ with getXML "https://example.com/feed.rss" }}
{{ range .rss.channel.item }}
    <strong>{{ .title | plainify | htmlUnescape }}</strong><br />
    <p>{{ .description | plainify | htmlUnescape }}</p>
    {{ $link := .link | plainify | htmlUnescape }}
    <a href="{{ $link }}">{{ $link }}</a><br />
    <hr>
{{ end }}
{{ end }}

@bep bep modified the milestones: v0.89, v0.90 Nov 2, 2021
@swamidass
Copy link

It would also be good for reading RSS feeds.

@bep bep closed this as completed in #9044 Dec 2, 2021
bep pushed a commit that referenced this issue Dec 2, 2021
Example:

```
{{ with resources.Get "https://example.com/rss.xml" | transform.Unmarshal }}
    {{ range .channel.item }}
        <strong>{{ .title | plainify | htmlUnescape }}</strong><br />
        <p>{{ .description | plainify | htmlUnescape }}</p>
        {{ $link := .link | plainify | htmlUnescape }}
        <a href="{{ $link }}">{{ $link }}</a><br />
        <hr>
    {{ end }}
{{ end }}
```

Closes #4470
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.