| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207 |
- // SPDX-FileCopyrightText: Copyright The Miniflux Authors. All rights reserved.
- // SPDX-License-Identifier: Apache-2.0
- package atom // import "miniflux.app/v2/internal/reader/atom"
- import (
- "encoding/xml"
- "html"
- "strings"
- "miniflux.app/v2/internal/reader/media"
- )
- // The "atom:feed" element is the document (i.e., top-level) element of
- // an Atom Feed Document, acting as a container for metadata and data
- // associated with the feed. Its element children consist of metadata
- // elements followed by zero or more atom:entry child elements.
- //
- // Specs:
- // https://tools.ietf.org/html/rfc4287
- // https://validator.w3.org/feed/docs/atom.html
- type atom10Feed struct {
- XMLName xml.Name `xml:"http://www.w3.org/2005/Atom feed"`
- // The "atom:id" element conveys a permanent, universally unique
- // identifier for an entry or feed.
- //
- // Its content MUST be an IRI, as defined by [RFC3987]. Note that the
- // definition of "IRI" excludes relative references. Though the IRI
- // might use a dereferencable scheme, Atom Processors MUST NOT assume it
- // can be dereferenced.
- //
- // atom:feed elements MUST contain exactly one atom:id element.
- ID string `xml:"http://www.w3.org/2005/Atom id"`
- // The "atom:title" element is a Text construct that conveys a human-
- // readable title for an entry or feed.
- //
- // atom:feed elements MUST contain exactly one atom:title element.
- Title atom10Text `xml:"http://www.w3.org/2005/Atom title"`
- // The "atom:subtitle" element is a Text construct that
- // contains a human-readable description or subtitle for the feed.
- Subtitle atom10Text `xml:"http://www.w3.org/2005/Atom subtitle"`
- // The "atom:author" element is a Person construct that indicates the
- // author of the entry or feed.
- //
- // atom:feed elements MUST contain one or more atom:author elements,
- // unless all of the atom:feed element's child atom:entry elements
- // contain at least one atom:author element.
- Authors atomPersons `xml:"http://www.w3.org/2005/Atom author"`
- // The "atom:icon" element's content is an IRI reference [RFC3987] that
- // identifies an image that provides iconic visual identification for a
- // feed.
- //
- // atom:feed elements MUST NOT contain more than one atom:icon element.
- Icon string `xml:"http://www.w3.org/2005/Atom icon"`
- // The "atom:logo" element's content is an IRI reference [RFC3987] that
- // identifies an image that provides visual identification for a feed.
- //
- // atom:feed elements MUST NOT contain more than one atom:logo element.
- Logo string `xml:"http://www.w3.org/2005/Atom logo"`
- // atom:feed elements SHOULD contain one atom:link element with a rel
- // attribute value of "self". This is the preferred URI for
- // retrieving Atom Feed Documents representing this Atom feed.
- //
- // atom:feed elements MUST NOT contain more than one atom:link
- // element with a rel attribute value of "alternate" that has the
- // same combination of type and hreflang attribute values.
- Links atomLinks `xml:"http://www.w3.org/2005/Atom link"`
- // The "atom:category" element conveys information about a category
- // associated with an entry or feed. This specification assigns no
- // meaning to the content (if any) of this element.
- //
- // atom:feed elements MAY contain any number of atom:category
- // elements.
- Categories atomCategories `xml:"http://www.w3.org/2005/Atom category"`
- Entries []atom10Entry `xml:"http://www.w3.org/2005/Atom entry"`
- }
- type atom10Entry struct {
- // The "atom:id" element conveys a permanent, universally unique
- // identifier for an entry or feed.
- //
- // Its content MUST be an IRI, as defined by [RFC3987]. Note that the
- // definition of "IRI" excludes relative references. Though the IRI
- // might use a dereferencable scheme, Atom Processors MUST NOT assume it
- // can be dereferenced.
- //
- // atom:entry elements MUST contain exactly one atom:id element.
- ID string `xml:"http://www.w3.org/2005/Atom id"`
- // The "atom:title" element is a Text construct that conveys a human-
- // readable title for an entry or feed.
- //
- // atom:entry elements MUST contain exactly one atom:title element.
- Title atom10Text `xml:"http://www.w3.org/2005/Atom title"`
- // The "atom:published" element is a Date construct indicating an
- // instant in time associated with an event early in the life cycle of
- // the entry.
- Published string `xml:"http://www.w3.org/2005/Atom published"`
- // The "atom:updated" element is a Date construct indicating the most
- // recent instant in time when an entry or feed was modified in a way
- // the publisher considers significant. Therefore, not all
- // modifications necessarily result in a changed atom:updated value.
- //
- // atom:entry elements MUST contain exactly one atom:updated element.
- Updated string `xml:"http://www.w3.org/2005/Atom updated"`
- // atom:entry elements MUST NOT contain more than one atom:link
- // element with a rel attribute value of "alternate" that has the
- // same combination of type and hreflang attribute values.
- Links atomLinks `xml:"http://www.w3.org/2005/Atom link"`
- // atom:entry elements MUST contain an atom:summary element in either
- // of the following cases:
- // * the atom:entry contains an atom:content that has a "src"
- // attribute (and is thus empty).
- // * the atom:entry contains content that is encoded in Base64;
- // i.e., the "type" attribute of atom:content is a MIME media type
- // [MIMEREG], but is not an XML media type [RFC3023], does not
- // begin with "text/", and does not end with "/xml" or "+xml".
- //
- // atom:entry elements MUST NOT contain more than one atom:summary
- // element.
- Summary atom10Text `xml:"http://www.w3.org/2005/Atom summary"`
- // atom:entry elements MUST NOT contain more than one atom:content
- // element.
- Content atom10Text `xml:"http://www.w3.org/2005/Atom content"`
- // The "atom:author" element is a Person construct that indicates the
- // author of the entry or feed.
- //
- // atom:entry elements MUST contain one or more atom:author elements
- Authors atomPersons `xml:"http://www.w3.org/2005/Atom author"`
- // The "atom:category" element conveys information about a category
- // associated with an entry or feed. This specification assigns no
- // meaning to the content (if any) of this element.
- //
- // atom:entry elements MAY contain any number of atom:category
- // elements.
- Categories atomCategories `xml:"http://www.w3.org/2005/Atom category"`
- media.MediaItemElement
- }
- // A Text construct contains human-readable text, usually in small
- // quantities. The content of Text constructs is Language-Sensitive.
- // Specs: https://datatracker.ietf.org/doc/html/rfc4287#section-3.1
- // Text: https://datatracker.ietf.org/doc/html/rfc4287#section-3.1.1.1
- // HTML: https://datatracker.ietf.org/doc/html/rfc4287#section-3.1.1.2
- // XHTML: https://datatracker.ietf.org/doc/html/rfc4287#section-3.1.1.3
- type atom10Text struct {
- Type string `xml:"type,attr"`
- CharData string `xml:",chardata"`
- InnerXML string `xml:",innerxml"`
- XHTMLRootElement atomXHTMLRootElement `xml:"http://www.w3.org/1999/xhtml div"`
- }
- func (a *atom10Text) body() string {
- var content string
- if strings.EqualFold(a.Type, "xhtml") {
- content = a.xhtmlContent()
- } else {
- content = a.CharData
- }
- return strings.TrimSpace(content)
- }
- func (a *atom10Text) title() string {
- var content string
- switch {
- case strings.EqualFold(a.Type, "xhtml"):
- content = a.xhtmlContent()
- case strings.Contains(a.InnerXML, "<![CDATA["):
- content = html.UnescapeString(a.CharData)
- default:
- content = a.CharData
- }
- return strings.TrimSpace(content)
- }
- func (a *atom10Text) xhtmlContent() string {
- if a.XHTMLRootElement.XMLName.Local == "div" {
- return a.XHTMLRootElement.InnerXML
- }
- return a.InnerXML
- }
- type atomXHTMLRootElement struct {
- XMLName xml.Name `xml:"div"`
- InnerXML string `xml:",innerxml"`
- }
|