Some notes to bring us one step closer to a real RFC for Atom autodiscovery, suitable for submission to a real standards body like the IETF (or wherever Atom ends up). Because, you know, it would be nice to have real specs that don’t only exist on sites with cat pictures on them.
One of the things I’ve learned recently is that the rel and type attribute values in the <link> element are case-insensitive. I have some examples that illustrate a surprising number of variations that conform to the HTML and XHTML specifications. I’m not just making this up because I like things to be complicated; this is how (X)HTML works.
The purpose of Atom autodiscovery is for clients to find the location of a web site’s Atom feed, knowing only the location of the site. For example, an end user wishes to subscribe to the Atom feed of a site. Their Atom-aware aggregator client prompts them to enter the home page of the site. The client retrieves the home page, finds the Atom autodiscovery <link> element, and then retrieves the Atom feed. If the client plans to retrieve the Atom feed more than once, the client SHOULD cache the URI of the Atom feed.
Clients could also set up local proxies that monitor the web sites the end user visits, and notifies the end user when an Atom feed is discovered. Atom autodiscovery support could also be built into future versions of web browsers.
If clients wish to support Atom autodiscovery, they MUST support all of the variations listed here.
Location of an autodiscovery <link> element
- MAY appear within the <head> element of an HTML or XHTML document, but MUST NOT appear within the <body> [HTML4, 12.3]
Structure of an autodiscovery <link> element
relattribute MUST be present- value is a space-separated list of keywords [HTML4, 6.12]
- value list MUST contain the keyword “alternate” or some case-variation of it
- value is case-insensitive [HTML4, 6.12]
typeattribute MUST be present- value MUST be “application/atom+xml” or some case-variation of it
- value is case-insensitive [HTML4, 6.7]
hrefattribute MUST be present- value is URI for Atom feed [AtomFormat02]
- value is case-sensitive [HTML4, 6.4]
- value MUST be a URI
- value MAY be a relative URI [HTML4, 6.4]
- before dereferencing the URI, clients MUST resolve it to a full URI [RFC 1808 section 3] using the current document’s base URI [HTML4, 12.4]
titleattribute MAY be present- governed by [HTML4, 7.4.3]
- if present, clients MAY present the title to the end user
other attributes MAY be present, as specified in [HTML4, 12.3], but they are not used by this specification and clients MAY ignore them
Multiple autodiscovery <link> elements
- An HTML document MAY contain one or more Atom autodiscovery <link> elements
- Each autodiscovery <link> element SHOULD point to a different Atom feed
- If multiple autodiscovery <link> elements are present, each element SHOULD include a descriptive title in the title attribute. Clients MAY use these titles to present a list of available Atom feeds to the end user.
Compatibility with different versions of HTML and XHTML
HTML 2/3.2/4.0/4.01
- use <link> (empty element, no end tag)
- element names are not case-sensitive
- attribute names are not case-sensitive
- quotes around attribute values are optional
XHTML 1.0/1.1
- use <link … /> (empty element, Appendix-C-compatible end slash with leading space [XHTML1, C.2])
- element and attribute names are case-sensitive and MUST be lowercase [XHTML1, 4.2]
- attribute values MUST be quoted [XHTML1, 4.4]
Examples
Each of the following examples assume an autodiscovery <link> element in an HTML document located at http://www.example.com/. They each reference an Atom feed at http://www.example.com/xml/atom.xml.
<link rel="alternate" type="application/atom+xml" href="http://www.example.com/xml/atom.xml"><link rel="alternate" type="application/atom+xml" href="xml/atom.xml"><link rel="alternate" type="application/atom+xml" href="/xml/atom.xml"><link rel=alternate type=application/atom+xml href=http://www.example.com/xml/atom.xml><link rel="AlTeRnAtE" type="application/atom+xml" href="http://www.example.com/xml/atom.xml"><link rel="alternate" type="APPLICATION/ATOM+XML" href="http://www.example.com/xml/atom.xml"><LINK REL="alternate" TYPE="application/atom+xml" HREF="http://www.example.com/xml/atom.xml"><link href="http://www.example.com/xml/atom.xml" type="APPLICATION/ATOM+XML" rel="alternate"><link rel="alternate foo" type="application/atom+xml" href="http://www.example.com/xml/atom.xml"><link rel="foo alternate" type="application/atom+xml" href="http://www.example.com/xml/atom.xml"><link rel="foo alternate bar" type="application/atom+xml" href="http://www.example.com/xml/atom.xml">
Each of the following examples assume an autodiscovery <link> element in an XHTML document located at http://www.example.com/. They each reference an Atom feed at http://www.example.com/xml/atom.xml.
<link rel="alternate" type="application/atom+xml" href="http://www.example.com/xml/atom.xml" /><link rel="alternate" type="application/atom+xml" href="xml/atom.xml" /><link rel="alternate" type="application/atom+xml" href="/xml/atom.xml" /><link rel="AlTeRnAtE" type="application/atom+xml" href="http://www.example.com/xml/atom.xml" /><link rel="alternate" type="APPLICATION/ATOM+XML" href="http://www.example.com/xml/atom.xml" /><link href="http://www.example.com/xml/atom.xml" type="APPLICATION/ATOM+XML" rel="alternate" /><link rel="alternate foo" type="application/atom+xml" href="http://www.example.com/xml/atom.xml" /><link rel="foo alternate" type="application/atom+xml" href="http://www.example.com/xml/atom.xml" /><link rel="foo alternate bar" type="application/atom+xml" href="http://www.example.com/xml/atom.xml" />
The following example is a complete HTML document located at http://www.example.com/. It references an Atom feed at http://www.example.com/?format=atom. It uses a relative URI with a query string.
<html>
<head>
<link rel=”alternate” type=”application/atom+xml” href=”?format=atom”>
</link>
</head>
<body>
</body>
</html>The following example is a complete HTML document located at http://www.example.com/. It references an Atom feed at http://example.org/atom.xml. It uses a relative URI which is relative to the base URI specified in the <base> element.
<html>
<head>
<base href=”http://example.org/”>
<link rel=”alternate” type=”application/atom+xml” href=”atom.xml”>
</link>
</head>
<body>
</body>
</html>The following example is a complete HTML document located at http://www.example.com/. It references multiple Atom feeds, located at http://www.example.com/xml/atom.xml, http://www.example.com/xml/comments.xml, and http://example.org/atom.xml respectively. It uses the optional title attribute to label each feed.
<html>
<head>
<link rel=”alternate” type=”application/atom+xml” title=”Main Atom feed” href=”/xml/atom.xml”>
<link rel=”alternate” type=”application/atom+xml” title=”Recent comments feed” href=”/xml/comments.xml”>
<link rel=”alternate” type=”application/atom+xml” title=”Atom feed (mirror)” href=”http://example.org/atom.xml”>
</link>
</head>
<body>
</body>
</html>


You MAY think that was too many examples, but you SHOULD not whine about it.
…Well, it’s too late now, isn’t it? :)
Great to have this from the beginning.
Comment by Jesper — Friday, December 19, 2003 @ 2:53 am
Small question - here you’ve got autodiscovery only using rel=”alternate”, where presumably what’s being referred to is an alternate representation of this_page. Might it not be useful to allow the other available values, e.g. rel=”Index” (with type=”application/atom+xml”), to take you to the Atom format site index?
Apart from this, it looks like you’ve nailed down the syntax (good man!), but what isn’t clear is how Atom tools are expected to interpret it - i.e. clients MUST support the variants, but support how? - clients MAY use titles to present a list is all that’s here I think.
Another aspect that might be worth considering while you’re around here is that this is for discovery from (X)HTML, how about other XML languages, e.g. OpenOffice, RSS. My guess is that a namespace-qualified version of what you have here for XHTML would do the trick.
Comment by Danny Ayers — Friday, December 19, 2003 @ 7:37 am
I second #2, Dannys Ayers point. For example you could have <link rel=”alternate” type=”application/attom+xml” href=”http://example.org/item?itemId=1&type=atom“ /> pointing to the atom version of a weblog entry say while <link rel=”contents” type=”application/attom+xml” href=”http://example.org/?type=atom“ /> points to a “table of contents” or feed for the site.
Thining about it some more I think that you should allow multiple rel values (in a space seperated list) and change the wording you have used to say the rel attributes value ‘MUST be “alternate”‘ to ‘MUST “contain alternate”‘. Then we could use <link rel=”alternate contents” type=”application/attom+xml” href=”http://example.org/?type=atom“ /> which would preserve the use of “alternate” yet allowing it to be supplemented with additional information that could be used by aggregators etc.
Comment by Ben Meadowcroft — Friday, December 19, 2003 @ 8:05 am
Before any one calls me on “attom” instead of “atom”, yes I know, it’s a mistake, get over it ;-), that’s why I like the editable comments over at http://bitworking.org
Comment by Ben Meadowcroft — Friday, December 19, 2003 @ 8:07 am
The “contains ‘alternate’” is a good point, because the HTML spec states that @rel is a space-separated list of values. I will make that correction.
I am not interested in documenting other uses of the LINK tag at this time, such as pointing to an index of feeds. That’s a good idea, and I am inclined to agree with your syntax proposal, but you should bring it up in atom-syntax. This is just a formalization of what has already been decided.
Comment by Mark — Friday, December 19, 2003 @ 9:45 am
atomautodiscovery.py is missing a couple of lines:
def getLinks(data, baseuri):
return p.links
should be;
def getLinks(data, baseuri):
p = AutodiscoveryLinkParser(baseuri)
p.feed(data)
return p.links
Comment by Kevin Marks — Saturday, December 20, 2003 @ 5:39 am
Kevin: fixed, thanks.
Comment by Mark — Saturday, December 20, 2003 @ 4:57 pm
Yet another reason to make Atom the native model for my next version of NewsDesk. You’ve done a great job with Atom and autodiscovery. Thanks!
Comment by David Peckham — Sunday, December 21, 2003 @ 1:02 am
Hey there, Dave! I was watching my access logs all day yesterday with “tail -f access.log | grep /tests” and saw you finally pass them around midnight. Congratulations! I think you’re the first.
Boy, that #52 is a bitch, isn’t it? ;)
Comment by Mark — Sunday, December 21, 2003 @ 6:18 am
Yep :) And I started at 2am Saturday!
Are you planning to do a test suite for other Atom features?
Comment by David Peckham — Sunday, December 21, 2003 @ 9:28 pm
David: absolutely!
We ♥ unit testing.
Comment by Mark — Sunday, December 21, 2003 @ 9:43 pm