Subscribing to a journal page
This document describes a convention to subscribe to a journal written in HTML, without using a full-fledged syndication technology like Atom or RSS. It's a lightweight alternative to the hAtom microformat. This convention will obviously be less powerful than more advanced technologies such as Atom and will not work as well for other use-cases than maintaining a journal. Nothing prevents authors from simultaneously publishing an Atom feed if they wish. This convention can ease the generation of said feeds.
The remainder of this document describes how to interpret a single text/html document as if it were an Atom feed with all required elements present. This is to demonstrate how simple automatic generation of Atom feed is possible.
Feed elements
The URL from which the text/html document is fetched serves as the feed's "id" element and the recommended "link" element.
The contents of the first h1
tag in the page serves as the feed's required "title" element. Authors are encouraged to use titles which provide their own context, e.g. "m15o's journal" rather than "My journal".
A feed's required "updated" element should be set equal to the most recent value from all the associated entry's required "updated" elements. If no entries can be extracted from the document, then the feed is empty, and the feed's "updated" element should be set equal to the time the document was fetched.
Entry elements
A feed's entry elements are derived from a subset of the journal's article
tags, if any are present.
Each article
tag with a child h2
whose first 10 characters correspond to a date in ISO 8601 format (i.e. YYYY-MM-DD) represents a single entry. article
tags which do not meet this criteria are ignored.
An entry's required "title" element is equal to value of its h2
tag.
An entry's required "id" element is equal to the concatenation of the feed's "id", a # character, and the entry's title (e.g. feed_id#title). Any space character should be converted to "-".
An entry's required "updated" element is noon UTC on the day indicated by the 10 character date stamp at the beginning of the corresponding h2
line's label.
An entry's "content" element is equal to the inner HTML of the enclosing article
tag, from which the h2
node has been removed. It should be of type "html".
Example
Here's an extract from this site's journal:
<h1>m15o's Journal</h1> <article> <h2>2022-06-09</h2> <p>Just added a page about the technical <a href="stack.html">stack</a> I'm using to build my <a href="projects.html">projects</a>.</p> </article> <article> <h2>2022-06-08</h2> <p>Wrote a page about <a href="small-net.html">the small net</a>.</p> </article>
Credits
This spec has been heavily inspired by gmisub.
Backlinks:
blog html journal journal subscribing to a blog h2 instead of h1 in html journal