So you want to add a web feed
You’ve decided you want a web feed for your website’s content.
Some popular CMSes have you covered with built-in feeds. For WordPress, an RSS feed is available by default at your-website.com/feed
, so it’s just a matter of making the feed visible in your HTML templates and you’re good to go.
But if you have to implement one from scratch, this guide goes through the basics. To keep it short, it has opinions without much space offered to alternatives and nuance.
Use the Atom format
The two most popular web feed formats are RSS and Atom, both XML-based. The best format for new feeds is Atom. Here’s a minimal feed with one entry to illustrate the elements you need to use:
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Dan Burzo: Posts</title>
<link href="https://danburzo.ro/posts.xml" rel="self" />
<link href="https://danburzo.ro/" />
<updated>2023-10-07T00:00:00Z</updated>
<id>https://danburzo.ro/</id>
<author>
<name>Dan Burzo</name>
</author>
<entry>
<id>https://danburzo.ro/add-a-web-feed/</id>
<title>So you want to add a web feed for your website</title>
<link href="https://danburzo.ro/add-a-web-feed/" />
<updated>2023-10-07T00:00:00Z</updated>
<content type="html">
<!-- entry content goes here -->
</content>
</entry>
</feed>
The feed has a set of fields at the top level, and another set of fields for each <entry>
. The guidelines below apply to both.
Good IDs
IDs are used to identify the feed and the entries. Feed readers use the entry ID to remember when you’ve read or starred a post. Use the post’s URL as the ID for the corresponding entry, and the website’s URL as the feed ID, and then try to never change them.
Date format
There’s a dizzying array of standard date/time formats, as shown in this diagram that illustrates RFC 3339 vs. ISO 8601 vs. W3C formats. Atom narrows it down significantly: all dates in the feed must be in RFC 3339 date-time format with uppercase delimiters. That means we need all these stringed together:
- The date in
YYYY-MM-DD
format, eg.2023-10-07
- The uppercase
T
separator - The time in
HH:MM:SS
format, eg.21:30:00
- Either the timezone offset, eg.
+03:00
, or uppercaseZ
for UTC time.
For example: October 7th, 2023 at 21:30, UTC time is expressed as 2023-10-07T21:30:00Z
.
The one date element that’s required for an entry is <updated>
. You can distinguish the initial publish date from the last update date by mixing in a <published>
element. On the top-level feed, <updated>
is the date of the latest change to any of the entries.
HTML content
Since XML and HTML share the syntax to some extent, adding raw HTML to the <content>
element will trip up the XML parser. To prevent it, you can escape the XML-sensitive characters <
, >
, and &
to their named entities <
, >
, and &
respectively.
Or, skip the text processing and wrap the HTML content as-is in a character data (CDATA) section:
<content type="html"><![CDATA[ html content goes here ]]></content>
Careful: even ‘plain-text’ fields such as <title>
can break XML parsing if an unaccounted-for ampersand or <
ends up in it, so it’s a good idea to handle it similarly to <content>
.
Absolute links
All links to resources must use absolute URLs, or the feed must contain attributes that help feed readers resolve any relative URLs they encounter.
Technically, the Atom specification allows the xml:base
attribute on any feed element to define the base URL against which to resolve relative URLs within the scope of that element:
<content type="html" xml:base="https://danburzo.ro/add-a-web-feed/">
<![CDATA[
html content goes here
]]>
</content>
How well xml:base
works in practice depends on the feed reader. For best compatibility, put absolute URLs everywhere, to the extent that your setup permits: the links to entries and the feed itself, and for all resources in the HTML content.
Replacing relative URLs with absolute counterparts in HTML content is not straightforward: it’s not just the href
s and the src
s, but things like srcset
that have their own little syntax going on. So it’s worth noting that feed readers are free to look at the entry’s <link>
, along with the xml:base
attribute, to resolve relative URLs. As long as everything higher level uses absolute URLs, you’re probably fine to ship relative URL inside the <content>
.
Check that the feed is valid
You can paste your Atom feed content into the W3C Feed Validator to check that everything has been generated correctly, or get very good guidance on fixing any error it runs into.
After you’ve published your feed, add it to your feed reader to keep an eye. I use NetNewsWire on Mac and iOS because it aligns with my idea of good software, but also because it’s relatively free of heuristics and hand-holding, so it tends to surface the quirks of an artisanal feed pretty quickly.
Put the feed on your server
Depending on your setup, the feed may be generated on the fly on a dedicated URL as with WordPress’s /feed/
, or stored as a physical file such as posts.xml
.
The correct media type for Atom feeds is application/atom+xml
, and you may have your server set the appropriate Content-Type
response header. Doing so provokes both useful and somewhat annoying behaviors for visitors depending on their browser, so it’s not a wholehearted recommendation.
In the case of a physical XML file such as posts.xml
, I find it works fine to serve it as a regular application/xml
file. Feed readers will figure it out.
Link to the feed in HTML
Now that the feed exists, all that’s left is to link to it. You can add a <link rel=alternate>
element in <head>
, with the appropriate media type. This enables feed readers to extract the feed URL from the web page, making it easiers for visitors to subscribe.
<!doctype html>
<html>
<head>
<title>Dan Burzo</title>
<link rel='alternate' type='application/atom+xml' href='https://danburzo.ro/posts.xml'>
</head>
</html>
Although you can use more than one feed <link>
, some readers will only pick the first they find and silently ignore the rest, so make sure the most important feed is first in line.
For better discoverability, even if the feed refers to the /blog
section of your website, put the <link>
on all pages of your website, most importantly your homepage, which is how most users will try to add your website to their feed readers.
Also include visible links to feeds in your site’s footer, labeled as such to make it findable with the browser’s search function:
<a href='https://danburzo.ro/posts.xml'>Feed (Atom)</a>
With these in place, man and machine alike can find your feeds.
Further reading
- RFC 4287: The Atom Syndication Format
- RSS Feed Best Practises by Kevin Cox
- Introduction to Atom from the W3C Feed Validator