Problems with RSS as it is deployed

I have a some longstanding issues with RSS for example the method for RSS autodiscovery, however the two most important problems with respect to RSS are:

  1. Entity encoding in the <description> element.
  2. Resolving relative URLs.

As I use a decent news aggregator, I don’t suffer from the second problem. The first problem however is something that should interest us all. As Tim Bray notes, entity encoding in the description element and then expecting the encoding to be resolved back is prone to errors. This is due to the under specified nature of the various RSS branches and people just doing it in an effort to crowbar HTML (not necessarily well formed XHTML fragments) into the early RSS deployments.

How to do it right! In order to include even html in your RSS then there are a few steps you need to take.

Step 1: Convert to RSS 1 or RSS 2, earlier 0.9x versions do not support what I am proposing here.
Step 2: Include the <encoded> element from the RSS 1.0 content module namespace, using the namespace prefix “content” as in <content:encoded> will work in more readers.
Step 3: Wrap your content in a CDATA section and put the result into the <content:encoded> element.
Step 4: Ensure the result is well formed XML.

This solution can be used to ensure that the content is included in an element recognised as holding encoded data, rather than the much abused description element. This is the method I use for my own feed which you can take a look at it to get some ideas.