Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 7, 2013

RSSOwl and Feed Validation

Filed under: RSS,XML — Patrick Durusau @ 6:17 pm

I rather hate to end the day on a practical note, ;-), but after going off Google Reader, I started using RSSOwl.

I have been adding feeds to RSSOwl but there were two that simply refused to load.

Feed Validator reported the feed was:

not well-formed (invalid token)

with a pointer to the letter “f” in the word “find.”

Helpful but not a bunch.

Captured the feed as XML and loaded it into oXygen.

A form feed character was immediately in front of the “f” in “fine” but of course was not displaying.

Culprit in one case was a form feed character, 0xc and in the other, end of text, 0x03.

ASCII characters 0 — 31 and 127 are non-printing control characters called CO controls.

Of the CO control characters, only carriage return (0x0d), linefeed (0x0a) and horizontal tab (0x09) can appear in an XML feed.

For loading and parsing RSS feeds into a topic map, you may want to filter for CO controls that should not appear in the XML feed.

PS: I suspect in both cases the control characters were introduced by copy-n-paste operations.

1 Comment

  1. […] that sounds like amusing but ancient history, recall in RSSOwl and Feed Validation a single errant control character in an RSS feed makes RSSOwl refuse the entire […]

    Pingback by …The More Things Stay the Same (TECO Line Editor) « Another Word For It — May 3, 2013 @ 8:56 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress