<p><b>Dave Johnson (</b><b><aclass="externalLink"href="http://www.rollerweblogger.org/">The Roller Weblogger</a></b><b>) has written a very nice blog</b><b><aclass="externalLink"href="http://www.rollerweblogger.org/page/roller/20040808#how_rome_works">How Rome Works</a></b><b>describing (as the title says) how Rome works. With his permission we are adding it to Rome documentation.</b></p>
<p>I spent some time exploring the new <aclass="externalLink"href="http://rome.dev.java.net/">Rome</a> feed parser for Java and trying to understand how it works. Along the way, I put together the following class diagram and notes on the parsing process. I provide some pointers into the <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/overview-summary.html">Rome 0.4 Javadocs</a>.</p>
<p>You don't need to know this stuff to use Rome, but it you are interested in internals you might find it interesting.</p>
<divclass="section">
<h3>Notes on the Rome parsing process<aname="Notes_on_the_Rome_parsing_process"></a></h3>
<p>Rome is based around an idealized and abstract model of a Newsfeed or "Syndication Feed." Rome can parse any format of Newsfeed, including RSS variants and Atom, into this model. Rome can convert from model representation to any of the same Newfeed output formats.</p>
<p>Internally, Rome defines intermediate object models for specific Newsfeed formats, or "Wire Feed" formats, including both Atom and all RSS variants. For each format, there is a separate JDOM based parser class that parses XML into an intermediate model. Rome provides "converters" to convert between the intermediate Wire Feed models and the idealized Syndication Feed model.</p>
<p>Rome makes no attempt at <aclass="externalLink"href="http://www.xml.com/pub/a/2003/01/22/dive-into-xml.html">Pilgrim-style liberal XML parsing</a>. If a Newsfeed is not valid XML, then Rome will fail. Perhaps, as <aclass="externalLink"href="http://www.peerfear.org/rss/permalink/2003/01/23/1043368363-Smart_Parsing__Not_RSS_Parsing.shtml">Kevin Burton suggests</a>, parsing errors in Newsfeeds can and should be corrected. Kevin suggests that, when the parse fails, you can correct the problem and parse again. (BTW, I have some sample code that shows how to do this, but it only works with Xerces - Crimsom's SAXParserException does not have reliable error line and column numbers.)</p>
<p>Here is what happens during Rome Newsfeed parsing:</p><imgsrc="HowRomeWorks.png"alt=""/>
<olstyle="list-style-type: decimal">
<li>Your code calls <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/SyndFeedInput.html">SyndFeedInput</a> to parse a Newsfeed, for example (see also <ahref="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a>):
<divclass="source">
<pre>URL feedUrl = new URL("file:blogging-roller.rss");
<li>SyndFeedInput delegates to WireFeedInput to do the actual parsing.</li>
<li><aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/WireFeedInput.html">WireFeedInput</a> uses a PluginManager of class FeedParsers to pick the right parser to use to parse the feed and then calls that parser to parse the Newsfeed.</li>
<li>The appropriate parser parses the Newsfeed parses the feed, using <aclass="externalLink"href="http://www.jdom.org/">JDom</a>, into a <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/WireFeed.html">WireFeed</a>. If the Newsfeed is in an RSS format, the the WireFeed is of class <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/Channel.html">Channel</a> and contains Items, Clouds, and other RSS things from the <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/package-summary.html">com.rometools.rome.feed.rss</a> package. Or, on the other hand, if the Newsfeed is in Atom format, then the WireFeed is of class <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/Feed.html">Feed</a> from the <aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/package-summary.html">com.rometools.rome.atom</a> package. In the end, WireFeedInput returns a WireFeed.</li>
<li>SyndFeedInput uses the returned WireFeedInput to create a SyndFeedImpl. Which implements SyndFeed. SyndFeed is an interface, the root of an abstraction that represents a format independent Newsfeed.</li>
<li><aclass="externalLink"href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/synd/SyndFeed.html">SyndFeedImpl</a> uses a Converter to convert between the format specific WireFeed representation and a format-independent SyndFeed.</li>
<li>SyndFeedInput returns to you a SyndFeed containing the parsed Newsfeed.</li></ol></div>
<p>Rome supports Newsfeed extension modules for all formats that also support modules: RSS 1.0, RSS 2.0, and Atom. Standard modules such as Dublic Core and Syndication are supported and you can <ahref="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">define your own custom modules</a> too.</p>
<p>Rome also supports <ahref="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Newsfeed output</a> and for each Newsfeed format provides a "generator" class that can take a Syndication Feed model and produce from it Newsfeed XML.</p></div>
<divclass="section">
<h3>Learning more<aname="Learning_more"></a></h3>
<p>I've linked to a number of the Rome 0.4 Tutorials, here is the full list from the <ahref="../index.html">Rome Wiki</a>:</p>
<olstyle="list-style-type: decimal">
<li><ahref="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a></li>
<li><ahref="./RomeV0.4TutorialUsingRomeToConvertASyndicationFeedFromOneTypeToAnother.html">Using Rome to convert a syndication feed from one type to another</a></li>
<li><ahref="./RomeV0.4TutorialUsingRomeToAggregateManySyndicationFeedsIntoASingleOne.html">Using Rome to aggregate many syndication feeds into a single one</a></li>
<li><ahref="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Using Rome to create and write a feed</a></li>
<li><ahref="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">Defining a Custom Module bean, parser and generator</a></li></ol></div>
<divclass="section">
<h3>Conclusion<aname="Conclusion"></a></h3>
<p>Overall, Rome looks really good. It is obvious that a lot of thought has gone into design and a lot of work has been done on implementation (and docs). Rome is well on the way to "ending syndication feed confusion by supporting all of 'em" for us Java heads.</p></div></div>