rome/HowRomeWorks/index.html

157 lines
No EOL
13 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia Site Renderer 1.4 at 2013-10-04 -->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>ROME - How Rome works</title>
<style type="text/css" media="all">
@import url("../css/maven-base.css");
@import url("../css/maven-theme.css");
@import url("../css/site.css");
</style>
<link rel="stylesheet" href="../css/print.css" type="text/css" media="print" />
<meta name="author" content="mkurz" />
<meta name="Date-Creation-yyyymmdd" content="20110816" />
<meta name="Date-Revision-yyyymmdd" content="20131004" />
<meta http-equiv="Content-Language" content="en" />
</head>
<body class="composite">
<div id="banner">
<a href="http://github.com/rometools/" id="bannerLeft">
<img src="../images/romelogo.png" alt="ROME" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xright">
<span id="publishDate">Last Published: 2013-10-04</span>
&nbsp;| <span id="projectVersion">Version: 2.0.0-SNAPSHOT</span>
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>Rome</h5>
<ul>
<li class="none">
<a href="../index.html" title="Overview">Overview</a>
</li>
<li class="expanded">
<strong>How Rome Works</strong>
<ul>
<li class="none">
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html" title="Read a syndication feed">Read a syndication feed</a>
</li>
<li class="none">
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html" title="Create and write a syndication feed">Create and write a syndication feed</a>
</li>
<li class="none">
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToConvertASyndicationFeedFromOneTypeToAnother.html" title="Convert a syndication feed">Convert a syndication feed</a>
</li>
<li class="none">
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToAggregateManySyndicationFeedsIntoASingleOne.html" title="Aggregate many syndication feeds">Aggregate many syndication feeds</a>
</li>
<li class="none">
<a href="../HowRomeWorks/UnderstandingTheRomeCommonClassesAndInterfaces.html" title="Common classes and interfaces">Common classes and interfaces</a>
</li>
<li class="none">
<a href="../HowRomeWorks/RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html" title="Defining a Custom Module">Defining a Custom Module</a>
</li>
</ul>
</li>
<li class="none">
<a href="../RssAndAtOMUtilitiEsROMEV0.5AndAboveTutorialsAndArticles/index.html" title="Tutorials And Articles">Tutorials And Articles</a>
</li>
<li class="collapsed">
<a href="../ROMEReleases/index.html" title="Releases">Releases</a>
</li>
<li class="none">
<a href="../ROMEDevelopmentProposals/index.html" title="ROME Development Proposals">ROME Development Proposals</a>
</li>
</ul>
<h5>Project Documentation</h5>
<ul>
<li class="collapsed">
<a href="../project-info.html" title="Project Information">Project Information</a>
</li>
<li class="collapsed">
<a href="../project-reports.html" title="Project Reports">Project Reports</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="poweredBy" alt="Built by Maven" src="../images/logos/maven-feather.png" />
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<div class="section">
<h2>How Rome works<a name="How_Rome_works"></a></h2>
<p><b>Dave Johnson (</b><b><a class="externalLink" href="http://www.rollerweblogger.org/">The Roller Weblogger</a></b><b>) has written a very nice blog</b> <b><a class="externalLink" href="http://www.rollerweblogger.org/page/roller/20040808#how_rome_works">How Rome Works</a></b> <b>describing (as the title says) how Rome works. With his permission we are adding it to Rome documentation.</b></p>
<p>I spent some time exploring the new <a class="externalLink" href="http://rome.dev.java.net/">Rome</a> feed parser for Java and trying to understand how it works. Along the way, I put together the following class diagram and notes on the parsing process. I provide some pointers into the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/overview-summary.html">Rome 0.4 Javadocs</a>.</p>
<p>You don't need to know this stuff to use Rome, but it you are interested in internals you might find it interesting.</p>
<div class="section">
<h3>Notes on the Rome parsing process<a name="Notes_on_the_Rome_parsing_process"></a></h3>
<p>Rome is based around an idealized and abstract model of a Newsfeed or &quot;Syndication Feed.&quot; Rome can parse any format of Newsfeed, including RSS variants and Atom, into this model. Rome can convert from model representation to any of the same Newfeed output formats.</p>
<p>Internally, Rome defines intermediate object models for specific Newsfeed formats, or &quot;Wire Feed&quot; formats, including both Atom and all RSS variants. For each format, there is a separate JDOM based parser class that parses XML into an intermediate model. Rome provides &quot;converters&quot; to convert between the intermediate Wire Feed models and the idealized Syndication Feed model.</p>
<p>Rome makes no attempt at <a class="externalLink" href="http://www.xml.com/pub/a/2003/01/22/dive-into-xml.html">Pilgrim-style liberal XML parsing</a>. If a Newsfeed is not valid XML, then Rome will fail. Perhaps, as <a class="externalLink" href="http://www.peerfear.org/rss/permalink/2003/01/23/1043368363-Smart_Parsing__Not_RSS_Parsing.shtml">Kevin Burton suggests</a>, parsing errors in Newsfeeds can and should be corrected. Kevin suggests that, when the parse fails, you can correct the problem and parse again. (BTW, I have some sample code that shows how to do this, but it only works with Xerces - Crimsom's SAXParserException does not have reliable error line and column numbers.)</p>
<p>Here is what happens during Rome Newsfeed parsing:</p><img src="HowRomeWorks.png" alt="" />
<ol style="list-style-type: decimal">
<li>Your code calls <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/SyndFeedInput.html">SyndFeedInput</a> to parse a Newsfeed, for example (see also <a href="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a>):
<div class="source">
<pre>URL feedUrl = new URL(&quot;file:blogging-roller.rss&quot;);
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new InputStreamReader(feedUrl.openStream()));
</pre></div></li>
<li>SyndFeedInput delegates to WireFeedInput to do the actual parsing.</li>
<li><a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/WireFeedInput.html">WireFeedInput</a> uses a PluginManager of class FeedParsers to pick the right parser to use to parse the feed and then calls that parser to parse the Newsfeed.</li>
<li>The appropriate parser parses the Newsfeed parses the feed, using <a class="externalLink" href="http://www.jdom.org/">JDom</a>, into a <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/WireFeed.html">WireFeed</a>. If the Newsfeed is in an RSS format, the the WireFeed is of class <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/Channel.html">Channel</a> and contains Items, Clouds, and other RSS things from the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/package-summary.html">com.sun.syndication.feed.rss</a> package. Or, on the other hand, if the Newsfeed is in Atom format, then the WireFeed is of class <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/Feed.html">Feed</a> from the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/package-summary.html">com.sun.syndication.atom</a> package. In the end, WireFeedInput returns a WireFeed.</li>
<li>SyndFeedInput uses the returned WireFeedInput to create a SyndFeedImpl. Which implements SyndFeed. SyndFeed is an interface, the root of an abstraction that represents a format independent Newsfeed.</li>
<li><a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/synd/SyndFeed.html">SyndFeedImpl</a> uses a Converter to convert between the format specific WireFeed representation and a format-independent SyndFeed.</li>
<li>SyndFeedInput returns to you a SyndFeed containing the parsed Newsfeed.</li></ol></div>
<div class="section">
<h3>Other Rome features<a name="Other_Rome_features"></a></h3>
<p>Rome supports Newsfeed extension modules for all formats that also support modules: RSS 1.0, RSS 2.0, and Atom. Standard modules such as Dublic Core and Syndication are supported and you can <a href="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">define your own custom modules</a> too.</p>
<p>Rome also supports <a href="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Newsfeed output</a> and for each Newsfeed format provides a &quot;generator&quot; class that can take a Syndication Feed model and produce from it Newsfeed XML.</p></div>
<div class="section">
<h3>Learning more<a name="Learning_more"></a></h3>
<p>I've linked to a number of the Rome 0.4 Tutorials, here is the full list from the <a href="../index.html">Rome Wiki</a>:</p>
<ol style="list-style-type: decimal">
<li><a href="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToConvertASyndicationFeedFromOneTypeToAnother.html">Using Rome to convert a syndication feed from one type to another</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToAggregateManySyndicationFeedsIntoASingleOne.html">Using Rome to aggregate many syndication feeds into a single one</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Using Rome to create and write a feed</a></li>
<li><a href="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">Defining a Custom Module bean, parser and generator</a></li></ol></div>
<div class="section">
<h3>Conclusion<a name="Conclusion"></a></h3>
<p>Overall, Rome looks really good. It is obvious that a lot of thought has gone into design and a lot of work has been done on implementation (and docs). Rome is well on the way to &quot;ending syndication feed confusion by supporting all of 'em&quot; for us Java heads.</p></div></div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">
Copyright &#169; 2004-2013
<a href="http://www.rometools.org">ROME Project</a>.
All Rights Reserved.
</div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>