rome/HowRomeWorks/index.html
2016-04-24 20:59:50 +02:00

270 lines
16 KiB
HTML

<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2016-04-24
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="author" content="mkurz" />
<meta name="Date-Creation-yyyymmdd" content="20110816" />
<meta name="Date-Revision-yyyymmdd" content="20160424" />
<meta http-equiv="Content-Language" content="en" />
<title>ROME - How Rome works</title>
<link rel="stylesheet" href="../css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="../css/site.css" />
<link rel="stylesheet" href="../css/print.css" media="print" />
<script type="text/javascript" src="../js/apache-maven-fluido-1.3.0.min.js"></script>
</head>
<body class="topBarDisabled">
<a href="http://github.com/rometools/rome">
<img style="position: absolute; top: 0; right: 0; border: 0; z-index: 10000;"
src="https://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png"
alt="Fork me on GitHub">
</a>
<div class="container-fluid">
<div id="banner">
<div class="pull-left">
<a href="../index.html" id="bannerLeft">
<img src="../images/romelogo.png" alt="ROME"/>
</a>
</div>
<div class="pull-right"> </div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li id="publishDate" class="pull-right">Last Published: 2016-04-24</li> <li class="divider pull-right">|</li>
<li id="projectVersion" class="pull-right">Version: 1.7.0-SNAPSHOT</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span3">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">Rome</li>
<li>
<a href="../index.html" title="Overview">
<i class="none"></i>
Overview</a>
</li>
<li class="active">
<a href="#"><i class="icon-chevron-down"></i>How Rome Works</a>
<ul class="nav nav-list">
<li>
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html" title="Read a syndication feed">
<i class="none"></i>
Read a syndication feed</a>
</li>
<li>
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html" title="Create and write a syndication feed">
<i class="none"></i>
Create and write a syndication feed</a>
</li>
<li>
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToConvertASyndicationFeedFromOneTypeToAnother.html" title="Convert a syndication feed">
<i class="none"></i>
Convert a syndication feed</a>
</li>
<li>
<a href="../HowRomeWorks/RomeV0.4TutorialUsingRomeToAggregateManySyndicationFeedsIntoASingleOne.html" title="Aggregate many syndication feeds">
<i class="none"></i>
Aggregate many syndication feeds</a>
</li>
<li>
<a href="../HowRomeWorks/UnderstandingTheRomeCommonClassesAndInterfaces.html" title="Common classes and interfaces">
<i class="none"></i>
Common classes and interfaces</a>
</li>
<li>
<a href="../HowRomeWorks/RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html" title="Defining a Custom Module">
<i class="none"></i>
Defining a Custom Module</a>
</li>
</ul>
</li>
<li>
<a href="../RssAndAtOMUtilitiEsROMEV0.5AndAboveTutorialsAndArticles/index.html" title="Tutorials And Articles">
<i class="none"></i>
Tutorials And Articles</a>
</li>
<li>
<a href="../ROMEReleases/index.html" title="Releases">
<i class="icon-chevron-right"></i>
Releases</a>
</li>
<li>
<a href="../ROMEDevelopmentProposals/index.html" title="ROME Development Proposals">
<i class="none"></i>
ROME Development Proposals</a>
</li>
<li>
<a href="../Modules/index.html" title="Modules">
<i class="icon-chevron-right"></i>
Modules</a>
</li>
<li>
<a href="../Fetcher/index.html" title="Fetcher">
<i class="icon-chevron-right"></i>
Fetcher</a>
</li>
<li>
<a href="../Opml/index.html" title="OPML">
<i class="none"></i>
OPML</a>
</li>
<li>
<a href="../Propono/index.html" title="Propono">
<i class="none"></i>
Propono</a>
</li>
<li>
<a href="../Certiorem/index.html" title="Certiorem">
<i class="icon-chevron-right"></i>
Certiorem</a>
</li>
<li class="nav-header">Project Documentation</li>
<li>
<a href="../project-info.html" title="Project Information">
<i class="icon-chevron-right"></i>
Project Information</a>
</li>
</ul>
<hr class="divider" />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img class="builtBy" alt="Built by Maven" src="../images/logos/maven-feather.png" />
</a>
</div>
</div>
</div>
<div id="bodyColumn" class="span9" >
<div class="section">
<h2>How Rome works<a name="How_Rome_works"></a></h2>
<p><b>Dave Johnson (</b><b><a class="externalLink" href="http://www.rollerweblogger.org/">The Roller Weblogger</a></b><b>) has written a very nice blog</b> <b><a class="externalLink" href="http://www.rollerweblogger.org/page/roller/20040808#how_rome_works">How Rome Works</a></b> <b>describing (as the title says) how Rome works. With his permission we are adding it to Rome documentation.</b></p>
<p>I spent some time exploring the new <a class="externalLink" href="http://rome.dev.java.net/">Rome</a> feed parser for Java and trying to understand how it works. Along the way, I put together the following class diagram and notes on the parsing process. I provide some pointers into the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/overview-summary.html">Rome 0.4 Javadocs</a>.</p>
<p>You don't need to know this stuff to use Rome, but it you are interested in internals you might find it interesting.</p>
<div class="section">
<h3>Notes on the Rome parsing process<a name="Notes_on_the_Rome_parsing_process"></a></h3>
<p>Rome is based around an idealized and abstract model of a Newsfeed or &quot;Syndication Feed.&quot; Rome can parse any format of Newsfeed, including RSS variants and Atom, into this model. Rome can convert from model representation to any of the same Newfeed output formats.</p>
<p>Internally, Rome defines intermediate object models for specific Newsfeed formats, or &quot;Wire Feed&quot; formats, including both Atom and all RSS variants. For each format, there is a separate JDOM based parser class that parses XML into an intermediate model. Rome provides &quot;converters&quot; to convert between the intermediate Wire Feed models and the idealized Syndication Feed model.</p>
<p>Rome makes no attempt at <a class="externalLink" href="http://www.xml.com/pub/a/2003/01/22/dive-into-xml.html">Pilgrim-style liberal XML parsing</a>. If a Newsfeed is not valid XML, then Rome will fail. Perhaps, as <a class="externalLink" href="http://www.peerfear.org/rss/permalink/2003/01/23/1043368363-Smart_Parsing__Not_RSS_Parsing.shtml">Kevin Burton suggests</a>, parsing errors in Newsfeeds can and should be corrected. Kevin suggests that, when the parse fails, you can correct the problem and parse again. (BTW, I have some sample code that shows how to do this, but it only works with Xerces - Crimsom's SAXParserException does not have reliable error line and column numbers.)</p>
<p>Here is what happens during Rome Newsfeed parsing:</p><img src="HowRomeWorks.png" alt="" />
<ol style="list-style-type: decimal">
<li>Your code calls <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/SyndFeedInput.html">SyndFeedInput</a> to parse a Newsfeed, for example (see also <a href="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a>):
<div class="source">
<pre>URL feedUrl = new URL(&quot;file:blogging-roller.rss&quot;);
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new InputStreamReader(feedUrl.openStream()));
</pre></div></li>
<li>SyndFeedInput delegates to WireFeedInput to do the actual parsing.</li>
<li><a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/io/WireFeedInput.html">WireFeedInput</a> uses a PluginManager of class FeedParsers to pick the right parser to use to parse the feed and then calls that parser to parse the Newsfeed.</li>
<li>The appropriate parser parses the Newsfeed parses the feed, using <a class="externalLink" href="http://www.jdom.org/">JDom</a>, into a <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/WireFeed.html">WireFeed</a>. If the Newsfeed is in an RSS format, the the WireFeed is of class <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/Channel.html">Channel</a> and contains Items, Clouds, and other RSS things from the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/rss/package-summary.html">com.rometools.rome.feed.rss</a> package. Or, on the other hand, if the Newsfeed is in Atom format, then the WireFeed is of class <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/Feed.html">Feed</a> from the <a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/atom/package-summary.html">com.rometools.rome.atom</a> package. In the end, WireFeedInput returns a WireFeed.</li>
<li>SyndFeedInput uses the returned WireFeedInput to create a SyndFeedImpl. Which implements SyndFeed. SyndFeed is an interface, the root of an abstraction that represents a format independent Newsfeed.</li>
<li><a class="externalLink" href="http://rome.dev.java.net/apidocs/0_4/com/sun/syndication/feed/synd/SyndFeed.html">SyndFeedImpl</a> uses a Converter to convert between the format specific WireFeed representation and a format-independent SyndFeed.</li>
<li>SyndFeedInput returns to you a SyndFeed containing the parsed Newsfeed.</li></ol></div>
<div class="section">
<h3>Other Rome features<a name="Other_Rome_features"></a></h3>
<p>Rome supports Newsfeed extension modules for all formats that also support modules: RSS 1.0, RSS 2.0, and Atom. Standard modules such as Dublic Core and Syndication are supported and you can <a href="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">define your own custom modules</a> too.</p>
<p>Rome also supports <a href="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Newsfeed output</a> and for each Newsfeed format provides a &quot;generator&quot; class that can take a Syndication Feed model and produce from it Newsfeed XML.</p></div>
<div class="section">
<h3>Learning more<a name="Learning_more"></a></h3>
<p>I've linked to a number of the Rome 0.4 Tutorials, here is the full list from the <a href="../index.html">Rome Wiki</a>:</p>
<ol style="list-style-type: decimal">
<li><a href="./RomeV0.4TutorialUsingRomeToReadASyndicationFeed.html">Using Rome to read a syndication feed</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToConvertASyndicationFeedFromOneTypeToAnother.html">Using Rome to convert a syndication feed from one type to another</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToAggregateManySyndicationFeedsIntoASingleOne.html">Using Rome to aggregate many syndication feeds into a single one</a></li>
<li><a href="./RomeV0.4TutorialUsingRomeToCreateAndWriteASyndicationFeed.html">Using Rome to create and write a feed</a></li>
<li><a href="./RomeV0.4TutorialDefiningACustomModuleBeanParserAndGenerator.html">Defining a Custom Module bean, parser and generator</a></li></ol></div>
<div class="section">
<h3>Conclusion<a name="Conclusion"></a></h3>
<p>Overall, Rome looks really good. It is obvious that a lot of thought has gone into design and a lot of work has been done on implementation (and docs). Rome is well on the way to &quot;ending syndication feed confusion by supporting all of 'em&quot; for us Java heads.</p></div></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row span12">Copyright &copy; 2016.
All Rights Reserved.
</div>
</div>
</footer>
</body>
</html>