Important note: The next project will build directly on your solution to this project. It is essential that you get a decent and working solution for this project before you start working on the next one. Please do not delay work on this project because a failure on this one is likely to result in a failure on the next one as well.
RSS (Really Simple Syndication) is an XML application for distributing web content that changes frequently. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it.
For this project your task is to write a program that asks the user for the URL of an RSS 2.0 feed and for the name of an output file including the .html extension, reads the RSS feed into an XMLTree object and then uses the information in the XMLTree object to generate in the output file a nicely formatted HTML page with table of links to all the news items in the original feed.
Here is a simplified description of the structure of an RSS 2.0 XML document. RSS 2.0 documents can contain a few other tags and features, but these are the ones you will need for the project.
Note the following properties of RSS 2.0 XML documents:
Note that when your program creates an XMLTree object from a given RSS 2.0 feed, if it is successful, all you know is that the input is a valid XML document. It is up to your program to check that the label of the root of the XMLTree is an <rss> tag and that it has a version attribute with value "2.0". After that, your program can assume that the input is indeed a valid RSS 2.0 feed and the XMLTree has the structure described above; in other words, you do not need to check for the existence of the <channel> tag, or the <title>, <link>, and <description> tags inside that. Make sure you do not make any assumptions that are not specified in the structure described above and, in particular, make sure to check that the channel's <title> and <description> tags and each item's <title> and <description> tags have a child before trying to access it. However, the <item>'s children <link> and <pubDate>, if present, are required to have a child with the needed information. (See slide #9 in RSS for a diagram capturing these requirements.)
These are the minimum requirements for the generated web page. If you think you can produce better output or include more information, you should consult your instructor to make sure that any changes you want to implement are acceptable. This is what your output should include:
You can see an example of the output here (note that the links may be outdated and no longer available).
Here are some links that you may find useful/interesting if you want to know more.