Lab: XMLTree RSS Processing


The Problem

RSS (Really Simple Syndication) is an XML application for distributing web content that changes frequently. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it. In this lab, you will write code that extracts information from an RSS (version 2.0) document loaded into an XMLTree object.

RSS 2.0 documents have the following format:

Note the following properties of RSS 2.0 XML documents:

Setup

Follow these steps to set up a project for this lab.

  1. Create a new Eclipse project by copying ProjectTemplate. Name the new project RSSProcessing.
  2. Open the src folder of this project and then open (default package). As a starting point you can use any of the Java files. Rename it RSSProcessing and delete the other files from the project.
  3. Follow the link to RSSProcessing.java, select all the code on that page (click and hold the left mouse button at the start of the program and drag the mouse to the end of the program) and copy it to the clipboard (right-click the mouse on the selection and choose Copy from the contextual pop-up menu), then come back to this page and continue with these instructions.
  4. Finally in Eclipse, open the RSSProcessing.java file; select all the code in the editor, right-click on it and select Paste from the contextual pop-up menu to replace the existing code with the code you copied in the previous step. Save your file.

Method

  1. Implement the following static method that, given an XMLTree and a tag name (a String), searches the children of the XMLTree for the given tag and returns the index of the first occurrence of the tag or -1 if the tag does not exist.
  2. Review the main method skeleton and modify it to output the title, description, and link of the RSS channel. Each element in the output should be preceded by a descriptive label, e.g.,
    Title: Yahoo! News - Latest News & Headlines
    Description: The latest news and headlines from Yahoo! News.
    Link: http://news.yahoo.com/
    Run the program and test your implementation. As input you can use any URL of a valid RSS 2.0 feed, e.g., https://news.yahoo.com/rss/.
  3. Once you are confident that your implementations above are correct, implement the following static method that, given an XMLTree whose root is an <item> tag and an output stream, outputs the title (or the description, if the title is not available) and the link, if available. Here is an example of what the output might look like:
    Title: Tropical Storm Leslie churns northward in Atlantic
    Link: http://news.yahoo.com/storm-churns-northward-winds-buffeting-bermuda-144218080.html
  4. Back in the main method, add code so that it prints all items in the RSS channel by repeatedly calling processItem. Then run and test your code to make sure it works as intended.

Additional Activities

  1. Modify processItem (including updating the comments) so that, in addition to title (or description) and link, it also outputs publication date (tag pubDate) and source (tag source) with the source URL (attribute url of source tag). If any of these elements are not present, output <element> not present (where <element> is replaced by the name of the missing tag).