In previous iterations of this course (and CS349A The Intelligent Web), we have been making use of a dedicated RSS feed provided by the Wellesley CIO, Ravi Ravishanker for the content of Wellesley 25 Live. Here is the url of the feed: https://events.wellesley.edu/cs-rss.php.
An RSS feed returns data in the XML format. XML is similar to HTML, but its tags are not intented to format content for web pages. Instead, it is a format to store any kind of data. If you peruse the page (see screenshot below), it looks like the HTML code we write in the text editor, with lots of tags and text between them. However, usually the content is a list of itmes, where every item has a set of fields, that are the same across all items. Given that XML and JSON are competing text formats, the tag names in XML correspond to the property names in JSON (or on object properties) and the values enclosed by tags are the values of properties for a JSON object. Here is how one entry in the Wellesley Events feed looks like, showing the information for one single event:
There are three examples of applications from previous years that have made use of this feed.
However, all of these apps were using Python to retrieve the data from the feed and then store the data in a JSON file to use with their app. Here is an example of the JSON file for events of this period (Winter 2015). This file only contains events viewable to all public, if you want the whole feed of events that can be accessed from campus, execute Python script while you are in campus.
Currently, there is no way for us to access the feed with Javascript, because
it is not a valid RSS feed and Google Feed API cannot access it. Additionally,
due to the "Same Origin
Policy" restriction, we also cannot send an AJAX request to the server for this file.
In fact, even though our web apps (e.g. http://cs.wellesley.edu/~wendy/cs249/AM3/am3.html
reside in the wellesley.edu
domain, the same as
I have asked LTS to make changes to it. Stay tuned for news on this.
For some reason, the feed page shows now less data in campus than outside campus, making it even less appealing.
RSS feeds are very common on the web, especially for broadcasting recently changed information. News websites make extensive use of them. RSS feeds are great from mashups, because they allow one to aggreggate content from multiple websites.
We can use the Google Feeds API to access any well-formatted RSS feed on the web.
Let's see an example first:
Choose a website:
nytimes.com
cnn.com
bbc.co.uk
buzzfeed.com
Enter a query term (e.g., obama, war, oil, etc.):
For this example to work, we need to add the Google Feeds API by:
script
tag that refers to the Gogle Javascript API:
https://www.google.com/jsapi.google.load("feeds", "1");
If we know the URL of a feed, we can create an object of type Feed
and then
load its content.
Try out the following code by copying it onto the console of this web page (so that the Google API is loaded):
var feed = new google.feeds.Feed("http://www.buzzfeed.com/index.xml"); feed.load(function(r) { console.log("Buzzfeed: ", r); });
You should see in the console the response from the Buzzfeed RSS feed. Notice
how we need to specify a callback function that will be invoked
when the response from the server arrives. By default, the feed returns the
most recent four entries. If we want more, we can use the method .setNumEntries
to ask for more or less entries.
The code of the search example above is a bit different, because instead of connecting to a single feed, it searches all the feeds within a website. Big websites like New York Times have multiple sections, each with its own feed.
If you are interested, here is the entire code for the example:
<!-- the html part of the code --> <div id="box"> <p>Choose a website: </p> <p> <input type="radio" name="news"><span>nytimes.com</span><br> <input type="radio" name="news"><span>cnn.com</span><br> <input type="radio" name="news"><span>bbc.co.uk</span><br> <input type="radio" name="news"><span>buzzfeed.com</span><br> </p> <p>Enter a query term (e.g., obama, war, oil, etc.): <input id="search"> <button>Search</button> </p> <div id="content">Search results will appear here.</div> </div>
// The Javascript code, goes in a separate JS file. // Don't forget the script tag for the API. google.load("feeds", "1"); // Which radio button was chosen, and with what search text $("button").click(function(){ var url = $("input[name='news']:checked").next().text(); var search = $("#search").val(); findFeed(url, search); }); // Search all feeds of a site function findFeed(siteURL, phrase) { var query = 'site:' + siteURL + ' ' + phrase; google.feeds.findFeeds(query, findDone); } var res; // global variable to inspect results function findDone(result) { res = result; if (!result.error) { // Get content div var content = document.getElementById('content'); var html = ''; // Loop through the results and print out the title of the feed and link to // the url. for (var i = 0; i < result.entries.length; i++) { var entry = result.entries[i]; html += '<p><a href="' + entry.url + '">' + entry.title + '</a></p>'; } content.innerHTML = html; } }
To learn more about the Google Feed API, read its documentation.
If you search Google for the name of a website and "rss feed", you will often find the URL to access the feed. Here are some example:
Go out there and search for feeds and data that you can use in your applications.