Saturday, August 27, 2011
Data : Stock Quotes
Viz: Puzzle - This Shell by Gambit
Tool : Highcharts JS
Data: Internet Movie Database (IMDb)
This database which was initially started as just a hobby was later acquired by Amazon.com, and has now grown into a tremendously huge database with millions of users registering every month. There are various features which are supported by this database. The most popular feature is the movie ratings. All the movies are given ratings by the registered users and these ratings are properly linked with the list data. Rating for all the movies are available in this database. This huge database also supports a special feature, which gives you the list of the entire cast and crew involved in every episode of a tv show. This feature increased the database titles by almost double. As mentioned earlier, this database also has feature, where we can search for a character of a movie or tv serial and it will give us the character details like, who played the character, the popular quotes of the character etc. It also provides filmography of huge number of people involved with movies and telivision shows. One more feature which is popular is the Top 250 IMDb movie list. The ranking system for selecting this is based on a formula which will give Bayesian posterior mean as a result. The formula is as follows:
Ranking = {(Rating*no. of votes) + mean vote*m}/{no. of votes+m}
where:
m = minimum votes required to be listed in the Top 250 (currently 3000).
This website is Perl Based. The data of this database can be downloaded as textfiles in compressed format and can be extracted using CLI (Command Line Interface) tools. Java based GUI is used to search and display the information. It also features information of other language movies and items related to those movies. For easy access to the dataset, a new package of Python called IMDbPY was introduced.
The cons of this database website is that, it does not provide API for automated queries and also the exact process involved for giving the rankings is not disclosed, leading to some contradicting ranking for few movies.
This is now also available as a mobile application.
The link for viewing this online database is : http://www.imdb.com/.
Tool: Dipity
Dipity is an interactive and free digital timeline tool. It allows one to use texts, links, pictures and video in their personal timelines.
This tool is intended for anyone who uses internet. One can customize the look of their embedded time line by adding as much data and in any form. Users can even create, share, embed and work together on interactive and visually engaging timelines that integrate video, audio, images, text, links, social media, location and timestamps. It also makes public time lines searchable and increase traffic and user engagement on one's website.
Dipity allows users to gather real-time sources from social media, search engines and RSS, converting them in a very user interactive visualization. It combines the power of multimedia and social media content with timestamps, geolocation and realtime updates. One can even zoom in and out of the timeline to hours or years. It makes the historical and present data highly visual and attractive.
Data:UNData-world of information
Tool: d3.js (Data-Driven Documents)
D3.js is javascript library for visualization on websites or for web application. It uses JSON or DOM objects(as any other javascript) as input for generating Visualization. The author of this tool is Mike Bostov. This tool tries just to manipulate data in document object model by exploiting complete capabilities of HTML5 and CSS3.It uses SVG graphics with javascript very efficiently, because of which it can render large datasets with extremely fast with animations and and interactions. It is divided in modules to make web application light by including only required module.
Tool: Google Fusion Table
Friday, August 26, 2011
Data:America's Children: Key National Indicators of Well-Being
The report identifies seven domains which characterize the well-being of child. The seven domains are family and social environment, economic circumstances, health care, physical environment and safety, behavior, education, and health. These domains are interrelated and can have great effect on well-being.
The report provides tables related to these seven domains and also gives the graphs that show the change of these monitors over the years.
Here is the snapshot of a table that has details of childcare.
The tables can be viewed here: tables
The tables contain available data from 1950-2010.
References:
1. http://www.childstats.gov
2. http://www.childstats.gov/americaschildren/tables.asp