Friday, December 16, 2011

Project: Express Yourself

About the project:

Every person has an opinion. With the wide and vastly used social networks these days, people like to express their opinion and know others opinions on various topics. A close observation of the patterns of people's opinions, how they change over time or how consistent they are gives us a better understanding of what people perceive and what they want- may be in the field of politics, may be at a work environment or in just a friends circle. This is facilitated by our app, Express Yourself, which presents a real-time visualization of user opinions.

Express Yourself ,  as the name suggests allows users to express themselves by allowing them to create their own polls and cast a vote to polls by others. The app presents a visualization of the poll results of every poll, against a time series and the visualization is updated every 10 seconds to capture change in users' opinions. These polls may be about what users think of the current government or just a simple question on previous day's basketball match!

The user needs to login via Facebook and is all set to go! He can create polls, respond to polls, view the visualizations. The user clicks on any of the questions to view the real-time visualization of responses. The visualization is in the form of a line graph against a time series. Color legends and background labels make the visualization self-explanatory. The vote he casts can be a range between 1-7 expressing his percent support to that option. The user has his own profile page that displays the posts he created, his responses and the percentage of support to his vote. The home page of the app displays the popular posts and the new polls. The user is also provided a "Search" feature letting him to look through the archives. Also, when he creates a new post, the app pulls up similar questions, so that he does not create them again.


We have built the website in Ruby on Rails framework. The visualization has been developed using the Highcharts.js JavaScript library and JQuery UI elements.

Further scope:
We aim to include newer features to improve the user input. We plan to include sparklines against the list of polls, create weekly, monthly, yearly tabs to display the visualization over a period of time. Our greater aim is to allow companies use this app to know about their employee moods, thoughts and needs. We believe this will be of great help and a source of constant employee feedback to the company.

We would like to thank Dr.Watson for his valuable feedback and constant support and guidelines throughout this project. We feel it has been a great learning opportunity. We also would like to thank all the guests for their feedback.

Team:  Pradeep Kumar Ramaswamy (pramasw3)Gopikannan Venugopalsamy(gvenugo)
Shishir Kakaraddi(smkakara),Sridevi Venugopal(sthirum), Preeti Muppalla(ppreeti)

Thursday, December 15, 2011

Project: Census of India

Census of India attempts to visualize the Indian population census of 2011. The population of India as of Feb 2011 is 1.2 billion which is a very large number and hence the census data is also large. In our project we have presented this data to the user in a good interactive way rather than plain textual form. The web visualization is interactive which map of India and charts to represent the data. In these charts we have presented some important categories such as the population size, sex ratio and literacy. We have also given a feature to compare 5 states so as to quickly know the trends of each state and who wins in which category.

We developed the site using HTML, CSS and JavaScript.
We made use of the HighCharts API for charts and ammap for the map.
Editor used was Microsoft Expression Web 4.
Data set was the XML file containing data. This file is available for download on the website itself.

Website hosted on:

Source Code:

Screen cast on YouTube:

Aditya Sahasrabuddhe, Anuj Sharma and Pavan Gopal Bandla

We would like to thank Dr. Watson for his help and invaluable feedback throughout the studio sessions which helped us in iterative designing of the visualization. We would also like to thank all the guests ( present during studio sessions and the final presentation) for their important feedback.

Project: Baseball Visualization

Baseball Visualization attempts to capture important statistics in the history of baseball over the past 100 years in the form of data sets and the visualization is done on the data by using HTML, CSS, Javascript and JQuery. Using this visualization tool, we have uncovered some important correlations in the history baseball by analyzing the players performance, U.S. statewide statistics and the performance of the teams in the American and National leagues. We have also shown how significant events in the history of baseball have had an impact on the game and on the different leagues. It is easy to navigate the homepage and find the visualization tools that we have built. We have also embedded a link for GitHub for ease of navigation to the source code from the homepage. Please feel free to express your comments that will help us in the future development of the project.

Here are the links:

Source code - -

Visualization site -

Data set -

Screencast -

Who are we ?

Vikhyath Reddy Marapadaga (vmarapa), Charan Chaudary Lekkalapudi (clekkal), Vinay K. Patnana (vkpatnan) & Ravi Teja Manda (vrmanda).

We express our sincere thanks to Dr. Watson for guiding us in the project and providing us valuable feedback that helped us stay on proper course in the project.

Reaction: Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods

Dealing with the subjectivity of visualization, the authors identified the need for a scientific foundation and they took some steps in that direction. The paper includes both theory and experimentation that backs up the theory.

The first part of the theory is an identification of a set of elementary perceptual tasks. A second part is the ordering of the tasks. Experimentation consists on subjects recording their judgements of the quantitive information of the graph.

The theory and experiments support the conclusion: a complete change is need in the current graphs being used. The authors also purpose alternative graphs: dot charts, dot charts with grouping and framed-rectangle charts. Even when I understand the problems presented in this graph, I don’t really find the new graphs better or being used in the long-term (which apparently happened).

Reaction: Imaging Vector Fields Using Line Integral Convolution

The authors present a new method for imaging vector fields using line integral convolution (LIC). A comparison is made between DDA convolution and Line Integral Convolution. The flaw in DDA convolution is that it renders vector fields unevenly, small scaled surfaces loss detail. LIC is an alternative presented on this paper that doesn’t have this flaw, because it uses a different approach for approximating vectors.

The paper is a little complicated and the authors go through all the technique. He discusses its application, implementation, and advantages. Its advantages include: removing the aliasing error, its flexibility (ability to interface with other techniques).

Reaction: Marching Cubes: A High Resolution 3D Surface Construction Algorithm A new algorithm (at the time) for drawing isosurfaces (analog to isolines

A new algorithm (at the time) for drawing isosurfaces (analog to isolines) was presented in this paper: Marching Cubes. The basic idea is to break the volume into smaller volumes (divide and conquer). Then each cube is classified into 15 configurations (256 reduce into 14 patterns after two different symmetries). According to external sources I read, the original algorithm contains mistakes and additional cases where added.

The name comes from the concept that after calculating a cube, we march (move on) to the next cube. The algorithm was implemented in C and demoed using medical 3D images. The results are impressive, and more if we consider the year.

At the end of the paper, the authors comment of a new algorithm, diving cubes. It draws points instead of triangles, it works because as the resolution increases the number of triangles approaches the number of pixels displayed.

Reaction: Task Taxonomy for Graph Visualization

Until now I thought that making visualizations was an artistic task. Not entirely true, but I never saw or thought about the procedure as a fixed list of tasks. The artistic component and subjective appreciation of visualizations make it hard to measure and evaluate. The authors provide a taxonomy, benchmark datasets, and specific tasks that would help both designers and evaluators.

The idea is to break this tasks into lower level tasks, and for this the authors add two general tasks to those purposed by Amal et al. Tasks are grouped in four groups: topology-based, attribute-based, browsing and overview tasks.

General descriptions and examples are given for each task.


Text analysis is a field that I didn’t thought about it before this class. For me its interesting too discover new information on data already available to us: text mining Twitter posts and Facebook comments could give an impressive amount of information to researches and businesses equally.

This chapter provides a snapshot of visualization tools applied to text: visualizing text mining results, document concordance and word frequencies, and literature and citation relationships. For text mining many tools are shown including the JIGSAW. Word concurrence visualizations are by far, my favorites: the DocuBurst it’s impressive, TextArc looks very useful, but very hard to display, due to it’s size; of course, zooming controls. Tag clouds are always fun, and the “Word Nebula” project done by my fellow classmates seems a right step forward.

Finally, the chapter shows visualization tools for relations in literature. One example, is a graph showing relationships among scientific disciplines. Just thinking about a table showing the same data, makes me see the importance of visualization in this and other cases.

Reaction: Jigsaw: Supporting Investigative Analysis through Interactive Visualization

Jigsaw is a visualization tool that helps investigative analysts that work with collections of text document. It visualizes the documents and the entities (place, person, things) on them. The input are reports written in natural language and of a length of about 1-5 paragraphs. Output is done in four different views: List View, Graph View, Scatterplot View, and Text View.

The List View shows lists of types of entities, side by side, connected by a line if they appear on a same report. Graph View shows a graph with reports and entities and lines in between. Scatterplot View shows types of entities in the X and Y axis (I did’t understand the different colors show on the picture. Finally the Text View is the original text from the reports with the entities highlighted and color-coded according to its type.

The tool seems to be a good idea for dealing with this type of information, but I’ll be worried about how it performs with large amounts of information. The Graph View looks very nice with three reports, but a 100 could be tough to represent clearly. Other views will have problems too, for example: the tabs on the Text View will overflow, the Scatterplot matrix will be to big to be displayed or too cluttered, etc.

Project: Gmail Viz

GmailViz is the project aimed at providing way to analyze your inbox using paramters like 
  • last reply to a mail
  • importance based on frequency of mail
  • user importance in your inbox
  • average response rate
  • user importance
  • timeline of the mail -how frequently you reply and how often you receive the mails from certain senders.

We have used Java Servlets to talk to our database which is in built in oracle. We query the database generate the result and pass it as json viz ajax calls to javascript.

We analyze your inbox such that we find importance of each mail and present to you in graphical form. We also carry out other useful analysis like email overload- suggesting you the mails that ought to be deleted from your inbox because they have not been either replied to in a very long time or their importance in very close. We visualize such emails in form of a scatter chart and present you with the link to those mails such that you can go ahead and delete such mails thereby releasing free space in your mailbox.

We run a batch process to populate tables in our database. Once this batch process is completed by clicking on analyze button on home page which looks like figure shown below you can start analyzing your inbox -
index page for GmailViz
The following are some other screenshots from the vizualization that we do based on analysis of your inbox  -

Chart showing timeline vs importance

Drill down from previous chart zoom-in on particular time period and giving granular details about importance

Email Overload with mail preview
Graph showing when was the user last replied

Graph showing keyword analysis

User importance in your inbox

We used Raphael js and Google Chart API for plotting the chart. The pages were developed using HTML5, CSS3 and Javascript.


Data: We build our own dataset by the analyzing mail from your inbox

Team: Sarvesh Pai, Girish Pandit, Nidhi Pathak, Vartika Singh

Thanks to Prof. Dr Watson who provided us vital feedback at various stage of our project.
Also special thanks to Clayton Coleman, Bill Houghteling, and Prof. Patrick Fitzgerald for their valuable insight into our project during demo.

Reaction: Effectively Communicating Numbers

This white paper discusses the best practices to present quantitative information from a BI perspective. The author explains the best means to graph quantitative data and to communicate a message effectively. The message is restated through the whole paper: a plot or chart can either communicate a message or not.

The author not only describes the advantages and disadvantages of points, lines, boxes and bars plots, but also details the key points to format and size axis to again, communicate a message effectively.

The article is full of recommendations and good practices, and as an added value it also briefly discusses quantitative analysis techniques like regression and how to communicate numbers effectively

Reaction: The Eyes Have It

This paper does a good job explaining why visualization is both an art and a science. While it is true that it relies on basic principles and a structured approach, each visualization plot has inherent peculiarities and a particular design.

It describes the basic tasks as a checklist that should be carried out to design visualizations (overview, zoom, filter, details, relate, history, extract) and specifies correct techniques for particular analysis on various dimensions. It also relates it to why human perception responds differently to the particular stimuli from these techniques.

Since some of the above tasks and requirements are done intuitively to a certain extent, probably the most valuable discussion on the article is on querying and filtering. It briefly discusses the importance and usefulness on Boolean logic, Venn diagrams and decision tables as an optional filter before the exploratory plotting.

Reaction: Value of Information Visualization

Information visualization is tangibly valuable. However it is very difficult to quantify this value, which is exactly where InfoVis comes as a valuable asset. The paper discusses a big number of topics that are relevant not only for visualization purposes, but also for data mining and unsupervised machine learning.

One of the biggest challenges of data mining is to explain patterns and associations in higher dimensions. While visualization can support and supervise the data mining algorithms in few dimensions, visualization is limited to explain only a few dimensions at a time.

There is also a detailed discussion on some classical examples for visualization like Napoleon’s march to Moscow and Snow’s cholera pandemic source illustration, two graphs we saw in class. The paper also does a very good job explaining basic principles and dimensions for InfoVis (proximity, similarity, continuity, symmetry, closure, relative size) and why they are all important. It includes relevant examples and illustrations that are to be used for further reference. However it lacks consistency when trying to quantify the value of InfoVis.

Reaction: Balancing Systematic and Flexible Exploration of Social Networks

In this article the authors present SocialAction, which seems a fairly customizable way to visualize, analyze and interact with networks and their components. Recently, Social Network Analysis (SNA) has trended for its applicability to the fraud detection (through outliers and their links) and to estimate potential of a given population based on the behavior of some network elements. I believe that SocialAction is flexible enough to perform industry standard network analysis across analysis from these and other different industries.

SocialAction flexibility is outstanding. Its features include ordering, ranking, coloring, querying, and node rearranging. Nodes and links can be added, deleted, or selected upon customizable parameters. The exploratory analysis is well complemented through a friendly way to create standard plots. As expected, it is more difficult to analyze networks as the dimensions increase, particularly if links overlap. The tool also includes a way to analyze based on dimension reduction and network reduction.

Should the authors provide a case study for trending Social Network Analysis problems, like fraud detection and profit network potential, SocialAction would increase its chances of being the next ultimate tool for this kind of analysis.

Reaction: TileBars: Visualization of Term Distribution Information in Full Text Information Access

Through these article the author exemplifies a tool he designed to simultaneously visualize a term as it appears through the length of the document, its frequency and its distribution across different documents. The user is given a lot of freedom, he has both the ability to specify the number of terms to analyze and which boolean connectors to use.

Today PDF and (Microsoft) Word documents query a term based on order of appearance. Search engines calculate relevance based on historic data and analysis algorithms. It would be interesting to contrast and compare the TileBars visualization tool with text documents as well as with search engines. This way the user could take advantage of a customizable, simultaneous way to visualize a given term and prioritize the output depending on a more robust criteria.

TileBars seem a useful tool, even for visualization only. However texttile algorithm does not seem very robust to explore document structure.

Reaction: Imaging Vector Fields Using Line Integral Convolution

This paper talks about bluring textures along a vector field. The comparisons made in this paper are good.The paperdescribes a technique called the LIC.While describing
the LIC the author has discussed different things.LIC has several advantages that has been discussed.Some of the example images given are very interesting.

However the effect of understanding of vector fields was not very apparent by this paper to me initially.

Project: MoViz

MoViz is a novel visualization that provides a single platform for all your movie rating and review needs. MoViz brings together rating and reviews from Rotten tomatoes, Amazon and IMDB to depict differences and similarities in ratings to help you decide better. If also shows you the number of users who trust a certain source.
This visualization does give a lot of importance to the consolidated ratings obtained from different sources. We have loaded the top 50 movies (from IMDN top 250 list) into our data store. The following is a snapshot of the Ratings tab:

The following are the details of the project:
The team: Anil Kulkarni, Deepa Bantwal Baliga, Shilpa Srinivasan, Sreeja Ravikumar.
The source code:

The screen cast demo can be found here:

Project: Medipa

Team: Juhee Bae, David Gao, Eric Helms, Justin Sherrill

Description: Getting access to medical images is a bit troublesome with the tools we have today. Medipa aims to resolve this by providing the necessary tool for medical professionals to upload medical images such as CT or MRI scan to the internet. This opens the door for patients to see his/her own scans as well as allow other medical professionals to collaborate. Medipa also provides additional tools for users to interact w/ the rendered model.

Data: Uploaded to EC2 instance for viewing.

Note: The site itself have upload disabled due to not wanting to bog down the user experience interacting with images.

Project: A Better Tag Cloud

The Project Description
Word and Tag clouds have been around for a while, but they are often criticized for not offering enough information or enough valuable information. Our team has attempted to improve on a basic tag cloud by offering additional data gathering, text manipulation, and word associativity analysis features, while still keeping the tag cloud fun.

This screenshot, shows the site in action with text scrapped from the blog, with the word "data" clicked on:

The Details

The team: Pamela Ocampa, Juliane Foster, Michael Rountree, Andrea Villanes
The data: Can be gathered from blogs, twitter, or directly copy and pasted into the site

Wednesday, December 14, 2011

Project: Demystifying Cricket

About the Project
Demystifying Cricket is an visualization which helps users to visualize and thus analyze statistical data related to the game of Cricket. Cricket is a popular sport in the Asian and European countries with a fan-base in millions. Cricket is a game which originated in England in 1877 and thus has a huge statistical data related to around thousands of matches played worldwide! However, there has been no attempt at making this data analyzable using the modern tools of visualization. We have done just that.

Our aim is to take this huge data-set of matches, series and players together and construct visualizations which will help an avid fan to play around and simplify the gleaning of  information at the same time. 

For the data-set, we have taken data from the site HowStat! which has a collection of data since 1877. The data on this site is humongous in every-way with no structure of a database available. Thus, we have created our own database using the data from this site. 

The data-set or the database is in MySQL and can be found in the Github repository under the file-name 'Cricket.sql'.

For more information and description of our project, we have provided a detailed screen-cast and hosted them on YouTube. 
Here are their links: Part 1 and Part 2.

Screen-shots of our application

 Fig 1. The HomePage -  World Cup Globe

Fig 2. The Cartogram Globe 

Fig 3. Team Performance Chart 

Fig 4. Head to Head Performance Chart 

Fig 5. Batting Statistics Charts

The future work of our visualization focuses on making the database incorporate batting statistics for Test Matches and Bowling Statistics - thus to create a full-scale Cricket visualization Powerhouse. Furthermore, most of these charts can be modeled along the lines of a slider cum timeline which users can use to detect a significant pattern in their team's performance. We have so much data that we can also predict the probability of a team winning a game against one particular team at a particular venue! 

Finally, here are the related links for our Project:

  1. The Live Application:
  2. The Source Code:
  3. The ScreenCast(s): Part 1 & Part 2
  4. The Team:  Mayur Awaghade, Ameya Gholkar, Micheal Matthews, Swathi Subramanian
Thanks to Dr.Watson for an excellent course with great insight into visualizations and its concepts and also for guiding us throughout the course of the Project. We would also like to thank Clayton Coleman, Bill Houghteling, and Prof. Patrick Fitzgerald for their great and helpful feedback.

Project : LifeGraf – Visualizing personal finances with simple perspective

Himanshu Arora, Lavanya Mohanan, Vivek Dodeja
LifeGraf is a tool for visualizing and comparing the prices of various commodities across different countries. Financial information has been characterised by many complexities like variations over a period of time, exchange rates between countries, normalizing different data indices and valid relevance across various commodities. Hence analysis of financial data is quite a deal for someone who doesn’t want to get into some complicated calculations.
Visualization of financial data is quite important so as to support effective analysis. The scope of analysis provided by this tool involves a comparison of personal finances for an individual across different countries in much simplified way.

The dataset used for LifeGraf is derived from Numbeo has the world’s largest database about the cost of living of various cities worldwide. However it just only shows standard visualization for its data source. Hence we conceptualised on visualising this information in totally new sort of interface which would not only convey the information but also was pretty fun to look at.
To extract the data from Numbeo, we implemented data scrapping script in Java. This script helped us to extract the data to provide usable content. As we studied the dataset, we realised that we needed a visual that could help user understand the data in the simplest manner. Following the feedback from discussion with the class and professor, we decided on using an interactive visualization with pictorial visuals. We used images of basic commodities for better understanding.
For working, the tool expects user to enter his home country and the country whose commodity prices he wants to compare. User is also provided with a list of basic commodities of his country and a list of commodities for comparison in the foreign country. The user now clicks on the images of commodities in both countries and sees the visuals which present a comparison to him. The tool also has two windows for comparison: one which presents the image of basic commodity actually replicated in numbers and the second which gives the equivalent numeric value of the commodity.
Financial data, in general, tends to be quite complex and difficult to understand. LifeGraf was conceptualized and developed to ease this perplex view of financial data by providing some reference index for a common user. The tool has quite a potential to be used widely if properly extended and supported.
LifeGraf Visual:
About LifeGraf:
ScreenCast Access:
Dataset Access:
GitHub Access:

Word of Thanks...
We would like to thank Dr.Watson for providing us with proper guidance and feedback for the project. Also the presentation panel consisting of Clayton Coleman, Bill Houghteling and Prof. Patrick Fitzgerald found our visualization very interesting and provided us with their valuable feedback which we have implemented.