Monday, August 29, 2011

Data: NetFlix Prize Interactive Dataset

During fall of 2007, Netflix announced NetFlix Prize competition for the best collaborative filtering algorithm to predict their user ratings for films based on their previous ratings. Netflix already had their own algorithm called Cinematch.

Netflix had the training data set for more than 100 million ratings that about 500K users gave to 17,770 movies and the prizes were given based on comparison of the user algorithm with Cinematch.

The visualization shown below shows all 17,770 movies that were the part of Netflix Prize competition. The movies are laid out in such a manner that similar movies are closer to one another. Similarity of movies is calculated using the likes and dislikes of multiple users.

Activate the visualization below by clicking on Show Visualization button and then hover your mouse pointer on the visualization points to see the movies.