Monday, July 9, 2012

Viz: Twitter data on soda or pop?

Aggregating Twitter data to answer the age-old question: soda or pop?


As more people move their conversations online and into the public eye via social networks, gathering information about regional dialects becomes easier than ever. Twitter data scientist Edwin Chen collected and analyzed tweets across the world to map one of the most well-known markers of linguistic difference: how people refer to soft drinks. First, he sampled messages tagged with a location, filtering them for "soda," "pop," or "coke." Then, he used other words in the sentence to make sure they referred to drinks ("drink a pop," for example) and to filter out specific references to Coke as a brand. Lastly, he aggregated the tweets by location and mapped them.

The results are much the same as in similar older work.. In the US, "pop" is...