Friday, May 18, 2012

Spotted: How accurate are models of text corpora? -- Interpretation and trust: designing model-driven visualizations for text analysis

Interpretation and trust: designing model-driven visualizations for text analysis

Jason Chuang, Daniel Ramage, Christopher Manning, Jeffrey Heer

Statistical topic models can help analysts discover patterns in large text corpora by identifying recurring sets of words and enabling exploration by topical concepts. However, understanding and validating the output of these models can itself be a challenging analysis task. In this paper, we offer two design considerations - interpretation and trust - for designing visualizations based on data-driven models. Interpretation refers to the facility with which an analyst makes inferences about the data through the lens of a model abstraction. Trust refers to the actual and perceived accuracy of an analyst's inferences. These considerations derive from our experiences developing the Stanford Dissertation Browser, a tool for exploring over 9,000 Ph.D.

0 comments: