Thursday, December 15, 2011

Reaction: TileBars: Visualization of Term Distribution Information in Full Text Information Access

Through these article the author exemplifies a tool he designed to simultaneously visualize a term as it appears through the length of the document, its frequency and its distribution across different documents. The user is given a lot of freedom, he has both the ability to specify the number of terms to analyze and which boolean connectors to use.

Today PDF and (Microsoft) Word documents query a term based on order of appearance. Search engines calculate relevance based on historic data and analysis algorithms. It would be interesting to contrast and compare the TileBars visualization tool with text documents as well as with search engines. This way the user could take advantage of a customizable, simultaneous way to visualize a given term and prioritize the output depending on a more robust criteria.

TileBars seem a useful tool, even for visualization only. However texttile algorithm does not seem very robust to explore document structure.