Tuesday, November 29, 2011

Reaction: TileBars: Visualization of Term Distribution Information in Full Text Information Access

Traditional information retrieval systems work well for titles and abstracts, which are compact and focused on main topics. However, full text documents can have different organization structures which means the traditional approaches may be irrelevant or incomplete for information retrieval. Thus the author presents Tilebars, an algorithm which displays relative frequency and distribution of search queries in visual form.

During the discussion, author describes the ranking techniques and states an important fact that ranking should focus on providing user with informative and compact results which are easy to comprehend. The tilebars algorithm gives user a visual representation which indicates relative lengths of the documents. It then uses shading to indicate frequency and distribution of the search terms. The author stresses on identification of document structure of full text documents which may contain subtopics or dropout topics along with the main topics. The figures in the paper could have been organized in a better manner, especially for examples where you may have to scroll up and down too often while reading.

This technique can be very useful to filter out the large research literature to narrow down the focus to relevant papers. Other techniques like ranking can be applied later to this filtered list. However, effectiveness of this technique needs to be studied with in depth testing. Overall, this paper was a good read for someone whose interests lie in information analysis or data mining