Content-based text mapping using multi-dimensional projections for exploration of document collections

Research output: Chapter in Book/Report/Conference proceedingsConference proceedingpeer-review

Abstract

This paper presents a technique for generation of maps of documents targeted at placing similar documents in the same neighborhood. As a result, besides being able to group (and separate) documents by their contents, it runs at very manageable computational costs. Based on multi-dimensional projection techniques and an algorithm for projection improvement, it results in a surface map that allows the user to identify a number of important relationships between documents and sub-groups of documents via visualization and interaction. Visual attributes such as height, color, isolines and glyphs as well as aural attributes (such as pitch), help add dimensions for integrated visual analysis. Exploration and narrowing of focus can be performed using a set of tools provided. This novel text mapping technique, named IDMAP (Interactive Document Map), is fully described in this paper. Results are compared with dimensionality reduction and cluster techniques for the same purposes. The maps are bound to support a large number of applications that rely on retrieval and examination of document collections and to complement the type of information offered by current knowledge domain visualizations.

Original languageEnglish
Title of host publicationVisualization and Data Analysis 2006 - Proceedings of SPIE-IS and T Electronic Imaging
DOIs
Publication statusPublished - 2006
Externally publishedYes
EventVisualization and Data Analysis 2006 - San Jose, CA, United States
Duration: 16 Jan 200617 Jan 2006

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume6060
ISSN (Print)0277-786X

Conference

ConferenceVisualization and Data Analysis 2006
Country/TerritoryUnited States
CitySan Jose, CA
Period16/01/0617/01/06

Keywords

  • Document mapping
  • Domain knowledge visualization
  • IDMAP
  • Multi-dimensional projection
  • Text visualization

Fingerprint

Dive into the research topics of 'Content-based text mapping using multi-dimensional projections for exploration of document collections'. Together they form a unique fingerprint.

Cite this