Journal article
Topic signature language models for ad hoc retrieval
Drexel University. College of Engineering. Department of Mechanical Engineering and Mechanics. Faculty Research and Publications.
15 Jan 2008
Abstract
Abstract—Semantic smoothing, which incorporates synonym and sense information into the language models, is effective and potentially significant to improve retrieval performance. Previously implemented semantic smoothing models such as the translation model have shown good experimental results. However, these models are unable to incorporate contextual information. To overcome this limitation, we propose a novel context-sensitive semantic smoothing method that decomposes a document into a set of weighted context-sensitive topic signatures and then maps those topic signatures into query terms. The language model with such a contextsensitive semantic smoothing is referred to as the topic signature language model. In detail, we implement two types of topic signatures, depending on whether ontology exists in the application domain. One is the ontology-based concept and the other is the multiword phrase. The mapping probabilities from each topic signature to individual terms are estimated through the EM algorithm. Document models based on topic signature mapping are then derived. The new smoothing method is evaluated on theTREC 2004/ 2005 Genomics Track with ontology-based concepts, as well as the TREC Ad Hoc Track (Disks 1, 2, and 3) with multiword phrases. Both experiments show significant improvements over the two-stage language model, as well as the language model with contextinsensitive semantic smoothing.
Metrics
Details
- Title
- Topic signature language models for ad hoc retrieval
- Creators
- Xiaohua Zhou (Author) - Drexel University (1970-)Xiaohua Hu 1960- (Author) - Drexel University (1970-)Xiaodan Zhang (Author) - Drexel University (1970-)
- Publication Details
- Drexel University. College of Engineering. Department of Mechanical Engineering and Mechanics. Faculty Research and Publications.
- Publisher
- The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Mechanical Engineering and Mechanics; College of Engineering; Drexel University (1970-)
- Web of Science ID
- WOS:000248117600010
- Scopus ID
- 2-s2.0-34547970247
- Other Identifier
- 991014632721304721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Information Systems
- Engineering, Electrical & Electronic