Conference proceeding
Text Retrieval based on Least Information Measurement
ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, pp.125-132
01 Jan 2017
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
We developed a new information retrieval framework based on the Least Information (LI) metric. We derived multiple term weighting schemes and combined them with a vector space representation for ad hoc retrieval. Given probability distributions in a collection as prior knowledge, LI Binary (LIB) quantifies least information due to the binary occurrence of a term in a document whereas LI Frequency (LIF) measures least information based on the probability of drawing a term from a bag of words. Experiments on four benchmark TREC collections for ad hoc retrieval showed that LIT-based methods achieved superior performances compared to classic TF*IDF and BM25, especially for verbose queries and hard search topics. The least information theory is a method for entropy-based information measurement and offers a novel approach for IR modeling.
Metrics
9 Record Views
Details
- Title
- Text Retrieval based on Least Information Measurement
- Creators
- Weimao Ke - Drexel UniversityAssoc Comp Machinery
- Publication Details
- ICTIR'17: PROCEEDINGS OF THE 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL, pp.125-132
- Conference
- ICTIR'17: 2017 ACM SIGIR INTERNATIONAL CONFERENCE THEORY OF INFORMATION RETRIEVAL
- Publisher
- Assoc Computing Machinery
- Number of pages
- 8
- Grant note
- 1646955 / National Science Foundation; National Science Foundation (NSF)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Identifiers
- 991019167529204721
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Web of Science research areas
- Computer Science, Information Systems
- Computer Science, Theory & Methods