Book chapter
Relation-Based Document Retrieval for Biomedical IR
Transactions on Computational Systems Biology V, pp 112-128
2006
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
In this paper, we explore the use of term relations in information retrieval for precision-focused biomedical literature search. A relation is defined as a pair of two terms which are semantically and syntactically related to each other. Unlike the traditional “bag-of-word” model for documents, our model represents a document by a set of sense-disambiguated terms and their binary relations. Since document level co-occurrence of two terms, in many cases, does not mean this document addresses their relationships, the direct use of relation may improve the precision of very specific search, e.g. searching documents that mention genes regulated by Smad4. For this purpose, we develop a generic ontology-based approach to extract terms and their relations, and present a betweenness centrality based approach to rank retrieved documents. A prototyped IR system supporting relation-based search is then built for Medline abstract search. We use this novel IR system to improve the retrieval result of all official runs in TREC-2004 Genomics Track. The experiment shows promising performance of relation-based IR. The average P@100 (the precision of top 100 documents) for 50 topics is significantly raised from 26.37 %( the P@100 of the best run is 42.10%) to 53.69% while the MAP (mean average precision) is kept at an above-average level of 26.59%. The experiment also shows the expressiveness of relations for the representation of information needs, especially in the area of biomedical literature full of various biological relations.
Metrics
Details
- Title
- Relation-Based Document Retrieval for Biomedical IR
- Creators
- Xiaohua Zhou - Drexel UniversityXiaohua Hu - Drexel UniversityGuangren Li - Hunan UniversityXia Lin - Drexel UniversityXiaodan Zhang - Drexel University
- Publication Details
- Transactions on Computational Systems Biology V, pp 112-128
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Resource Type
- Book chapter
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000240081600009
- Scopus ID
- 2-s2.0-38549090589
- Other Identifier
- 991019170550204721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Biochemical Research Methods
- Computer Science, Interdisciplinary Applications
- Computer Science, Theory & Methods