Conference proceeding
MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup
PRICAI 2006: Trends in Artificial Intelligence, pp 1145-1149
2006
Abstract
Dictionary-based biological concept extraction is still the state-of-the-art approach to large-scale biomedical literature annotation and indexing. The exact dictionary lookup is a very simple approach, but always achieves low extraction recall because a biological term often has many variants while a dictionary is impossible to collect all of them. We propose a generic extraction approach, referred to as approximate dictionary lookup, to cope with term variations and implement it as an extraction system called MaxMatcher. The basic idea of this approach is to capture the significant words instead of all words to a particular concept. The new approach dramatically improves the extraction recall while maintaining the precision. In a comparative study on GENIA corpus, the recall of the new approach reaches a 57% recall while the exact dictionary lookup only achieves a 26% recall.
Metrics
16 Record Views
Details
- Title
- MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup
- Creators
- Xiaohua Zhou - Drexel UniversityXiaodan Zhang - Drexel UniversityXiaohua Hu - Drexel University
- Publication Details
- PRICAI 2006: Trends in Artificial Intelligence, pp 1145-1149
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science
- Other Identifier
- 991019170317804721