Logo image
MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup
Conference proceeding

MaxMatcher: Biological Concept Extraction Using Approximate Dictionary Lookup

Xiaohua Zhou, Xiaodan Zhang and Xiaohua Hu
PRICAI 2006: Trends in Artificial Intelligence, pp 1145-1149
2006

Abstract

Approximate Match Biological Concept Biological Term Boundary Word Significance Score
Dictionary-based biological concept extraction is still the state-of-the-art approach to large-scale biomedical literature annotation and indexing. The exact dictionary lookup is a very simple approach, but always achieves low extraction recall because a biological term often has many variants while a dictionary is impossible to collect all of them. We propose a generic extraction approach, referred to as approximate dictionary lookup, to cope with term variations and implement it as an extraction system called MaxMatcher. The basic idea of this approach is to capture the significant words instead of all words to a particular concept. The new approach dramatically improves the extraction recall while maintaining the precision. In a comparative study on GENIA corpus, the recall of the new approach reaches a 57% recall while the exact dictionary lookup only achieves a 26% recall.

Metrics

16 Record Views

Details

Logo image