A mixture language model for class-attribute mining from biomedical literature digital library

Xiaohua Zhou; Xiaohua Hu; Xiaodan Zhang; Daniel D. Wu

doi:10.1109/BIBMW.2007.4425416

Back

Conference proceeding

A mixture language model for class-attribute mining from biomedical literature digital library

Xiaohua Zhou, Xiaohua Hu, Xiaodan Zhang and Daniel D. Wu

2007 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, pp 174-182

01 Jan 2007

DOI: https://doi.org/10.1109/BIBMW.2007.4425416

Additional Links

Abstract

Computer Science, Information Systems

Engineering, Electrical & Electronic

Science & Technology

Computer Science

Engineering

Technology

We define and study a novel text mining problem for biomedical literature digital library, referred to as the class-attribute mining. Given a collection of biomedical literature from a digital library addressing a set of objects (e.g., proteins) and their descriptions (e.g., protein functions), the tasks of class-attribute mining include: (1) to identify and summarize latent classes in the space of objects, (2) to discover latent attribute themes in the space of object descriptions, and (3) to summarize the commonalities and differences among identified classes along each attribute theme. We approach this mining problem through a mixture language model and estimate the parameters of the model using the EM algorithm. We demonstrate the effectiveness of the model with an application called protein community identification and annotation from Medline, the largest biomedical literature digital library with more than 16 millions abstracts.

Metrics

8 Record Views

Details

Title: A mixture language model for class-attribute mining from biomedical literature digital library
Creators: Xiaohua Zhou - Drexel Univ, Coll Informat Sci & Technol, Data Mining & Bioinformat Lab, Philadelphia, PA 19104 USA
Xiaohua Hu - Drexel University, Information Science
Xiaodan Zhang - Drexel Univ, Coll Informat Sci & Technol, Data Mining & Bioinformat Lab, Philadelphia, PA 19104 USA
Daniel D. Wu - Drexel Univ, Coll Informat Sci & Technol, Data Mining & Bioinformat Lab, Philadelphia, PA 19104 USA
Publication Details: 2007 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, pp 174-182
Series: IEEE International Conference on Bioinformatics and Biomedicine Workshop-BIBMW
Publisher: IEEE
Number of pages: 9
Resource Type: Conference proceeding
Language: English
Academic Unit: Information Science
Web of Science ID: WOS:000253368800024
Scopus ID: 2-s2.0-44949192814
Other Identifier: 991019173588804721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Web of Science research areas: Computer Science, Information Systems; Engineering, Electrical & Electronic

A mixture language model for class-attribute mining from biomedical literature digital library

Additional Links

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media