Book chapter
Clustering Large Collection of Biomedical Literature Based on Ontology-Enriched Bipartite Graph Representation and Mutual Refinement Strategy
Advances in Knowledge Discovery and Data Mining, pp 303-312
2006
Abstract
In this paper we introduce a novel document clustering approach that solves some major problems of traditional document clustering approaches. Instead of depending on traditional vector space model, this approach represents a set of documents as bipartite graphs using domain knowledge in ontology. In this representation, the concepts of the documents are classified according to their relationships with documents that are reflected on the bipartite graph. Using the concept groups, documents are clustered based on the concepts’ contribution to each document. Through the mutual-refinement relationship with concept groups and document groups, the two groups are recursively refined. Our experimental results on MEDLINE articles show that our approach outperforms two leading document clustering algorithms: BiSecting K-means and CLUTO. In addition to its decent performance, our approach provides a meaningful explanation for each document cluster by identifying its most contributing concepts, thus helps users to understand and interpret documents and clustering results.
Metrics
Details
- Title
- Clustering Large Collection of Biomedical Literature Based on Ontology-Enriched Bipartite Graph Representation and Mutual Refinement Strategy
- Creators
- Illhoi Yoo - Drexel UniversityXiaohua Hu - Drexel University
- Publication Details
- Advances in Knowledge Discovery and Data Mining, pp 303-312
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Resource Type
- Book chapter
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000237249600036
- Scopus ID
- 2-s2.0-33745868352
- Other Identifier
- 991019170552304721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Information Systems