Conference proceeding
Clustering ontology-enriched graph representation for biomedical documents based on scale-free network theory
2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, pp.834-841
01 Jan 2006
Abstract
In this paper we introduce a novel document clustering approach that solves some major problems of traditional document clustering approaches. Instead of depending on traditional vector space model, this approach represents documents as graphs using domain knowledge in ontology because graphs can represent the semantic relationships among the concepts in documents. Based on scale-free network theory, our approach generates a model for each document cluster from the ontology-enriched graph representation by identifying k high density subgraphs capturing the core semantic relationship information about each document cluster. Using these k high density subgraphs, each document is assigned to a proper document cluster. Our extensive experimental results on MEDLINE articles show that our approach outperforms two leading document clustering algorithms, BiSecting K-means and CLUTO's vcluster. Moreover, our. approach provides a meaningful explanation for document clustering through generated models. This explanation helps users to understand clustering results and documents as a whole.
Metrics
5 Record Views
Details
- Title
- Clustering ontology-enriched graph representation for biomedical documents based on scale-free network theory
- Creators
- Illhoi Yoo - Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USAXiaohua Hu - Drexel UniversityIEEE
- Publication Details
- 2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, pp.834-841
- Conference
- 2006 3RD INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, 3rd
- Publisher
- IEEE
- Number of pages
- 8
- Grant note
- 240205; 240196 / PA Dept of Health Tobacco Settlement Formula 0514679 / NSF CCF NSF IIS 0448023 / NSF Career; National Science Foundation (NSF); NSF - Office of the Director (OD)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Identifiers
- 991019170475504721
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Web of Science research areas
- Computer Science, Artificial Intelligence