Conference proceeding
Biomedical ontology MeSH improves document clustering qualify on MEDLINE articles: A comparison study
19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, p577
IEEE International Symposium on Computer-Based Medical Systems
01 Jan 2006
Abstract
Document clustering has been used for better document retrieval, document browsing, and text mining. In this paper, we investigate if biomedical ontology MeSH improves the clustering quality for MEDLINE articles. For this investigation, we perform a comprehensive comparison study of various document clustering approaches such as hierarchical clustering methods (single-link, complete-link and complete link), Bisecting K-means, K-means, and Suffix Tree Clustering (STC) in terms of efficiency, effectiveness, and scalability. According to our experiment results, biomedical ontology MeSH significantly enhances clustering quality on biomedical documents. In addition, our results show that decent document clustering approaches, such as Bisecting K-means, K-means and STC, gains some benefit from MeSH ontology while hierarchical algorithms showing the poorest clustering quality do not reap the benefit of MeSH ontology.
Metrics
6 Record Views
Details
- Title
- Biomedical ontology MeSH improves document clustering qualify on MEDLINE articles: A comparison study
- Creators
- Illhoi Yoo - Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USAXiaohua Hu - Drexel University
- Contributors
- D J Lee (Editor)B Nutter (Editor)S Antani (Editor)S Mitra (Editor)J Archibald (Editor)
- Publication Details
- 19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, p577
- Series
- IEEE International Symposium on Computer-Based Medical Systems
- Publisher
- IEEE
- Number of pages
- 2
- Grant note
- 240205; 240196 / PA Dept of Health Tobacco Settlement Formula NSF IIS 0448023; NSF CCF 0514679 / NSF; National Science Foundation (NSF)
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Identifiers
- 991019170334204721
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
Source: InCites
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Information Systems
- Computer Science, Interdisciplinary Applications
- Engineering, Biomedical
- Engineering, Electrical & Electronic