Conference proceeding
Utilization of global ranking information in graph-based biomedical literature clustering
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, Vol.4654
Lecture Notes in Computer Science
01 Jan 2007
Abstract
In this paper, we explore how global ranking method in conjunction with local density method help identify meaningful term clusters from ontology enriched graph representation of biomedical literature corpus. One big problem with document clustering is how to discount the effects of class-unspecific general terms and strengthen the effects of class-specific core terms. We claim that a well constructed term graph can help improve the global ranking of class-specific core terms. We first apply PageRank and HITS to a directed abstract-title term graph to target class specific core terms. Then k dense term clusters (graphs) are identified from these terms. Last, each document is assigned to its closest core term graph. A series of experiments are conducted on a document corpus collected from PubMed. Experimental results show that our approach is very effective to identify class-specific core terms and thus help document clustering.
Metrics
4 Record Views
Details
- Title
- Utilization of global ranking information in graph-based biomedical literature clustering
- Creators
- Xiaodan Zhang - Drexel Univ, Coll Informat Sci & Technol, 3141 Chestnut St, Philadelphia, PA 19104 USAXiaohua Hu - Drexel UniversityJiali Xia - Jiangxi Univ Finance & Econ, UFSoft Sch Software, Nanchang, Jiangxi, Peoples R ChinaXiaohua Zhou - Drexel Univ, Coll Informat Sci & Technol, 3141 Chestnut St, Philadelphia, PA 19104 USAPalakorn Achananuparp - Drexel Univ, Coll Informat Sci & Technol, 3141 Chestnut St, Philadelphia, PA 19104 USA
- Contributors
- I Y Song (Editor)T M Nguyen (Editor)
- Publication Details
- DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, Vol.4654
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Nature
- Number of pages
- 2
- Grant note
- 240205; 240196 / PA Dept of Health Tobacco Settlement Formula IIS 0448023; CCF 0514679 / NSF; National Science Foundation (NSF) 239667 / PA Dept of Health
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Identifiers
- 991019170151304721
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
Source: InCites
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Information Systems
- Computer Science, Theory & Methods