Conference proceeding
Web Clustering based on the Information of Sibling Pages
2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, pp 480-485
01 Jan 2008
Abstract
This paper is dedicated to investigating the value of information from sibling pages for web page clustering. We use a link-based clustering algorithm to examine the usefulness of sibling links for improving clustering quality. The algorithm is extended by two types of edge weighting techniques. The results of the experiments conducted on WebKB4 dataset prove that: (1) using information from sibling pages can significantly improve clustering quality; (2) sib;ling pages are more useful than parent and child pages in enhancing clustering performance; (3) weighting and pruning sibling links can not improve the clustering quality. We also conducted an experiment on the citation dataset Cora7. The results indicate that sibling links are not more useful than the direct citation links when used to cluster collections of research papers.
Metrics
Details
- Title
- Web Clustering based on the Information of Sibling Pages
- Creators
- Caimei Lu - Drexel UniversityXiaodan Zhang - Drexel UniversityJung-ran Park - Drexel UniversityXiaohua Hu - Drexel UniversityTingting He - Central China Normal University
- Publication Details
- 2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, pp 480-485
- Publisher
- IEEE
- Number of pages
- 2
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science
- Web of Science ID
- WOS:000263829500109
- Scopus ID
- 2-s2.0-57949107288
- Other Identifier
- 991019168574104721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Artificial Intelligence
- Computer Science, Interdisciplinary Applications
- Computer Science, Theory & Methods