Conference proceeding
CatRelate: A New Hierarchical Document Category Integration Algorithm by Learning Category Relationships
Digital Libraries: International Collaboration and Cross-Fertilization, v 3334, pp 280-289
2004
Abstract
We address the problem of integrating documents from a source catalog into a master catalog. Current technologies for solving the problem deem it as a flat category integration problem without considering the useful hierarchy information in the catalog, or deal with it hierarchically but without a rigorous model. In contrast, our method is based on correctly identifying relationships among categories, such as Match, Disjoint, SubConcept, SuperConcept, and Overlap, which come from the relations of sets in Set theory. Compared with traditional Match/NotMatch relationship in literature, our approach is more expressive in defining the relationship. The relationships among categories are first learned in a probabilistic way, and then refined by considering the hierarchy context. Our preliminary experiments show that it can help to correctly identify category relationships, and thus increase the accuracy of document integration.
Metrics
4 Record Views
2 citations in Scopus
Details
- Title
- CatRelate: A New Hierarchical Document Category Integration Algorithm by Learning Category Relationships
- Creators
- Shanfeng Zhu - Bioinformatics Center, Institute for Chemical Research, Kyoto University, JapanChristopher C Yang - Drexel University, Information Science (Informatics)Wai Lam - Chinese University of Hong Kong
- Publication Details
- Digital Libraries: International Collaboration and Cross-Fertilization, v 3334, pp 280-289
- Conference
- International Conference on Asia Digital Libraries (Shanghai, China, 13 Dec 2004–17 Dec 2004)
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Berlin Heidelberg; Berlin, Heidelberg
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science (Informatics)
- Web of Science ID
- WOS:000226741400030
- Scopus ID
- 2-s2.0-35048896545
- Other Identifier
- 991021861113004721
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Information Systems
- Computer Science, Theory & Methods