Conference proceeding
Is There a Limit to the Utility of Analogy? A Case Study in Knowledge Graph Triple Representation
2022 International Conference on Computational Science and Computational Intelligence (CSCI), pp 349-355
Dec 2022
Abstract
Knowledge graph embedding methods are known to be highly dependent on the locally closed world assumption (LCWA). This assumption has practicality in training neural networks for the link prediction task, but is ill-posed for representing knowledge graph triples as first class objects. In this paper, we explore an alternate sampling paradigm, namely pairwise triple similarity scoring (PTSS), and detail the impact of the sampling parameter on downstream predicate prediction tasks. We specifically seek to find the limit at which more negative samples do not provide a statistically significant lift in performance, giving insights into how to efficiently sample and train such models. Our main finding indicates that there is a point of diminishing return on the number of analogous, pairwise samples selected; this point can be found prior to experimental hyperparameter sweeps. Additionally, our experiments show that there are two classes of models: those that benefit from additional sampling, and those that are less impacted. The root cause for these differences is driven by changes in the distributions of similarity scores, depending on the seed knowledge graph embeddings selected. Our work demonstrates the importance of selecting the correct seed embedding method, largely dependent on the topology of the underlying knowledge graph.
Metrics
10 Record Views
2 citations in Scopus
Details
- Title
- Is There a Limit to the Utility of Analogy? A Case Study in Knowledge Graph Triple Representation
- Creators
- Alexander Kalinowski - Drexel UniversityYuan An - Drexel University
- Publication Details
- 2022 International Conference on Computational Science and Computational Intelligence (CSCI), pp 349-355
- Publisher
- IEEE
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science; Decision Sciences (and Management Information Systems)
- Scopus ID
- 2-s2.0-85172021078
- Other Identifier
- 991021043015704721