Logo image
The Impact of Name Ambiguity on Properties of Coauthorship Networks
Journal article   Open access   Peer reviewed

The Impact of Name Ambiguity on Properties of Coauthorship Networks

Jinseok Kim, Heejun Kim and Jana Diesner
Journal of information science theory and practice, v 2(2)
01 Jan 2014
url
https://doi.org/10.1633/jistap.2014.2.2.1View
Published, Version of Record (VoR)CC BY V4.0 Open
url
https://doi.org/10.1633/JISTaP.2014.2.2.1View
Published, Version of Record (VoR) Open

Abstract

Initial based disambiguation of author names is a common data pre-processing step in bibliometrics. It is widely accepted that this procedure can introduce errors into network data and any subsequent analytical results. What is not sufficiently understood is the precise impact of this step on the data and findings. We present an empirical answer to this question by comparing the impact of two commonly used initial based disambiguation methods against a reasonable proxy for ground truth data. We use DBLP, a database covering major journals and conferences in computer science and information science, as a source. We find that initial based disambiguation induces strong distortions in network metrics on the graph and node level: Authors become embedded in ties for which there is no empirical support, thus increasing their sphere of influence and diversity of involvement. Consequently, networks generated with initial-based disambiguation are more coherent and interconnected than the actual underlying networks, and individual authors appear to be more productive and more strongly embedded than they actually are.

Metrics

12 Record Views

Details

Logo image