Large Language Models (LLMs) excel in diverse tasks such as text generation,
data analysis, and software development, making them indispensable across
domains like education, business, and creative industries. However, the rapid
proliferation of LLMs (with over 560 companies developing or deploying them as
of 2024) has raised concerns about their originality and trustworthiness. A
notable issue, termed identity confusion, has emerged, where LLMs misrepresent
their origins or identities. This study systematically examines identity
confusion through three research questions: (1) How prevalent is identity
confusion among LLMs? (2) Does it arise from model reuse, plagiarism, or
hallucination? (3) What are the security and trust-related impacts of identity
confusion? To address these, we developed an automated tool combining
documentation analysis, self-identity recognition testing, and output
similarity comparisons--established methods for LLM fingerprinting--and
conducted a structured survey via Credamo to assess its impact on user trust.
Our analysis of 27 LLMs revealed that 25.93% exhibit identity confusion. Output
similarity analysis confirmed that these issues stem from hallucinations rather
than replication or reuse. Survey results further highlighted that identity
confusion significantly erodes trust, particularly in critical tasks like
education and professional use, with declines exceeding those caused by logical
errors or inconsistencies. Users attributed these failures to design flaws,
incorrect training data, and perceived plagiarism, underscoring the systemic
risks posed by identity confusion to LLM reliability and trustworthiness.
Metrics
22 Record Views
Details
Title
I'm Spartacus, No, I'm Spartacus: Measuring and Understanding LLM Identity Confusion