Logo image
Evaluating the Effectiveness of Psychological Prompt Injection Attacks on Large Language Models for Social Engineering Artifact Generation
Conference proceeding   Open access

Evaluating the Effectiveness of Psychological Prompt Injection Attacks on Large Language Models for Social Engineering Artifact Generation

Eve Cohen and Thomas Heverin
Proceedings of the 24th European Conference on Cyber Warfare and Security, pp 879-883
01 Jun 2025
url
10.34190/eccws.24.1.3515 View
Published, Version of Record (VoR) Open CC BY-NC-ND V4.0
url
https://doi.org/10.34190/eccws.24.1.3515 View
Published, Version of Record (VoR) Open

Abstract

Access control Chatbots Cohen's d Cybercrime Effectiveness Evaluation Investigations Kruskal-Wallis test Large language models Mann-Whitney U test Psychological effects Ransomware Security Statistical methods Success Tactics Variance analysis Cybersecurity
This study explores the vulnerability of Large Language Models (LLMs) to prompt injection attacks, a critical security concern. We investigate the effectiveness of four psychological techniques (PTs) from social engineering - Impersonation, Incentive, Persuasion, and Quid Pro Quo - in facilitating these attacks. Prompt injection involves manipulating LLMs by embedding malicious instructions within user prompts, potentially generating harmful content or compromising sensitive data. Understanding these mechanisms is crucial for developing effective defenses. Our research assesses how these PTs influence prompt injection success rates against ChatGPT-4o mini and Gemma-7b-it LLMs used for ChatGPT and Gemini respectively. We hypothesized that PTs significantly increase the likelihood of successful attacks, with some techniques being more effective. 220 prompt injection tests (110 per LLM) were conducted, designed to elicit social-engineering artifacts like phishing emails, fake login screens, and ransomware notes, evaluating model susceptibility to diverse attack vectors. The four PTs were chosen based on their relevance to manipulating human behavior in social engineering. Impersonation involves assuming a trusted identity, Incentive offers rewards, Persuasion uses manipulative tactics, and Quid Pro Quo involves reciprocal exchanges. These techniques were adapted for prompt injections to simulate real-world social engineering scenarios. Statistical methods, including ANOVA and Kruskal-Wallis tests, assessed the overall impact of PTs. Mann-Whitney U tests with Bonferroni correction compared individual techniques, and Cohen's d measured effect sizes. Results demonstrate a statistically significant impact of PTs on prompt injection success. Impersonation was most effective across both LLMs, followed by Persuasion and Quid Pro Quo, with Incentive being least effective. These findings align with social engineering principles, highlighting the power of impersonation and other manipulative tactics. Our research has significant implications for LLM security and Al-driven social engineering. LLM vulnerability to psychologically-driven prompt injections necessitates proactive security measures. Future research should focus on robust defense mechanisms, explore the interplay of PTs, and investigate their impact on LLM security. This study contributes to understanding LLM vulnerabilities and developing more resilient Al systems.

Metrics

38 Record Views

Details

Logo image