Attention Is All You Need for LLM-based Code Vulnerability Localization

Yue Li; Xiao Li; Hao Wu; Yue Zhang; Xiuzhen Cheng; Sheng Zhong; Fengyuan Xu

doi:10.48550/arxiv.2410.15288

Back

Attention Is All You Need for LLM-based Code Vulnerability Localization

Preprint

Open access

Attention Is All You Need for LLM-based Code Vulnerability Localization

Yue Li, Xiao Li, Hao Wu, Yue Zhang, Xiuzhen Cheng, Sheng Zhong and Fengyuan Xu

IACAPAP ArXiv (Online)

20 Oct 2024

DOI: https://doi.org/10.48550/arxiv.2410.15288

Files and links (1)

url

https://arxiv.org/abs/2410.15288View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Cryptography and Security

The rapid expansion of software systems and the growing number of reported vulnerabilities have emphasized the importance of accurately identifying vulnerable code segments. Traditional methods for vulnerability localization, such as manual code audits or rule-based tools, are often time-consuming and limited in scope, typically focusing on specific programming languages or types of vulnerabilities. In recent years, the introduction of large language models (LLMs) such as GPT and LLaMA has opened new possibilities for automating vulnerability detection. However, while LLMs show promise in this area, they face challenges, particularly in maintaining accuracy over longer code contexts. This paper introduces LOVA, a novel framework leveraging the self-attention mechanisms inherent in LLMs to enhance vulnerability localization. Our key insight is that self-attention mechanisms assign varying importance to different parts of the input, making it possible to track how much attention the model focuses on specific lines of code. In the context of vulnerability localization, the hypothesis is that vulnerable lines of code will naturally attract higher attention weights because they have a greater influence on the model's output. By systematically tracking changes in attention weights and focusing on specific lines of code, LOVA improves the precision of identifying vulnerable lines across various programming languages. Through rigorous experimentation and evaluation, we demonstrate that LOVA significantly outperforms existing LLM-based approaches, achieving up to a 5.3x improvement in F1-scores. LOVA also demonstrated strong scalability, with up to a 14.6x improvement in smart contract vulnerability localization across languages like C, Python, Java, and Solidity. Its robustness was proven through consistent performance across different LLM architectures.

Metrics

10 Record Views

Details

Title: Attention Is All You Need for LLM-based Code Vulnerability Localization
Creators: Yue Li
Xiao Li
Hao Wu
Yue Zhang
Xiuzhen Cheng
Sheng Zhong
Fengyuan Xu
Publication Details: IACAPAP ArXiv (Online)
Resource Type: Preprint
Language: English
Academic Unit: Computer Science (Computing)
Other Identifier: 991021930421604721

Attention Is All You Need for LLM-based Code Vulnerability Localization

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media