Large Language Models (LLMs) demonstrate remarkable zero-shot performance
across various natural language processing tasks. The integration of multimodal
encoders extends their capabilities, enabling the development of Multimodal
Large Language Models that process vision, audio, and text. However, these
capabilities also raise significant security concerns, as these models can be
manipulated to generate harmful or inappropriate content through jailbreak.
While extensive research explores the impact of modality-specific input edits
on text-based LLMs and Large Vision-Language Models in jailbreak, the effects
of audio-specific edits on Large Audio-Language Models (LALMs) remain
underexplored. Hence, this paper addresses this gap by investigating how
audio-specific edits influence LALMs inference regarding jailbreak. We
introduce the Audio Editing Toolbox (AET), which enables audio-modality edits
such as tone adjustment, word emphasis, and noise injection, and the Edited
Audio Datasets (EADs), a comprehensive audio jailbreak benchmark. We also
conduct extensive evaluations of state-of-the-art LALMs to assess their
robustness under different audio edits. This work lays the groundwork for
future explorations on audio-modality interactions in LALMs security.
Metrics
12 Record Views
Details
Title
Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak
Creators
Erjia Xiao
Hao Cheng
Jing Shao
Jinhao Duan
Kaidi Xu
Le Yang
Jindong Gu
Renjing Xu
Resource Type
Preprint
Language
English
Academic Unit
Computer Science (Computing)
Other Identifier
991022020738404721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services