Logo image
AXE: An Agentic eXploit Engine for Confirming Zero-Day Vulnerability Reports
Preprint   Open access

AXE: An Agentic eXploit Engine for Confirming Zero-Day Vulnerability Reports

Amirali Sajadi, Tu Nguyen, Kostadin Damevski and Preetha Chatterjee
pp 1-17
15 Feb 2026
url
https://doi.org/10.48550/arxiv.2602.14345View
Published, Version of Record (VoR)arXiv.org - Non-exclusive license to distribute Open

Abstract

Computer Science - Artificial Intelligence Computer Science - Cryptography and Security Artificial Intelligence or Cybernetics
Vulnerability detection tools are widely adopted in software projects, yet they often overwhelm maintainers with false positives and non-actionable reports. Automated exploitation systems can help validate these reports; however, existing approaches typically operate in isolation from detection pipelines, failing to leverage readily available metadata such as vulnerability type and source-code location. In this paper, we investigate how reported security vulnerabilities can be assessed in a realistic grey-box exploitation setting that leverages minimal vulnerability metadata, specifically a CWE classification and a vulnerable code location. We introduce Agentic eXploit Engine (AXE), a multi-agent framework for Web application exploitation that maps lightweight detection metadata to concrete exploits through decoupled planning, code exploration, and dynamic execution feedback. Evaluated on the CVE-Bench dataset, AXE achieves a 30% exploitation success rate, a 3x improvement over state-of-the-art black-box baselines. Even in a single-agent configuration, grey-box metadata yields a 1.75x performance gain. Systematic error analysis shows that most failed attempts arise from specific reasoning gaps, including misinterpreted vulnerability semantics and unmet execution preconditions. For successful exploits, AXE produces actionable, reproducible proof-of-concept artifacts, demonstrating its utility in streamlining Web vulnerability triage and remediation. We further evaluate AXE's generalizability through a case study on a recent real-world vulnerability not included in CVE-Bench.

Metrics

1 Record Views

Details

Logo image