Ubiquitous computing is now a reality for organizations worldwide, and the security of these systems is of paramount importance for operations across industry, academia, and government. However, the deluge of information given to security operations teams is challenging to respond to at scale and has led to significant interest in automation and autonomous cyber operations -- the automated investigation and remediation of cybersecurity incidents. Despite widespread adoption of machine learning and artificial intelligence in the cybersecurity domain, reinforcement learning has, to date, not been widely leveraged in autonomous cyber operations despite its seemingly natural fit. In this work, we identify four "fatal flaws'' in autonomous cyber operations research: state space observability, attacker knowledge of the environment, defender knowledge of the attacker, and the assumption of a fixed-size state space. Via game theory, decision theory, and experimentation with reinforcement learning agents, we address each of these limitations to bridge the gap between the current state of the art in research and what has been adopted in practice. In particular, we explore the theoretical value of partial observability and find that this result has significant practical effect in designing both attacking and defending agents. Using reinforcement learning, we demonstrate that there is a "price of pessimism'' for overestimating an attacker's capability. By modifying the YAWNING-TITAN and CybORG environments, we explore the impact of training agents on attackers with different objectives and consider the impact of using standard proximal policy optimization versus hierarchical proximal policy optimization, finding that there are trade-offs between the two. Finally, we introduce an agent architecture that incorporates a transformer for representation learning that allows our agents to operate in dynamic state spaces. Through these contributions, we bridge the gap between the current academic state of the art and the needs of cybersecurity practitioners, enabling important future work on autonomous cyber operations.
Metrics
17 File views/ downloads
16 Record Views
Details
Title
Strategic reinforcement learning agents for autonomous cyber defense
Creators
Erick Galinkin
Contributors
Spiros Mancoridis (Advisor)
Emmanouil Pountourakis (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University
Number of pages
xiv, 104 pages
Resource Type
Dissertation
Language
English
Academic Unit
Computer Science (Computing) [Historical]; College of Computing and Informatics (2013-2026); Drexel University