Conference paper
contrastBERT: Behavioral Anomaly Detection for Malware Using Contrastive Learning
Game Theory and AI for Security, pp 167-186
2026
Abstract
Behavioral anomaly detection is a useful tool for finding malware attacks. Detection is possible using binary classification, where the behavior of benign software and malware is considered during the detector learning process. However, such classifiers are less likely to detect zero-day attacks, as the set of possible malware is enormous and changing. An alternative to binary classification is anomaly detection, which considers only the behavior of benign software to train a malware detector. Recent advances in Large Language Models (LLMs) in combination with Contrastive Learning have created a unique opportunity to develop anomaly detectors to expose malware threats. Our detector, contrastBERT, defines its language vocabulary as the set of calls to the operating system and the sequences of such calls as sentences. These sequences of system calls are logged during the execution of both benign software on the system to be monitored as well as separately during malware execution. contrastBERT is evaluated using a similarity distribution between benign sample sentences fed to an isolation forest (IF), which provides an effective separation between benign and malicious behavior when trained on embeddings from contrastBERT. Our results show that using this approach, several types of common malware behavior patterns can be detected with a high detection rate and a low false positive rate.
Metrics
12 Record Views
Details
- Title
- contrastBERT: Behavioral Anomaly Detection for Malware Using Contrastive Learning
- Creators
- John Carter - Drexel UniversitySpiros Mancoridis - Drexel UniversityPavlos Protopapas - Harvard UniversityBrian Mitchell - Drexel University
- Contributors
- John S. Baras (Editor)Symeon Papavassiliou (Editor)Eirini Eleni Tsiropoulou (Editor)Muhammed O. Sayin (Editor)
- Publication Details
- Game Theory and AI for Security, pp 167-186
- Series
- Lecture Notes in Computer Science
- Publisher
- Springer Nature; Cham
- Resource Type
- Conference paper
- Language
- English
- Academic Unit
- Computer Science
- Scopus ID
- 2-s2.0-105020238843
- Other Identifier
- 991022121987404721