Logo image
sysBERT: Improved Behavioral Malware Detection using BERT Trained on sys2vec Embeddings
Conference proceeding   Open access

sysBERT: Improved Behavioral Malware Detection using BERT Trained on sys2vec Embeddings

John Carter, Spiros Mancoridis and Pavlos Protopapas
Proceedings of the Annual Hawaii International Conference on System Sciences, pp 7122-7131
2025
url
https://doi.org/10.24251/HICSS.2025.851View
Published, Version of Record (VoR)Open Access (License Unspecified) Open

Abstract

behavioral malware detection BERT language models Machine Learning
As malware becomes increasingly stealthy and more difficult to detect, behavioral malware detection has become the preferred method of detection, which uses representative run-time data from the device to determine if an infection has occurred. In this work, we collected kernel-level system calls from a router serving IoT devices during periods of benign behavior and periods of known malware infection. The system calls were processed using our custom-trained sys2vec model, which created contextual embeddings for each system call observed. We then subjected the data to a classifier using a Gated Recurrent Unit (GRU) with an Attention layer. Although this pipeline performed well for noisy, easy-to-detect malware, it struggled with stealthier malware. To combat this, we trained a classifier that uses a custom-trained BERT encoder in place of the GRU/Attention layers, which results in much better detection at a usable false positive rate (FPR) ≤ 1 × 10−5.

Metrics

6 Record Views

Details

Logo image