PHANTOM-DGA: Multi-Task Transformer With Cached Phonotactics, Dictionary Segmentation, and Meta-Learning for Robust DGA Domain Detection

Muhammad  Asshad

Authors

Muhammad Asshad Faculty of Computer Studies, Arab open university, Oman

Keywords:

Domain generation algorithm (DGA), Malicious Domain Detection, Transformer Encoder, Self-Supervised Pretraining, Contrastive Learning, Meta-Learning, Phonotactics, Calibration

Abstract

Domain generation algorithms (DGAs) enable malware to evade static blacklists by generating large volumes of pseudo-random candidate domains. While many published detectors achieve near-perfect performance under random i.i.d. splits, performance often degrades under temporal shift and when evaluating on previously unseen DGA families. This paper presents PHANTOM-DGA, a GPU-optimized, checkpointable training pipeline and feature-gated multi-task transformer that combines byte-level sequence modeling with lightweight side features: (i) classical lexical statistics, (ii) dictionary-based segmentation features, and (iii) phonotactic features learned via an n-gram consonant–vowel language model trained only on benign training data to avoid leakage. PHANTOM-DGA also supports self-supervised pretraining (masked modeling + contrastive alignment), and optional Reptile-style meta-learning to improve robustness in a held-out-family evaluation setting. Experiments on the public ‘Domain Generation Algorithm’ dataset (≈1.82M unique canonicalized domains; 52 DGA families) evaluate three protocols: random split, time-forward split, and unseen-family split. Under the random protocol, PHANTOM-DGA reaches ROC-AUC 0.9986 and F1 0.9916. Under the time-forward protocol, ROC-AUC decreases to 0.9697–0.9674 depending on ablations. Under the unseen-family protocol, the best ROC-AUC is 0.9359 with reduced false positives at 95% TPR. The results highlight that protocol choice dominates headline performance and that robustness gains require explicit distribution-shift evaluation and training strategies.

Author Biography

Muhammad Asshad, Faculty of Computer Studies, Arab open university, Oman

Dr. Muhammad Asshad is a researcher and academic in the field of computer engineering, specializing in next-generation 5G wireless networks. He earned his Ph.D. in Computer Engineering from Kocaeli University, Turkey, in 2019, and holds an M.S. in Telecommunication and Networks.

His notable contributions focus on the development of secure and efficient frameworks for next-generation wireless communication networks. His research interests include cybersecurity, network security, and secure communication frameworks, with a focus on enhancing the reliability and protection of modern network infrastructures.

In addition to his research expertise, he holds professional certifications, including CCNA, Cisco IoT, and CC etc. Dr. Asshad is dedicated to academic excellence, innovative teaching, and promoting ethical and secure practices in the field of technology.

PHANTOM-DGA: Multi-Task Transformer With Cached Phonotactics, Dictionary Segmentation, and Meta-Learning for Robust DGA Domain Detection

Authors

Keywords:

Abstract

Author Biography

Muhammad Asshad, Faculty of Computer Studies, Arab open university, Oman

Downloads

Published

Issue

Section

Make a Submission