AI-MTD: Zero-Trust Artificial Intelligence Model Security Based on Moving Target Defense

Daniel Gilkarov; Ran Dubin

AI-MTD: Zero-Trust Artificial Intelligence Model Security Based on Moving Target Defense

Daniel Gilkarov, Ran Dubin

DOI: http://dx.doi.org/10.15439/2025F9981

Citation: Proceedings of the 20th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 43, pages 699–704 (2025)

Full text

Abstract. This paper examines the challenges in distributing AI models through file transfer mechanisms. Despite advance- ments in security measures, vulnerabilities persist, necessitating a multi-layered approach to mitigate risks effectively. The physical security of model files is critical, requiring stringent access controls and attack prevention solutions. This paper proposes a novel solution architecture that protects the model architecture and weights from attacks by using Moving Target Defense (MTD), which obfuscates the model, preventing unauthorized access, and enabling detection of changes to the model. Our method is shown to be effective at detecting alterations to the model, such as steganography; it is faster than encryption (0.1 seconds to obfuscate vs. 18 seconds to encrypt for a 2500 MB model), and it preserves the accessibility of the original model file format, unlike encryption. Finally, our code is available at https://github.com/ArielCyber/AI-model-MTD.git.

References

E. Wenger, J. Passananti, A. N. Bhagoji, Y. Yao, H. Zheng, and B. Y. Zhao, “Backdoor attacks against deep learning systems in the physical world,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 6206–6215.
M. Chen, G. He, and J. Wu, “Zddr: A zero-shot defender for adversarial samples detection and restoration,” IEEE Access, 2024.
K. Nguyen, T. Fernando, C. Fookes, and S. Sridharan, “Physical adversarial attacks for surveillance: A survey,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
Y. Yao, J. Duan, K. Xu, Y. Cai, Z. Sun, and Y. Zhang, “A survey on large language model (llm) security and privacy: The good, the bad, and the ugly,” High-Confidence Computing, p. 100211, 2024.
N. S. Agency, “Deploying ai systems securely, best practices for deploying secure and resilient ai systems,” 2024, accessed: 2024-05-01. [Online]. Available: https://media.defense.gov/2024/Apr/15/2003439257/-1/-1/0/CSI-DEPLOYING-AI-SYSTEMS-SECURELY.PDF
M. ATLAS, “Mitre atlas,” 2024, accessed: 2024-05-01. [Online]. Available: https://atlas.mitre.org/
OWSAP, “Owasp machine learning security top ten,” 2023, accessed: 2023-10-15. [Online]. Available: https://owasp.org/www-project-machine-learning-security-top-10/
MITRE, “Mitre atlas,” 2023, accessed: 2023-10-15. [Online]. Available: https://atlas.mitre.org/
M. ATLAS, “Mitre atlas, user execution: Unsafe ml artifacts,” 2024, accessed: 2024-05-01. [Online]. Available: https://atlas.mitre.org/techniques/AML.T0011.000
R. Dubin, “Disarming attacks inside neural network models,” IEEE Access, 2023.
E. Sultanik, “Never a dill moment: Exploiting machine learning pickle files,” 2022, accessed: 2022-12-19. [Online]. Available: https://blog.trailofbits.com/2021/03/15/never-a-dill-moment-exploiting-machine-learning-pickle-files/
P. Zhou, “How to make hugging face to hug worms: Discovering and exploiting unsafe pickle.loads over pre-trained large model hubs,” accessed: 2024-08-01. [Online]. Available: https://i.blackhat.com/Asia-24/Presentations/Asia-24-Zhou-HowtoMakeHuggingFace.pdf
M. Slaviero, “Sour pickles, a serialized exploitation guide in one part,” accessed: 2023-05-01. [Online]. Available: https://media.blackhat.com/bh-us-11/Slaviero/BH_US_11_Slaviero_Sour_Pickles_Slides.pdf
D. Gilkarov, “AI-MTD Code Repository,” 2025, accessed: 2025-06-04. [Online]. Available: https://github.com/ArielCyber/AI-model-MTD.git
J.-H. Cho, D. P. Sharma, H. Alavizadeh, S. Yoon, N. Ben-Asher, T. J. Moore, D. S. Kim, H. Lim, and F. F. Nelson, “Toward proactive, adaptive defense: A survey on moving target defense,” IEEE Communications Surveys & Tutorials, vol. 22, no. 1, pp. 709–745, 2020.
V. Heydari, “Moving target defense for securing scada communications,” IEEE Access, vol. 6, pp. 33 329–33 343, 2018.
M. Azab and M. Eltoweissy, “Migrate: Towards a lightweight moving-target defense against cloud side-channels,” in 2016 IEEE security and privacy workshops (SPW). IEEE, 2016, pp. 96–103.
M. Styugin, V. Zolotarev, A. Prokhorov, and R. Gorbil, “New approach to software code diversification in interpreted languages based on the moving target technology,” in 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT). IEEE, 2016, pp. 1–5.
S. Banescu and A. Pretschner, “A tutorial on software obfuscation,” Advances in Computers, vol. 108, pp. 283–353, 2018.
G. Mordehai, Y. Elovici, and G. Kedma, “Method and system for protecting computerized systems from malicious code,” Jul. 11 2017, uS Patent 9,703,954.
D. Evans, A. Nguyen-Tuong, and J. Knight, “Effectiveness of moving target defenses,” Moving Target Defense: Creating Asymmetric Uncertainty for Cyber Threats, pp. 29–48, 2011.
R. Dubin, “Disarming attacks inside neural network models code repository,” 2022, accessed: 2023-07-15. [Online]. Available: https://github.com/ArielCyber/AI-MODEL-CDR
Intel, “Reference architecture for privacy preserving machine learning with intel® sgx and tensorflow* serving,” accessed: 2023-05-01. [Online]. Available: https://www.intel.com/content/www/us/en/developer/articles/technical/privacy-preserving-ml-with-sgx-and-tensorflow.html
“The ai pc powered by intel is here. now, ai is for everyone.” accessed: 2023-05-01. [Online]. Available: https://www.intel.com/content/www/us/en/products/docs/processors/core-ultra/ai-pc.html
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, F. Pereira, C. Burges, L. Bottou, and K. Weinberger, Eds., vol. 25. Curran Associates, Inc., 2012. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,” 2022. [Online]. Available: https://arxiv.org/abs/2201.03545
G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” 2018. [Online]. Available: https://arxiv.org/abs/1608.06993
M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” 2020. [Online]. Available: https://arxiv.org/abs/1905.11946
——, “Efficientnetv2: Smaller models and faster training,” 2021. [Online]. Available: https://arxiv.org/abs/2104.00298
I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, and P. Dollár, “Designing network design spaces,” 2020. [Online]. Available: https://arxiv.org/abs/2003.13678
L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” 2017. [Online]. Available: https://arxiv.org/abs/1706.05587
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” 2016. [Online]. Available: https://arxiv.org/abs/1506.01497
Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” 2021. [Online]. Available: https://arxiv.org/abs/2106.13230
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019. [Online]. Available: https://arxiv.org/abs/1810.04805
A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” 2020. [Online]. Available: https://arxiv.org/abs/2006.11477
D. Gilkarov and R. Dubin, “Steganalysis of ai models lsb attacks,” IEEE Transactions on Information Forensics and Security, 2024.
R. Dubin, “Content disarm and reconstruction of steganography malware in neural network models,” 2023, accessed: 2022-01-15. [Online]. Available: https://github.com/randubin/CDR-NN
H. Face, “Safetensors,” accessed: 2024-08-01. [Online]. Available: https://huggingface.co/docs/safetensors/en/index
J. Zheng, P. P. Chan, H. Chi, and Z. He, “A concealed poisoning attack to reduce deep neural networks’ robustness against adversarial samples,” Information Sciences, vol. 615, pp. 758–773, 2022.
H. Face, “Hugging face model zoo,” accessed: 2023-05-01. [Online]. Available: https://huggingface.co/models