首页 | 本学科首页   官方微博 | 高级检索  
     

基于多维特征图知识蒸馏的对抗样本防御方法
引用本文:邱宝琳,易平. 基于多维特征图知识蒸馏的对抗样本防御方法[J]. 网络与信息安全学报, 2022, 8(2): 88-99. DOI: 10.11959/j.issn.2096-109x.2022012
作者姓名:邱宝琳  易平
作者单位:上海交通大学网络空间安全学院,上海 200240
基金项目:国家重点研发计划(2019YFB1405000)
摘    要:计算机视觉领域倾向使用深度神经网络完成识别任务,但对抗样本会导致网络决策异常。为了防御对抗样本,主流的方法是对模型进行对抗训练。对抗训练存在算力高、训练耗时长的缺点,其应用场景受限。提出一种基于知识蒸馏的对抗样本防御方法,将大型数据集学习到的防御经验复用到新的分类任务中。在蒸馏过程中,教师模型和学生模型结构一致,利用模型特征图向量作为媒介进行经验传递,并只使用干净样本训练。使用多维度特征图强化语义信息的表达,并且提出一种基于特征图的注意力机制,将特征依据重要程度赋予权重,增强蒸馏效果。所提算法在Cifar100、Cifar10等开源数据集上进行实验,使用FGSM(fast gradient sign method)、PGD(project gradient descent)、C&W(Carlini-Wagner attack)等算法进行白盒攻击,测试实验效果。所提方法在Cifar10干净样本的准确率超过对抗训练,接近模型在干净样本正常训练的准确率。在L2距离的PGD攻击下,所提方法效果接近对抗训练,显著高于正常训练。而且其学习成本小,即使添加注意力机制和多维度特征图等优化方案,...

关 键 词:深度学习  对抗样本防御  知识蒸馏  多维度特征图

Adversarial examples defense method based on multi-dimensional feature maps knowledge distillation
Baolin QIU,Ping YI. Adversarial examples defense method based on multi-dimensional feature maps knowledge distillation[J]. Chinese Journal of Network and Information Security, 2022, 8(2): 88-99. DOI: 10.11959/j.issn.2096-109x.2022012
Authors:Baolin QIU  Ping YI
Affiliation:School of Cyber Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Abstract:The neural network approach has been commonly used in computer vision tasks.However, adversarial examples are able to make a neural network generate a false prediction.Adversarial training has been shown to be an effective approach to defend against the impact of adversarial examples.Nevertheless, it requires high computing power and long training time thus limiting its application scenarios.An adversarial examples defense method based on knowledge distillation was proposed, reusing the defense experience from the large datasets to new classification tasks.During distillation, teacher model has the same structure as student model and the feature map vector was used to transfer experience, and clean samples were used for training.Multi-dimensional feature maps were utilized to enhance the semantic information.Furthermore, an attention mechanism based on feature map was proposed, which boosted the effect of distillation by assigning weights to features according to their importance.Experiments were conducted over cifar100 and cifar10 open-source dataset.And various white-box attack algorithms such as FGSM (fast gradient sign method), PGD (project gradient descent) and C&W (Carlini-Wagner attack) were applied to test the experimental results.The accuracy of the proposed method on Cifar10 clean samples exceeds that of adversarial training and is close to the accuracy of the model trained on clean samples.Under the PGD attack of L2 distance, the efficiency of the proposed method is close to that of adversarial training, which is significantly higher than that of normal training.Moreover, the proposed method is a light-weight adversarial defense method with low learning cost.The computing power requirement is far less than that of adversarial training even if optimization schemes such as attention mechanism and multi-dimensional feature map are added.Knowledge distillation can learn the decision-making experience of normal samples and extract robust features as a neural network learning scheme.It uses a small amount of data to generate accurate and robust models, improves generalization, and reduces the cost of adversarial training.
Keywords:deep learning  adversarial examples defense  knowledge distillation  multi-dimensional feature maps  
点击此处可从《网络与信息安全学报》浏览原始摘要信息
点击此处可从《网络与信息安全学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号