栈式自编码的恶意代码分类算法研究 Research on malicious code classification algorithm of Stacked Auto Encoder期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

栈式自编码的恶意代码分类算法研究

引用本文：	罗世奇,田生伟,孙华,禹龙.栈式自编码的恶意代码分类算法研究[J].计算机应用研究,2018,35(1).

作者姓名：	罗世奇田生伟孙华禹龙

作者单位：	新疆大学软件学院,新疆大学软件学院,新疆大学软件学院,新疆大学网络中心

基金项目：	新疆自治区科技人才培养项目(QN2016YX0051),自治区研究生科研创新项目(XJGRI2017007),自治区研究生教育创新计划科研创新项目“基于深度学习的恶意代码分析与检测研究”(007号),赛尔网络下一代互联网技术创新项目“基于深度学习的IPv6网络恶意代码分析与检测研究 ”(NGII2017XXXX).

摘要：	针对传统机器学习方法不能有效地提取恶意代码的潜在特征，提出了基于栈式自编码(Stacked Auto Encoder,SAE)的恶意代码分类算法。其次，从大量训练样本中学习并提取恶意代码纹理图像特征、指令语句中的隐含特征；在此基础上，为提高特征选择对分类算法准确性的提高，将恶意代码纹理特征以及指令语句频度特征进行融合，训练栈式自编码器和softmax分类器。实验结果表明：基于恶意代码纹理特征以及指令频度特征，利用栈式自编码分类算法对恶意代码具有较好的分类能力，其分类准确率高于传统浅层机器学习模型（随机森林，支持向量机），相比随机森林的方法提高了2.474％，相比SVM的方法提高了1.235％。
关键词：	栈式自编码恶意代码分类
收稿时间：	2016/9/18 0:00:00
修稿时间：	2017/11/16 0:00:00
Research on malicious code classification algorithm of Stacked Auto Encoder

Luo Shiqi,Tian ShengWei,Sun Hua and Yu Long.Research on malicious code classification algorithm of Stacked Auto Encoder[J].Application Research of Computers,2018,35(1).

Authors:	Luo Shiqi Tian ShengWei Sun Hua and Yu Long

Affiliation:	School of Software,Xinjiang University,,,

Abstract:	Aiming at the traditional method can not effectively extract the potential characteristics of malicious code, this paper puts forward Stacked Auto Encoder to classify malicious code into families, Firstly, it studyied and extracted the implicit features of the texture image and semantic in malicious code from a large number of training samples. In order to improve the accuracy of classification algorithm on the features selection, on the basis of that, combine the implicit features of texture image and semantic in malicious code, to train Stacked Auto Encoder and Softmax Regression. The experimental results demonstrate it that on the method of Stacked Auto Encoder to classify malicious code into families, which is based on the implicit features of the texture image and semantic, it has better classification ability than traditional machine learning such as Random Forest, SVM. The accuracy rate is 2.474% higher than that of the traditional Random Forest and is 1.235% higher than that of SVM.

Keywords:	SAE (Stacked Auto Encoder) malicious code classify

	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏