首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

When training a machine learning model, there is likely to be a tradeoff between accuracy and the diversity of the dataset. Previous research has shown that if we train a model to detect one specific malware family, we generally obtain stronger results as compared to a case where we train a single model on multiple diverse families. However, during the detection phase, it would be more efficient to have a single model that can reliably detect multiple families, rather than having to score each sample against multiple models. In this research, we conduct experiments based on byte n-gram features to quantify the relationship between the generality of the training dataset and the accuracy of the corresponding machine learning models, all within the context of the malware detection problem. We find that neighborhood-based algorithms generalize surprisingly well, far outperforming the other machine learning techniques considered.

  相似文献   

2.
With the rising adoption of the Internet of Things (IoT) across a variety of industries, malware is increasingly targeting the large number of IoT devices that lack adequate protection. Malware hunting is challenging in the IoT due to the variety of instruction set architectures of devices, as shown by the differences in the relevant characteristics of malware on different platforms. There are also serious concerns about resource utilization and privacy leaks in the development of conventional detection models. This study suggests a novel federated malware detection framework based on many-objective optimization (FMDMO) for the IoT to overcome the problems. First, the framework provides a cross-platform compatible basis with the federated mechanism as the backbone, while avoiding raw data sharing to improve privacy protection. Second, an intelligent optimization-based client selection method is designed for four objectives: learning performance, architectural selection deviation, time consumption, and training stability, which leads malware detection to retain a high degree of cross-architectural generalization while enhancing training efficiency. Based on a large IoT malware dataset we constructed, containing 62,515 malware samples across seven typical architectures, the FMDMO is evaluated comprehensively in three scenarios. The experimental results demonstrate the FMDMO substantially enhances the model's cross-platform detection performance while preserving effective training and flexibility.  相似文献   

3.
传统的机器学习算法无法有效地从海量的行为特征中选择出有本质的行为特征来对未知的Android恶意应用进行检测。为了解决这个问题,提出DBNSel,一种基于深度信念网络模型的Android恶意应用检测方法。为了实现该方法,首先通过静态分析方法从Android应用中提取5类不同的属性。其次,建立深度信念网络模型从提取到的属性中进行选择和学习。最后,使用学习到的属性来对未知类型的Android恶意应用进行检测。在实验阶段,使用一个由3 986个Android正常应用和3 986个Android恶意应用组成的数据集来验证DBNSel的有效性。实验结果表明,DBNSel的检测结果要优于其他几种已有的检测方法,并可以达到99.4%的检测准确率。此外,DBNSel具有较低的运行开销,可以适应于更大规模的真实环境下的Android恶意应用检测。  相似文献   

4.

A lot of malicious applications appears every day, threatening numerous users. Therefore, a surge of studies have been conducted to protect users from newly emerging malware by using machine learning algorithms. Albeit existing machine or deep learning-based Android malware detection approaches achieve high accuracy by using a combination of multiple features, it is not possible to employ them on our mobile devices due to the high cost for using them. In this paper, we propose MAPAS, a malware detection system, that achieves high accuracy and adaptable usages of computing resources. MAPAS analyzes behaviors of malicious applications based on API call graphs of them by using convolution neural networks (CNN). However, MAPAS does not use a classifier model generated by CNN, it only utilizes CNN for discovering common features of API call graphs of malware. For efficiently detecting malware, MAPAS employs a lightweight classifier that calculates a similarity between API call graphs used for malicious activities and API call graphs of applications that are going to be classified. To demonstrate the effectiveness and efficiency of MAPAS, we implement a prototype and thoroughly evaluate it. And, we compare MAPAS with a state-of-the-art Android malware detection approach, MaMaDroid. Our evaluation results demonstrate that MAPAS can classify applications 145.8% faster and uses memory around ten times lower than MaMaDroid. Also, MAPAS achieves higher accuracy (91.27%) than MaMaDroid (84.99%) for detecting unknown malware. In addition, MAPAS can generally detect any type of malware with high accuracy.

  相似文献   

5.

As Android-based mobile devices become increasingly popular, malware detection on Android is very crucial nowadays. In this paper, a novel detection method based on deep learning is proposed to distinguish malware from trusted applications. Considering there is some semantic information in system call sequences as the natural language, we treat one system call sequence as a sentence in the language and construct a classifier based on the Long Short-Term Memory (LSTM) language model. In the classifier, at first two LSTM models are trained respectively by the system call sequences from malware and those from benign applications. Then according to these models, two similarity scores are computed. Finally, the classifier determines whether the application under analysis is malicious or trusted by the greater score. Thorough experiments show that our approach can achieve high efficiency and reach high recall of 96.6% with low false positive rate of 9.3%, which is better than the other methods.

  相似文献   

6.

With the recognition of free apps, Android has become the most widely used smartphone operating system these days and it naturally invited cyber-criminals to build malware-infected apps that can steal vital information from these devices. The most critical problem is to detect malware-infected apps and keep them out of Google play store. The vulnerability lies in the underlying permission model of Android apps. Consequently, it has become the responsibility of the app developers to precisely specify the permissions which are going to be demanded by the apps during their installation and execution time. In this study, we examine the permission-induced risk which begins by giving unnecessary permissions to these Android apps. The experimental work done in this research paper includes the development of an effective malware detection system which helps to determine and investigate the detective influence of numerous well-known and broadly used set of features for malware detection. To select best features from our collected features data set we implement ten distinct feature selection approaches. Further, we developed the malware detection model by utilizing LSSVM (Least Square Support Vector Machine) learning approach connected through three distinct kernel functions i.e., linear, radial basis and polynomial. Experiments were performed by using 2,00,000 distinct Android apps. Empirical result reveals that the model build by utilizing LSSVM with RBF (i.e., radial basis kernel function) named as FSdroid is able to detect 98.8% of malware when compared to distinct anti-virus scanners and also achieved 3% higher detection rate when compared to different frameworks or approaches proposed in the literature.

  相似文献   

7.
With computers and the Internet being essential in everyday life, malware poses serious and evolving threats to their security, making the detection of malware of utmost concern. Accordingly, there have been many researches on intelligent malware detection by applying data mining and machine learning techniques. Though great results have been achieved with these methods, most of them are built on shallow learning architectures. Due to its superior ability in feature learning through multilayer deep architecture, deep learning is starting to be leveraged in industrial and academic research for different applications. In this paper, based on the Windows application programming interface calls extracted from the portable executable files, we study how a deep learning architecture can be designed for intelligent malware detection. We propose a heterogeneous deep learning framework composed of an AutoEncoder stacked up with multilayer restricted Boltzmann machines and a layer of associative memory to detect newly unknown malware. The proposed deep learning model performs as a greedy layer-wise training operation for unsupervised feature learning, followed by supervised parameter fine-tuning. Different from the existing works which only made use of the files with class labels (either malicious or benign) during the training phase, we utilize both labeled and unlabeled file samples to pre-train multiple layers in the heterogeneous deep learning framework from bottom to up for feature learning. A comprehensive experimental study on a real and large file collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that our proposed deep learning framework can further improve the overall performance in malware detection compared with traditional shallow learning methods, deep learning methods with homogeneous framework, and other existing anti-malware scanners. The proposed heterogeneous deep learning framework can also be readily applied to other malware detection tasks.  相似文献   

8.
田志成  张伟哲  乔延臣  刘洋 《软件学报》2023,34(4):1926-1943
深度学习已经逐渐应用于恶意代码检测并取得了不错的效果.然而,最近的研究表明:深度学习模型自身存在不安全因素,容易遭受对抗样本攻击.在不改变恶意代码原有功能的前提下,攻击者通过对恶意代码做少量修改,可以误导恶意代码检测器做出错误的决策,造成恶意代码的漏报.为防御对抗样本攻击,已有的研究工作中最常用的方法是对抗训练.然而对抗训练方法需要生成大量对抗样本加入训练集中重新训练模型,效率较低,并且防御效果受限于训练中所使用的对抗样本生成方法.为此,提出一种PE文件格式恶意代码对抗样本检测方法,针对在程序功能无关区域添加修改的一类对抗样本攻击,利用模型解释技术提取端到端恶意代码检测模型的决策依据作为特征,进而通过异常检测方法准确识别对抗样本.该方法作为恶意代码检测模型的附加模块,不需要对原有模型做修改,相较于对抗训练等其他防御方法效率更高,且具有更强的泛化能力,能够防御多种对抗样本攻击.在真实的恶意代码数据集上进行了实验,实验结果表明,该方法能够有效防御针对端到端PE文件恶意代码检测模型的对抗样本攻击.  相似文献   

9.
Recently, transforming windows files into images and its analysis using machine learning and deep learning have been considered as a state-of-the art works for malware detection and classification. This is mainly due to the fact that image-based malware detection and classification is platform independent, and the recent surge of success of deep learning model performance in image classification. Literature survey shows that convolutional neural network (CNN) deep learning methods are successfully employed for image-based windows malware classification. However, the malwares were embedded in a tiny portion in the overall image representation. Identifying and locating these affected tiny portions is important to achieve a good malware classification accuracy. In this work, a multi-headed attention based approach is integrated to a CNN to locate and identify the tiny infected regions in the overall image. A detailed investigation and analysis of the proposed method was done on a malware image dataset. The performance of the proposed multi-headed attention-based CNN approach was compared with various non-attention-CNN-based approaches on various data splits of training and testing malware image benchmark dataset. In all the data-splits, the attention-based CNN method outperformed non-attention-based CNN methods while ensuring computational efficiency. Most importantly, most of the methods show consistent performance on all the data splits of training and testing and that illuminates multi-headed attention with CNN model's generalizability to perform on the diverse datasets. With less number of trainable parameters, the proposed method has achieved an accuracy of 99% to classify the 25 malware families and performed better than the existing non-attention based methods. The proposed method can be applied on any operating system and it has the capability to detect packed malware, metamorphic malware, obfuscated malware, malware family variants, and polymorphic malware. In addition, the proposed method is malware file agnostic and avoids usual methods such as disassembly, de-compiling, de-obfuscation, or execution of the malware binary in a virtual environment in detecting malware and classifying malware into their malware family.  相似文献   

10.
With the increasing market share of Mac OS X operating system, there is a corresponding increase in the number of malicious programs (malware) designed to exploit vulnerabilities on Mac OS X platforms. However, existing manual and heuristic OS X malware detection techniques are not capable of coping with such a high rate of malware. While machine learning techniques offer promising results in automated detection of Windows and Android malware, there have been limited efforts in extending them to OS X malware detection. In this paper, we propose a supervised machine learning model. The model applies kernel base Support Vector Machine and a novel weighting measure based on application library calls to detect OS X malware. For training and evaluating the model, a dataset with a combination of 152 malware and 450 benign were created. Using common supervised Machine Learning algorithm on the dataset, we obtain over 91% detection accuracy with 3.9% false alarm rate. We also utilize Synthetic Minority Over-sampling Technique (SMOTE) to create three synthetic datasets with different distributions based on the refined version of collected dataset to investigate impact of different sample sizes on accuracy of malware detection. Using SMOTE datasets we could achieve over 96% detection accuracy and false alarm of less than 4%. All malware classification experiments are tested using cross validation technique. Our results reflect that increasing sample size in synthetic datasets has direct positive effect on detection accuracy while increases false alarm rate in compare to the original dataset.  相似文献   

11.
恶意代码分类是恶意代码分析和入侵检测领域中的核心问题.现有分类方法分析效率低,准确性差,主要原因在于行为分析原始资料规模大,噪声高,随机因素干扰.针对上述问题,以恶意代码行为序列报告作为基础,在分析随机因素及行为噪声对恶意代码行为特征和操作相似性的干扰之后,给出一个系统调用参数有效窗口模型,通过该模型加强行为序列的相似度描述能力,降低随机因素的干扰.在此基础上提出一种基于朴素贝叶斯机器学习模型和操作相似度窗口的恶意代码自动分类方法.设计并实现了一个自动恶意代码行为分类器原型MalwareFilter.使用真实恶意代码生成的行为序列报告对原型系统进行评估,通过实验证明了该方法的有效性,结果表明,该方法通过操作相似度窗口提高了训练和分类过程的性能和准确度.  相似文献   

12.
近年来,低级别微结构特征已被广泛应用于恶意软件检测。但是,微结构特征数据通常包含大量的冗余信息,且目前的检测方法并没有对输入微结构数据进行有效地预处理,这就造成恶意软件检测需要依赖于复杂的深度学习模型才能获得较高的检测性能。然而,深度学习检测模型参数量较大,难以在计算机底层得到实际应用。为了解决上述问题,本文提出了一种新颖的动态分析方法来检测恶意软件。首先,该方法创建了一个自动微结构特征收集系统,并从收集的通用寄存器(General-Purpose Registers, GPRs)数据中随机抽取子样本作为分类特征矩阵。相比于其他微结构特征, GPRs特征具有更丰富的行为特征信息,但也包含更多的噪声信息。因此,需要对GPRs数据进行特征区间分割,以降低数据复杂度并抑制噪声。本文随后采用词频-逆文档频率(Term Frequency-Inverse Document Frequency, TF-IDF)技术从抽取的特征矩阵中选择最具区分性的信息来进行恶意软件检测。TF-IDF技术可以有效降低特征矩阵的维度,从而提高检测效率。为了降低模型复杂度,并保证检测方法的性能,本文利用集成学习模型来识...  相似文献   

13.
随着安卓恶意软件数量的快速增长,传统的恶意软件检测与分类机制存在检测率低、训练模型复杂度高等问题。为解决上述问题,结合图像纹理特征提取技术和机器学习分类器,提出基于灰度图纹理特征的恶意软件分类方法。该方法首先将恶意软件样本生成灰度图,设计并集成了包含GIST和Tamura特征提取算法在内的4种特征提取方法;然后将所得纹理特征集合作为源数据,基于Caffe高性能处理架构构造了5种分类学习模型,最终实现对恶意软件的检测和分类。实验结果表明,基于图像纹理特征的恶意软件分类具有较高的准确率,且Caffe架构能有效缩短学习时间,降低复杂度。  相似文献   

14.
传统机器学习在恶意软件分析上需要复杂的特征工程,不适用于大规模的恶意软件分析。为提高在Android恶意软件上的检测效率,将Android恶意软件字节码文件映射成灰阶图像,综合利用深度可分离卷积(depthwise separable convolution,DSC)和注意力机制提出基于全局注意力模块(GCBAM)的Android恶意软件分类模型。从APK文件中提取字节码文件,将字节码文件转换为对应的灰阶图像,通过构建基于GCBAM的分类模型对图像数据集进行训练,使其具有Android恶意软件分类能力。实验表明,该模型对Android恶意软件家族能有效分类,在获取的7 630个样本上,分类准确率达到98.91%,相比机器学习算法在准确率、召回率等均具有较优效果。  相似文献   

15.

Each year, a huge number of malicious programs are released which causes malware detection to become a critical task in computer security. Antiviruses use various methods for detecting malware, such as signature-based and heuristic-based techniques. Polymorphic and metamorphic malwares employ obfuscation techniques to bypass traditional detection methods used by antiviruses. Recently, the number of these malware has increased dramatically. Most of the previously proposed methods to detect malware are based on high-level features such as opcodes, function calls or program’s control flow graph (CFG). Due to new obfuscation techniques, extracting high-level features is tough, fallible and time-consuming; hence approaches using program’s bytes are quicker and more accurate. In this paper, a novel byte-level method for detecting malware by audio signal processing techniques is presented. In our proposed method, program’s bytes are converted to a meaningful audio signal, then Music Information Retrieval (MIR) techniques are employed to construct a machine learning music classification model from audio signals to detect new and unseen instances. Experiments evaluate the influence of different strategies converting bytes to audio signals and the effectiveness of the method.

  相似文献   

16.
Zhu  Hui-Juan  Jiang  Tong-Hai  Ma  Bo  You  Zhu-Hong  Shi  Wei-Lei  Cheng  Li 《Neural computing & applications》2018,30(11):3353-3361

Mobile phones are rapidly becoming the most widespread and popular form of communication; thus, they are also the most important attack target of malware. The amount of malware in mobile phones is increasing exponentially and poses a serious security threat. Google’s Android is the most popular smart phone platforms in the world and the mechanisms of permission declaration access control cannot identify the malware. In this paper, we proposed an ensemble machine learning system for the detection of malware on Android devices. More specifically, four groups of features including permissions, monitoring system events, sensitive API and permission rate are extracted to characterize each Android application (app). Then an ensemble random forest classifier is learned to detect whether an app is potentially malicious or not. The performance of our proposed method is evaluated on the actual data set using tenfold cross-validation. The experimental results demonstrate that the proposed method can achieve a highly accuracy of 89.91%. For further assessing the performance of our method, we compared it with the state-of-the-art support vector machine classifier. Comparison results demonstrate that the proposed method is extremely promising and could provide a cost-effective alternative for Android malware detection.

  相似文献   

17.
对于传统的恶意程序检测方法存在的缺点,针对将数据挖掘和机器学习算法被应用在未知恶意程序的检测方法进行研究。当前使用单一特征的机器学习算法无法充分发挥其数据处理能力,检测效果不佳。文中将语音识别模型与随机森林算法相结合,首次提出了综和APK文件多类特征统一建立N-gram模型,并应用随机森林算法用于未知恶意程序检测。首先,采用多种方式提取可以反映Android恶意程序行为的3类特征,包括敏感权限、DVM函数调用序列以及OpCodes特征;然后,针对每类特征建立N-gram模型,每个模型可以独立评判恶意程序行为;最后,3类特征模型统一加入随机森林算法进行学习,从而对Android程序进行检测。基于该方法实现了Android恶意程序检测系统,并对811个非恶意程序及826个恶意程序进行检测,准确率较高。综合各个评价指标,与其他相关工作对比,实验结果表明该系统在恶意程序检测准确率和有效性上表现更优。  相似文献   

18.

The acceptance and widespread use of the Android operating system drew the attention of both legitimate developers and malware authors, which resulted in a significant number of benign and malicious applications available on various online markets. Since the signature-based methods fall short for detecting malicious software effectively considering the vast number of applications, machine learning techniques in this field have also become widespread. In this context, stating the acquired accuracy values in the contingency tables in malware detection studies has become a popular and efficient method and enabled researchers to evaluate their methodologies comparatively. In this study, we wanted to investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers, particularly the disassembly method and the input data characteristics. Firstly, we developed a model that tackles the malware detection problem from a Natural Language Processing (NLP) perspective using Long Short-Term Memory (LSTM). Then, we experimented with different base units (instruction, basic block, method, and class) and representations of source code obtained from three commonly used disassembling tools (JEB, IDA, and Apktool) and examined the results. Our findings exhibit that the disassembly method and different input representations affect the model results. More specifically, the datasets collected by the Apktool achieved better results compared to the other two disassemblers.

  相似文献   

19.
恶意软件的爆炸性增长,以及对用户机和网络环境造成的严重威胁,逐渐成为了网络空间安全领域的主要矛盾。当前传统的基于特征码的静态扫描技术和基于软件行为的恶意软件检测技术容易产生误报和漏报,渐渐无法满足信息安全领域的新要求。为了解决这些问题,提出基于卷积神经网络CNN的恶意代码检测技术。利用Cuckoo沙箱系统来模拟运行环境并提取分析报告;通过编写Python脚本对分析报告进行预处理;搭建深度学习CNN训练模型来实现对恶意代码的检测,并将其与机器学习以及常见的杀毒软件进行比较。实验结果表明,该方法在相比之下更具有优势,并且取得了较好的检测效果,具有更高的可行性。  相似文献   

20.
首先介绍了安全传输层(TLS,transport layer security)协议的特点、流量识别方法;然后给出了一种基于机器学习的分布式自动化的恶意加密流量检测体系;进而从 TLS 特征、数据元特征、上下文数据特征3个方面分析了恶意加密流量的特征;最后,通过实验对几种常见机器学习算法的性能进行对比,实现了对恶意加密流量的高效检测。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号