首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Due to its damage to Internet security, malware (e.g., virus, worm, trojan) and its detection has caught the attention of both anti-malware industry and researchers for decades. To protect legitimate users from the attacks, the most significant line of defense against malware is anti-malware software products, which mainly use signature-based method for detection. However, this method fails to recognize new, unseen malicious executables. To solve this problem, in this paper, based on the instruction sequences extracted from the file sample set, we propose an effective sequence mining algorithm to discover malicious sequential patterns, and then All-Nearest-Neighbor (ANN) classifier is constructed for malware detection based on the discovered patterns. The developed data mining framework composed of the proposed sequential pattern mining method and ANN classifier can well characterize the malicious patterns from the collected file sample set to effectively detect newly unseen malware samples. A comprehensive experimental study on a real data collection is performed to evaluate our detection framework. Promising experimental results show that our framework outperforms other alternate data mining based detection methods in identifying new malicious executables.  相似文献   

2.
Malicious executables are programs designed to infiltrate or damage a computer system without the owner’s consent, which have become a serious threat to the security of computer systems. There is an urgent need for effective techniques to detect polymorphic, metamorphic and previously unseen malicious executables of which detection fails in most of the commercial anti-virus software. In this paper, we develop interpretable string based malware detection system (SBMDS), which is based on interpretable string analysis and uses support vector machine (SVM) ensemble with Bagging to classify the file samples and predict the exact types of the malware. Interpretable strings contain both application programming interface (API) execution calls and important semantic strings reflecting an attacker’s intent and goal. Our SBMDS is carried out with four major steps: (1) first constructing the interpretable strings by developing a feature parser; (2) performing feature selection to select informative strings related to different types of malware; (3) followed by using SVM ensemble with bagging to construct the classifier; (4) and finally conducting the malware detector, which not only can detect whether a program is malicious or not, but also can predict the exact type of the malware. Our case study on the large collection of file samples collected by Kingsoft Anti-virus lab illustrate that: (1) The accuracy and efficiency of our SBMDS outperform several popular anti-virus software; (2) Based on the signatures of interpretable strings, our SBMDS outperforms data mining based detection systems which employ single SVM, Naive Bayes with bagging, Decision Trees with bagging; (3) Compared with the IMDS which utilizes the objective-oriented association (OOA) based classification on API calls, our SBMDS achieves better performance. Our SBMDS system has already been incorporated into the scanning tool of a commercial anti-virus software.  相似文献   

3.
谢丽霞  李爽 《计算机应用》2018,38(3):818-823
针对Android恶意软件检测中数据不平衡导致检出率低的问题,提出一种基于Bagging-SVM(支持向量机)集成算法的Android恶意软件检测模型。首先,提取AndroidManifest.xml文件中的权限信息、意图信息和组件信息作为特征;然后,提出IG-ReliefF混合筛选算法用于数据集降维,采用bootstrap抽样构造多个平衡数据集;最后,采用平衡数据集训练基于Bagging算法的SVM集成分类器,通过该分类器完成Android恶意软件检测。在分类检测实验中,当良性样本和恶意样本数量平衡时,Bagging-SVM和随机森林算法检出率均高达99.4%;当良性样本和恶意样本的数量比为4:1时,相比随机森林和AdaBoost算法,Bagging-SVM算法在检测精度不降低的条件下,检出率提高了6.6%。实验结果表明所提模型在数据不平衡时仍具有较高的检出率和分类精度,可检测出绝大多数恶意软件。  相似文献   

4.

In this study we represent malware as opcode sequences and detect it using a deep belief network (DBN). Compared with traditional shallow neural networks, DBNs can use unlabeled data to pretrain a multi-layer generative model, which can better represent the characteristics of data samples. We compare the performance of DBNs with that of three baseline malware detection models, which use support vector machines, decision trees, and the k-nearest neighbor algorithm as classifiers. The experiments demonstrate that the DBN model provides more accurate detection than the baseline models. When additional unlabeled data are used for DBN pretraining, the DBNs perform better than the other detection models. We also use the DBNs as an autoencoder to extract the feature vectors of executables. The experiments indicate that the autoencoder can effectively model the underlying structure of input data and significantly reduce the dimensions of feature vectors.

  相似文献   

5.
针对现有恶意程序行为特征检测存在的不足,采用多轨迹检测方法,用文件操作、网络访问、内存资源访问的行为特征构建出三维恶意行为特征库。在构造投影数据库的过程中,结合AC自动机优化频繁序列查询,舍去不满足最小长度的频繁序列,得到改进的数据挖掘算法——Prefixspan-x,并将其应用于动态提取恶意软件行为特征库和阈值匹配,以克服静态反汇编方式获取软件行为轨迹时软件加壳、混淆带来的检测困难。实验结果表明,基于数据挖掘的多轨迹特征检测技术具有较高的准确率和较低的漏报率。  相似文献   

6.
荣俸萍  方勇  左政  刘亮 《计算机科学》2018,45(5):131-138
基于动态分析的恶意代码检测方法由于能有效对抗恶意代码的多态和代码混淆技术,而且可以检测新的未知恶意代码等,因此得到了研究者的青睐。在这种情况下,恶意代码的编写者通过在恶意代码中嵌入大量反检测功能来逃避现有恶意代码动态检测方法的检测。针对该问题,提出了基于恶意API调用序列模式挖掘的恶意代码检测方法MACSPMD。首先,使用真机模拟恶意代码的实际运行环境来获取文件的动态API调用序列;其次,引入面向目标关联挖掘的概念,以挖掘出能够代表潜在恶意行为模式的恶意API调用序列模式;最后,将挖掘到的恶意API调用序列模式作为异常行为特征进行恶意代码的检测。基于真实数据集的实验结果表明,MACSPMD对未知和逃避型恶意代码进行检测的准确率分别达到了94.55%和97.73%,比其他基于API调用数据的恶意代码检测方法 的准确率分别提高了2.47%和2.66%,且挖掘过程消耗的时间更少。因此,MACSPMD能有效检测包括逃避型在内的已知和未知恶意代码。  相似文献   

7.
This paper presents a supervised methodology that detects malware based on positive selection. Malware detection is a challenging problem due to the rapid growth of the number of malware and increasing complexity. Run-time monitoring of program execution behavior is widely used to discriminate between benign and malicious executables due to its effectiveness and robustness. This paper proposes a novel classification algorithm based on the idea of positive selection, which is one of the important algorithms in Artificial Immune Systems (AIS), inspired by positive selection of T-cells. The proposed algorithm is applied to learn and classify program behavior based on I/O Request Packets (IRP). In our experiments, the proposed algorithm outperforms ANSC, Na? ve Bayes, Bayesian Networks, Support Vector Machine, and C4.5 Decision Tree. This algorithm can also be used in general purpose classification problems not just two-class but multi-class problems.  相似文献   

8.
Behavior‐based detection and signature‐based detection are two popular approaches to malware (malicious software) analysis. The security industry, such as the sector selling antivirus tools, has been using signature and heuristic‐based technologies for years. However, this approach has been proven to be inefficient in identifying unknown malware strains. On the other hand, the behavior‐based malware detection approach has a greater potential in identifying previously unknown instances of malicious software. The accuracy of this approach relies on techniques to profile and recognize accurate behavior models. Unfortunately, with the increasing complexity of malicious software and limitations of existing automatic tools, the current behavior‐based approach cannot discover many newer forms of malware either. In this paper, we implement ‘holography platform’, a behavior‐based profiler on top of a virtual machine emulator that intercepts the system processes and analyzes the CPU instructions, CPU registers, and memory. The captured information is stored in a relational database, and data mining techniques are used to extract information. We demonstrate the breadth of the ‘holography platform’ by conducting two experiments: a packed binary behavior analysis and a malvertising (malicious advertising) incident tracing. Both tasks are known to be very difficult to do efficiently using existing methods and tools. We demonstrate how the precise behavior information can be easily obtained using the ‘holography platform’ tool. With these two experiments, we show that the ‘holography platform’ can provide security researchers and automatic malware detection systems with an efficient malicious software behavior analysis solution. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
由于智能手机使用率持续上升促使移动恶意软件在规模和复杂性方面发展更加迅速。作为免费和开源的系统,目前Android已经超越其他移动平台成为最流行的操作系统,使得针对Android平台的恶意软件数量也显著增加。针对Android平台应用软件安全问题,提出了一种基于多特征协作决策的Android恶意软件检测方法,该方法主要通过对Android 应用程序进行分析、提取特征属性以及根据机器学习模型和分类算法判断其是否为恶意软件。通过实验表明,使用该方法对Android应用软件数据集进行分类后,相比其他分类器或算法分类的结果,其各项评估指标均大幅提高。因此,提出的基于多特征协作决策的方式来对Android恶意软件进行检测的方法可以有效地用于对未知应用的恶意性进行检测,避免恶意应用对用户所造成的损害等。  相似文献   

10.
近些年来,层出不穷的恶意软件对系统安全构成了严重的威胁并造成巨大的经济损失,研究者提出了许多恶意软件检测方案。但恶意软件开发中常利用加壳和多态等混淆技术,这使得传统的静态检测方案如静态特征匹配不足以应对。而传统的应用层动态检测方法也存在易被恶意软件禁用或绕过的缺点。本文提出一种利用底层数据流关系进行恶意软件检测的方法,即在系统底层监视程序运行时的数据传递情况,生成数据流图,提取图的特征形成特征向量,使用特征向量衡量数据流图的相似性,评估程序行为的恶意倾向,以达到快速检测恶意软件的目的。该方法具有低复杂度与高检测效率的特点。实验结果表明本文提出的恶意软件检测方法可达到较高的检测精度以及较低的误报率,分别为98.50%及3.18%。  相似文献   

11.
Android操作系统是市场占有率最高的移动操作系统,基于Android平台的恶意软件也呈现爆发式的增长,而目前仍然没有有效的手段进行Android恶意行为的检测,通过分析Android恶意行为的特点,采用基于贝叶斯网络的机器学习算法进行Android恶意行为的检测,通过静态分析的方法进行Android文件静态特征的提取,将Android恶意应用的静态分析与贝叶斯网络相结合,最后通过使用提出的方法构建贝叶斯网络模型,通过实验验证了提出的Android恶意行为检测模型的有效性。  相似文献   

12.
针对基于特征码的恶意代码检测方法无法应对混淆变形技术的问题,提出基于关键应用编程接口(API)图的检测方法。通过提取恶意代码控制流图中含关键API调用的节点,将恶意行为抽象成关键API图,采用子图匹配的方法判定可疑程序的恶意度。实验结果证明,该方法能有效检测恶意代码变体,漏报率较低。  相似文献   

13.
Recently malicious code is spreading rapidly due to the use of P2P(peer to peer) file sharing. The malicious code distributed mostly transformed the infected PC as a botnet for various attacks by attackers. This can take important information from the computer and cause a large-scale DDos attack. Therefore it is extremely important to detect and block the malicious code in early stage. However a centralized security monitoring system widely used today cannot detect a sharing file on a P2P network. In this paper, to compensate the defect, P2P file sharing events are obtained and the behavior is analyzed. Based on the analysis a malicious file detecting system is proposed and synchronized with a security monitoring system on a virtual machine. In application result, it has been detected such as botnet malware using P2P. It is improved by 12 % performance than existing security monitoring system. The proposed system can detect suspicious P2P sharing files that were not possible by an existing system. The characteristics can be applied for security monitoring to block and respond to the distribution of malicious code through P2P.  相似文献   

14.
针对当前Android恶意程序检测方法对未知应用程序检测能力不足的问题,提出了一种基于textCNN神经网络模型的Android恶意程序检测方法.该方法使用多种触发机制从不同层面上诱导激发程序潜在的恶意行为;针对不同层面上的函数调用,采用特定的hook技术对程序行为进行采集;针对采集到的行为日志,使用fastText算...  相似文献   

15.
With computers and the Internet being essential in everyday life, malware poses serious and evolving threats to their security, making the detection of malware of utmost concern. Accordingly, there have been many researches on intelligent malware detection by applying data mining and machine learning techniques. Though great results have been achieved with these methods, most of them are built on shallow learning architectures. Due to its superior ability in feature learning through multilayer deep architecture, deep learning is starting to be leveraged in industrial and academic research for different applications. In this paper, based on the Windows application programming interface calls extracted from the portable executable files, we study how a deep learning architecture can be designed for intelligent malware detection. We propose a heterogeneous deep learning framework composed of an AutoEncoder stacked up with multilayer restricted Boltzmann machines and a layer of associative memory to detect newly unknown malware. The proposed deep learning model performs as a greedy layer-wise training operation for unsupervised feature learning, followed by supervised parameter fine-tuning. Different from the existing works which only made use of the files with class labels (either malicious or benign) during the training phase, we utilize both labeled and unlabeled file samples to pre-train multiple layers in the heterogeneous deep learning framework from bottom to up for feature learning. A comprehensive experimental study on a real and large file collection from Comodo Cloud Security Center is performed to compare various malware detection approaches. Promising experimental results demonstrate that our proposed deep learning framework can further improve the overall performance in malware detection compared with traditional shallow learning methods, deep learning methods with homogeneous framework, and other existing anti-malware scanners. The proposed heterogeneous deep learning framework can also be readily applied to other malware detection tasks.  相似文献   

16.
Dynamic Analysis of Malicious Code   总被引:2,自引:0,他引:2  
Malware analysis is the process of determining the purpose and functionality of a given malware sample (such as a virus, worm, or Trojan horse). This process is a necessary step to be able to develop effective detection techniques for malicious code. In addition, it is an important prerequisite for the development of removal tools that can thoroughly delete malware from an infected machine. Traditionally, malware analysis has been a manual process that is tedious and time-intensive. Unfortunately, the number of samples that need to be analyzed by security vendors on a daily basis is constantly increasing. This clearly reveals the need for tools that automate and simplify parts of the analysis process. In this paper, we present TTAnalyze, a tool for dynamically analyzing the behavior of Windows executables. To this end, the binary is run in an emulated operating system environment and its (security-relevant) actions are monitored. In particular, we record the Windows native system calls and Windows API functions that the program invokes. One important feature of our system is that it does not modify the program that it executes (e.g., through API call hooking or breakpoints), making it more difficult to detect by malicious code. Also, our tool runs binaries in an unmodified Windows environment, which leads to excellent emulation accuracy. These factors make TTAnalyze an ideal tool for quickly understanding the behavior of an unknown malware.  相似文献   

17.
In this paper, we propose two joint network-host based anomaly detection techniques that detect self-propagating malware in real-time by observing deviations from a behavioral model derived from a benign data profile. The proposed malware detection techniques employ perturbations in the distribution of keystrokes that are used to initiate network sessions. We show that the keystrokes’ entropy increases and the session-keystroke mutual information decreases when an endpoint is compromised by a self-propagating malware. These two types of perturbations are used for real-time malware detection. The proposed malware detection techniques are further compared with three prominent anomaly detectors, namely the maximum entropy detector, the rate limiting detector and the credit-based threshold random walk detector. We show that the proposed detectors provide considerably higher accuracy with almost 100% detection rates and very low false alarm rates.  相似文献   

18.
Malware classification based on call graph clustering   总被引:1,自引:0,他引:1  
Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, enabling the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Clustering experiments are conducted on a collection of real malware samples, and the results are evaluated against manual classifications provided by human malware analysts. Experiments show that it is indeed possible to accurately detect malware families via call graph clustering. We anticipate that in the future, call graphs can be used to analyse the emergence of new malware families, and ultimately to automate implementation of generic detection schemes.  相似文献   

19.
Statistical detection of mass malware has been shown to be highly successful. However, this type of malware is less interesting to cyber security officers of larger organizations, who are more concerned with detecting malware indicative of a targeted attack. Here we investigate the potential of statistically based approaches to detect such malware using a malware family associated with a large number of targeted network intrusions. Our approach is complementary to the bulk of statistical based malware classifiers, which are typically based on measures of overall similarity between executable files. One problem with this approach is that a malicious executable that shares some, but limited, functionality with known malware is likely to be misclassified as benign. Here a new approach to malware classification is introduced that classifies programs based on their similarity with known malware subroutines. It is illustrated that malware and benign programs can share a substantial amount of code, implying that classification should be based on malicious subroutines that occur infrequently, or not at all in benign programs. Various approaches to accomplishing this task are investigated, and a particularly simple approach appears the most effective. This approach simply computes the fraction of subroutines of a program that are similar to malware subroutines whose likes have not been found in a larger benign set. If this fraction exceeds around 1.5 %, the corresponding program can be classified as malicious at a 1 in 1000 false alarm rate. It is further shown that combining a local and overall similarity based approach can lead to considerably better prediction due to the relatively low correlation of their predictions.  相似文献   

20.
We present a scalable and multi-level feature extraction technique to detect malicious executables. We propose a novel combination of three different kinds of features at different levels of abstraction. These are binary n-grams, assembly instruction sequences, and Dynamic Link Library (DLL) function calls; extracted from binary executables, disassembled executables, and executable headers, respectively. We also propose an efficient and scalable feature extraction technique, and apply this technique on a large corpus of real benign and malicious executables. The above mentioned features are extracted from the corpus data and a classifier is trained, which achieves high accuracy and low false positive rate in detecting malicious executables. Our approach is knowledge-based because of several reasons. First, we apply the knowledge obtained from the binary n-gram features to extract assembly instruction sequences using our Assembly Feature Retrieval algorithm. Second, we apply the statistical knowledge obtained during feature extraction to select the best features, and to build a classification model. Our model is compared against other feature-based approaches for malicious code detection, and found to be more efficient in terms of detection accuracy and false alarm rate.
Bhavani Thuraisingham (Corresponding author)Email:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号