首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
It is proposed that the complexity of a program is inversely proportional to the average information content of its operators. An empirical probability distribution of the operators occurring in a program is constructed, and the classical entropy calculation is applied. The performance of the resulting metric is assessed in the analysis of two commercial applications totaling well over 130000 lines of code. The results indicate that the new metric does a good job of associating modules with their error spans (averaging number of tokens between error occurrences)  相似文献   

2.
Metamorphic malware is capable of changing its internal structure without altering its functionality. A common signature is nonexistent in highly metamorphic malware and, consequently, such malware can remain undetected under standard signature scanning. In this paper, we apply previous work on structural entropy to the metamorphic detection problem. This technique relies on an analysis of variations in the complexity of data within a file. The process consists of two stages, namely, file segmentation and sequence comparison. In the segmentation stage, we use entropy measurements and wavelet analysis to segment files. The second stage measures the similarity of file pairs by computing an edit distance between the sequences of segments obtained in the first stage. We apply this similarity measure to the metamorphic detection problem and show that we obtain strong results in certain challenging cases.  相似文献   

3.
An entropy-based uncertainty measure of process models   总被引:1,自引:0,他引:1  
In managing business processes, the process uncertainty and variability are significant factors causing difficulties in prediction and decision making, which evokes and augments the importance and need of process measures for systematic analysis. We propose an entropy-based process measure to quantify the uncertainty of business process models. The proposed measure enables capturing the dynamic behavior of processes, in contrast to previous work which focused on providing measures for the static aspect of process models.  相似文献   

4.

Metamorphic malware change their internal code structure by adopting code obfuscation technique while maintaining their malicious functionality during each infection. This causes change of their signature pattern across each infection and makes signature based detection particularly difficult. In this paper, through static analysis, we use similarity score from matrix factorization technique called Nonnegative Matrix Factorization for detecting challenging metamorphic malware. We apply this technique using structural compression ratio and entropy features and compare our results with previous eigenvector-based techniques. Experimental results from three malware datasets show this is a promising technique as the accuracy detection is more than 95%.

  相似文献   

5.
The Journal of Supercomputing - Malware uses a variety of anti-reverse engineering techniques, which makes its analysis difficult. Dynamic analysis tools, e.g., debuggers, DBI (Dynamic Binary...  相似文献   

6.
We present an overview of the latest developments in the detection of metamorphic and virtualization-based malware using an algebraic specification of the Intel 64 assembly programming language. After giving an overview of related work, we describe the development of a specification of a subset of the Intel 64 instruction set in Maude, an advanced formal algebraic specification tool. We develop the technique of metamorphic malware detection based on equivalence-in-context so that it is applicable to imperative programming languages in general, and we give two detailed examples of how this might be used in a practical setting to detect metamorphic malware. We discuss the application of these techniques within anti-virus software, and give a proof-of-concept system for defeating detection counter-measures used by virtualization-based malware, which is based on our Maude specification of Intel 64. Finally, we compare formal and informal approaches to malware detection, and give some directions for future research.  相似文献   

7.
Normalized Compression Distance (NCD) is a popular tool that uses compression algorithms to cluster and classify data in a wide range of applications. Existing discussions of NCD’s theoretical merit rely on certain theoretical properties of compression algorithms. However, we demonstrate that many popular compression algorithms do not seem to satisfy these theoretical properties. We explore the relationship between some of these properties and file size, demonstrate that this theoretical problem is actually a practical problem for classifying malware with large file sizes, and propose some variants of NCD that mitigate this problem.  相似文献   

8.
Metamorphic malware changes its internal structure with each generation, while maintaining its original behavior. Current commercial antivirus software generally scan for known malware signatures; therefore, they are not able to detect metamorphic malware that sufficiently morphs its internal structure. Machine learning methods such as hidden Markov models (HMM) have shown promise for detecting hacker-produced metamorphic malware. However, previous research has shown that it is possible to evade HMM-based detection by carefully morphing with content from benign files. In this paper, we combine HMM detection with a statistical technique based on the chi-squared test to build an improved detection method. We discuss our technique in detail and provide experimental evidence to support our claim of improved detection.  相似文献   

9.
To evade signature-based detection, metamorphic viruses transform their code before each new infection. Software similarity measures are a potentially useful means of detecting such malware. We can compare a given file to a known sample of metamorphic malware and compute their similarity—if they are sufficiently similar, we classify the file as malware of the same family. In this paper, we analyze an opcode-based software similarity measure inspired by simple substitution cipher cryptanalysis. We show that the technique provides a useful means of classifying metamorphic malware.  相似文献   

10.
In this paper, we propose a modified version of the k-nearest neighbor (kNN) algorithm. We first introduce a new affinity function for distance measure between a test point and a training point which is an approach based on local learning. A new similarity function using this affinity function is proposed next for the classification of the test patterns. The widely used convention of k, i.e., k = [√N] is employed, where N is the number of data used for training purpose. The proposed modified kNN algorithm is applied on fifteen numerical datasets from the UCI machine learning data repository. Both 5-fold and 10-fold cross-validations are used. The average classification accuracy, obtained from our method is found to exceed some well-known clustering algorithms.  相似文献   

11.
A method that measures the distance between extended objects of nonregular shape is presented. The distance measure is an average of a set of minimal point-to-point distances between the borders of the objects. The set of points is collected with a well-defined criterion based on processing of distance values on a connected medial axis formed between the objects  相似文献   

12.
一种基于熵的连续属性离散化算法   总被引:6,自引:0,他引:6  
贺跃  郑建军  朱蕾 《计算机应用》2005,25(3):637-638
连续属性离散化的关键在于合理确定离散化划分点的个数和位置。为了提高无监督离散化的效率,给出一种基于熵的连续属性离散化方法。该方法利用连续属性的信息量 (熵 )的特性,通过对连续属性变量的自身划分,最小化信息熵的减少和区间数,并寻求熵的损失与适度的区间数之间的最佳平衡,以便得到优化的离散值。实验表明该算法是行之有效的。  相似文献   

13.
Face is considered to be one of the biometrics in automatic person identification. The non-intrusive nature of face recognition makes it an attractive choice. For face recognition system to be practical, it should be robust to variations in illumination, pose and expression as humans recognize faces irrespective of all these variations. In this paper, an attempt to address these issues is made using a new Hausdorff distance-based measure. The proposed measure represent the gray values of pixels in face images as vectors giving the neighborhood intensity distribution of the pixels. The transformation is expected to be less sensitive to illumination variations besides preserving the appearance of face embedded in the original gray image. While the existing Hausdorff distance-based measures are defined between the binary edge images of faces which contains primarily structural information, the proposed measure gives the dissimilarity between the appearance of faces. An efficient method to compute the proposed measure is presented. The performance of the method on bench mark face databases shows that it is robust to considerable variations in pose, expression and illumination. Comparison with some of the existing Hausdorff distance-based methods shows that the proposed method performs better in many cases.  相似文献   

14.
A self-adaptive differential evolution algorithm incorporate Pareto dominance to solve multi-objective optimization problems is presented. The proposed approach adopts an external elitist archive to retain non-dominated solutions found during the evolutionary process. In order to preserve the diversity of Pareto optimality, a crowding entropy diversity measure tactic is proposed. The crowding entropy strategy is able to measure the crowding degree of the solutions more accurately. The experiments were performed using eighteen benchmark test functions. The experiment results show that, compared with three other multi-objective optimization evolutionary algorithms, the proposed MOSADE is able to find better spread of solutions with better convergence to the Pareto front and preserve the diversity of Pareto optimal solutions more efficiently.  相似文献   

15.
在给出Vague集及其距离测度的概念后,将单个Vague值作为一个区间,运用Vague集向Fuzzy集转化的思想将区间离化为点,通过求点的欧式距离得出两个Vague值之间的距离测度。分析了该公式的性能,并讨论了它在模糊识别中的应用。  相似文献   

16.
Malware classification using machine learning algorithms is a difficult task, in part due to the absence of strong natural features in raw executable binary files. Byte n-grams previously have been used as features, but little work has been done to explain their performance or to understand what concepts are actually being learned. In contrast to other work using n-gram features, in this work we use orders of magnitude more data, and we perform feature selection during model building using Elastic-Net regularized Logistic Regression. We compute a regularization path and analyze novel multi-byte identifiers. Through this process, we discover significant previously unreported issues with byte n-gram features that cause their benefits and practicality to be overestimated. Three primary issues emerged from our work. First, we discovered a flaw in how previous corpora were created that leads to an over-estimation of classification accuracy. Second, we discovered that most of the information contained in n-grams stem from string features that could be obtained in simpler ways. Finally, we demonstrate that n-gram features promote overfitting, even with linear models and extreme regularization.  相似文献   

17.
本文提出了矩阵式红外线车流量识别这个全新的技术,利用微处理器,设计出了经济、高效、准确的车流量检测装置,能实现对被监测路段的车辆行驶方向、车型、多车并行、车流量统计等复杂情形的识别。这有利于获取道路交通情况,促进交通管理。  相似文献   

18.
While monitoring, instrumented long running parallel applications generate huge amount of instrumentation data. Processing and storing this data incurs overhead, and perturbs the execution. A technique that eliminates unnecessary instrumentation data and lowers the intrusion without loosing any performance information is valuable for tool developers. This paper presents a new algorithm for software instrumentation to measure the amount of information content of instrumentation data to be collected. The algorithm is based on entropy concept introduced in information theory, and it makes selective data collection for a time-driven software monitoring system possible.  相似文献   

19.
20.
模糊熵、距离测度和相似性测度是模糊集合的三种重要度量,许多学者对三者之间的关系进行了研究。采用更为严格的定义,通过定义模糊集合之间新的运算研究了三者之间的关系,给出了三者之间的相互诱导公式。对部分公式进行了举例说明。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号