A boosting ensemble for the recognition of code sharing in malware |
| |
Authors: | Stanley J Barr Samuel J Cardman Jr" target="_blank">David M MartinJr |
| |
Affiliation: | (1) Distributed Systems Center, The MITRE Corporation, Bedford, MA 01730, USA;(2) Computer Science Department, University of Massachusetts Lowell, Lowell, MA 01854, USA |
| |
Abstract: | Research and development efforts have recently started to compare malware variants, as it is believed that malware authors
are reusing code. A number of these projects have focused on identifying functions through the use of signature-based classifiers.
We introduce three new classifiers that characterize a function’s use of global data. Experiments on malware show that we
can meaningfully correlate functions on the basis of their global data references even when their functions share little code.
We also present an algorithm that combines existing classifiers and our new ones into an ensemble for correlating functions
in two binary programs. For testing, we developed a model for comparing our work to previous signature based classifiers.
We then used that model to show how our new combined ensemble classifier dominates the previously reported classifiers. The
resulting ensemble can be used by malware analysts when they are comparing two binaries. This technique will allow them to
correlate both functions and global data references between the two and will lead to a quick identification of any sharing
that is occurring. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|