首页 | 本学科首页   官方微博 | 高级检索  
     

基于底层虚拟机的标识符混淆方法
引用本文:田大江,李成扬,黄天波,文伟平. 基于底层虚拟机的标识符混淆方法[J]. 计算机应用, 2022, 42(8): 2540-2547. DOI: 10.11772/j.issn.1001-9081.2021071166
作者姓名:田大江  李成扬  黄天波  文伟平
作者单位:北京大学 软件与微电子学院,北京 102600
基金项目:华为-北京大学校企合作项目(2020001763)
摘    要:针对现有代码混淆仅限于某一特定编程语言或某一平台,并不具有广泛性和通用性,以及控制流混淆和数据混淆会引入额外开销的问题,提出一种基于底层虚拟机(LLVM)的标识符混淆方法。该方法实现了4种标识符混淆算法,包括随机标识符算法、重载归纳算法、异常标识符算法以及高频词替换算法,同时结合这些算法,设计新的混合混淆算法。所提混淆方法首先在前端编译得到的中间文件中候选出符合混淆条件的函数名,然后使用具体的混淆算法对这些函数名进行处理,最后使用具体的编译后端将混淆后的文件转换为二进制文件。基于LLVM的标识符混淆方法适用于LLVM支持的语言,不影响程序正常功能,且针对不同的编程语言,时间开销在20%内,空间开销几乎无增加;同时程序的平均混淆比率在77.5%,且相较于单一的替换算法和重载算法,提出的混合标识符算法理论分析上可以提供更强的隐蔽性。实验结果表明,所提方法具有性能开销小、隐蔽性强、通用性广的特点。

关 键 词:软件保护  代码混淆  标识符混淆  底层虚拟机  混淆方法  
收稿时间:2021-07-07
修稿时间:2021-09-14

Identifier obfuscation method based on low level virtual machine
Dajiang TIAN,Chengyang LI,Tianbo HUANG,Weiping WEN. Identifier obfuscation method based on low level virtual machine[J]. Journal of Computer Applications, 2022, 42(8): 2540-2547. DOI: 10.11772/j.issn.1001-9081.2021071166
Authors:Dajiang TIAN  Chengyang LI  Tianbo HUANG  Weiping WEN
Affiliation:School of Software and Microelectronics,Peking University,Beijing 102600,China
Abstract:Most of the existing code obfuscation solutions are limited to a specific programming language or a platform, which are not widespread and general. Moreover, control flow obfuscation and data obfuscation introduce additional overhead. Aiming at the above problems, an identifier obfuscation method was proposed based on Low Level Virtual Machine (LLVM). Four identifier obfuscation algorithms were implemented in the method, including random identifier algorithm, overload induction algorithm, abnormal identifier algorithm, and high-frequency word replacement algorithm. At the same time, a new hybrid obfuscation algorithm was designed by combining these algorithms. In the proposed method, firstly, in the intermediate files compiled by the front-ends, the function names, which met the obfuscation criteria, were selected. Secondly, these function names were processed by using specific obfuscation algorithms. Finally, the obfuscated files were transformed into binary files by using specific compilation back-ends. The identifier obfuscation method based on LLVM is suitable for the languages supported by LLVM and does not affect the normal functions of the program. For different programming languages, the time overhead is within 20% and the space overhead hardly increases. At the same time, the average confusion ratio of the program is 77.5%, and compared with the single replacement algorithm and overload algorithm, the proposed mixed identifier algorithm can provide stronger concealment in theoretical analysis. Experimental results show that the proposed method has the characteristics of low-performance overhead, strong concealment, and wide versatility.
Keywords:software protection  code obfuscation  identifier obfuscation  Low Level Virtual Machine (LLVM)  obfuscation method  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号