期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李嘉楠韩林柴赟达《计算机工程》2022,48(1):142-148

作为SIMD扩展部件向量化的重要手段,自动向量化已在LLVM编译器中得到实现,但向量长度以及指令集功能的差异,导致国产平台在自动向量化过程中容易错失向量化机会以及向量化后产生倒加速的问题。为使SIMD得到充分应用,结合国产平台的指令集特征完善指令代价信息以提高收益分析精准度,使其在自动向量化后生成后端支持且简洁高效的向量指令。在此基础上,提出一种改进的控制流向量化方法,通过添加指令代价信息提高自动向量化的适配能力,从而形成一套面向国产平台的LLVM自动向量化系统。实验结果表明,相比自动向量化移植前,通过该方法进行移植优化后,SPEC测试的整体性能提升10.8%,TSVC测试集中的加速比提升16%,精准代价指导下的加速比提升42%,控制流向量化下的加速比提升51%。相似文献

2.

通用代码Shell化技术研究

陈涛舒辉熊小兵《计算机科学》2021,48(4):288-294

代码Shell化技术是一种实现程序从源码形态到二进制形态的程序变换技术。该技术可用于实现Shellcode生成,生成包括漏洞利用过程中的Shellcode及后渗透测试过程中的功能性Shellcode。文中形式化地描述了程序中代码与数据的关系,提出了一种基于LLVM(Low Level Virtual Machine)的通用程序变换方法,该方法可用于实现操作系统无关的代码Shell化。该技术通过构建代码内置全局数据表和添加动态重定位代码,将代码对数据的绝对内存地址访问转化为对代码内部全局数据表的相对地址访问,重构了代码与数据之间的引用关系,解决了代码执行过程中对操作系统重定位机制依赖的问题,使得生成的Shellcode代码具有位置无关特性。在验证实验中,使用适用于不同操作系统的不同规模的工程源码对基于该技术实现的Shellcode生成系统进行了功能测试,并对比了Shell化前后代码功能的一致性、文件大小、函数数量和运行时间,实验结果表明基于该技术的Shellcode生成系统功能正常,具有较好的兼容性和通用性。相似文献

3.

Static Dalvik Bytecode Optimization for Android Applications

下载免费PDF全文

Jeehong Kim Inhyeok Kim Changwoo Min Hyung Kook Jun Soo Hyung Lee Won‐Tae Kim Young Ik Eom 《ETRI Journal》2015,37(5):1001-1011

Since just‐in‐time (JIT) has considerable overhead to detect hot spots and compile them at runtime, using sophisticated optimization techniques for embedded devices means that any resulting performance improvements will be limited. In this paper, we introduce a novel static Dalvik bytecode optimization framework, as a complementary compilation of the Dalvik virtual machine, to improve the performance of Android applications. Our system generates optimized Dalvik bytecodes by using Low Level Virtual Machine (LLVM). A major obstacle in using LLVM for optimizing Dalvik bytecodes is determining how to handle the high‐level language features of the Dalvik bytecode in LLVM IR and how to optimize LLVM IR conforming to the language information of the Dalvik bytecode. To this end, we annotate the high‐level language features of Dalvik bytecode to LLVM IR and successfully optimize Dalvik bytecodes through instruction selection processes. Our experimental results show that our system with JIT improves the performance of Android applications by up to 6.08 times, and surpasses JIT by up to 4.34 times. 相似文献

4.

基于底层虚拟机的标识符混淆方法

田大江李成扬黄天波文伟平《计算机应用》2022,42(8):2540-2547

针对现有代码混淆仅限于某一特定编程语言或某一平台,并不具有广泛性和通用性,以及控制流混淆和数据混淆会引入额外开销的问题,提出一种基于底层虚拟机（LLVM）的标识符混淆方法。该方法实现了4种标识符混淆算法,包括随机标识符算法、重载归纳算法、异常标识符算法以及高频词替换算法,同时结合这些算法,设计新的混合混淆算法。所提混淆方法首先在前端编译得到的中间文件中候选出符合混淆条件的函数名,然后使用具体的混淆算法对这些函数名进行处理,最后使用具体的编译后端将混淆后的文件转换为二进制文件。基于LLVM的标识符混淆方法适用于LLVM支持的语言,不影响程序正常功能,且针对不同的编程语言,时间开销在20%内,空间开销几乎无增加;同时程序的平均混淆比率在77.5%,且相较于单一的替换算法和重载算法,提出的混合标识符算法理论分析上可以提供更强的隐蔽性。实验结果表明,所提方法具有性能开销小、隐蔽性强、通用性广的特点。相似文献

5.

基于融合编译的软件多样化保护方法

下载免费PDF全文

熊小兵舒辉康绯《网络与信息安全学报》2020,6(6):13-24

针对现有主流保护方法存在的特征明显、模式单一等问题,以 LLVM 开源编译框架为基础,提出了一种基于融合编译的软件多样化保护方法,该方法将目标软件进行随机化加密处理,并在编译层面与掩护软件进行深度融合,通过内存执行技术,将加密后的目标软件进行解密处理,进而在内存中以无进程的形式执行,利用掩护代码的多样性、融合策略的随机性来实现目标软件的多样化保护效果。选取了多款常用软件作为测试集,从资源开销、保护效果、对比实验等多个角度对所提方法进行了实例测试,从测试结果可以看出,所提方法资源开销较小,相较于混淆、加壳等传统方法,所提方法在抗静态分析、抗动态调试等方面具有较大优势,能有效对抗主流代码逆向分析和破解手段。相似文献

6.

Efficient and retargetable SIMD translation in a dynamic binary translator

下载免费PDF全文

Sheng‐Yu Fu Ding‐Yong Hong Yu‐Ping Liu Jan‐Jan Wu Wei‐Chung Hsu 《Software》2018,48(6):1312-1330

The single‐instruction multiple‐data (SIMD) computing capability of modern processors is continually improved to deliver ever better performance and power efficiency. For example, Intel has increased SIMD register lengths from 128 bits in streaming SIMD extension to 512 bits in AVX‐512. The ARM scalable vector extension supports SIMD register length up to 2048 bits and includes predicated instructions. However, SIMD instruction translation in dynamic binary translation has not received similar attention. For example, the widely used QEMU emulates guest SIMD instructions with a sequence of scalar instructions, even when the host machines have relevant SIMD instructions. This leaves significant potential for performance enhancement. We propose a newly designed SIMD translation framework for dynamic binary translation, which takes advantage of the host's SIMD capabilities. The proposed framework has been built in HQEMU, an enhanced QEMU with a separate thread for applying LLVM optimizations. The current prototype supports ARMv7, ARMv8, and IA32 guests on the X86‐64 AVX‐2 host. Compared with the scalar‐translation version HQEMU, our framework runs up to 1.84 times faster on Standard Performance Evaluation Corporation 2006 CFP benchmarks and up to 6.81 times faster on selected real applications. 相似文献

7.

基于LLVM的RISC-V向量扩展栈帧布局优化

陆旭凡胡海根邢明杰《计算机系统应用》2021,30(11):27-32

为了能够生成正确、优化的机器指令代码,需要在编译器后端代码的生成阶段,设计和使用合适的程序栈帧布局.由于RISC-V向量扩展架构具有可伸缩性、其向量寄存器的长度在编译时不可知,传统的栈帧布局无法适用.之前LLVM中针对向量扩展实现的栈帧布局虽然能够生成正确的机器指令,但存在访存指令较多,栈帧空间较大,以及预留寄存器较多等问题.我们对原有实现所存在的问题进行分析,在此基础上提出了新的布局方式以及向量对象地址计算方式,并通过巴塞罗那超算中心开发的测试集进行验证.实验表明新的栈帧布局能够有效减少访存指令数和栈空间大小. 相似文献

8.

面向多面体模型的静态控制块识别扩展方法

夏文博胡伟方郭浩然《计算机应用与软件》2022,39(3):19-24

在编译优化中,多面体模型可以对计算密集型程序中的耗时较多的循环代码进行并行性和数据局部性优化.但是,多面体建模过程中存在诸多限制,程序中只有少量代码可以被识别进而转换为多面体表示进行优化.基于LLVM编译框架提出一种分析方法,对多面体建模中的非规则因素进行了规范化处理,对非仿射因素提出一种定值扩展方法,消除了多面体建模... 相似文献

9.

非计数类循环的C2VHDL编译方法

下载免费PDF全文

杨杰吴艳霞顾国昌孙延腾《计算机工程与应用》2010,46(30):61-64

目前,大多数C2VHDL编译工具采用有穷状态机（FSM）的设计方法,该方法可以实现循环初值、终值以及步进值确定的计数类循环。由于非计数类循环每次执行循环时都要进行条件判断,程序执行前不能确定循环体执行次数,导致采用FSM方式对其进行C2VHDL编译很复杂,所以大多数C2VHDL编译工具不支持这类循环。以基于LLVM（Low Level Virtual Machine）的ASCRA（Application-Specific Compiler for Reconfigurable Architecture）编译架构为基础,采用一个周期高电平使能信号控制方式代替FSM,提出了一种支持嵌套格式的非计数类循环编译方法。实验结果证明该方法生成的控制结构简单,能够灵活地实现各种非计数类循环的C2VHDL转换,具有较强的可扩展性。相似文献

10.

MPEG Reconfigurable Video Coding: From specification to a reconfigurable implementation

Jérôme Gorin Mickaël Raulet Françoise Prêteux 《Signal Processing: Image Communication》2013,28(10):1224-1238

This paper demonstrates that it is possible to produce automatic, reconfigurable, and portable implementations of multimedia decoders onto platforms with the help of the MPEG Reconfigurable Video Coding (RVC) standard. MPEG RVC is a new formalism standardized by the MPEG consortium used to specify multimedia decoders. It produces visual representations of decoder reference software, with the help of graphs that connect several coding tools from MPEG standards. The approach developed in this paper draws on Dataflow Process Networks to produce a Minimal and Canonical Representation (MCR) of MPEG RVC specifications. The MCR makes it possible to form automatic and reconfigurable implementations of decoders which can match any actual platforms. The contribution is demonstrated on one case study where a generic decoder needs to process a multimedia content with the help of the RVC specification of the decoder required to process it. The overall approach is tested on two decoders from MPEG, namely MPEG-4 part 2 Simple Profile and MPEG-4 part 10 Constrained Baseline Profile. The results validate the following benefits on the MCR of decoders: compact representation, low overhead induced by its compilation, reconfiguration and multi-core abilities. 相似文献