首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
使用微机的用户有时想采取一些加密方法把自已的某些文件或子目录加密,以防止别人看到、调用、执行或拷贝删除等操作,本文就笔者的~些使用经验对文件及子目录加密方法作一概括总结,以供大家参考。一、文件加密1.口令法口令法是防止无关人员擅自招待磁盘中某一可执行文件的一种加密方法。它是在编程时,设置口令及相应的输入指令,当编译连接产生执行文件后,无关人员无法查到口令,执行文件时,先提示输入口令,若用户输入的口令与预先设置的相同就执行;若不同,则重复询问几次(次数可随意设定),苦连续几次键入的口令都不下确,则…  相似文献   

2.
有些较大的微机应用系统,除通过中心控制室内的主机键盘输入操作指令外,还需要能够在另外的地点输入常用或者重要的操作指令,即进行近台控制。本文介绍一种实现近台控制技术,其主要特点是近控操作键对于应用软件完全透明,并与主机键盘上的相应按键一样,能够单次按键重复输入。  相似文献   

3.
PASCAL计算机     
PASCAL 是一台并行的二进制计算机,字长42位,时钟脉冲重复频率为660千周,平均每秒执行6万次操作,能执行定浮点运算。除了磁心存贮器外还有磁鼓、磁带。具有变址与计数的地址修改方式。特殊指令中有计数指令、重复指令、根据上次比较结果的转移指令以及两类连接指令(为了使用子程序与解释程序)。传送指令能在计算过程中同时实现在磁鼓与磁心之间、磁带与磁心之间两个方向的数据交换。  相似文献   

4.
本文介绍了用2的补码表示定点运算的16位单片微处理机。它执行同NOVA系列小型计算机一样的指令并具有与之相同的性能,它被封装在40条腿的双列直插封装内。图1是处理机的逻辑符号,它所表示的各种数据和控制信号解释如下:  相似文献   

5.
汇编语言程序员常常发现,在编制程序的过程中,某些代码块需要重复多次。这种块,举例说,可以是由保存或调换各组寄存器内容的代码组成,也可以由建立连接关系或执行一系列算术运算的代码组成。在这种情况下,程序员会发现宏指令技巧是有用的。宏指令(通常称为宏)是若干组指令的单行缩写。在使用宏的过程中,实质上是程序员定义单一“指令”以表示一个代码块。每当这个单行宏指令在程序中出现时,宏处理汇编程序将取整个块代替之。  相似文献   

6.
逆指令技术是一种比较有效的加密方法,用这种方法加密的软件有难以动态跟踪的优点。 基本原理:CPU的特点之一是顺序执行指令,各种跟踪器也都是顺序反汇编,如果能让CPU执行逆序的指令,那么跟踪器顺序反汇编出的都是乱码,从而就可以起到反跟踪的效果。如何让CPU执行逆序的指令呢?这就要设置标志寄存器中的单步标志,设置了该标志后,CPU每执行一条指令后就执行一次INT1。如果用新的INT1代替旧的INT1,在新INT1  相似文献   

7.
反跟踪一技     
在制造加密系统时,加密技术的好坏固然重要,但反跟踪技术的“奇”与“妙”显得更为关键。在此笔者介绍一种比较理想的反跟踪技术——指令队列预取法。 在计算机CPU中,为了提高运行速度,专门辟有一个指令流队列,以存放后续指令。CPU把取指令部分与执行指令部分分开,因而当一条指令执行时,同时可以取出其后续的一条或多条指令放于指令流队列中排队。这样,在完成一条指令时就可以马上执行下一条指令,减少了CPU取指所花的等待时间。指令队列空间的大小随CPU的不同而各异。8086 CPU为4字节、  相似文献   

8.
针对超标量处理器中指令长时间占用重排序缓存引起指令退休缓慢的问题,提出了一种基于投机执行的两级退休机制.该方案根据指令有无异常和预测错误风险将指令分为有风险指令和无风险指令,对重排序缓存进行轻量化改进,只有存在异常和预测风险的指令才允许进重排序缓存,在确认风险消除后将指令快速退休.重命名寄存器从重排序缓存分离,负责寄存器重命名和结果乱序回写.实验结果表明,在硬件资源相同的情况下,基于该方案的处理器比传统的按序退休处理器的性能平均提高28.8%以上.  相似文献   

9.
通常人们所说的计算机的运算速度,一般是指的平均运算速度。它是指计算机平均每秒钟可执行的指令条数。在计算平均运算速度时,要涉及到计算机执行的程序中各种指令的比例,因为不同的指令的执行速度是不一样的。不同类型的程序所用的指令种类的比例是不同的,所以计算出来的运算速度也是大不相同的。  相似文献   

10.
<正> 一、指令基本概念简述一位微机的工业控制单元ICU有一个指令寄存器专用于寄存OBM的指令操作码OPC,它具有4个输入端I_0~I_3,因而允许16种(2~4=16)不同编码输入,也就有16条基本指令。每条指令可分成两部分,第一部分是指令的操作码,由4位二进制数组成,位于指令的高4位,指令的第二部分是系统  相似文献   

11.
Speculative execution is the execution of instructions before it is known whether these instructions should be executed. In the speculative execution for instruction level parallelism (ILP) processors, the concept of shadow register provides a hardware solution to maintain semantics of a program from the pollution of boosted instructions that are incorrectly predicted. In a recent study, Chang and Lai proposed a special register file based on shadow register, named conjugate register file (CRF), to support multilevel boosting in speculative execution. They also proposed a scheduling heuristic named frequency-driven scheduling to incorporate with CRF for execution. However, the ability of boosting is still constrained since the concept of register pair will force the results produced speculatively be stored in dedicated locations. Moreover, when the parallelism potential increases to tens through the advancement of hardware techniques, the heavy demand on register usage and the complexity of register file may well become a serious bottleneck for the exploitation of ILP.In this paper, the algorithm of frequency-driven scheduling is modified by replacing the function of hardware CRF with the technique of variable renaming during compilation. The new scheduling technique, named LESS, can exploit the parallelism efficiently with limited number of registers. Moreover, since the technique can benefit ILP without any special hardware support, it can be incorporated with any other ILP architecture without changing its instruction set architecture (ISA).Simulation results show that the performance achievable by LESS is better than other existing methods. For example, under the ILP model with an issue rate of 8, the speculative execution can achieve an increase of 34% in parallelism, as compared to 18% in CRF scheme.  相似文献   

12.
Neil Burroughs 《Software》2016,46(11):1499-1523
The primary goal of the register allocation phase in a compiler is to minimize register spills to memory. Spill decisions by the allocator are often made based on the costs of spilling a virtual register and, therefore, on an assumed placement of spill instructions. However, because most allocators make these decisions incrementally, placement opportunities can change as allocation proceeds, calling into question the basis for the original spill decision. An alternative heuristic to placement costs for spill decisions focuses on where program execution will lead. Spilling the virtual register with the Furthest Next Use is known to lead to the minimum number of loads under certain conditions in straight‐line code. While it has been implemented in register allocation in different forms, none of these implementations fully exploits profiling information. We present a register allocator that can adapt to improved profiling information, using branch probabilities to compute an Expected Distance to Next Use for making spill decisions and block frequency information to optimize post‐allocation spill instruction placement. Spill placement is optimized after allocation using a novel method for minimizing spill instruction costs on the control flow graph. Our evaluation of the allocator compared with LLVM recognizes more than 36% and 50% reductions, on average, in the number of dynamically executed store and load instructions, respectively, when using statically derived profiling information. When using dynamically gathered profiling, these improvements increase to 50% and 60% reductions, on average, for stores and loads, respectively. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
SMA:前瞻性多线程体系结构   总被引:4,自引:1,他引:3  
肖刚  周兴铭  徐明  邓鹍 《计算机学报》1999,22(6):582-590
提出了一种新的ILP处理器体系结构-前瞻性多线程体系的结构,简称SMA.它结合了前瞻性执行机制和多线程执行机制,以整个线程为长步进行前瞻性执行,多个线程并行执行并且共享处理器硬件资源,这样,处理器既通过组合每个线程的指令窗口形成一个大的动态指令窗口,开发出程序中更大的ILP,又利用多线程执行机制屏蔽各种长延迟操作,达到较高的资源利用率;介绍了SMA执行模型,并讨论了SMA处理器的实现和其中的关键技  相似文献   

14.
寄存器分配与指令调度是编译器优化过程中的两项重要任务.由于这两个阶段通常是独立完成的,寄存器分配往往会引入不必要的伪相关,从而影响指令调度的效率和结果,影响最终性能的提高.本文提出了寄存器队列模型,并在其基础上提出了一种结合实现寄存器分配和指令调度的算法,该算法能够在保证每条指令的执行时间最早的同时使用最少数目的寄存器.它的另外一个优点是具有线性的时间和空间复杂度,而且易于硬件实现.  相似文献   

15.
The speculated execution of threads in a multithreaded architecture, plus the branch prediction used in each thread execution unit, allows many instructions to be executed speculatively, that is, before it is known whether they actually needed by the program. In this study, we examine how the load instructions executed on what turn out to be incorrectly executed program paths impact the memory system performance. We find that incorrect speculation (wrong execution) on the instruction and thread-level provides an indirect prefetching effect for the later correct execution paths and threads. By continuing to execute the mispredicted load instructions even after the instruction or thread-level control speculation is known to be incorrect, the cache misses observed on the correctly executed paths can be reduced by 16 to 73 percent, with an average reduction of 45 percent. However, we also find that these extra loads can increase the amount of memory traffic and can pollute the cache. We introduce the small, fully associative wrong execution cache (WEC) to eliminate the potential pollution that can be caused by the execution of the mispredicted load instructions. Our simulation results show that the WEC can improve the performance of a concurrent multithreaded architecture up to 18.5 percent on the benchmark programs tested, with an average improvement of 9.7 percent, due to the reductions in the number of cache misses.  相似文献   

16.
Explicit Data Graph Execution(EDGE)ISA是一种专门为类数据流驱动的分片式众核处理器而设计的指令集体系结构.相较于传统的采用控制流驱动的处理器,EDGE结构以超块(Hyperblock)而不是单个指令作为其执行单位,在超块内部实现数据流执行,超块之间按照推测序保持控制流执行,有利于挖掘指令级并行性.但是,EDGE编译器按照程序的串行执行顺序组织超块,超块间和超块内部受限于数据依赖,削弱了整个程序运行时的潜在数据级并行性和线程级并行性,不利于发挥EDGE分片式结构的优势.本文通过分析EDGE编译器超块组织的特点,结合EDGE结构特有的执行模型,提出一种普适性的超块组织框架来模拟EDGE结构上多线程运行的效果,进一步挖掘EDGE结构运行串行单线程程序时的指令级并行性.本文选用TRIPS微处理器作为EDGE结构的实例处理器,利用矩阵乘法等三个实验验证了我们所提出的框架的可行性,实验结果表明这些应用在TRIPS上获得了较好的性能提升.  相似文献   

17.
田祖伟  孙光 《计算机科学》2010,37(5):130-133
程序中大量分支指令的存在,严重制约了体系结构和编译器开发并行性的能力。有效发掘指令级并行性的一个主要挑战是要克服分支指令带来的限制。利用谓词执行可有效地删除分支,将分支指令转换为谓词代码,从而扩大了指令调度的范围并且删除了分支误测带来的性能损失。阐述了基于谓词代码的指令调度、软件流水、寄存器分配、指令归并等编译优化技术。设计并实现了一个基于谓词代码的指令调度算法。实验表明,对谓词代码进行编译优化,能有效提高指令并行度,缩短代码执行时间,提高程序性能。  相似文献   

18.
Matching an application to an architecture in structure and size is a way of achieving higher computation speed. This paper presents a combination of a compiler and a reconfigurable long instruction word (RLIW) architecture as an approach to the matching problem. Configurations suitable for the execution of different parts of a program are determined by a compiler, and code is generated for both reconfiguring the hardware and performing the computation. The RLIW machine, consisting of multiple processing and global data memory modules, effectively utilizes the fine-grained parallelism detected in programs by a compiler. The long word instructions control the operation of processing and memory modules in the system. To reduce the data transfer between processing modules and data memory modules, we provide reconfigurable interconnections among the processing modules which permit direct communication. The compiler uses new techniques, including region scheduling, generation of code for reconfiguration of the system, and memory allocation techniques, to achieve improved performance. Algorithms for packing operations into long word instructions and techniques for effectively assigning memory modules to the operands required by an instruction are developed. Results of the experiments to evaluate the system indicate that speedups of 60–300% can be obtained for both scientific and nonscientific programs. The reconfigurable architecture is responsible for much of the speedup. Also, the results indicate that the major problem of memory bottleneck faced in designing parallel systems is successfully attacked.This paper represents work done while the author was at the University of Pittsburgh  相似文献   

19.
Parallel programs are commonly written using barriers to synchronize parallel processes. Upon reaching a barrier, a processor must stall until all participating processors reach the barrier. A software implementation of the barrier mechanism using shared variables has two major drawbacks. Firstly, the execution of the barrier may be slow since it requires execution of several instructions. Secondly, processors that are stalled waiting for other processors to reach the barrier cannot do any useful work. In this paper, the notion of thefuzzy barrier is presented, that avoids these drawbacks. The first problem is avoided by implementing the mechanism in hardware. The second problem is solved by extending the barrier concept to include a region of statements that can be executed by a processor while it awaits synchronization. The barrier regions are constructed by a compiler and consist of instructions such that a processor is ready to synchronize upon reaching the first instruction and must synchronize before exiting the region. When synchronization does occur, the processors could be executing at any point in their respective barrier regions. The larger the barrier region, the more likely it is that none of the processors will have to stall. Hardware fuzzy barriers have been implemented as part of a RISC-based multi-processor system. Results based on a software implementation of the fuzzy barrier on the Encore multiprocessor indicate that the synchronization overhead can be greatly reduced using the mechaism.A preliminary version of this paper appeared inASPLOS '89.This work was done while the author was at Philips Laboratories.  相似文献   

20.
循环展开是一种常用的编译优化技术,能够有效减少循环开销,提升指令级并行程度和数据局部性,提升循环的执行效能。然而,过度的循环展开会造成指令Cache溢出,增大寄存器压力,循环展开次数太少又会浪费潜在的性能提升机会,因此寻找恰当的展开因子是研究循环展开问题的核心。基于GCC开源编译器,面向循环展开问题开展深入的分析与研究,针对指令Cache和寄存器资源对循环展开的影响,提出了一种基于指令Cache和寄存器压力的循环展开因子计算方法,并在GCC编译器中实现了该计算方法。申威和海光平台上的实验结果显示,相较于目前GCC中存在的其它展开因子计算方法,所提出的方法可以获得更为有效的循环展开因子,提升了程序性能。在SPEC CPU 2006测试集上的平均性能分别提升了2.7%和3.1%,在NPB-3.3.1测试集上的分别为5.4%和6.1%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号