首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 171 毫秒
1.
编译指导的多线程低功耗技术研究   总被引:7,自引:0,他引:7  
多线程和低功耗将是研究下一代微处理器结构所要解决和实现的重点目标之一,提出了一个在SMT体系结构中通过动态调整CPU执行频率降低功耗的计算模型,进一步分析和讨论了如何在编译时识别具有可使处理部件降低频率执行的期望区间,并给出了调整频率和能量分析的计算模型以及编译实现策略,目的是在不降低或不明显降低程序执行性能的情况下,显著降低处理器的功率/能量消耗,理论上该模型也可以用于superscalar和multiprocessor体系结构。  相似文献   

2.
多变体执行(multi-variant execution,MVX)是目前最流行的主动防御技术之一。理想情况下,当未受到攻击时,多变体执行架构提供正常的程序功能。但不幸的是,当多线程程序在多变体执行架构下运行时,由于各个程序变体中共享资源操作的线程执行顺序不一致,不同变体将会产生状态不一致,从而产生攻击误报,该问题导致了多线程程序难以以多变体执行方式运行。基于多变体执行环境,提出了一种编译支持的多线程程序对共享资源操作的同步编译模型,该模型以共享资源操作为同步点,分析确定多线程程序中对共享资源的操作时机和操作方式,保证各程序变体在运行时多线程对共享资源操作的一致性,从而消除了由此而引起的攻击误报。以LLVM 12.0编译框架为基础,设计实现了基于该同步编译模型的原型系统,并对该原型系统进行了仿真实验测试。实验结果显示,经过原型系统处理的多线程程序在多变体执行架构中的误报率显著降低,表明该同步编译模型作为一种通用性的方法,可有效消除多线程程序在多变体执行架构下运行时的攻击误报,提高了多变体执行的可用性。  相似文献   

3.
按照教科书上的定义,进程是资源管理的最小单位,线程是程序执行的最小单位.在操作系统设计上,从进程演化出线程,最主要的目的就是更好的支持SMP(Symmetric Multi-Processing对称式多处理器)以及减小(进程/线程)上下文切换开销.在操作系统实现上,现代计算机技术要求操作系统必须融合新的技术和方法-多线程的进程体系结构,而Solaris操作系统的内核结构就是围绕着线程而重新架构的,它是这方面的典范.在本文中将讨论Solaris的进程模型、进程执行环境以及多线程的进程体操结构,内容包括进程、轻量进程和内核线程,最后将讨论进程的创建和终止.  相似文献   

4.
基于事务性执行的投机并行多线程是一种适合未来多核微处理器架构的新型并行程序设计和编译技术.但在此基础上的并行程序执行过程更为复杂,程序执行过程的模拟成为关键问题之一.本文提出利用二进制代码级动态插桩技术对投机并行多线程程序进行功能性模拟,设计并实现了完整的软件平台,可精确地模拟和监控并行程序的线程级投机执行过程,检测访存冲突,从而实现投机并行多线程的语义.该软件平台同时可以作为进一步研究投机多线程并行程序真实执行过程的基础,并有效支持投机并行多线程编译器的设计和分析.  相似文献   

5.
CPU/FPGA混合架构是可重构计算的普遍结构,为了简化混合架构上FPGA的使用,提出了一种硬件线程方法,并设计了硬件线程的执行机制,以硬件线程的方式使用可重构资源.同时,软硬件线程可以通过共享数据存储方式进行多线程并行执行,将程序中计算密集部分以FPGA上的硬件线程方式执行,而控制密集部分则以CPU上的软件线程方式执行.在Simics仿真软件模拟的混合架构平台上,对DES,MD5SUM和归并排序算法进行软硬件多线程改造后的实验结果表明,平均执行加速比达到了2.30,有效地发挥了CPU/FPGA混合架构的计算性能.  相似文献   

6.
能够提供更强计算能力的多核处理器将在安全关键系统中得到广泛应用.但是,由于现代处理器所使用的流水线、乱序执行、动态分支预测、Cache等性能提高机制以及多核之间的资源共享,使得系统的最坏执行时间分析变得非常困难.为此,国际学术界提出时间可预测系统设计的思想,以降低系统的最坏执行时间分析难度.已有研究主要关注硬件层次及其编译方法的调整和优化,而较少关注软件层次,即时间可预测多线程代码的构造方法以及到多核硬件平台的映射.本文提出一种基于同步语言模型驱动的时间可预测多线程代码生成方法,并对代码生成器的语义保持进行证明;提出一种基于AADL(Architecture Analysis and Design Language)的时间可预测多核体系结构模型,作为本文研究的目标平台;最后,给出多线程代码到多核体系结构模型的映射方法,并给出系统性质的分析框架.  相似文献   

7.
按照教科书上的定义,进程是资源管理的最小单位,线程是程序执行的最小单位。在操作系统设计上,从进程演化出线程,最主要的目的就是更好的支持SMP(SymmetricMulti-Processing:对称式多处理器)以及减小(进程/线程)上下文切换开销。在操作系统实现上,现代计算机技术要求操作系统必须融合新的技术和方法-多线程的进程体系结构,而Solaris操作系统的内核结构就是围绕着线程而重新架构的,它是这方面的典范。在本文中将讨论Solaris的进程模型、进程执行环境以及多线程的进程体操结构,内容包括进程、轻量进程和内核线程,最后将讨论进程的创建和终止。  相似文献   

8.
SMA:前瞻性多线程体系结构   总被引:4,自引:1,他引:3  
肖刚  周兴铭  徐明  邓鹍 《计算机学报》1999,22(6):582-590
提出了一种新的ILP处理器体系结构-前瞻性多线程体系的结构,简称SMA.它结合了前瞻性执行机制和多线程执行机制,以整个线程为长步进行前瞻性执行,多个线程并行执行并且共享处理器硬件资源,这样,处理器既通过组合每个线程的指令窗口形成一个大的动态指令窗口,开发出程序中更大的ILP,又利用多线程执行机制屏蔽各种长延迟操作,达到较高的资源利用率;介绍了SMA执行模型,并讨论了SMA处理器的实现和其中的关键技  相似文献   

9.
随着生产工艺的提高,芯片上能集成越来越多的晶体管,多线程技术也逐步成为一种主流的处理器体系结构技术.提出一种融合同时多线程技术和微线程技术的新型体系结构同时多微线程(simultaneous multi-microthreading,SMMT),并给出同时多微线程体系结构的实现方案.SMMT有效结合同时多线程技术硬件代价小和微线程技术能够加速单进程应用的优点,通过软硬件协同的方式充分挖掘单进程程序的微线程级并行性.通过在设计的龙芯2号同时多微线程处理器上进行性能评测,结果表明,同时多微线程体系结构能够有效地加速单进程的程序,以很小的硬件代价显著地提高了处理器的性能.  相似文献   

10.
QNX环境下多线程编程   总被引:2,自引:2,他引:2  
介绍了QNX实时操作系统和多线程编程技术,包括线程间同步的方法、多线程程序的分析步骤、线程基本程序结构以及实用编译方法。  相似文献   

11.
多线程计算模型、体系结构与编译技术   总被引:3,自引:0,他引:3  
1 引言在过去的30年里,计算机体系结构经历了长足的发展,超标量、超流水线、VLIW等先进思想显著地提高了计算机的性能,但这些单线程的体系结构在提高指令级并行度方面也面临着巨大的困难。多线程体系结构被认为是一种提高并行度的有效模型,它结合了数据流结构和传统的冯·诺依曼控制流结构,既保持了指令执行的高性能,又实现了处理器的高  相似文献   

12.
Current high-end microprocessors achieve high performance as a result of adding more features and therefore increasing complexity. This paper makes the case for a Chip-Multiprocessor based on the Data-Driven Multithreading (DDM-CMP) execution model in order to overcome the limitations of current design trends. Data-Driven Multithreading (DDM) is a multithreading model that effectively hides the communication delay and synchronization overheads. DDM-CMP avoids the complexity of other designs by combining simple commodity microprocessors with a small hardware overhead for thread scheduling and an interconnection network. Preliminary experimental results show that a DDM-CMP chip of the same hardware budget as a high-end commercial microprocessor, clocked at the same frequency, achieves a speedup of up to 18.5 with a 78–81% power consumption of the commercial chip. Overall, the estimated results for the proposed DDM-CMP architecture show a significant benefit in terms of both speedup and power consumption making it an attractive architecture for future processors.  相似文献   

13.
The speculated execution of threads in a multithreaded architecture, plus the branch prediction used in each thread execution unit, allows many instructions to be executed speculatively, that is, before it is known whether they actually needed by the program. In this study, we examine how the load instructions executed on what turn out to be incorrectly executed program paths impact the memory system performance. We find that incorrect speculation (wrong execution) on the instruction and thread-level provides an indirect prefetching effect for the later correct execution paths and threads. By continuing to execute the mispredicted load instructions even after the instruction or thread-level control speculation is known to be incorrect, the cache misses observed on the correctly executed paths can be reduced by 16 to 73 percent, with an average reduction of 45 percent. However, we also find that these extra loads can increase the amount of memory traffic and can pollute the cache. We introduce the small, fully associative wrong execution cache (WEC) to eliminate the potential pollution that can be caused by the execution of the mispredicted load instructions. Our simulation results show that the WEC can improve the performance of a concurrent multithreaded architecture up to 18.5 percent on the benchmark programs tested, with an average improvement of 9.7 percent, due to the reductions in the number of cache misses.  相似文献   

14.
Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded consistency model adapting to multithreaded processor is proposed in this paper. The restriction imposed on memory event ordering by chip multithreaded consistency is presented and formalized. With the idea of critical cycle built by Wei-Wu Hu, we prove that the proposed chip multithreaded consistency model satisfies the criterion of correct execution of sequential consistency model. Chip multithreaded consistency model provides a way of achieving high performance compared with sequential consistency model and easures the compatibility of software that the execution result in multithreaded processor is the same as the execution result in uniprocessor. The implementation strategy of chip multithreaded consistency model in Godson-2 SMT processor is also proposed. Godson-2 SMT processor supports chip multithreaded consistency model correctly by exception scheme based on the sequential memory access queue of each thread.  相似文献   

15.
Current trend of research on multithreading processors is toward the chip multithreading (CMT), which exploits thread level parallelism (TLP) and improves performance of softwares built on traditional threading components, e.g., Pthread. There exist commercially available processors that support simultaneous multithreading (SMT) on multicore processors. But they are basically based on the conventional sequential execution model, and execute multiple threads in parallel under the control of OS that handles interruptions. Moreover, there exist few languages or programming techniques to utilize the multicore processors effectively. We are taking another approach to develop a multithreading processor, which is dedicated to TLP. Our processor, named Fuce, is based on the continuation-based multithreading. A thread is defined as a block of sequentially ordered instructions which are executed without interruption. Every thread execution is triggered only by the event called continuation. This paper first introduces the continuation-based multithread execution model and its processor architecture then gives multithreaded programming techniques and the continuation-based multithreading language system CML. Last, the performance of the Fuce processor is evaluated by means of the clock-level software simulation.  相似文献   

16.
In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destination’s cache before it is needed, eliminating cache misses in the destination’s cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors.  相似文献   

17.
金丽  朱浩 《计算机科学》2015,42(12):243-246, 282
降密策略的主要目的在于确保程序中敏感信息的安全释放。目前,降密策略的安全条件和实施机制的研究主要集中在顺序式程序设计语言,它们不能直接移植到多线程并发环境,原因在于攻击者能利用线程调度的某些性质推导出敏感信息。为此,基于多线程程序设计语言模型和线程调度模型,建立了支持多线程并发环境的二维降密策略,有效确保了在合适的程序点降密合适的信息;建立了多线程并发环境下该降密策略的动态监控机制,并证明了该实施机制的可靠性。  相似文献   

18.
马明理  陈刚  董金祥 《计算机测量与控制》2006,14(11):1551-1553,1556
介绍了一种新的多线程内存分配技术(NIXMalloc)的设计和实现,提出了两种高效的分配策略及其自适应调优方法,有效地提高多线程应用程序的内存管理性能;其中Local分配策略对超级块对象Span进行了线程私有化,基于超级块对象为单位的垃圾回收和内存布局调整使多线程性能更优越;Global分配策略采用了自适应调优方法,在动态检测应用程序内存使用情况的基础上进行内存预取和线程缓存限值的动态调整;实验证明NIXMalloc可改善内存管理性能,提高吞吐量,同时降低内存使用量;在多线程应用系统中能获得较好的时空效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号