首页 | 本学科首页   官方微博 | 高级检索  
     


Low-level implementation of the SISC protocol for thread-level speculation on a multi-core architecture
Affiliation:1. School of Electrical Engineering, University of Belgrade, Bul. Kralja Aleksandra 73, Belgrade, Serbia;2. Thales Research & Technologies, Augustin Fresnel 1, 91120 Palaiseau, France;1. Department of Mechanical Engineering, University of Delaware, Newark, Delaware 19716-3140, USA;2. Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware 19716-3140, USA;3. State Key Laboratory of Coal Combustion, Huazhong University of Science and Technology, Wuhan, PR China;1. Dept. of Computer Engineering and Informatics, University of West Attica, Athens, Greece;2. Dept. of Electrical and Computer Engineering, University of Peloponnese, Patras, Greece
Abstract:Chip Multiprocessors (CMP) have emerged during last decades as a very attractive solution in using the ever-increasing on-chip transistor count. However, classical parallelization techniques failed to fully exploit parallelization from existing sequential applications due to false data dependencies. This paper focuses on the Thread-level Speculation (TLS) technique, an alternative way to exploit the transistor budget in a CMP. With TLS, even possibly data dependent threads can run in parallel as long as the semantics of the sequential execution is preserved. A special hardware support monitors the actual data dependencies between threads at run time and, if they are violated, misspeculation effects are undone usually through replay. This kind of system is known as speculative CMP. However, the TLS mechanism requires complex protocols that integrate cache coherence and speculation to maintain program order among multiple versions of data. Current TLS protocol evaluations are usually inadequate because they are not done low-level enough. A realistic evaluation of speculative CMPs requires either to be performed on a real hardware or very detailed cycle-accurate simulator models.In this paper we are particularly focused on a low-level evaluation of the write-invalidate TLS protocol Speculation Integrated with Snoopy Coherence (SISC) protocol proposed in 1]. This evaluation relies on cycle-level simulation environment with detailed cycle-level cache memories, cache controller and system bus. On top of this, a speculative four core architecture is simulated and three new modules (Scheduler, Squash Arbiter and Supplier Arbiter) are provided to support low-level implementation of the SISC protocol. The overall cost of the SISC protocol is evaluated by means of CACTI tool for the three different domains: the access latency cost, the area cost, and the power cost. The evaluation goal was to keep the cache access time to remain below cycle latency as well as the area and power overheads below an acceptable budget overhead. The SISC protocol has been compared against regular MESI-based architecture in both 32-bit and 64-bit versions. We kept the cache access time below the cycle latency, and we managed to keep both data cache area and static power overheads respectively below 32% and 35%.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号