首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
随着继电保护设备的功能以及集成度的提升,单核处理器在性能上已捉襟见肘,多核处理器得到了较广泛应用.多核处理器在继电保护设备中通常工作于多核异构模式,该模式下划分出的各运行区之间通常基于共享内存实现数据交换.本文提出了一种在多核异构模式下去中心化的有管理的共享内存设计方法,可以实现各运行区对等的进行抽象化的共享内存的资源...  相似文献   

2.
随着多核技术的不断发展,多核CPU已经成为处理器市场的主流。如何充分利用多核的优势提高应用程序的性能是开发人员不得不面对的课题。多核系统为开发人员提供了一个实现并行计算的重要平台。文中探讨了基于双核系统的快速排序的效率,介绍了C#线程编程的相关知识,并在此基础上实现了基于双核系统的多线程的快速排序算法,实验结果表明该算法较传统快速排序算法而言,算法执行效率得到了很大的提升。  相似文献   

3.
多核数字信号处理器(DSP)的性能常常受限于共享存储的长延迟Cache一致性访问.数据前向(forwarding)技术是隐藏长延迟访问的一种有效手段.根据多核DSP应用的两类重要特征,提出了一种面向共享存储多核DSP结构的数据流分簇前向技术DSCF(data stream clustered forwarding).DSCF方法的主要特点是:兼容基本的共享存储Cache一致性协议;不污染目标Cache;数据的传输速度能够与消费速度相匹配;系统结构的可扩展性好.典型测试程序的模拟评测表明,采用DSCF方法能够将Cache一致性失效率平均降低44%,将系统总体性能提升30%~70%.  相似文献   

4.
Recent advances in computing architectures and networking are bringing parallel computing systems to the masses so increasing the number of potential users of these kinds of systems. In particular, two important technological evolutions are happening at the ends of the computing spectrum: at the “small” scale, processors now include an increasing number of independent execution units (cores), at the point that a mere CPU can be considered a parallel shared-memory computer; at the “large” scale, the Cloud Computing paradigm allows applications to scale by offering resources from a large pool on a pay-as-you-go model. Multi-core processors and Clouds both require applications to be suitably modified to take advantage of the features they provide. Despite laying at the extreme of the computing architecture spectrum – multi-core processors being at the small scale, and Clouds being at the large scale – they share an important common trait: both are specific forms of parallel/distributed architectures. As such, they present to the developers well known problems of synchronization, communication, workload distribution, and so on. Is parallel and distributed simulation ready for these challenges? In this paper, we analyze the state of the art of parallel and distributed simulation techniques, and assess their applicability to multi-core architectures or Clouds. It turns out that most of the current approaches exhibit limitations in terms of usability and adaptivity which may hinder their application to these new computing architectures. We propose an adaptive simulation mechanism, based on the multi-agent system paradigm, to partially address some of those limitations. While it is unlikely that a single approach will work well on both settings above, we argue that the proposed adaptive mechanism has useful features which make it attractive both in a multi-core processor and in a Cloud system. These features include the ability to reduce communication costs by migrating simulation components, and the support for adding (or removing) nodes to the execution architecture at runtime. We will also show that, with the help of an additional support layer, parallel and distributed simulations can be executed on top of unreliable resources.  相似文献   

5.
Multi-core processors are becoming prevalent rapidly in personal computing and embedded systems.Nevertheless,the programming environment for multi-core processor-based systems is still quite immature and lacks efficient tools.In this work,we present a new VERTAF/Multi-Core framework and show how software code can be automatically generated from SysML models of multi-core embedded systems.We illustrate how model-driven design based on SysML can be seamlessly integrated with Intel’s threading building blocks (TBB) and the quantum framework (QF) middleware.We use a digital video recording system to illustrate the benefits of the framework.Our experiments show how SysML/QF/TBB help in making multi-core embedded system programming model-driven,easy,and efficient.  相似文献   

6.
多核处理器,尤其是单芯片多处理器(chip multi-processor,CMP)能够提供强大的共享内存的并行资源,然而单核处理器上的程序和算法并不能充分利用多核架构提供的并行计算资源,因此必须针对多核体系架构特点,对算法进行改进优化,提高算法的执行性能。以优化程序局部性、减少cache访问冲突、提高线程并行度、充分利用单指令多数据流(single instruction multipledata,SIMD)并行和带宽优化等几方面为出发点,归纳和分析了多核处理器上数据处理算法的相关优化策略,并对多核算法进行了总结评述。最后阐述了该领域亟待解决的诸多问题,展望了未来的研究发展方向。  相似文献   

7.
In recent years, the need for high-performance network monitoring tools, which can cope with rapidly increasing network bandwidth, has become vital. A possible solution is to utilize the processing power of multi-core processors that nowadays are available as commercial-off-the-shelf (COTS) hardware. In this paper, we introduce a software solution for wire-speed packet capturing and transmission for TCP/IP networks under Linux operating system, called DashCap. The results of our experimental evaluations show that the proposed solution causes more than two times performance boost for packet capturing in comparison to the existing software solutions under Linux. We have proposed a scalable software architecture for network monitoring tools called DashNMon, which is based on DashCap. Multi-core awareness is a distinguished property of this architecture. Comparing to the existing cluster-based solutions, DashNMon can be used with COTS multi-core processors. In order to evaluate the proposed solutions, we have developed several prototype tools. The results of the experiments carried out using these tools show the scalability and high performance of the network monitoring tools that are based on the proposed architecture. Using the proposed architecture, it is possible to design and implement high-performance multi-threaded network intrusion detection systems (NIDSs) or application-layer firewalls, completely in the user space and with better utilization of the computational resources of multi-processor/multi-core systems.  相似文献   

8.
多核处理器及其对系统结构设计的影响   总被引:3,自引:0,他引:3       下载免费PDF全文
多核技术成为当今处理器技术发展的重要方向,已经是计算机系统设计者必须直面的现实。从计算机系统结构的角度探讨了同构与异构、通用与多用等多核处理器的类型,分析了典型多核处理器的微结构、工艺等结构特点,讨论了多核处理器对计算机系统结构设计带来的挑战。  相似文献   

9.
通用非对称多核方案设计   总被引:1,自引:0,他引:1  
多核处理器是目前处理器发展的主流方向, 但硬实时保证方面存在诸多挑战. 通过研究分析实时应用需求和多核处理器的应用现状, 提出一种基于通用处理器的非对称多核方案. 重点讨论了方案的软件总体设计、共享资源管理、非对称多核模式下的从核镜像加载、启动和核间通信的设计. 采用通用非对称多核方案研制的低压保护测控装置, 其现场运行情况表明方案满足电力二次设备的实时性能要求.  相似文献   

10.
随着安全关键系统对计算性能要求的日趋提高,能够提供更强计算能力而又减少电子设备的体积、重量和功耗的多核处理器将在安全关键领域得到广泛应用.同步语言能够表达确定性并发行为且具有精确时间语义等特性,适用于安全关键软件的建模和验证.目前,同步语言SIGNAL编译器主要支持串行代码生成,较少关注多线程代码生成.提出一种同步语言SIGNAL多线程代码生成工具.首先将SIGNAL程序转换为经过时钟演算的S-CGA中间程序;之后将S-CGA中间程序转换为时钟数据依赖图以分析依赖关系;然后对时钟数据依赖图进行拓扑排序划分,并针对划分结果提出优化算法和基于流水线方式的任务划分方法;最后将划分结果转换为虚拟多线程结构并进一步生成可执行多线程C/Java代码.通过在多核处理器上的实验,验证了所提方法的有效性.  相似文献   

11.
能够提供更强计算能力的多核处理器将在安全关键系统中得到广泛应用.但是,由于现代处理器所使用的流水线、乱序执行、动态分支预测、Cache等性能提高机制以及多核之间的资源共享,使得系统的最坏执行时间分析变得非常困难.为此,国际学术界提出时间可预测系统设计的思想,以降低系统的最坏执行时间分析难度.已有研究主要关注硬件层次及其编译方法的调整和优化,而较少关注软件层次,即时间可预测多线程代码的构造方法以及到多核硬件平台的映射.本文提出一种基于同步语言模型驱动的时间可预测多线程代码生成方法,并对代码生成器的语义保持进行证明;提出一种基于AADL(Architecture Analysis and Design Language)的时间可预测多核体系结构模型,作为本文研究的目标平台;最后,给出多线程代码到多核体系结构模型的映射方法,并给出系统性质的分析框架.  相似文献   

12.
集成电路制造工艺的飞速发展,使得集成电路的特征尺寸不断减少和集成度不断提高,造成集成电路对工作环境的影响越来越敏感,发生软错误的几率不断增加,对可靠性造成重要影响。随着微处理器进入了多核时代,丰富的片上资源给软错误加固带来了很好的机遇。本文针对多核处理器中I/O系统软错误,提出了一种基于多核处理器的软件Scrub方法对软错误进行加固。测试结果表明,我们提出的软错误容错方法可以大大提高I/O系统的可靠性。  相似文献   

13.
Real-time systems need time-predictable platforms to allow static analysis of the worst-case execution time (WCET). Standard multi-core processors are optimized for the average case and are hardly analyzable. Within the T-CREST project we propose novel solutions for time-predictable multi-core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared to other processors the WCET performance is outstanding.The T-CREST platform is evaluated with two industrial use cases. An application from the avionic domain demonstrates that tasks executing on different cores do not interfere with respect to their WCET. A signal processing application from the railway domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result of a collaborative research and development project executed by eight partners from academia and industry. The European Commission funded T-CREST.  相似文献   

14.
Identifying frequent items in high-speed network is important for a variety of network applications ranging from traffic engineering to anomaly detection such as detection of denial of service attacks. To deal with high packet arrival rate, it is desirable that such systems are able to support very high update throughput. The advent of multi-core processors calls for efficient parallel designs which can effectively utilize the parallelism of the multi-cores. In this paper, we address the problem of parallelizing weighted frequency counting in the context of multi-core processors. We discuss the challenges in designing an efficient parallel system. Our evaluation and analysis reveals that the naive fine-grained lock design results in excessive overhead and wait, which in turn leads to severe performance degradation in multi-core architectures. Based on our analysis, we propose a novel method: precision integrated method (PRIM). PRIM makes use of the temporal imprecision concept to significantly reduce the merge overhead at the cost of relatively large memory space used. Both the theoretical analysis and real traffic experiments demonstrate that PRIM delivers almost linear speedup.  相似文献   

15.
Multi-core architectures are widely used to enhance the microprocessor performance within a limited increase in time-to-market and power consumption of the chips.Toward the application of high-density data signal processing, this paper presents a novel heterogeneous multi-core architecture digital signal processor(DSP),YHFT-QDSP,with one RISC CPU core and 4 VLIW DSP cores.By three kinds of interconnection,YHFT-QDSP provides high efficiency message communication for inner-chip RISC core and DSP cores,inne...  相似文献   

16.
Decoding of an H.264 video stream is a computationally demanding multimedia application which poses serious challenges on current processor architectures. For processors with strongly limited computational resources, a natural way to tackle this problem is the use of multi-core systems. The contribution of this paper lies in a systematic overview and performance evaluation of parallel video decoding approaches. We focus on decoder splittings for strongly resource-restricted environments inherent to mobile devices. For the evaluation, we introduce a high-level methodology which can estimate the runtime behaviour of multi-core decoding architectures. We use this methodology to investigate six methods for accomplishing data-parallel splitting of an H.264 decoder. These methods are compared against each other in terms of runtime complexity, core usage, inter-communication and bus transfers. We present benchmark results using different numbers of processor cores. Our results shall aid in finding the splitting strategy that is best-suited for the targeted hardware-architecture.  相似文献   

17.
多核处理器环境日益普及,针对此体系结构的应用程序的并行性研究已成为焦点。轻量级Web服务器应用广泛,但传统实现不能充分利用多核优势且编程方法相对复杂。提出一种基于数据流结合虚拟机优化及Java扩展库的解决方案,给出了一种新的Web服务器实现方法。实验结果表明该方法有效提高了Web服务器的处理性能。  相似文献   

18.
We present a new method for the interactive rendering of isosurfaces using ray casting on multi-core processors. This method consists of a combination of an object-order traversal that coarsely identifies possible candidate 3D data blocks for each small set of contiguous pixels, and an isosurface ray casting strategy tailored for the resulting limited-size lists of candidate 3D data blocks. While static screen partitioning is widely used in the literature, our scheme performs dynamic allocation of groups of ray casting tasks to ensure almost equal loads among the different threads running on multi-cores while maintaining spatial locality. We also make careful use of memory management environment commonly present in multi-core processors. We test our system on a two-processor Clovertown platform, each consisting of a Quad-Core 1.86-GHz Intel Xeon Processor, for a number of widely different benchmarks. The detailed experimental results show that our system is efficient and scalable, and achieves high cache performance and excellent load balancing, resulting in an overall performance that is superior to any of the previous algorithms. In fact, we achieve an interactive isosurface rendering on a 1024(2) screen for all the datasets tested up to the maximum size of the main memory of our platform.  相似文献   

19.
Multi-core real-time scheduling for generalized parallel task models   总被引:1,自引:0,他引:1  
Multi-core processors offer a significant performance increase over single-core processors. They have the potential to enable computation-intensive real-time applications with stringent timing constraints that cannot be met on traditional single-core processors. However, most results in traditional multiprocessor real-time scheduling are limited to sequential programming models and ignore intra-task parallelism. In this paper, we address the problem of scheduling periodic parallel tasks with implicit deadlines on multi-core processors. We first consider a synchronous task model where each task consists of segments, each segment having an arbitrary number of parallel threads that synchronize at the end of the segment. We propose a new task decomposition method that decomposes each parallel task into a set of sequential tasks. We prove that our task decomposition achieves a resource augmentation bound of 4 and 5 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively. Finally, we extend our analysis to a directed acyclic graph (DAG) task model where each node in the DAG has a unit execution requirement. We show how these tasks can be converted into synchronous tasks such that the same decomposition can be applied and the same augmentation bounds hold. Simulations based on synthetic workload demonstrate that the derived resource augmentation bounds are safe and sufficient.  相似文献   

20.
Single-instruction-set architecture (Single-ISA) heterogeneous multi-core processors (HMP) are superior to Symmetric Multi-core processors in performance per watt. They are popular in many aspects of the Internet of Things, including mobile multimedia cloud computing platforms. One Single-ISA HMP integrates both fast out-of-order cores and slow simpler cores, while all cores are sharing the same ISA. The quality of service (QoS) is most important for virtual machine (VM) resource management in multimedia mobile computing, particularly in Single-ISA heterogeneous multi-core cloud computing platforms. Therefore, in this paper, we propose a dynamic cloud resource management (DCRM) policy to improve the QoS in multimedia mobile computing. DCRM dynamically and optimally partitions shared resources according to service or application requirements. Moreover, DCRM combines resource-aware VM allocation to maximize the effectiveness of the heterogeneous multi-core cloud platform. The basic idea for this performance improvement is to balance the shared resource allocations with these resources requirements. The experimental results show that DCRM behaves better in both response time and QoS, thus proving that DCRM is good at shared resource management in mobile media cloud computing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号