排序方式: 共有81条查询结果,搜索用时 0 毫秒
1.
Abstract Multi-core digital signal processors (DSPs) are widely used in wireless telecommunication, core network transcoding, industrial control, and audio/video processing technologies, among others. In comparison with general-purpose multi-processors, multi-core DSPs normally have a more complex memory hierarchy, such as on-chip core-local memory and non-cache-coherent shared memory. As a result, efficient multi-core DSP applications are very difficult to write. The current approach used to program multi-core DSPs is based on proprietary vendor software development kits (SDKs), which only provide low-level, non-portable primitives. While it is acceptable to write coarse-grained task-level parallel code with these SDKs, writing fine-grained data parallel code with SDKs is a very tedious and error-prone approach. We believe that it is desirable to possess a high-level and portable parallel programming model for multi-core DSPs. In this paper, we propose OpenMDSP, an extension of OpenMP designed for multi-core DSPs. The goal of OpenMDSP is to fill the gap between the OpenMP memory model and the memory hierarchy of multi-core DSPs. We propose three classes of directives in OpenMDSP, including 1) data placement directives that allow programmers to control the placement of global variables conveniently, 2) distributed array directives that divide a whole array into sections and promote the sections into core-local memory to improve performance, and 3) stream access directives that promote big arrays into core-local memory section by section during parallel loop processing while hiding the latency of data movement by the direct memory access (DMA) of a DSP. We implement the compiler and runtime system for OpenMDSP on PreeScale MSC8156. The benchmarking results show that seven of nine benchmarks achieve a speedup of more than a factor of 5 when using six threads. 相似文献
2.
一种支持多重循环软件流水的寄存器结构 总被引:1,自引:0,他引:1
寄存器结构及其分配是软件流水算法的关键之一.为支持多重循环的软件流水,该文提出一种新颖的寄存器结构:半共享跳跃式流水寄存器堆.它可以有效地解决多重循环软件流水下的特殊问题,即:同层次和跨层次的寄存器重命名问题以及断流问题;有效地消除外层循环的体间读写相关,提高程序的指令级并行度.它有3种分配方式可供灵活使用:单个寄存器、流水寄存器和寄存器组方式.流水寄存器方式对生存期确定的、局限于一个循环层次的寄存器重命名问题提供简单而有效的支持.寄存器组分配方式解决了多重循环软件流水时变量生存期不确定的情况.跳跃操作为 相似文献
3.
4.
5.
1 引言安腾(Itanium)处理器是HP/Intel公司推出的第一代基于IA-64体系结构的处理器。IA-64体系结构是一种64位的支持显式指令级并行计算(Explicit Parallel Instruction Computing,EPIC)的体系结构,它实现了一系列新特性,支持开发更大的指令级并行性(Instruction Level Parallelism,ILP),突破了传统体系结构的性能限制。这些新特性包括:猜 相似文献
6.
本首先提出一个能够支持多分支循环程序最优执行的VLIW体系结构模型,然后在这个模型的基础上设计了一个新的主要用于数字信号处理及图象处理应用领域的单片体系结构-URPR-2。在这个体系结构中,属于不同路径和不同循环体的多个分支操作可以在一个节拍内同时被执行,因此可以在更大范围内开发指讼级并行性,同时还提出了一个种叫作流水控制黑板的机制来支持条件分支操作。URPR-2不仅能够以很高的速度执行只含有基 相似文献
7.
提出了一套全新的微机接口实验仪。该实验仪利用PCI主模式控制芯片Plx9054在PCI总线上扩展出一个8位教学用微机实验总线(E-Bus),并在该总线上设计了一个EDA的开发平台和一套软件开发包。学生即可以利用该E-DA平台和软件开发包在E-Bus上设计各种PC机接口。该文首先描述了E-BUS的扩展方法和EDA开发平台的组成原理,然后给出了一个键盘接口的例子对实验步骤做了详细描述,最后与国内同类产品进行了比较。 相似文献
8.
9.
10.
本介绍一个采用VLIW超长指令字体系结构的高性能单片多处理机,在这个体系结构中采用流水寄存器堆来消除循环程序内的数据相关,从而使程序能够在指令级以极高的并行度并行运行。模拟实验结果表明这个体系结构具有很高的运算速度和很好的性能价格比。 相似文献