期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	273篇
免费	28篇
国内免费	41篇

专业分类

电工技术	10篇
综合类	13篇
金属工艺	2篇
机械仪表	2篇
建筑科学	1篇
能源动力	1篇
轻工业	1篇
水利工程	1篇
石油天然气	3篇
武器工业	1篇
无线电	25篇
一般工业技术	6篇
原子能技术	4篇
自动化技术	272篇

出版年

2024年	4篇
2022年	4篇
2021年	3篇
2020年	5篇
2019年	3篇
2018年	7篇
2017年	11篇
2016年	11篇
2015年	23篇
2014年	30篇
2013年	23篇
2012年	25篇
2011年	44篇
2010年	32篇
2009年	24篇
2008年	14篇
2007年	21篇
2006年	13篇
2005年	16篇
2004年	12篇
2003年	7篇
2002年	6篇
2001年	2篇
2000年	2篇

排序方式： 共有342条查询结果，搜索用时 0 毫秒

1 [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] 下一页 » 末页»

OpenMDSP: Extending OpenMP to Program Multi-Core DSPs

下载免费PDF全文

何江舟陈文光陈光日郑纬民汤志忠叶寒栋《计算机科学技术学报》2014,29(2):316-331

Abstract Multi-core digital signal processors （DSPs） are widely used in wireless telecommunication, core network transcoding, industrial control, and audio/video processing technologies, among others. In comparison with general-purpose multi-processors, multi-core DSPs normally have a more complex memory hierarchy, such as on-chip core-local memory and non-cache-coherent shared memory. As a result, efficient multi-core DSP applications are very difficult to write. The current approach used to program multi-core DSPs is based on proprietary vendor software development kits （SDKs）, which only provide low-level, non-portable primitives. While it is acceptable to write coarse-grained task-level parallel code with these SDKs, writing fine-grained data parallel code with SDKs is a very tedious and error-prone approach. We believe that it is desirable to possess a high-level and portable parallel programming model for multi-core DSPs. In this paper, we propose OpenMDSP, an extension of OpenMP designed for multi-core DSPs. The goal of OpenMDSP is to fill the gap between the OpenMP memory model and the memory hierarchy of multi-core DSPs. We propose three classes of directives in OpenMDSP, including 1） data placement directives that allow programmers to control the placement of global variables conveniently, 2） distributed array directives that divide a whole array into sections and promote the sections into core-local memory to improve performance, and 3） stream access directives that promote big arrays into core-local memory section by section during parallel loop processing while hiding the latency of data movement by the direct memory access （DMA） of a DSP. We implement the compiler and runtime system for OpenMDSP on PreeScale MSC8156. The benchmarking results show that seven of nine benchmarks achieve a speedup of more than a factor of 5 when using six threads. 相似文献

铸造数值模拟超线程并行计算的研究

陈邦乾陈立亮《铸造技术》2007,28(9):1230-1234

OpenMP作为共享存储并行编程标准，以其良好的易用性、支持增量并行等优点，成为并行程序设计的主流模型之一。本文主要探讨了铸造数值模拟温度场在重力补缩与非重力补缩条件下的并行计算。实例表明，在支持超线程的硬件环境中，使用OpenMP技术可以取得良好的并行性，大大提高运算效率。相似文献

Performance Evaluation of a Multi-Zone Application in Different OpenMP Approaches

Haoqiang Jin Barbara Chapman Lei Huang Dieter an Mey Thomas Reichstein 《International journal of parallel programming》2008,36(3):312-325

We describe a performance study of a multi-zone application benchmark implemented in several OpenMP approaches that exploit multi-level parallelism and deal with unbalanced workload. The multi-zone application was derived from the well-known NAS Parallel Benchmarks (NPB) suite that involves flow solvers on collections of loosely coupled discretization meshes. Parallel versions of this application have been developed using the Subteam concept and Workqueuing model as extensions to the current OpenMP. We examine the performance impact of these extensions to OpenMP and compare with hybrid and nested OpenMP approaches on several large parallel systems. 相似文献

Multi-core CPUs,Clusters, and Grid Computing: A Tutorial 总被引：1，自引：0，他引：1

Michael Creel William L. Goffe 《Computational Economics》2008,32(4):353-382

The nature of computing is changing and it poses both challenges and opportunities for economists. Instead of increasing clock speed, future microprocessors will have “multi-cores” with separate execution units. “Threads” or other multi-processing techniques that are rarely used today are required to take full advantage of them. Beyond one machine, it has become easy to harness multiple computers to work in clusters. Besides dedicated clusters, they can be made up of unused lab computers or even your colleagues’ machines. Finally, grids of computers spanning the Internet are now becoming a reality. 相似文献

OpenMP compiler for distributed memory architectures

WANG Jue HU ChangJun ZHANG JiLin & LI JianJiang School of Information Engineering University of Science Technology Beijing Beijing China 《中国科学:信息科学(英文版)》2010,(5):932-944

OpenMP is an emerging industry standard for shared memory architectures. While OpenMP has advantages on its ease of use and incremental programming, message passing is today still the most widely-used programming model for distributed memory architectures. How to effectively extend OpenMP to distributed memory architectures has been a hot spot. This paper proposes an OpenMP system, called KLCoMP, for distributed memory architectures. Based on the partially replicating shared arrays memory model, we propose ... 相似文献

多色SSOR-PCG的MPI+OpenMP混合编程实现

林绍忠许合伟颉志强《计算机辅助工程》2013,22(6):79-83

针对对称逐步超松驰预处理共轭梯度（Symmetric Successive Over Relaxation Preconditioned Conjugate Gradient,SSOR-PCG）法并行化时每步迭代都要并行求解2个三角方程组的困难,采用多色排序技术提高并行度,基于MPI＋OpenMP混合编程模型开发适合于分布共享内存计算机的并行程序,通过测试选择有效的MPI通信函数,并给出3种避免共享数据竞争的措施,供不同规模问题和不同内存容量计算机情况选用．相似文献

Efficient parallel implementation of Ewald summation in molecular dynamics simulations on multi-core platforms

Yali Liu Changjun Hu Chongchong Zhao 《Computer Physics Communications》2011,(5):1111-1119

We present a multi-step computation method to implement the Ewald summation for long-range electrostatic interactions in molecular dynamic simulations on a multi-core machine. Our methodology is based on the OpenMP programming model. It partitions computations of real-space summation among threads so that the global force of a single particle cannot be modified by more than one thread simultaneously. It requires neither a private copy of the force array for each thread nor an inspector at runtime. Compared with some other methods that can parallelise reduction operations on a force-array, our method achieves relatively higher speedups and lower L2 cache miss and bus utilisation ratios. 相似文献

一种利用并行复算实现的OpenMP 容错机制 总被引：1，自引：0，他引：1

富弘毅丁滟宋伟杨学军《软件学报》2012,23(2):411-427

基于并行复算的故障恢复技术,将故障恢复的计算任务分配至未发生故障的结点上并行执行,从而显著缩短复算时间,有效降低故障恢复开销,提高并行程序容错性能.基于该故障恢复技术,提出了一种针对OpenMP并行程序的容错机制PR-OMP,有效解决了分段复算、复算负载重分布等问题;此外,还扩展了传统编译数据流分析技术,提出了针对OpenMP并行程序的数据流分析技术,并基于该技术计算状态保存开销进行优化.设计实现了用于支持PR-OMP的编译工具GiFT-OMP,并通过实验证明了PR-OMP机制及其支持工具的有效性,评估并分析了其性能和可扩展性. 相似文献

异构多核上支持OpenMP3.0的自适应任务粒度策略

曹倩左敏《小型微型计算机系统》2012,33(6):1350-1357

任务粒度是决定任务并行程序性能的关键因素,鉴于不同应用其最优的任务粒度可能不同,提出一种异构多核Cell处理器上支持OpenMP3.0的自适应任务粒度策略.该策略首先广度生成任务,直到所有的线程达到饱和,之后若某个线程执行完自身任务而处于空闲状态时,通过回溯到忙碌线程的任务树中最早可以派生任务的结点处生成新任务,以供空闲线程窃取执行.该策略不仅保证生成的任务粒度最大化,并且有效地解决了负载不均衡问题.实验在一个Cell处理器上进行,结果表明与顺序执行速度相比,自适应任务粒度策略达到了4.1到7.2的加速比,并且该策略优于现有的Tascell和AdaptiveTC方案,同时对于绝大部分应用表现出了良好的可扩展行. 相似文献

10.

多核环境下边缘提取并行算法研究

张思乾程果陈荤熊伟《计算机科学》2012,39(1):295-298

随着处理器由高主频的单核处理器逐步转向片上多核处理器(CMP),计算机并行处理能力不断提升。通过分析GIS串行算法面临的性能瓶颈,利用CMP的优势,采用线程级并行处理栅格数据。针对边缘提取算法,深入分析和比较了MPI、OpenMP等当前主流的并行编程模式,提出了并行性能估计模型。基于OpenMP编程模型分析线程数、调度方式和分块大小对算法并行性能的影响,实现边缘提取最优并行。实验证明,性能评估模型能够准确预测CMP环境下的并行性能,基于OpenMP实现的边缘提取并行算法能够提高图像边缘提取效率。相似文献

1 [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] 下一页 » 末页»