首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
采用双平行柔性铰链结构设计了一种二维微动平台,并计算了微动平台各方向的刚度。根据约束条件,建立了理论刚度数学模型;采用序列二次规划(SQP)算法对平台结构进行多目标优化,并通过MATLAB软件对数学模型进行求解;利用ANSYS软件对优化后平台的位移、刚度和固有频率进行仿真分析。理论计算和仿真结果表明,微动平台具有良好的静、动态特性。  相似文献   

2.
针对基于MPI的传统FDTD并行算法存在的缺点,提出了优化的FDTD两级化并行算法。结合MPI和OpenMP编程模型的特点,实现了基于SMP集群系统平台的MPI-OpenMP混合编程模型的两种并行FDTD算法。在实验室搭建的SMP集群系统平台上,通过对一金属长方体的散射问题分析,把混合编程算法同基于MPI的FDTD并行算法进行了比较。结果表明,混合并行算法具有更好的加速比和带宽利用率。  相似文献   

3.
基于SMP机群的层次化并行编程技术的研究   总被引:2,自引:0,他引:2       下载免费PDF全文
祝永志  张丹丹  曹宝香  禹继国 《电子学报》2012,40(11):2206-2210
 针对多核SMP机群的体系结构特点,讨论了MPI+OpenMP混合并行程序设计技术.提出了一种多层次化混合设计新方法.设计了N-body问题的多层次化并行算法,并在曙光5000A机群上与传统的混合算法作了性能方面的比较.结果表明,该层次化混合并行算法具有更好的扩展性和加速比.  相似文献   

4.
第二代卫星数字广播系统DVB-S2采用接近Shannon限的LDPC码作为内码.在LDPC译码方式中,软判决的和积译码性能最佳,但是由于其采用大量浮点数运算,使得译码器的软硬件实现都较困难.为此,提出一种采用图形处理器(GPU)编程进行译码的实现方式.GPU的并行处理功能使其可以同时满足高精度浮点运算和高速实时解码的要求,为DVB-S2的实际应用提供了新的思路.采用计算机上NVIDIA GeForce 9600显卡编程环境,实现了满足高清视频要求的信息吞吐速率.  相似文献   

5.

Recent advances in general-purpose graphics processing units (GPGPUs) have resulted in massively parallel hardware that is widely available to achieve high performance in desktop, notebook, and even mobile computer systems. While multicore technology has become the norm of modern computers, programming such systems requires the understanding of underlying hardware architecture and hence posts a great challenge for average programmers, who might be professionals in specific domains, but not experts in parallel programming. This paper presents a GUI tool called GPUBlocks that can facilitate parallel programming on multicore computer systems. GPUBlocks is developed based on the OpenBlocks framework, an extendable tool for graphical programming, to construct the GUI-based programming environment for CUDA and OpenCL parallel computing platforms. Programmers simply need to drag-n-drop blocks, fill the fields of the blocks, and connect them according to array or matrix computations that are specified by algorithms. GPUBlocks can then translate block-based code to CUDA or OpenCL programs. Furthermore, a couple of optimization constructs have also been offered for rapid program optimization. Experimental results have shown that the generated CUDA and OpenCL programs can achieve reasonable speedups on GPUs. Consequently, GPUBlocks can be used as a tool for fast prototyping of GPU applications or a platform for educational parallel programming.

  相似文献   

6.
大量工程应用问题可建模为结构化非线性规划,且这类问题的系数矩阵可分为稀疏型和稠密型两种类型.利用原始-对偶内点法(primal dual interior point method,PD-IPM),并结合分布式并行技术可高效求解此类问题.经典工程问题-机组组合(unit commitment,UC)为稀疏系数矩阵的结构化非线性规划,本文根据PD-IPM原理,对UC模型进行连续松弛预处理,结合快速解耦技术解耦牛顿修正方程并设计CPU-GPU协同并行算法求解子问题,最后将结果与带稠密型子问题的结构化非线性规划的求解结果进行比较和分析.实验结果显示,本文所设计的算法对于两种不同类型的结构化非线性规划求解均能获得较好的加速比.  相似文献   

7.
并口JTAG仿真器的设计与实现   总被引:1,自引:0,他引:1       下载免费PDF全文
许建荣  姚国良  胡晨   《电子器件》2007,30(1):314-317
通过对JTAG协议和标准并口规范的研究,提出了适合嵌入式系统调试的并口JTAG仿真器的实现方案.给出了硬件电路的设计,并通过并口信号的软件时序模拟实现底层通信从而完成JTAG协议的转换,最后提出开放接口的驱动软件架构,可以对目标机器进行在线编程和调试,实现人机交互以及与第三方开发工具的交互,并对基本功能进行了验证.  相似文献   

8.
细粒度并行计算编程模型研究   总被引:5,自引:1,他引:4  
作为应用软件模型和计算机硬件之间的桥梁,编程模型在计算机领域的重要性不言而喻.但随着具备细粒度并行计算能力的多核心微处理器进入主流市场,与之相适应的编程模型发展却相对滞后.对细粒度的并行计算编程模型进行研究.首先,介绍3种典型的多核心微处理器体系结构;其次,介绍3个已有的细粒度并行计算编程模型;最后,探讨并行计算编程模型的必备条件.  相似文献   

9.
The ParaScope parallel programming environment   总被引:1,自引:0,他引:1  
The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, is described. It includes a collection of tools that use global program analysis to help users develop and debug parallel programs. The focus is on ParaScope's compilation system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. A project aimed at extending ParaScope to support programming in FORTRAN D, a machine-independent parallel programming language for use with both distributed-memory and shared-memory parallel computers, is described  相似文献   

10.
构建了一种适用于多核集群的混合并行编程模型.该模型融合了共享内存的面向任务的TBB编程和基于消息传递的MPI编程两种模式.结合两者的优势,实现进程到处理节点和进程内线程到处理器核的两级并行.相对于单一编程方式下的程序性能,采用这种混合并行编程模型的算法不但可以减少程序执行时间,获得更好的加速比和执行效率,而且明显地提高了集群性能.  相似文献   

11.
刘民  吴澄 《电子学报》1999,27(7):132-134
随着CIMS技术的发展,生产线调度问题的研究显得日益重要,最小化拖用期任务数并行机调度问题是一类重要的生产线调度问题,但迄今为止,在解决工件和机器数较多的大规模并行机调度问题还存在着许多困难,进化规划方法与遗传算法一样是一种重要的进化计算方法,它具有描述简单,使用灵活,运行效率高,鲁棒性强,较少受初始条件限制等优点,这使得它有很高的实用价值,但与遗传算法相比,进化规划方法的应用还刚刚开始,特别是在  相似文献   

12.
This paper presents a comparison design of comb decimators based on the non-recursive algorithm and the recursive algorithm. Compared with the recursive algorithm, the main advantage of the non-recursive algorithm is its abilities of reducing power consumption and increasing circuit speed especially when the decimation ratio and filter order are high. Based on the non-recursive algorithm, a decimator with programmable filter orders (3rd, 4th and 5th), decimation ratios (8, 16, 32 and 64) and input bits (1 and 2 bits) has been implemented in a 0.6 m 3.3 V CMOS process. Its measured core power consumption is 44 mW at the oversampling rate of 25 MHz and its highest input data rate is 110 MHz.  相似文献   

13.
介绍了一种基于Apla—Java可重用部件库并行(并发)程序的开发方法,包括Apla—Java可重用部件的设计策略以及部件库并行并发机制的实现方法。并通过一个并行计算的实例验证了Apla—Java可重用部件库应用于并行(并发)程序设计的正确性。  相似文献   

14.
A number of issues relating to the implementation of a parallel finite-element program are discussed, including choice of programming language (principally FORTRAN90 versus C++), data-communication environment, matrix partitioner, and parallel preconditioner. Some results computed for an intricately shaped load in a microwave applicator are presented  相似文献   

15.
Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and paraUelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study.  相似文献   

16.
要使多核处理器充分发挥并行计算性能,最大的挑战是并行编程模型.目前并行线程使用锁来保证线程间的同步,但锁会带来死锁等错误,并且性能很难优化.事务存储模型将一系列共享存储操作看成一个事务,保证其原子性,一致性和隔离性.它可以取代锁结构,简化编程模型,提高并行计算的性能.介绍了一种软件事务存储模型(Buffering Software Transactional Memory,BSTM)的结构,它主要采用了写缓冲的办法,简化了事务模型的设计.实验的结果表明这种模型存在一定的优势.  相似文献   

17.
In order to realize high-capacity and low-cost flash memory, we have developed a 64-Mb flash memory with multilevel cell operation scheme. The 64-Mb flash memory has been achieved in a 98 mm2 die size by using four-level per cell operation scheme, NOR type cell array, and 0.4-μm CMOS technology. Using an FN type program/erase cell allows a single 3.3 V supply voltage. In order to establish fast programming operation using Fowler-Nordheim (FN)-NOR type memory cell, we have developed a highly parallel multilevel programming technology. The drain voltage controlled multilevel programming (DCMP) scheme, the parallel multilevel verify (PMV) circuit, and the compact multilevel sense-amplifier (CMS) have been implemented to achieve 128 b parallel programming and 6.3 μs/Byte programming speed  相似文献   

18.
MapReduce是由并行编程模型及相关支撑系统组成的数据处理框架,通过定义接口和运行时支持库,通过定义良好的接口和运行时支持库,能够自动并行执行大规模计算任务,通过隐藏底层实现细节,降低实现并行编程的难度,Hadoop是目前MapReduce框架最流行的开源实现.文章首先介绍了MapReduce并行编程模型及其hadoop的运行原理、运行机制,深入研究了MapReduce计算任务在Hadoop系统中的运行过程.  相似文献   

19.
An efficient method for the simulation of EPROM programming based on hydrodynamic calculations of electron energy within the device, is described. After the nonMaxwellian energy distribution is calculated, an expression for injected gate current is integrated to find the total gate charge and hence the threshold voltage shift, as a function of time. Comparison of theoretical and experimental results for actual EPROM programming validates this method.<>  相似文献   

20.
Parallel computing is rapidly entering mainstream computing, and multicore processors can now be found in the heart of supercomputers, desktop computers, and laptops. Consequently, applications will increasingly need to be parallelized to fully exploit the multicore processor throughput gains that are becoming available. Unfortunately, writing parallel code is more complex than writing serial code. An introductory parallel computing course aims to introduce students to this technology shift and to explain that parallelism calls for a different way of thinking and new programming skills. The course covers theoretical topics and offers practical experience in writing parallel algorithms on state-of-the-art parallel computers, parallel programming environments, and tools.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号