共查询到20条相似文献,搜索用时 281 毫秒
1.
2.
本文为满足G级像素帧的实时性处理需求,针对DCT变换计算量大和常规处理中并行度不足的问题,提出一种基于SIMD PE阵列的DCT数据并行实现方法.该方法因PE阵列本身所具有的可裁减特性,可应用于不同并行度需求的嵌入式系统中.文中提出一种基于PE标识的数据并行操作方式, 不但解决了局部计算中的"PE自治"问题,又省去了数据寻址时间开销.该操作方式规则、简洁,满足SIMD操作规则性强的要求,符合并行处理技术的发展方向. 相似文献
3.
提出了一种基于PIM并行计算机体系结构的一维多媒体处理SIMD阵列,实现了基于该阵列的控制器,给出了该控制器的主要部件和指令格式,介绍了PE阵列控制;最后,给出了基于该体系结构的指令执行仿真结果. 相似文献
4.
5.
基于局部线性滤波函数的大多数图像处理操作,都可以表示成图像数据与一个权值样板的卷积.对于N×N的图像和M×M(M<N)的模板,卷积算法在单处理机上用传统的方法实现需要O(N2M2)时间.显然它应当采用数据并行的处理方法来实现.本文较详细地讨论了卷积算法在局部寄存器个数受限与不受限情况下的两维处理元阵列的数据并行实现方法,提出了一种适用于具有有限局部寄存器的-维处理元阵列的卷积并行算法,并对算法的复杂度进行了分析. 相似文献
6.
利用超长指令字(VLIW)处理器处理单指令多数据(SIMD)的优势,采用加速SIMD指令计算的数据拼接方案和多方向并行搜索方法以及适合图像数据复用的插值图像存储结构,优化实现了一个高效的ME软硬件结合的架构,分别在TMS320C64××及自行设计的LILY Processor上实现了H.264的QCIF图像数据的搜索及H.263的CIF图像数据的搜索.测试实验表明ME的搜索速度提高了3倍到15倍. 相似文献
7.
8.
实时电子数字稳像系统并行处理的快速实现 总被引:2,自引:0,他引:2
电子数字稳像系统待处理的数据量和计算量非常大,其实时性要求使得系统必须具有很高的数据处理速度。结合图像视频数据处理高度并行化、重复性处理的特征,在程序编制上,采用并行处理方法,利用单指令多数据流(Single Instruction Multiple Data, SIMD)、流水线SIMD技术和多线程设计方法。采用块匹配法进行运动估计,以总绝对差作为匹配准则,以菱形搜索与三步快速搜索相结合的综合搜索策略,减少了运算量,进一步加快了处理速度。同时,使用Kalman低通滤波方法去除图像的高频抖动,而保留了平滑的全局运动,保证了系统的有效性和鲁棒性。通过这些措施,在普通PC机上实现了系统的高效实时处理。 相似文献
9.
10.
针对导弹末制导红外图像处理实时性要求,分析了红外图像并行实时处理算法及其集群计算机的实现,建立了适应于实时处理的BSP改进模型和集群并行处理软件系统,并给出了红外图像并行处理算法的实现方法.研究了应用提升小波实现图像并行处理算法及实现,进而分析了算法性能,推导出了加速比模型.实验验证了其有效性. 相似文献
11.
12.
为了数据采集处理设备小型化、智能化和一体化,完成大量数据的采集和实时处理,并通过特殊算法完成复杂运算的目的,本文杓建了一种基于DSP+FPGA的信号处理平台。该平台采用FPGA来实现FFT运算,利用DSP来完成频域信号的分析和处理以及与上位机的通信,应用CPLD来完成整个系统时序控制。该平台主要特点是硬件电路器件具有实时快速的执行速度,并使用了低功耗、低成本的DSP芯片。 相似文献
13.
Gealow J.C. Herrmann F.P. Hsu L.T. Sodini C.G. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1996,4(1):32-41
A system design for performing low-level image processing tasks in real time is presented. The design is based on large processor-per-pixel arrays implemented using integrated circuit technology. Two integrated circuit architectures are summarized: an associative parallel processor and a parallel processor employing DRAM cells. In both architectures, the layout pitch of one-bit-wide logic is matched to the pitch of memory cells to form high-density processing element arrays. The system design features an efficient control path implementation, providing high processing element array utilization without demanding complex controller hardware. Sequences of array instructions are generated by a host computer before processing begins, then stored in a simple controller. Once processing begins, the host computer initiates stored sequences to perform pixel-parallel operations. A programming framework implemented using the C++ programming language supports application development. A prototype system employs associative parallel processor devices, a controller, and the programming framework. Three sample applications, smoothing and segmentation, median filtering, and optical flow, establish the suitability of the system for real-time image processing 相似文献
14.
一种基于双端口RAM的高速数据采集系统设计 总被引:8,自引:0,他引:8
罗杰 《微电子学与计算机》2001,18(6):52-54
文章给出了一种基于双端口RAM技术的高速数据采集系统的设计。采用将高速双端口RAM映射为主机内存并构造成环状缓冲区的方法,实现了高速DC数据流实时采集与主机处理的并行操作。 相似文献
15.
16.
Video object segmentation is an important pre-processing task for many video analysis systems. To achieve the requirement
of real-time video analysis, hardware acceleration is required. In this paper, after analyzing existing video object segmentation
algorithms, it is found that most of the core operations can be implemented with simple morphology operations. Therefore,
with the concepts of morphological image processing element array and stream processing, a reconfigurable morphological image
processing accelerator is proposed, where by the proposed instruction set, the operation of each processing element can be
controlled, and the interconnection between processing elements can also be reconfigured. Simulation results show that most
of the core operations of video object segmentation can be supported by the accelerator by only changing the instructions.
A prototype chip is designed to support real-time change-detection-and-background-registration based video object segmentation
algorithm. This chip incorporates eight macro processing elements and can support a processing capacity of 6,200 9-bit morphological
operations per second on a SIF image. Furthermore, with the proposed tiling and pipelined-parallel techniques, a real-time
watershed transform can be achieved using 32 macro processing elements. 相似文献
17.
18.
19.
This paper presents an improved fuzzy logic controller (FLC) for an interior permanent magnet synchronous motor (IPMSM) for high-performance industrial drive applications. In the proposed control scheme for high-speed operations above the rated speed, the operating limits of IPMSM are expanded by incorporating the maximum torque per ampere operation in constant torque region and the flux-weakening operation in constant power region. The power ratings of the motor and the inverter are considered in developing the control algorithm. A new and simple FLC is utilized as a speed controller. The FLC is developed to have less computational burden, which makes it suitable for real-time implementation, particularly at high-speed operating conditions. The complete drive is implemented in real-time using digital signal processor (DSP) controller board DS 1102 on a laboratory 1-hp IPM motor. The efficiency of the proposed control scheme is evaluated through both experimental and computer simulation results. The proposed controller is found to be robust for high-speed applications 相似文献
20.
Real-time image processing usually requires an enormous throughput rate and a huge number of operations. Parallel processing, in the form of specialized hardware, or multiprocessing are therefore indispensable. This piper describes a flexible programmable image processing system using the field programmable gate array (FPGA). The logic cell nature of currently available FPGA is most suitable for performing real-time bit-level image processing operations using the bit-level systolic concept. Here, we propose a novel architecture, the programmable image processing system (PIPS), for the integration of these programmable hardware and digital signal processors (DSPs) to handle the bit-level as well as the arithmetic operations found in many image processing applications. The versatility of the system is demonstrated by the implementation of a 1-D median filter. 相似文献