首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
介绍了交错并联技术的工作原理和性能优点,给出了其设计要求,总结目前常见的交错并联控制方法及它们各自的优势和劣势。  相似文献   

2.
针对开关电源在传统的Boost功率因数校正电路中有着明显的开关损耗,使得电路具有较高成本和低效率。文中在传统单相Boost变换器的基础上,采纳多通道交错并联技术来进行有源功率因数校正的主电路拓扑。以三相交错并联Boost变换器为例,分析其工作过程,并通过仿真实验证明了多相交错并联Boost PFC变换器具有减小输入电流纹波和输入电感值,以及提高变换器的效率等优点。  相似文献   

3.
DSP结构可以分为定点型(FXP)和浮点型(FLP).虽然FXP型DSP只能实现整数运算,但是它运算速度快,占用资源少,比FLP型成本低.而FXP型DsP使用FLP算法能够实现更高的精度和动态运算范围.对FXP DSP结构支持下的FLP需求不断增长,这主要有以下原因:第一,实现算法代码通常用C/C (采用浮点数形式)编写,将FLP算法转换成FXP格式是比较麻烦的.而将浮点算法移植到DSP平台所花费的时间较少,因而FLP降低了研发成本.另外,常用的算法得益于浮点运算提供的较大的运算范围.最后,在某些情况下应用FXP算法无法获得期望的精度和动态范围.  相似文献   

4.
This paper presents a technique for simulating processors based on the principle of compiled simulation. Unlike existing, commercially available instruction set simulators for DSPs, which are of interpretive character, the proposed technique performs instruction decoding and simulation scheduling at compile time. The technique offers up to three orders of magnitude faster simulation. The high speed allows the user to explore algorithms and hardware/software trade-offs before any hardware implementation. Moreover, the user can tailor the compiled simulation to trade speed for more accuracy. In this paper, the sources of the speedup and the limitations of the technique are analyzed and the realization of the simulation compiler is presented.  相似文献   

5.
The Joint Collaborative Team on Video Decoding is developing a new standard named High Efficiency Video Coding (HEVC) that aims at reducing the bitrate of H.264/AVC by another 50 %. In order to fulfill the computational demands of the new standard, in particular for high resolutions and at low power budgets, exploiting parallelism is no longer an option but a requirement. Therefore, HEVC includes several coding tools that allows to divide each picture into several partitions that can be processed in parallel, without degrading the quality nor the bitrate. In this paper we adapt one of these approaches, the Wavefront Parallel Processing (WPP) coding, and show how it can be implemented on multi- and many-core processors. Our approach, named Overlapped Wavefront (OWF), processes several partitions as well as several pictures in parallel. This has the advantage that the amount of (thread-level) parallelism stays constant during execution. In addition, performance and power results are provided for three platforms: a server Intel CPU with 8 cores, a laptop Intel CPU with 4 cores, and a TILE-Gx36 with 36 cores from Tilera. The results show that our parallel HEVC decoder is capable of achieving an average frame rate of 116 fps for 4k resolution on a standard multicore CPU. The results also demonstrate that exploiting more parallelism by increasing the number of cores can improve the energy efficiency measured in terms of Joules per frame substantially.  相似文献   

6.
The current paper introduces an efficient technique for parallel data addressing in FFT architectures performing in-place computations. The novel addressing organization provides parallel load and store of the data involved in radix-r butterfly computations and leads to an efficient architecture when r is a power of 2. The addressing scheme is based on a permutation of the FFT data, which leads to the improvement of the address generating circuit and the butterfly processor control. Moreover, the proposed technique is suitable for mixed radix applications, especially for radixes that are powers of 2 and straightforward continuous flow implementation. The paper presents the technique and the resulting FFT architecture and shows the advantages of the architecture compared to hitherto published results. The implementations on a Xilinx FPGA Virtex-7 VC707 of the in-place radix-8 FFT architectures with input sizes 64 and 512 complex points validate the results.  相似文献   

7.
Scalable Parallel Memory Architectures for Video Coding   总被引:1,自引:0,他引:1  
Current video compression standards, which process frames macroblock by macroblock, employ several processing functions to achieve the compression. These functions refer to data memory address space in different ways. E.g., performing motion estimation and motion compensation functions requires many times data accesses unaligned to word boundaries. On the other hand, Discrete Cosine Transformation (DCT) and inverse of it (IDCT) for 8 × 8 block can be performed first for rows and then for columns. Thus, transposition is needed between these two stages. Among other things, parallel memory architecture can provide a solution for these tasks. In our other paper, we shortly surveyed parallel memory architectures and proposed parallel memory architecture designs for different data path widths for video coding applications. In this paper, we construct video coding function examples by using the proposed parallel data memory efficiently. Furthermore, performance and implementation cost of the parallel memory architecture are estimated and compared to more conventional memory architectures. The examples are given for different data bus widths (16, 32, 64, and 128 bits). We show that the parallel memory can keep the data path fully utilized in many video coding function implementations. This ensures high-speed operation and full utilization of the processing resources.  相似文献   

8.
Long Bose–Chaudhuri–Hocquenghen (BCH) codes are used as the outer error correcting codes in the second-generation Digital Video Broadcasting Standard from the European Telecommunications Standard Institute. These codes can achieve around 0.6-dB additional coding gain over Reed–Solomon codes with similar code rate and codeword length in long-haul optical communication systems. BCH encoders are conventionally implemented by a linear feedback shift register architecture. High-speed applications of BCH codes require parallel implementation of the encoders. In addition, long BCH encoders suffer from the effect of large fanout. In this paper, three novel architectures are proposed to reduce the achievable minimum clock period for long BCH encoders after the fanout bottleneck has been eliminated. For an (8191, 7684) BCH code, compared to the original 32-parallel BCH encoder architecture without fanout bottleneck, the proposed architectures can achieve a speedup of over 100%.  相似文献   

9.
10.
The stickers model is a model of DNA computation that is computationally complete and universal. Many NP complete problems can be described by stickers programs that have polynomial runtime and are exponential in space. The stickers model can be viewed as a bit-vertically operating register machine. This makes it attractive for in silico implementation. This paper describes a stickers model for the maximum clique problem and its implementation by an FPGA architecture. The results show that the FPGA based algorithm is comparable with existing software algorithms for moderate problem sizes. More generally, the stickers model seems to be a well-suited programming model for dedicated hardware.  相似文献   

11.
Speeding up fast Fourier transform (FFT) computations is critical for today's real-time systems targeting signal processing and telecommunication applications. Aiming at the performance improvement and the efficiency of FFT architectures, this paper presents an address generation technique which enables a radix-$b$ processor to access in parallel $b$ memory banks without conflicts during each stage's computations. Using $kb$ memory banks at each stage leads to increasing the speedup of the algorithm by a factor of $kb$ . The address generation can be realized in each radix-$b$ stage by the use of lookup tables of size $O(kb^{2})$ bits. The proposed technique is cost efficient and leads to the design of FFT architectures of high speedup and high sustained throughput.   相似文献   

12.
张思栋  黄鲁  林贝元 《微电子学》2007,37(5):712-716
提出了一种基于优化时间重叠技术的10位300 MHz采样率4路并行流水线A/D转换器的设计方法,该方法降低了对运算放大器的要求。通过理论计算和实例设计,证明了此低功耗设计方法的显著效果。设计了一个用于前端的运算放大器,在CSM 0.35μm CMOS工艺、3.3 V电源电压下,该运放的增益为106 dB,单位增益带宽为402 MHz,建立时间为8.8 ns。采用优化时间重叠技术后,可满足4路并行300 MHz采样率的要求,功耗仅为8.57 mW,可大大降低整个并行流水线A/D转换器的功耗。  相似文献   

13.
DSP的并行处理方法   总被引:3,自引:0,他引:3  
TI公司TMS320C6x和AD公司ADSP2106x是目前业界使用广泛的数字信号处理嚣(DSP).本文详细地介绍了利用TMS320C6x的接口HPI、接口McBSP以及ADSP2106x的Link接口分别组成并行DSP处理系统的方法.同时介绍了这些方法的优缺点。  相似文献   

14.
随着信号处理性能需求不断提高,多核DSP软件开发是一个重要发展趋势。指令并行、多核并行处理、计算和传输并行等都是提高处理性能的方法。多核DSP处理器多级存储器中,越靠近内核存储器容量越少。在大数据量处理中需要相应的大存储器容量,无法直接将任务分配到各个处理器内核。针对这一问题,探讨了基于8核处理器的并行任务分配,并根据多核DSP的架构,采用两级乒乓的方式来实现大点数FFT的设计。该设计采用DMA方式实现了处理和传输并行,提高了处理性能。  相似文献   

15.
并行计算是实现高性能计算的一个重要发展方向。随着信号处理、通信等领域对处理能力需求的不断提升,DSP的并行开发技术也得到了较快发展。多器件并行和片上多核的方法可以有效提高处理性能。多核并行处理相对于传统单核DSP要进行多任务并行设计,使系统设计更加复杂。文中在探讨了利用8核处理器进行信号处理开发的关键技术的基础上,采用Round—Robin方式设计了一种多核并行信号处理模式,并对多核的同步、Cache一致性、任务并行分配等进行了论述。  相似文献   

16.
介绍了逆变电源并联的原理、技术要求和特点,概述了逆变电源并联技术的现状和意义,分析了环流的产生,对当前采用的逆变电源并联方案进行了总结和分类,指出了逆变电源并联技术发展的趋势。简要地介绍了TMS320LF_2407A数字信号处理器芯片的特点,阐明了系统的软、硬件结构和工作原理。仿真结果表明,该系统能够达到比较理想的控制效果。  相似文献   

17.
针对EMI干扰源,设计了一种交错并联控制方式的多路输出BUCK变换器。详细介绍了交错并联控制的工作方式,对其进行了时域和频域分析,与同相控制方式进行了比较,并通过Saber仿真软件对其进行了仿真验证。最后完成了一台28V输入,四路18V/1.5A输出的实验样机,测试结果表明,交错并联控制方式可以有效的减小干扰源的1次谐波的幅值,提高基波频率,为实现LOW-EMI变换器提供有效途径。  相似文献   

18.
In this paper we propose a technique to implement in a parallel fashion a turbo decoder based on an arbitrary permutation, and to expand its interleaver in order to produce a family of prunable S-random interleavers suitable for parallel implementations. We show that the spread properties of the obtained interleavers are almost optimal and we prove by simulation that they are very competitive in terms of error floor performance. A few details on the decoder architecture are also provided  相似文献   

19.
In this paper we consider the l 1-compressive sensing problem. We propose an algorithm specifically designed to take advantage of shared memory, vectorized, parallel and many-core microprocessors such as the Cell processor, new generation Graphics Processing Units (GPUs) and standard vectorized multi-core processors (e.g. quad-core CPUs). Besides its implementation is easy. We also give evidence of the efficiency of our approach and compare the algorithm on the three platforms, thus exhibiting pros and cons for each of them.  相似文献   

20.
文中将耦合电感应用于断续Buck变换器,分析了耦合系数对电流纹波和动态性能的影响,给出了参数设计方法,并进行了仿真实验研究。结果表明,通过合理的设计,耦合电感能够改善电路的稳态和动态性能,同时降低了损耗,提高了电路的效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号