首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A 32-b RISC/DSP microprocessor with reduced complexity   总被引:2,自引:0,他引:2  
This paper presents a new 32-b reduced instruction set computer/digital signal processor (RISC/DSP) architecture which can be used as a general purpose microprocessor and in parallel as a 16-/32-b fixed-point DSP. This has been achieved by using RISC design principles for the implementation of DSP functionality. A DSP unit operates in parallel to an arithmetic logic unit (ALU)/barrelshifter on the same register set. This architecture provides the fast loop processing, high data throughput, and deterministic program flow absolutely necessary in DSP applications. Besides offering a basis for general purpose and DSP processing, the RISC philosophy offers a higher degree of flexibility for the implementation of DSP algorithms and achieves higher clock frequencies compared to conventional DSP architectures. The integrated DSP unit provides instruction set support for highly specialized DSP algorithms. Subword processing optimized for DSP algorithms has been implemented to provide maximum performance for 16-b data types. While creating a unified base for both application areas, we also minimized transistor count and we reduced complexity by using a short instruction pipeline. A parallelism concept based on a varying number of instruction latency cycles made superscalar instruction execution superfluous  相似文献   

2.
Distributed arithmetic techniques are the key to efficient implementation of DSP algorithms in FPGAs. The distributed arithmetic process is briefly described. A representative DSP design application in the form of an 8 tap FIR filter is offered for the Xilinx XC3042 field programmable logic array (FPGA). The design is presented in sufficient detail—from filter specifications via filter design software through detailed logic of salient data and control functions to obtain a realistic placing and routing of configurable logic block (CLBs) and in/out block (IOBs) components for simulation verification and performance evaluation vis-a-vis commercially available dedicated 8 tap FIR filter chips.  相似文献   

3.
A family of computational organizations for the solution of the Toeplitz systems appearing in the digital signal processing (DSP) techniques of linear prediction and optimal FIR filtering is presented. All these organizations are based on a structure called superlattice which governs the Toeplitz solving procedure and provides many possible implementations. Algorithmic schemes for the implementation of these organizations, suitable for single-processor and multiprocessor environments, are developed. Among them there are order recursive algorithms, parallel-algorithms of O(p) complexity which use O(p) processing elements, and partitioned-parallel algorithms. The last can make full use of any number of available, parallel-working processors, independently of the system order. Superlattice-type algorithms are described for many Toeplitz-based problems  相似文献   

4.
QC LDPC (Quasi-才yclic Low-density Parity-check)是一类半结构化的低密度奇偶校验码,其分块的矩阵结构具有超大规模集成电路实现上的便利,同时保持了优异的纠错性能. 本文针对QC LDPC码的基矩阵,提出一种移位因子的搜索方法及其改进版本。通过对基矩阵的扩展矩阵的Tanner图进行树形展开来进行环的检验,避免了传统算法中的复杂算术操作,降低了复杂度。在采用和IEEE 802.16e中码率为0.5的LDPC码方案相同的基矩阵条件下,本文的算法构造出的QC LDPC码具有更优的环长分布,同时纠错性能也有提升。   相似文献   

5.
This paper describes a low-power programmable DSP architecture that targets audio signal processing. The architecture can be characterized as a heterogeneous multiprocessor consisting of small instruction set processors called mini-cores as well as standard DSP and CPU cores that communicate using message passing. The mini-cores are tailored for different classes of filtering algorithms (FIR, IIR, N-LMS etc.), and in a typical system the communication among processors occur at the sampling rate only.The mini-cores are intended as soft-macros to be used in the implementation of system-on-chip solutions using a synthesis-based design flow targeting a standard-cell implementation. They are parameterized in word-size, memory-size, etc. and can be instantiated according to the needs of the application. To give an impression of the size of a mini-core we mention that one of the FIR mini-cores in a prototype design has 16 instructions, a 32-word × 16-bit program memory, a 64-word × 16-bit data memory and a 25-word × 16-bit coefficient memory.Results obtained from the design of a prototype chip containing mini-cores for a hearing aid application, demonstrate a power consumption that is only 1.5–1.6 times larger than a hardwired ASIC and more than 6–21 times lower than current state of the art low-power DSP processors. This is due to: (1) the small size of the processors and (2) a smaller instruction count for a given task.  相似文献   

6.
仪表着陆系统是国际目前通用的飞机着陆设备。常规的ILS机载接收机基带信号处理部分采用模拟电路实现,测量精度低,电路实现复杂。基于DSP器件,基带信号处理部分全部在数字域进行,采用了定长的FIR滤波器和多速率信号处理算法,并针对硬件条件,对软件的处理速度和存储空间进行了优化。将该软件在DSP开发板上进行了仿真,计算结果稳定、精确,总体性能优于常规ILS机载接收机基带信号处理模块。  相似文献   

7.
岳梦云  白冰 《电子学报》2000,48(10):2041-2046
本文设计了一种适用于电机矢量控制算法的数字信号处理系统的微架构定义,包括其指令集定义、存储器模型以及与主CPU的交互模式.该设计具有通过固定部分多操作数有效缩减指令编码长度提高代码密度以及后台执行多周期指令提高ALU并行效率的显著优点.文中给出了典型的FOC控制算法在DSP (Digital Signal Processor)指令集上实现的指令周期数,也给出了对应架构的电路实现情况,最终以ARM CORTEX-M0及几款主流DSP作为比较基线,通过实测实验数据证明了体系结构的高能效比,以较为有限的电路面积代价,极大提高了集成DSP的嵌入式系统的运行效率.  相似文献   

8.
Synthesis of Embedded Software from Synchronous Dataflow Specifications   总被引:3,自引:0,他引:3  
The implementation of software for embedded digital signal processing (DSP) applications is an extremely complex process. The complexity arises from escalating functionality in the applications; intense time-to-market pressures; and stringent cost, power and speed constraints. To help cope with such complexity, DSP system designers have increasingly been employing high-level, graphical design environments in which system specification is based on hierarchical dataflow graphs. Consequently, a significant industry has emerged for the development of data-flow-based DSP design environments. Leading products in this industry include SPW from Cadence, COSSAP from Synopsys, ADS from Hewlett Packard, and DSP Station from Mentor Graphics. This paper reviews a set of algorithms for compiling dataflow programs for embedded DSP applications into efficient implementations on programmable digital signal processors. The algorithms focus primarily on the minimization of code size, and the minimization of the memory required for the buffers that implement the communication channels in the input dataflow graph. These are critical problems because programmable digital signal processors have very limited amounts of on-chip memory, and the speed, power, and cost penalties for using off-chip memory are often prohibitively high for embedded applications. Furthermore, memory demands of applications are increasing at a significantly higher rate than the rate of increase in on-chip memory capacity offered by improved integrated circuit technology.  相似文献   

9.
A method is presented to synthesize wideband linear-phase finite-impulse-response (FIR) filters with a piecewise-polynomial-sinusoidal impulse response. The method is based on merging the earlier synthesis scheme proposed by the authors to design piecewise-polynomial filters with the method proposed by Chu and Burrus. The method uses an arbitrary number of separately generated center coefficients instead of only one or none as used in the method by Chu–Burrus. The desired impulse response is created by using a parallel connection of several filter branches and by adding an arbitrary number of center coefficients to form it. This method is especially effective for designing Hilbert transformers by using Type 4 linear-phase FIR filters, where only real-valued multipliers are needed in the implementation. The arithmetic complexity is proportional to the number of branches, the common polynomial order for each branch, and the number of separate center coefficients. For other linear-phase FIR filter types the arithmetic complexity depends additionally on the number of complex multipliers. Examples are given to illustrate the benefits of this method compared to the frequency-response masking (FRM) technique with regard to reducing the number of coefficients as well as arithmetic complexity.  相似文献   

10.
Stochastic computing utilizes compact arithmetic circuits that can potentially lower the implementation cost in silicon area. In addition, stochastic computing provides inherent fault tolerance at the cost of a less efficient signal encoding. Finite impulse response (FIR) filters are key elements in digital signal processing (DSP) due to their linear phase-frequency response. In this article, we consider the problem of implementing FIR filters using the stochastic approach. Novel stochastic FIR filter designs based on multiplexers are proposed and compared to conventional binary designs implemented using Synopsys tools with a 28-nm cell library. Silicon area, power and maximum clock frequency are obtained to evaluate the throughput per area (TPA) and the energy per operation (EPO). For equivalent filtering performance, the stochastic FIR filters underperform in terms of TPA and EPO compared to the conventional binary design, although the stochastic design shows more graceful degradation in performance with a significant reduction in energy consumption. A detailed analysis is performed to evaluate the accuracy of stochastic FIR filters and to determine the required stochastic sequence length. The fault-tolerance of the stochastic design is compared with that of the binary circuit enhanced with triple modular redundancy (TMR). The stochastic designs are more reliable than the conventional binary design and its TMR implementation with unreliable voters, but they are less reliable than the binary TMR implementation when the voters are fault-free.  相似文献   

11.
高质量0.6 Kb/s声码器的TMS320VC55x实现   总被引:1,自引:0,他引:1  
给出了一种编码速率为600b/s的高质量声码器算法及基于DSP芯片的硬件实现。介绍了语音编解码算法原理、声码器系统的硬件结构、工作流程以及软件实现与代码优化。针对C55xDSP芯片的结构特点,采用C与汇编混合编程,汇编指令优化等方法,大大降低了算法的存储复杂度和运算复杂度,达到了实时性要求。  相似文献   

12.
提出利用Matlab与DSP技术相结合进行程序设计方案。以设计FIR低通滤波器为例,详细介绍了代码自动生成过程及参数配置。利用此方法实现的FIR低通滤波器能够在TMS320C6711 DSK开发板上顺利运行。实验结果表明自动生成的DSP代码滤波效果明显,缩短了DSP应用程序的开发周期,提高了编程效率。  相似文献   

13.
为满足太赫兹无线通信系统对大容量基带信号处理算法的要求,基于直接从多项式分解导出的传统滤波器并行实现算法,通过矩阵变化推导出复杂度更小的快速有限冲激响应(FIR)滤波器并行实现。在此基础上通过张量积的表示给出了2并行、4并行和8并行的转换公式以及实现架构。既而推导出2N并行快速FIR滤波器的通用实现公式,并对比了优化前后的复杂度差异。最后给出了64并行的快速FIR滤波器的推导公式和具体实现架构,以及优化前后的硬件复杂度对比,64并行的快速FIR滤波器算法资源消耗更少。  相似文献   

14.
This paper studies the problem of robust adaptive filtering in impulsive noise environment using a recursive least M-estimate algorithm (RLM). The RLM algorithm minimizes a robust M-estimator-based cost function instead of the conventional mean square error function (MSE). Previous work has showed that the RLM algorithm offers improved robustness to impulses over conventional recursive least squares (RLS) algorithm. In this paper, the mean and mean square convergence behaviors of the RLM algorithm under the contaminated Gaussian impulsive noise model is analyzed. A lattice structure-based fast RLM algorithm, called the Huber Prior Error Feedback-Least Squares Lattice (H-PEF-LSL) algorithm is derived. Part of the H-PEF-LSL algorithm was presented in ICASSP 2001. It has an order O(N) arithmetic complexity, where N is the length of the adaptive filter, and can be viewed as a fast implementation of the RLM algorithm based on the modified Huber M-estimate function and the conventional PEF-LSL adaptive filtering algorithm. Simulation results show that the transversal RLM and the H-PEF-LSL algorithms have better performance than the conventional RLS and other RLS-like robust adaptive algorithms tested when the desired and input signals are corrupted by impulsive noise. Furthermore, the theoretical and simulation results on the convergence behaviors agree very well with each other.  相似文献   

15.
《Microelectronics Journal》2002,33(5-6):501-508
This paper proposes the FPGA implementation of the digit-serial Canonical Signed-Digit (CSD) coefficient FIR filters which can be used as format conversion filters in place of the ones employed for the MPEG2 TM 5 (test model 5). Canonical representation of a signed digit (CSD) is a method used to reduce cost by representing a signed number using the least amount of non-zero digits, thereby reducing the number of multiply operations. As Field Programmable Gate Arrays (FPGAs) have grown in capacity, improved in performance, and decreased in cost, they are becoming a viable solution for performing computationally intensive tasks, with the ability to tackle applications formerly reserved for custom chips and programmable digital signal processing (DSP) devices. A digit-serial CSD FIR filter design is realized and practical design guidelines are provided using FPGAs. An analysis of the performance comparison of bit-serial, serial distributed arithmetic, and digit-serial CSD FIR filters on a Xilinx XC4000XL-series FPGA is described. The results show that the proposed digit-serial CSD FIR filter is compact and an efficient implementation of real-time DSP applications on FPGAs.  相似文献   

16.
This paper presents a new two-dimensional (2-D) optimum block stochastic gradient (TDOBSG) algorithm for 2-D adaptive finite impulse response (FIR) filtering. The TDOBSG algorithm employs a space-varying convergence factor for all the filter coefficients, where the convergence factor at each block iteration is optimized in a least squares sense that the squared norm of the a posteriori estimation error vector is minimized. It has the same order of computational complexity as another 2-D optimum block adaptive (TDOBA) algorithm. Computer simulations for image restoration show that the TDOBSG algorithm outperforms the TDOBA algorithm and other related algorithms in terms of objective and/or subjective measures.  相似文献   

17.
A fast exact least mean square adaptive algorithm   总被引:1,自引:0,他引:1  
A general block formulation of the least-mean-square (LMS) algorithm for adaptive filtering is presented. This formulation has an exact equivalence with the original LMS algorithm; hence it retains its convergence properties, while allowing a reduction in arithmetic complexity, even for very small block lengths. Working with small block lengths is interesting from an implementation point of view (large blocks result in large memory and large system delay) and allows a significant reduction in the number of operations. Tradeoffs between a number of operations and a convergence rate are obtainable by applying certain approximations to a matrix involved in the algorithm. Hence, the usual block LMS appears as a special case, which explains its convergence behavior according to the type of input signal (correlated or uncorrelated)  相似文献   

18.
This paper describes the implementation of a digital audio effect system‐on‐a‐chip (SoC), which integrates an embedded digital signal processor (DSP) core, audio codec intellectual property, a number of peripheral blocks, and various audio effect algorithms. The audio effect SoC is developed using a software and hardware co‐design method. In the design of the SoC, the embedded DSP and some dedicated hardware blocks are developed as a hardware design, while the audio effect algorithms are realized using a software centric method. Most of the audio effect algorithms are implemented using a C code with primitive functions that run on the embedded DSP, while the equalization effect, which requires a large amount of computation, is implemented using a dedicated hardware block with high flexibility. For the optimized implementation of audio effects, we exploit the primitive functions of the embedded DSP compiler, which is a very efficient way to reduce the code size and computation. The audio effect SoC was fabricated using a 0.18 μm CMOS process and evaluated successfully on a real‐time test board.  相似文献   

19.
The most computationally intensive part of the wideband receiver of a software defined radio (SDR) is the intermediate frequency (IF) processing block. Digital filtering is the main task in IF processing. The computational complexity of finite impulse response (FIR) filters used in the IF processing block is dominated by the number of adders (subtracters) employed in the multipliers. This paper presents a method to implement FIR filters for SDR receivers using minimum number of adders. We use an arithmetic scheme, known as pseudo floating-point (PFP) representation to encode the filter coefficients. By employing a span reduction technique, we show that the filter coefficients can be coded using considerably fewer bits than conventional 24-bit and 16-bit fixed-point filters. Simulation results show that the magnitude responses of the filters coded in PFP meet the attenuation requirements of wireless communication standard specifications. The proposed method offers average reductions of 40% in the number of adders and 80% in the number of full adders needed for the coefficient multipliers over conventional FIR filter implementation methods  相似文献   

20.
FIR陷波滤波器具有线性相位、精度高、稳定性好等诸多优势,然而当陷波性能要求较高时,通常需要较高的阶数,导致FIR陷波滤波器硬件实现复杂度大大提高。该文基于稀疏FIR滤波器设计算法和共同子式消除的思想,提出一种低复杂度的FIR陷波滤波器设计方法。该方法首先采用稀疏滤波器设计算法得到满足频域性能设计要求的FIR陷波原始滤波器系数,然后对其进行CSD编码,并分析CSD编码量化系数集中所有的2项子式和孤子的灵敏度,最后根据灵敏度的大小依次选择合理的2项子式或孤子直接合成滤波器系数集。仿真结果表明,新算法设计实现的FIR陷波滤波器比已有的低复杂度设计方法最多可减少51%的加法器,有效地降低了硬件实现复杂度,大大节省了硬件资源。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号