首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Reversible logic is a new field of study that has applications in optical information processing, low power CMOS design, DNA computing, bioinformatics, and nanotechnology. Low power consumption is a basic issue in VLSI circuits today. To prevent the distribution of errors in the quantum circuit, the reversible logic gates must be converted into fault-tolerant quantum operations. Parity preserving is used to realize fault tolerant in this circuits. This paper proposes a new parity preserving reversible gate. We named it NPPG gate. The most significant aspect of the NPPG gate is that it can be used to produce parity preserving reversible full adder circuit. The proposed parity preserving reversible full adder using NPPG gate is more efficient than the existing designs in term of quantum cost and it is optimized in terms of number of constant inputs and garbage outputs. Compressors are of importance in VLSI and digital signal processing applications. Effective VLSI compressors reduce the impact of carry propagation of arithmetic operations. They are built from the full adder blocks. We also proposed three new approaches of parity preservation reversible 4:2 compressor circuits. The third design is better than the previous two in terms of evaluation parameters. The important contributions have been made in the literature toward the design of reversible 4:2 compressor circuits; however, there are not efforts toward the design of parity preservation reversible 4:2 compressor circuits. All the scales are in the nanometric criteria.  相似文献   

2.
Nowadays, Variable digital filters (VDF) play an essential role in the field of communication and signal processing. The desired frequency response of any prototype filter can be obtained by developing an All Pass Transformation (APT) based Variable digital filter (APT-VDF) that maintains an exhaustive control over the cut off frequency. The performance of the APT-VDF is limited by its speed and area utilization. In this paper, the pipelined APT-VDF is modified by developing a new Variable Block Sized Ternary Adder (VBS-TA) and a modified Ternary multiplier for the fast realization of the filter structure. Because, the fundamental arithmetic operations involved in the design of APT-VDF are addition and multiplication. The ternary logic transmits more data through interconnection wire, and hence the ternary logic based arithmetic requires fewer components and interconnections. The proposed VBS-TA increases the speed of the addition process by skipping the carry propagation with the help of ternary compound gates. This VBS-TA can also be used to boost up the speed of the multiplier circuit in the APT-VDF filter. Furthermore, the ternary multiplier is modified by introducing a divide and conquers approach in the partial product generation part. The simulation results show that the proposed APT-VDF overtakes the existing VDFs in terms of delay, power and area utilization. It consumes only 0.289Wpower with a latency of 9.24 ns. Also, it achieves an operating frequency of 210.87 MHz, and it is much better than the existing VDFs.  相似文献   

3.
Vision-based mobile robots on highways   总被引:1,自引:0,他引:1  
《Advanced Robotics》2013,27(4):417-427
Intelligent vehicles are mobile robots on highways. They are expected to improve the safety, efficiency and environmental impacts of the current highway traffic systems. Vision systems will play an important role as sensors for the intelligent vehicles. This paper first compares the vision sensors with other sensing methods from an application point of view and then describes two vision systems, one which we have developed and another which we are developing. Two important features are required for the vision systems applied to intelligent vehicles: three-dimensional (3D) measurement capability and real-time operation. We chose a trinocular stereo vision scheme among a number of 3D vision processing methods because it is suitable for real-time operations with dedicated processor architectures. The trinocular stereo algorithm requires a large number of operations, but all the operations are relatively straightforward and, therefore, they are suitable for custom architecture implementation. The system takes three images simultaneously by using three TV cameras installed on a single horizontal line at the front grill of the test car. Vertical edges are extracted from these images and the spatial offsets (or disparities) among the images are calculated for measuring the distances to the objects. The first version was developed and installed in a car for highway testing. Two custom digital processors were developed: one for edge detection and the other for stereo matching. The test results were encouraging and the architectures based on ASIC (Application Specific Integrated Circuits) are 800 and 550 times more efficient, respectively, compared with conventional microprocessors for edge detection and stereo matching. The second version is currently being developed in order to further reduce the silicon area size. It uses hybrid analog/digital circuit technology while the first version uses only digital circuits. We are developing a hybrid analog/digital array processor chip which includes a large number of processing elements. Each processing element includes a digital memory unit, a data flow control switch unit and an analog arithmetic/logic unit. The analog arithmetic/logic unit reduces the silicon area size significantly compared with the digital one. The data flows among multiple processing elements in the array chip in a form of analog voltage. The data flow is controlled by the data flow switches. The digital memory unit controls the set-up of the data flow control switch and arithmetic/logic units.  相似文献   

4.
In this paper, high performance VLSI architectures for lifting based 1D and 2D-Discrete wavelet transforms (DWTs) are proposed. The proposed logic used for area efficient lifting based DWT is to perform the whole operation with one processing element. Similarly, the proposed logic used for delay efficient lifting based DWT is to perform the whole operation with multiple processing elements in parallel. In both the cases, the processing element consists of one floating point adder and one proposed fused multiply add design. The proposed and existing lifting based 1D and 2D lifting based DWTs are implemented with 45 nm technology. The results show that the proposed designs achieve significant improvement compared with existing architectures. For example, 9-point 2-parallel proposed (9, 7) single level 1D-DWT achieves 33.5% of reduction in total cycle delay compared with direct form. Similarly, 9-point single PE proposed (9, 7) single level 1D-DWT achieves 59.8% and 75.5% of reduction in total area and net power over direct form respectively.  相似文献   

5.
在科学计算、数字信号处理、通信和图像处理等应用中,除法运算是常用的基本操作之一。基于SRT 8除法算法,设计一个SIMD结构的IEEE 754标准浮点除法器,在同一硬件平台上能够实现双精度浮点除法和两个并行的单精度浮点除法。通过优化SRT 8迭代除法结构,提出商选择和余数加法的并行处理,并采用商数字存储技术降低迭代除法的计算延时,提高频率。同时,采用复用策略减少硬件资源开销,节省面积。实验表明,在40nm工艺下,本设计综合cell面积为18601.9681 μm2,运行频率可达2.5GHz,相对传统的SRT 8实现关键延迟减少了23.81%。  相似文献   

6.
In the recent past there is a rapid development in the field of digital technology especially in signal processing and image processing based applications Excellent performance high speed, compactable in size low power and less delay are the essential needs of the devices used for applications such as signal processing, audio processing and software define radio and so on. Particularly, digital gadgets are prone to have more critical logic size and power consumption and take large area in VLSI Implementation due to arithmetic operations of adders and multiplier designs. Thus priority architecture of Digital Wavelet Transform (DWT) is affected as it comprises a number of Filter banks in level basics, thus all Filter banks have number of adders and multipliers due to coefficient decompositions of low and high pass filters. On this n-size repeated filter logic takes more logic size and power consumption. Here, the proposed work presents a novel approach of DWT by replacing conventional adders and multipliers with XOR-MUX adders and Truncations multipliers thereby reducing the 2n logic size to n-size logic. Finally, the proposed DWT architecture designed in VHDL and also implemented in FPGA XC6SLX9-2TQG144 proved the performance in terms of delay, area and power.  相似文献   

7.
为了满足现代无线通信系统对于信号检测环节高吞吐、低资源消耗的设计需求,针对现有方案从组合逻辑、数据处理能力、模块耦合度等方面进行优化并提出一种高效率的多天线信号检测方案。该方案结构精简、易于流水线实现,结合DDR3高速读写数据的优势并采取基于AXI4-Stream接口封装的技术,极大地提高了检测环节的数据处理效率。以ZYNQ-7100为硬件平台,通过仿真验证了该方案的准确性及优越性。该方案为现有LTE-A系统基带核心处理部分提供了解决方案,同时对其他信号检测类产品IP的设计也有一定参考意义。  相似文献   

8.
针对地面数字视频广播(DVB-T)系统中高速FFT处理器的设计要求,提出了一种新的基16/8混合基算法及其实现结构。采用单个基16/8复用的蝶形运算单元顺序处理,并通过减少乘法器数目,有效降低了硬件消耗;运算单元内部采用“基4+基4/2”级联流水线方式,大大加快了运算速度;此外,应用对称乒乓RAM结构提高了蝶算单元的连续运算能力;并且使用改进的块浮点防溢出机制,以保证运算精度。仿真和实现结果表明该设计具有良好的性能,完全满足实际应用要求。  相似文献   

9.
Digital signal processing algorithms often rely heavily on a large number of multiplications, which is both time and power consuming. However, there are many practical solutions to simplify multiplication, like truncated and logarithmic multipliers. These methods consume less time and power but introduce errors. Nevertheless, they can be used in situations where a shorter time delay is more important than accuracy. In digital signal processing, these conditions are often met, especially in video compression and tracking, where integer arithmetic gives satisfactory results. This paper presents a simple and efficient multiplier with the possibility to achieve an arbitrary accuracy through an iterative procedure, prior to achieving the exact result. The multiplier is based on the same form of number representation as Mitchell’s algorithm, but it uses different error correction circuits than those proposed by Mitchell. In such a way, the error correction can be done almost in parallel (actually this is achieved through pipelining) with the basic multiplication. The hardware solution involves adders and shifters, so it is not gate and power consuming. The error summary for operands ranging from 8 bits to 16 bits indicates a very low relative error percentage with two iterations only. For the hardware implementation assessment, the proposed multiplier is implemented on the Spartan 3 FPGA chip. For 16-bit operands, the time delay estimation indicates that a multiplier with two iterations can work with a clock cycle more than 150 MHz, and with the maximum relative error being less than 2%.  相似文献   

10.
二维快速傅立叶变换(FFT)在一个传统概念的处理机上实现时,需要芯片具有更多的逻辑资源。本文给出了基于FPGA的自定义处理机(CCM)的二维FFT算法和实现。在CCM的Splash-2平台上实现了二维FFT,计算速度达到180Mflops,最快速度超过Sparc-10工作站的23倍。同时,对于一个N×N图像,这种实现方法可以满足二维FFT所需要的O(N2log2N)次的浮点算术运算。  相似文献   

11.
在科学计算、数字信号处理和图像处理等诸多应用中,反正切运算都是常用的基础操作之一。基于Piecewise算法,面向某型DSP芯片设计了一个符合IEEE-754标准的单精度浮点反正切运算器。为了达到设计精度,分析和限制了运算过程中的所有误差。为了确保输出结果具备与原函数一致的单调性,提出了一种可确保算法单调性的设计方法。为了降低硬件成本,使用了二级分层分段方法,并实现了基于电路静态和动态分析的信号位宽设计与优化。基于FPGA的仿真结果表明,反正切运算器所需硬件成本较小,其输出结果与标准结果之间的误差小于1 ulp,且运算器输出具有良好的单调性。  相似文献   

12.
基于FPGA的高精度信号发生器的实现与优化   总被引:2,自引:0,他引:2  
针对科里奥利质量流量计在线测试系统的需要,设计了一种基于现场可编程门阵列(FPGA,field programmable gate array),以直接数字频率合成技术(DDS,direct digital synthesis)为基础的高精度信号发生器.通过对其进行改进,在保证原有分辨率的情况下使研制方案节省了硬件资源...  相似文献   

13.
随着现代社会的数字化,各种数字信号处理技术正在进入社会生活的各个领域中。数字信号处理(DSP)在将各种技术附诸实践方面起着巨大的作用。TI公司生产的TMS30系列是目前最常用的数字信号处理芯片,本文中,我们采用了TI公司的性能优越的多处理器芯片TMS320C80来实现数字信号处理技术中最基本、最常用的傅利叶变换。  相似文献   

14.
针对布尔可满足性问题的高效求解进行了研究。首先,通过对k-SAT问题和基于耦合常微分方程形式的确定性连续时间动态系统的分析,提出了一种基于时延信息形式的改进连续时间动态系统方程,以保持集中搜索特性;然后,提出了实现该系统方程的三个主要组件即信号动态电路、辅助变量电路和数字验证电路的模拟设计。在信号动态电路的设计中,设计了一种获得更高性能、更小面积和更低功耗的模拟硬件形式;在提出的辅助变量电路和数字验证电路的模拟硬件设计中,实现了避免梯度下降搜索陷入无解和确定给定问题的解是否已经找到的目标;同时提出了降低面积和功耗的可替代辅助变量电路的两种设计方案。仿真实验结果表明,提出的新的模拟SAT求解器不仅是有效的,而且相比于单一软件算法实现的SAT求解器和其他硬件类SAT求解器具有更高的加速性能和更低的功耗。  相似文献   

15.
设计了一种用于32位浮点乘法器尾数乘部分的wallace树压缩器的硬件结构实现方法,通过3-2和4-2压缩的混合搭配,构成一种新的wallace树压缩器,采用verilog硬件描述语言实现RTL级代码的编写,并使用VCS进行功能仿真,然后在SMIC0.13 μm的工艺下,用synopsys DC进行逻辑综合、优化。结果表明,这种压缩器在部分积的压缩过程中,有效地提高了运算速度,并在很大程度上减小了硬件实现面积。  相似文献   

16.
量子可逆逻辑综合的关键技术及其算法   总被引:1,自引:0,他引:1  
李志强  李文骞  陈汉武 《软件学报》2009,20(9):2332-2343
最优化量子可逆逻辑的关键在于用最小的量子代价自动构造量子可逆逻辑.为了提高可逆逻辑自动生成与优化的效率,提出了类模板技术和一种快速算法.模板技术是一个有效的优化工具,类模板技术可以显著提高模板技术的匹配效率;R-M算法是可逆逻辑综合的一种较好的迭代方法,基于R-M算法的原始思想,构造了一个Hash函数,并在此基础上提出了一种可逆逻辑综合的快速算法.实验结果表明,在同等实验环境下使用类模板技术与快速算法,其优化的效果与效率远远优于已知的其他算法.  相似文献   

17.
This paper presents the implementation of the coarse-grained reconfigurable architecture (CGRA) DART with on-line error detection intended for increasing fault-tolerance. Most parts of the data paths and of the local memory of DART are protected using residue code modulo 3, whereas only the logic unit is protected using duplication with comparison. These low-cost hardware techniques would allow to tolerate temporary faults (including so called soft errors caused by radiation), provided that some technique based on re-execution of the last operation is used. Synthesis results obtained for a 90 nm CMOS technology have confirmed significant hardware and power consumption savings of the proposed approach over commonly used duplication with comparison. Introducing one extra pipeline stage in the self-checking version of the basic arithmetic blocks has allowed to significantly reduce the delay overhead compared to our previous design.  相似文献   

18.
There has been an increasing concern for the security of multimedia transactions over real-time embedded systems. Partial and selective encryption schemes have been proposed in the research literature, but these schemes significantly increase the computation cost leading to tradeoffs in system latency, throughput, hardware requirements and power usage. In this paper, we propose a light-weight multimedia encryption strategy based on a modified discrete wavelet transform (DWT) which we refer to as the secure wavelet transform (SWT). The SWT provides joint multimedia encryption and compression by two modifications over the traditional DWT implementations: (a) parameterized construction of the DWT and (b) subband re-orientation for the wavelet decomposition. The SWT has rational coefficients which allow us to build a high throughput hardware implementation on fixed point arithmetic. We obtain a zero-overhead implementation on custom hardware. Furthermore, a Look-up table based reconfigurable implementation allows us to allocate the encryption key to the hardware at run-time. Direct implementation on Xilinx Virtex FPGA gave a clock frequency of 60 MHz while a reconfigurable multiplier based design gave a improved clock frequency of 114 MHz. The pipelined implementation of the SWT achieved a clock frequency of 240 MHz on a Xilinx Virtex-4 FPGA and met the timing constraint of 500 MHz on a standard cell realization using 45 nm CMOS technology.  相似文献   

19.
Interval arithmetic, as it is standardized by the IEEE working group P1788 can be implemented by using floating point arithmetic units with directed rounding modes. The easiest way to represent an interval is by its two bounds. Simple formulas for the arithmetic operations can be applied. Our goal is to perform interval operations as fast as their floating point counterparts. Hence, we provide at least two units per operation. We also specify the operation for reverse multiplication (Neumaier in Vienna proposal for interval standardization, 2008) which can be implemented with the division unit. In this paper we do not care about optimization. Our primary intention is to give an easily understandable specification of hardware for interval arithmetic.  相似文献   

20.
针对阵列声波测井中声波全波列信号的特点,设计了一种满足实际生产作业要求的井下信号采集与处理系统。该系统以80C186为控制核心,主要实现与地面系统的通信和整个井下系统的控制;采用多片TMS320VC5416构成多通道实时信号处理电路,用于实现数字滤波、波形叠加、首波到时提取等处理。同时,阐述了基于短窗-长窗能量比算法的首波提取技术及其实现。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号