首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 531 毫秒
1.
    
In this paper we investigate -bit serial addition in the context of feed-forward linear threshold gate based networks. We show that twon-bit operands can be added in overall delay with a feed-forward network constructed with linear threshold gates and latches. The maximum weight value is and the maximum fan-in is . We also investigate the implications our scheme have to the performance and the cost under small weights and small fan-in requirements. We deduce that if the weight values are to be limited by a constantW, twon-bit operands can be added in overall delay with a feed-forward network that has the implementation cost [logW]+1, in terms of linear threshold gates, in terms of latches and a maximum fan-in of 3[logW]+1. We also prove that, if the fan-in values are to be limited by a constantF+1, twon-bit operands can be added in overall delay with a feed-forward network that has the implementation cost , in terms of linear threshold gates, in terms of latches, and a maximum weight value of . An asymptotic bound of is derived for the addition overall delay in the case that the weight values have to be linearly bounded, i.e., in the order ofO(n). The implementation cost in this case is in the order ofO(logn), in terms of linear threshold gates, and in the order ofO(log2 n), in terms of latches. The maximum fan-in is in the order ofO(logn). Finally, a partition technique, that substantially reduces the overall cost of the implementation for all the schemes in terms of delay, latches, weights, and fan-in with some few additional threshold gates, is also presented.  相似文献   

2.
The discrete wavelet transform (DWT) provides a new method for signal/image analysis where high frequency components are studied with finer time resolution and low frequency components with coarser time resolution. It decomposes a signal or an image into localized contributions for multiscale analysis. In this paper, we present a parallel pipelined VLSI array architecture for 2D dyadic separable DWT. The 2D data array is partitioned into non-overlapping groups of rows. All rows in a partition are processed in parallel, and consecutive partitions are pipelined. Moreover, multiple wavelet levels are computed in the same pipeline, and multiple DWT problems can be pipelined also. The whole computation requires a single scan of the image data array. Thus, it is suitable for on-line real-time applications. For anN×N image, anm-level DWT can be computed in time units on a processor costing no more than , whereq is the partition size,p is the length of corresponding 1D DWT filters,C m andC a are the costs of a parallel multiplier and a parallel adder respectively, and a time unit is the time for a multiplication and an addition. Forq=N m, the computing time reduces to . When a large number of DWT problems are pipelined, the computing time is about per problem.  相似文献   

3.
New algorithms for the DFT and the 2-dimensional DFT are presented. The DFT and the 2-dimensional DFT matrices can be expressed as the Kronecker product of DFT matrices of smaller dimension. These algorithms are synthesized by combining the efficient factorization of the Kronecker product of matrices with the highly hardware efficient recursive implementation of the smaller DFT matrices, to yield these algorithms. The architectures of the processors implementing these algorithms consist of 2-dimensional grid of processing elements, have temporal and spatial locality of connections. For computing the DFT of sizeN or for the 2D DFT of sizeN=N 1 byN 1, these algorithms require 2N multipliers and adders, take approximately computational steps for computing a transform vector, and take approximately computation steps between the computation of two successive transform vectors.  相似文献   

4.
CORDIC-based algorithms to compute cos and are proposed. The implementation requires a standard CORDIC module plus a module to compute the direction of rotation, this being the same hardware required for the extended CORDIC vectoring, recently proposed by the authors [T. Lang and E. Antelo, IEEE Transactions on Computers, vol. 47, no. 7, 1998, pp. 736–749.]. Although these functions can be obtained as a special case of this extended vectoring, the specific algorithm we propose here presents two significant improvements: (1) it uses the same datapath width as the standard CORDIC, even when t has 2n bits (to achieve a granularity of 2–n for the whole range). In contrast, the extended vectoring unit requires about 2n bits. (2) no repetitions of iterations are needed (the extended vectoring needs some repetitions). The proposed algorithm is compatible with the extended vectoring and, in contrast with previous implementations, the number of iterations and the delay of each iteration are the same as for the conventional CORDIC algorithm.  相似文献   

5.
This paper is the last in a two-part sequence which studies nonlinear networks, containing capacitor-only cutsets and/or inductor-only loops, from the geometric coordinate-free point of view of the theory of differentiable manifolds. For such circuits, it is shown that (subject to certain assumptions) there is a naturally defined Lie group action of on the state space ofN, where 0 is the sum of the number of independent capacitor-only cutsets and the number of independent inductor-only loops. Circuit theoretic sufficient conditions on the reactive constitutive relations are derived for the circuit dynamics to be invariant under this Lie group action.This work was supported by the Natural Sciences and Engineering Research Council of Canada, under Grant Number A7113, and by scholarships from the Natural Sciences and Engineering Research Council of Canada and the Ontario Provincial Government.  相似文献   

6.
Multiplication-accumulation operations described by represent the fundamental computation involved in many digital signal processing algorithms. For high speed signal processing, one obvious approach to realize the above computation in VLSI is to employm discrete multipliers working in parallel. However, a more area efficient approach is offered by the merged multiplication technique [5]. But the principal drawback of the conventional merged technique is its longer latency than the former discrete approach. This work proposes a hardware algorithm for merged array multiplication which eliminates this drawback and achieves significant improvement in latency when compared with the conventional scheme for merged multiplication. The proposed algorithm utilizes multiple wave front computation as opposed to the traditional approach where computation in an array multiplier is carried out by a single wave front. The improvement in latency by the proposed approach is greater than 40% (form>2) when compared with a conventional approach to merged multiplication. The consequent cost in the form of additional requirement of VLSI area is found to be rather small. In this paper, we provide a thorough analytic discussion on the proposed algorithm and support it by experimental results.  相似文献   

7.
Let K be a field, k and n positive integers and let matrices with coefficients in K. For any function
there exists a unique solution of the system of difference equations
defined by the matrix-k-tuple such that . The system is called finite-memory system iff for every function g with finite support the values are 0 for sufficiently big . In the case , these systems and the corresponding matrix-k-tuples have been studied in bis, fm, fmv, fv1, fv, fz. In this paper I generalize these results to an arbitrary positive integer k and to an arbitrary field K.  相似文献   

8.
This paper considers the problem of constructing feedback stabilizing controllers for the wave operator on n (more generally AR systems determined by a hyperbolic operator). In order to accomplish this, it must first clarify the notion of an input-output structure on a distributed system, as well as what it means to interconnect two such systems. Both these notions are shown to be consequences of a structure which generalizes the standard causal structure of lumped systems determined by the flow of time. Given this apparatus, the paper then constructs feedback controllers which stabilize the wave equation along directions given by a proper cone in n.  相似文献   

9.
In this paper, we first present a novel concept of 2-D basis interleaving array (also referred to as basis array for short). That is, an m × m interleaved array is said to be a basis array if the shortest distance among all pairs of elements in each of the so-called m-equivalent sets within the m × m array reaches the maximum. It is shown that this maximum is given by ${\lfloor \sqrt{2m} \rfloor}$ and an m × m basis array can be constructed by using a simple cyclic translation method. The previously developed concept of successive packing is then generalized in the sense that it can be applied to any basis array to generate an interleaved array with a larger size. Except that optimality cannot be guaranteed, the concept of basis arrays and successive packing are extended to M-D cases. It is shown that for any M ?? 2, the proposed technique can spread any error burst of block size ${m_{1}^{k} \times m_{2}^{k} \times \cdots \times m_{M}^{k}}$ within an ${ m_{1}^{n} \times m_{2}^{n} \times \cdots \times m_{M}^{n}}$ array (1 ?? k ?? n?1) so effectively that the error burst can be corrected with some simple random error-correcting code (provided the error-correcting code is available). It is shown that important prior results in M-D interleaving such as the t-interleaved array based approach by Blaum et al. and the successive packing approach by Shi and Zhang now become special cases of the framework based on basis arrays and successive packing, proposed in this paper.  相似文献   

10.
A novel figure of merit to describe the bandwidth power efficiency of CMOS transconductors— is proposed and optimized for cross-coupled differential pair transconductor structures. The optimization is done in two different ways: univariable unconstrained and multivariable constrained. It is revealed that not only dc biases but also ac input phases can affect the bandwidth power efficiency of the transconductor. The bias voltages which can lead to best ratio at different ac phase combinations are obtained and presented in the article. HSPICE simulations are conducted to verify the theoretical predictions. On the basis of the cross-coupled differential pair transconductor, a biquadratic transconductor-C filter configuration is implemented. The frequency vs. power characteristic of the filter is studied for both optimally- and non-optimally-biased transconductor. It is shown that the optimization of the transconductor structure can result in performance improvement of the transconductor-C filter. The deviation of the optimal bias condition between the transconductor alone and the transconductor-C filter due to the inclusion of peripheray circuitries in the filter is discussed in the article.  相似文献   

11.
In this paper, we propose a reduced complexity and power efficient System-on-Chip (SoC) architecture for adaptive interference suppression in CDMA systems. The adaptive Parallel-Residue-Compensation architecture leads to significant performance gain over the conventional interference cancellation algorithms. The multi-code commonality is explored to avoid the direct Interference Cancellation (IC), which reduces the IC complexity from to . The physical meaning of the complete versus weighted IC is applied to clip the weights above a certain threshold so as to reduce the VLSI circuit activity rate. Novel scalable SoC architectures based on simple combinational logic are proposed to eliminate dedicated multipliers with at least saving in hardware resource. A Catapult C High Level Synthesis methodology is apply to explore the VLSI design space extensively and achieve at least speedup. Multi-stage Convergence-Masking-Vector combined with clock gating is proposed to reduce the VLSI dynamic power consumption by up to This paper was presented in part at IEEE ISCAS in Vancouver, Canada, May, 2004.  相似文献   

12.
Effects of nitrogen incorporation on suppression of electron charge traps in Hf-based high- $kappa$ gate dielectrics have been studied by first-principles calculations, focusing on interactions between N atoms and electrons trapped at oxygen vacancies $(V_{rm O}{hskip0.2pt}hbox{'s})$. Our total energy calculations revealed that the formation energy of a doubly occupied state of $V_{rm O}$ is significantly increased in $hbox{HfO}_{x} hbox{N}_{y}$ compared to that in $hbox{HfO}_{2}$ . This clearly indicates that the electron charge traps at $V_{rm O}{ hskip0.2pt}hbox{'s}$ are considerably suppressed by N incorporation.   相似文献   

13.
A fundamental problem of symbolic analysis of electric networks when using the signal-flow (SFG) graph method is to find the common tree of the current and voltage graph ( and , respectively). In this paper we introduce a novel method in order to determine a common tree of both graphs, which may be used to obtain the symbolic network transfer function when carrying out the small-signal analysis of linear(ised) circuits.  相似文献   

14.
The aim of this paper is to give an explicit computation for the potential generated by a dipole on a hexagonal grid. Such a computation will be expressed as the Fourier transform of a distribution on the bidimensional torus .  相似文献   

15.
Without sacrificing the on-current in the transfer characteristics, we have successfully reduced the off-current part by the optimal $hbox{N}_{2}hbox{O}$ plasma treatment to improve the on–off-current ratio in n-type titanium oxide $( hbox{TiO}_{rm x})$ active-channel thin-film transistors. While the high-power (275 W) $hbox{N}_{2}hbox{O}$ plasma treatment oxidizes the whole $hbox{TiO}_{rm x}$ channel and results in the reduction of both on- and off-current, the optimized low-power (150 W) process makes the selective oxidation of the top portion in the channel and reduces only the off-current significantly. Increase in on–off ratio by almost five orders of magnitude is achieved without change in on-current by using the presented method.   相似文献   

16.
Bipolar transistors are interesting for low noise front-end readout systems when high speed and low power consumption are required. This paper presents a fully integrated, low noise front-end design for the future Large Hadron Collider (LHC) experiments using the radiation hard SOI BiCMOS process. In the present prototype, the input-referred Equivalent Noise Charge (ENC) of 990 electrons (rms) for 12 pF detector capacitance with a shaping time of 25 ns and power consumption of 1.4 mW/channel has been measured. The gain of this front-end is 90 mV/MIP (Minimum Ionisation Particle: 1 fC) with non-linearity of less than 3% and linear input dynamic range is MIP. These results are obtained at room temperature and before irradiation. The measurements after irradiations by high intensity pion beam with an integrated flux of pions/cm2 are also presented in this paper.  相似文献   

17.
A new frequency-domain algorithm, the planar Taylor expansion through the fast Fourier transform (FFT) method, has been developed to speed the computation of the Green's function related formulas in the half-space scenario for both the near-field (NF) and the far-field (FF). Two types of Taylor-FFT algorithms are presented in this paper: the spatial Taylor-FFT and the spectral Taylor-FFT. The former is for the computation of the NF and the latter is for the computation of the FF or the Fourier spectrum. The planar Taylor-FFT algorithm has a computational complexity of ${O(N^{2} log _{2} N^{2})}$ for an ${Ntimes N}$ computational grid, comparable to the multilevel fast multipole method (MLFMM). What's more important is that, the narrowband property of many electromagnetic fields allows the Taylor-FFT algorithm to use larger sampling spacing, which is limited by the transverse wave number. In addition, the algorithm is free of singularities. An accuracy of $-50~{rm dB}$ for the planar Taylor-FFT algorithm is easily obtained and an accuracy of $-80~{rm dB}$ is possible when the algorithm is optimized. The algorithm works particularly well for narrowband fields and quasi-planar geometries.   相似文献   

18.
In this work, a new direct digital frequency synthesizer (DDFS) is proposed, which is based on a new two-level table-lookup (TLTL) scheme combined with Taylor’s expansion. This method only needs a lookup-table size of total bits, one multiplier, one n × 3n/4-bit multiplier and two additional smaller multipliers, to generate both sine and cosine values (where n is the output precision). Compared with several notable DDFS’s, the new design has a smaller lookup-table size and higher SFDR (Spurious Free Dynamic Range) for high-precision output cases, at comparable multiplier and adder complexities. The DDFS is verified by FPGA and EDA tools using Synopsys Design Analyzer and UMC 0.25 μm cell library, assuming 16-bit output precision. The designed 16-bit DDFS has a small gate count of 2,797, and a high SFDR of 110 dBc.
  相似文献   

19.
A charge sensitive readout chain has been designed and fabricated in acommercially available 0.8 m CMOS technology. The readout chain is optimizedfor pixel detectors measuring soft X-ray energies up to 20 KeV. In the first modean analog signal proportional to input charge is generated and processed in realtime. In the second mode a peak-and-hold operation is enabled and therelevant signal is processed in later time. This dual mode of operation iscontrolled by an external digital signal. The readout chain consists of a chargeamplifier, a shaper, an operational amplifier which can either operate as avoltage amplifier or a peak detector and an output buffer. Its area is . The gain at the shaper output is 378 mv/fC, theENC is 16 rms at 160 nsec shaping time. The overall gainis 557 mV/fC, the ENC is rms with 240 nsec peaking timeand 1.4 sec recovery time. The overall power dissipation is 1.5 mWatt with aload capacitance of 25 pF.  相似文献   

20.
Amorphous $hbox{Bi}_{5}hbox{Nb}_{3}hbox{O}_{15}(hbox{B}_{5} hbox{N}_{3})$ film grown at 300 $^{circ}hbox{C}$ showed a high-$k$ value of 71 at 100 kHz, and similar $k$ value was observed at 0.5–5.0 GHz. The 80-nm-thick film exhibited a high capacitance density of 7.8 fF/$muhbox{m}^{2}$ and a low dissipation factor of 0.95% at 100 kHz with a low leakage-current density of 1.23 nA/ $hbox{cm}^{2}$ at 1 V. The quadratic and linear voltage coefficient of capacitances of the $hbox{B}_{5}hbox{N}_{3}$ film were 438 ppm/$hbox{V}^{2}$ and 456 ppm/V, respectively, with a low temperature coefficient of capacitance of 309 ppm/$^{circ}hbox{C}$ at 100 kHz. These results confirmed the potential of the amorphous $hbox{B}_{5}hbox{N}_{3}$ film as a good candidate material for a high-performance metal–insulator–metal capacitors.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号