共查询到20条相似文献,搜索用时 0 毫秒
1.
Real-time signal processing requires fast computation of inner products. Distributed arithmetic is a method of inner product computation that uses table-lookup and addition in place of multiplication. Distributed arithmetic has previously been shown to produce novel and seemingly efficient architectures for a variety of signal processing computations; however the methods of design, analysis and comparison have been ad hoc. We propose a systematic method for synthesizing optimal VLSI architectures using distributed arithmetic.A partition of the inner product computation at the word and bit level produces a computation consisting of lookups and additions. We study two classes of algorithms to implement this computation, regular iterative algorithms and tree algorithms, each of which can be expressed in the form of a dependency graph. We use linear and nonlinear maps to assign computations to processors in space and time. Expressions are developed for the area, latency, period and arithmetic error for a particular partition and space/time map of the dependecy graph. We use these expressions to formulate a constrained optimization problem over a large class of architectures. We compare distributed arithmetic with more conventional methods for inner product computation and show how area, latency and period may be traded off while maintaining constant error.This work was supported by Ball Aerospace, Boulder, CO and by the Office of Naval Research, Electronics Branch, Arlington, VA under contract ONR 89-J-1070. 相似文献
2.
3.
为了解决异构数据间的共享问题,提供标准、统一、功能丰富的接口,对传统的SDO的数据集成做了改进,并在P2P上加上了对server的支持,提高了模块的复用程度,降低了通讯的复杂度,用户获取数据时既可以从peer那获取,也可以从cdn或其它服务器上获取。 相似文献
4.
首先分析了中间件技术、分布式系统产生的原因以及它们的优点,然后根据新型专家系统的要求并针对传统专家系统的缺点,提出了一种在传统专家系统的体系结构中引入中间件技术的观点,从而把传统的专家系统改造为一种新型的专家系统一一基于中间件技术的分布式专家系统。 相似文献
5.
Calhoun B.H. Honore F.A. Chandrakasan A.P. 《Solid-State Circuits, IEEE Journal of》2004,39(5):818-826
Multithreshold CMOS (MTCMOS) circuits reduce standby leakage power with low delay overhead. Most MTCMOS designs cut off the power to large blocks of logic using large sleep transistors. Locally distributing sleep devices has remained less popular even though it has several advantages described in this paper. However, locally placed sleep devices are only feasible if sneak leakage currents are prevented. This paper makes two contributions to leakage reduction. First, we examine the causes of sneak leakage paths and propose a design methodology that enables local insertion of sleep devices for sequential and combinational circuits. A set of design rules allows designers to prevent most sneak leakage paths. A fabricated 0.13-/spl mu/m, dual V/sub T/ test chip employs our methodology to implement a low-power FPGA architecture with gate-level sleep FETs and over 8/spl times/ measured standby current reduction. Second, we describe the implementation and benefits of local sleep regions in our design and examine the interfacing issues for this technique. Local sleep regions reduce leakage in unused circuit components at a local level while the surrounding circuits remain active. Measured results show that local sleep regions reduce leakage in active configurable logic blocks (CLBs) on our chip by up to 2.2/spl times/ (measured) based on configuration. 相似文献
6.
A distributed approach to railway traffic control is described. The approach overcomes the upper bounds imposed on the size of controlled areas by the requirement for real-time processing when centralized methodologies are applied. The control problem is modeled in terms of resource allocation tasks, and the concept of priority is generalized to rule local control decisions. The analysis of a global network's behavior, as derived from the integration of local microdecisions, prefigures a depletion effect which will protect the system from traffic jam collapses. Simulation runs are reported to show the control system's overall operation 相似文献
7.
Pop P. Eles P. Zebo Peng Pop T. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(8):793-811
In this paper, we present an approach to mapping and scheduling of distributed embedded systems for hard real-time applications, aiming at a minimization of the system modification cost. We consider an incremental design process that starts from an already existing system running a set of applications. We are interested in implementing new functionality such that the timing requirements are fulfilled and the following two requirements are also satisfied: 1) the already running applications are disturbed as little as possible and 2) there is a good chance that later, new functionality can easily be added to the resulted system. Thus, we propose a heuristic that finds the set of already running applications which have to be remapped and rescheduled at the same time with mapping and scheduling the new application, such that the disturbance on the running system (expressed as the total cost implied by the modifications) is minimized. Once this set of applications has been determined, we outline a mapping and scheduling algorithm aimed at fulfilling the requirements stated above. The approaches have been evaluated based on extensive experiments using a large number of generated benchmarks as well as a real-life example. 相似文献
8.
针对装备保障人员专业技能培养需求,提出基于云计算理念构建虚拟式网络化测试系统,支持SaaS和IaaS模式的测试训练教学,达到有网就能训的普适化目标。在分析虚拟式网络化测试系统的基本架构及支撑环境结构的基础上,重点介绍了分布交互中间件设计思路:按照IEEE1641-2010标准定义的信号描述方法,研制系列标准信号FOM模块;利用IEEE1516-2010的模块化FOM机制,动态构建系统FOM,并通过WSDL API调用Web服务,实现跨广域网的分布交互。最后介绍了基于该技术路线开发的实际系统。 相似文献
9.
10.
Lu J.-C. Holton W.C. Fenner J.S. Williams S.C. Kim K.W. Hartford A.H. Chen D. Roze K. Littlejohn M.A. 《Electron Devices, IEEE Transactions on》1998,45(3):634-642
As future technology generations for integrated circuits continue to “shrink”, TCAD tools must be made more central to manufacturing issues; thus, yield optimization and design for manufacturing (DFM) should be addressed integrally with performance and reliability when using TCAD during the initial product design. This paper defines the goals for DFM in TCAD simulations and outlines a formal procedure for achieving an optimized result (ODFM). New design of experiments (DOE), weighted least squares modeling and multiple-objective mean-variance optimization methods are developed as significant parts of the new ODFM procedure. Examples of designing a 0.18-μm MOSFET device are given to show the impact of device design procedures on device performance distributions and sensitivity variance profiles 相似文献
11.
Chaudhuri S. Blthye S.A. Walker R.A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1997,5(1):69-81
This paper describes an exact solution methodology, implemented in Rensselaer's Voyager design space exploration system, for solving the scheduling problem in a three-dimensional (3-D) design space: the usual two-dimensional (2-D) design space (which trades off area and schedule length), plus a third dimension representing clock length. Unlike design space exploration methodologies which rely on bounds or estimates, this methodology is guaranteed to find the globally optimal solution to a 3-D scheduling problem. Furthermore, this methodology efficiently prunes the search space, eliminating provably inferior design points through the following: 1) a careful selection of candidate clock lengths and 2) tight bounds on the number of functional units or on the schedule length. Both chaining and multicycle operations are supported 相似文献
12.
《Solid-State Circuits, IEEE Journal of》1984,19(1):37-39
A technology-updatable design methodology for three-dimensional CMOS circuits has been developed. Four levels of abstraction have been implemented with topographical congruence: (1) technology level, (2) mask level, (3) transistor level, and (4) logic level. A novel transistor-level symbolic representation is introduced which emphasizes the three-dimensional nature of the circuits. A number of design examples are presented. 相似文献
13.
Dutta S. Wolf W. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(2):229-240
The programmable video signal processor (VSP) is an important category of processors for multimedia systems. Programmable video processors combine the flexibility of programmability with special architectural features that improve performance on video processing applications. VSPs are typically multiple processors with several processing elements (PEs) and a parallel memory system. This paper focuses on the architectural design of the PE's in a video processor and shows how technology and circuit parameters influence the structure of the datapath and, hence, the overall architecture of a programmable VSP. We emphasize the need to consider technological and circuit-level issues during the design of a system architecture and present a method whereby the conceptual organization of the PEs-the number of PEs, pipelining of the datapath, size of the register file, and number of register ports-can be evaluated in terms of a target set of applications before a detailed design is undertaken. We use motion-estimation and discrete cosine transform as example applications to illustrate how various technology parameters affect the architectural design choices. We show that the design of the register file and the datapath-pipeline depth can drastically affect PE utilization and, therefore, the number of PEs required for different applications. Our results demonstrate that pursuing the fastest cycle time can greatly increase the silicon area which must be devoted to PEs, due to both increased pipeline latency and reduced register file bandwidth 相似文献
14.
Cherkauer B.S. Friedman E.G. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(1):99-111
In this paper, the various disparate approaches to CMOS tapered buffer design are unified into an integrated design methodology. Circuit speed, power dissipation, physical area, and system reliability are the four performance criteria of concern in tapered buffers, and each places a separate, often conflicting, constraint on the design of a tapered buffer. Enhanced short-channel tapered buffer design equations are presented for propagation delay and power dissipation, as well as a new split-capacitor model of hot-carrier reliability of tapered buffers and a two-component physical area model. Each performance criterion is individually investigated and analyzed with respect to the number of stages and tapering factor, and the interaction of the four criteria is examined to develop both a qualitative and a quantitative understanding of the various design tradeoffs. The creation of process dependent look-up tables for optimal buffer design is described, and a methodology to apply these look-up tables to application-specific tapered buffers for both unconstrained and constrained systems is developed 相似文献
15.
Davide Cartasegna Piero Malcovati Lorenzo Crespi Kyehyung Lee Andrea Baschirotto 《Analog Integrated Circuits and Signal Processing》2014,78(3):785-798
This paper presents a design methodology for high-order class-D amplifiers, based on their similarity with sigma–delta ( $\Upsigma\Updelta$ ) modulators, for which established theory and toolboxes are available. The proposed methodology, which covers the entire design flow, from specifications to component sizing, is validated with three design examples, namely a second-order, a third-order, and a fourth-order class-D amplifier. Moreover, the third-order class-D amplifier has been integrated on silicon and characterized, further confirming the validity of the whole design flow. The achieved results demonstrate that high-order class-D amplifiers can achieve total-harmonic-distortion (THD) performance compatible with the specifications of high-end audio applications (THD ≈ 90 dB), which would be unfeasible with conventional first-order class-D amplifiers. 相似文献
16.
《Electron Devices, IEEE Transactions on》1984,31(2):171-173
A technology-updatable design methodology for three-dimensional (3-D) CMOS circuits has been developed. Four levels of abstraction have been implemented with topographical congruence: 1) technology level, 2) mask level, 3) transistor level, and 4) logic level. A novel transistor level symbolic representation is introduced which emphasizes the three-dimensional nature of the circuits. A number of design examples is presented. 相似文献
17.
Chi-Fang Li Yuan-Sun Chu Wern-Ho Sheen Fu-Chin Tian Ho J.-S. 《Solid-State Circuits, IEEE Journal of》2004,39(5):852-857
This paper presents a low-power ASIC design for cell search in the wideband code-division multiple-access (W-CDMA) system. A low-complexity algorithm that is able to work satisfactorily under the effect of large frequency and clock errors is designed first. Then, a set of low-power measures are employed in the design of hardware architecture and circuits. Finally, through power analysis, critical blocks are identified and redesigned so as to further reduce the power consumption. The final design shows that the power is reduced by 51% from the original design of 133.6 mW to 65.49 mW, and its core area is also reduced by 31.9% from 3.4/spl times/3.4 mm/sup 2/ to 2.8/spl times/2.8 mm/sup 2/. The design is implemented and verified in a 3.3-V 0.35-/spl mu/m CMOS technology with clock rate 15.36 MHz. 相似文献
18.
19.
Contradictory trends in the industrial design environment have increased uncertainty while decreasing the tolerance to uncertainty. Worst case design techniques, still widely used in industry, do not provide the accuracy required to design under these conditions. On the other hand, statistical design techniques do provide a significant improvement in accuracy, by virtue of their “circuit adaptive” behavior, but at a substantial cost in computational effort. One practical solution to improving the accuracy of worst case design without sacrificing efficiency is considered here. It integrates an efficient statistical circuit simulator with worst case design tools into a hierarchical performance design process. It employs two stages of worst case analysis, calibrated with statistical circuit simulation, serving as filters to screen out circuits that easily meet their performance requirements. This focuses the use of statistical circuit simulation on those circuits for which the improved accuracy provides significant benefit. This methodology has been applied with outstanding results in design and manufacturing 相似文献
20.
Stiller B. Class C. Waldvogel M. Caronni G. Bauer D. 《Selected Areas in Communications, IEEE Journal on》1999,17(9):1580-1598
Distributed multimedia applications require a variety of communication services. These services and different application requirements have to be provided and supported within: (1) end-systems in an efficient and integrated manner, combining the precise specification of quality-of-service (QoS) requirements, application interfaces, multicast support, and security features and (2) the network. The Da CaPo++ system presented in this paper provides an efficient end-system middleware for multimedia applications, capable of handling various types of applications in a modular fashion. Application needs and communication demands are specified by values in terms of QoS attributes and functional properties, such as encryption requirements or multicast support. Da CaPo++ automatically configures suitable communication protocols, provides for an efficient runtime support, and offers an easy-to-use, object-oriented application programming interface. While its applicability to real-life applications was shown by prototype implementations, performance evaluations have been carried out yielding practical experiences and numerical results 相似文献