A VLSI design methodology for distributed arithmetic |
| |
Authors: | Wayne P. Burleson and Louis L. Scharf |
| |
Affiliation: | (1) Department of Electrical and Computer Engineering, University of Massachusetts, 01003 Amherst, MA;(2) Department of Electrical and Computer Engineering, University of Colorado, 80309 Boulder, CO |
| |
Abstract: | Real-time signal processing requires fast computation of inner products. Distributed arithmetic is a method of inner product computation that uses table-lookup and addition in place of multiplication. Distributed arithmetic has previously been shown to produce novel and seemingly efficient architectures for a variety of signal processing computations; however the methods of design, analysis and comparison have been ad hoc. We propose a systematic method for synthesizing optimal VLSI architectures using distributed arithmetic.A partition of the inner product computation at the word and bit level produces a computation consisting of lookups and additions. We study two classes of algorithms to implement this computation, regular iterative algorithms and tree algorithms, each of which can be expressed in the form of a dependency graph. We use linear and nonlinear maps to assign computations to processors in space and time. Expressions are developed for the area, latency, period and arithmetic error for a particular partition and space/time map of the dependecy graph. We use these expressions to formulate a constrained optimization problem over a large class of architectures. We compare distributed arithmetic with more conventional methods for inner product computation and show how area, latency and period may be traded off while maintaining constant error.This work was supported by Ball Aerospace, Boulder, CO and by the Office of Naval Research, Electronics Branch, Arlington, VA under contract ONR 89-J-1070. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|