共查询到20条相似文献,搜索用时 55 毫秒
1.
2.
A reconfigurable multifunction computing cache architecture 总被引:1,自引:0,他引:1
Huesung Kim Somani A.K. Tyagi A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(4):509-523
A considerable portion of a microprocessor chip is dedicated to cache memory. However, not all applications need all the cache storage all the time, especially the computing bandwidth-limited applications. In addition, some applications have large embedded computations with a regular structure. Such applications may be able to use additional computing resources. If the unused portion of the cache could serve these computation needs, the on-chip resources would be utilized more efficiently. This presents an opportunity to explore the reconfiguration of a part of the cache memory for computing. Thus, we propose adaptive balanced computing (ABC)-dynamic resource configuration on demand from application-between memory and computing resources. In this paper, we present a cache architecture to convert a cache into a computing unit for either of the following two structured computations: finite impulse response and discrete/inverse discrete cosine transform. In order to convert a cache memory to a function unit, we include additional logic to embed multibit output lookup tables into the cache structure. The experimental results show that the reconfigurable module improves the execution time of applications with a large number of data elements by a factor as high as 50 and 60 相似文献
3.
Noguera J. Badia R.M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(7):730-739
In this paper, we propose a configuration-aware data-partitioning approach for reconfigurable computing. We show how the reconfiguration overhead impacts the data-partitioning process. Moreover, we explore the system-level power-performance tradeoffs available when implementing streaming embedded applications on fine-grained reconfigurable architectures. For a certain group of streaming applications, we show that an efficient hardware/software partitioning algorithm is required when targeting low power. However, if the application objective is performance, then we propose the use of dynamically reconfigurable architectures. We propose a design methodology that adapts the architecture and algorithms to the application requirements. The methodology has been proven to work on a real research platform based on Xilinx devices. Finally, we have applied our methodology and algorithms to the case study of image sharpening, which is required nowadays in digital cameras and mobile phones. 相似文献
4.
5.
软件无线电技术与可重配置计算体系结构 总被引:1,自引:0,他引:1
1.技术趋势 现代无线通信的主体是移动通信.参照ITU建议M1225,移动通信是在复杂多变的移动环境下工作的,因此必须考虑严重的时变和多径传播的影响.在现代无线通信系统中,特别是在码分多址(CDMA)系统中,为了提高系统容量,提高系统灵敏度和在较低的发射功率下获得较远的通信距离,一般都希望使用智能天线与联合检测技术. 相似文献
6.
Khan J. Vemuri R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(2):135-147
We define portable reconfigurable computing platforms as those which have some form of configurable logic coupled with other on-chip or off-chip processing units such as soft processors, embedded processors, and voltage-scalable processors. In the first part of this paper, we present and test a unique methodology where we dynamically change the active area of a field programmable gate array (FPGA) to vary the battery usage and lifetime of the system, by running it on several different taskgraph structures and report an average of 14% and as high as 21%, less battery capacity used, as compared to nonoptimal execution. In the second part of this paper, we integrate the above methodology with more traditional voltage and frequency scaling techniques for portable systems and present a heuristic iterative algorithm for single and multiple processing units. The iterative heuristic algorithm finds a sequence of tasks along with an appropriate design point (implementation option) for each task, such that a deadline is met and the amount of battery energy used is as small as possible. We have used several real-world benchmarks to test the effectiveness of this methodology and we will present the results. 相似文献
7.
Fallah F. Liao S. Devadas S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(1):9-17
Unate and binate covering problems are a subclass of general integer linear programming problems with which several problems in logic synthesis, such as two-level logic minimization and technology mapping, are formulated. Previous branch-and-bound methods for solving these problems exactly use lower bounding techniques based on finding maximal independent sets. In this paper, we examine lower bounding techniques based on linear programming relaxation (LPR) for the covering problem. We show that a combination of traditional reductions (essentiality and dominance) and incremental computation of LPR-based lower bounds can exactly solve difficult covering problems orders of magnitude faster than traditional methods 相似文献
8.
Compton K. Zhiyuan Li Cooley J. Knol S. Hauck S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2002,10(3):209-220
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. By mapping the compute-intensive sections of an application to reconfigurable hardware, custom computing systems exhibit significant speedups over traditional microprocessors. However, this potential acceleration is limited by the requirement that the speedups provided must outweigh the considerable cost of reconfiguration. The ability to relocate and defragment configurations on field programmable gate arrays (FPGAs) can dramatically decrease the overall reconfiguration overhead incurred by the use of the reconfigurable hardware. We therefore present hardware solutions to provide relocation and defragmentation support with a negligible area increase over a generic partially reconfigurable FPGA, as well as software algorithms for controlling this hardware. This results in factors of 8 to 12 improvement in the configuration overheads displayed by traditional serially programmed FPGAs. 相似文献
9.
As a computing paradigm that combines temporal and spatial computations,dynamic reconfigurable computing provides superiorities of flexibility,energy efficiency and area efficiency,attracting interest from both academia and industry.However,dynamic reconfigurable computing is not yet mature because of several unsolved problems.This work introduces the concept,architecture,and compilation techniques of dynamic reconfigurable computing.It also discusses the existing major challenges and points out its potential applications. 相似文献
10.
11.
Eugénia Moreira Bernardino Juan Manuel Sánchez-Pérez 《Optical Switching and Networking》2012,9(2):97-117
In the past years, the number of users of Internet-based applications has exponentially increased and consequently the request for transmission capacity or bandwidth has significantly augmented. When managed properly, the ring networks are uniquely suited to deliver a large amount of bandwidth in a reliable and inexpensive way. In this paper, we consider two problems that arise in the design of optical telecommunication networks, namely the SONET Ring Assignment Problem (SRAP) and the Intraring Synchronous Optical Network Design Problem (IDP), known to be NP-hard. In SRAP, the objective is to minimise the number of rings (i.e., DXCs). In IDP, the objective is to minimise the number of ADMs. Both problems are subject to a ring capacity constraint. To solve these problems, we propose two bee-inspired algorithms: Hybrid Artificial Bee Colony and Hybrid Bees Algorithm. We hybridise the basic form of these algorithms with local search, in order to refine newly constructed solutions. We also perform comparisons with other algorithms from the literature and use larger instances. The simulation results verify the effectiveness and robustness of the proposed algorithms. 相似文献
12.
在可重构计算芯片设计初期,确定芯片的各种互连资源数目是一个关键问题.如果设计的互连资源过少,可能导致应用领域中的部分算法无法实现,而过多的互连资源会造成芯片面积的浪费.基于可重构计算的特点,分析了可重构计算的相邻连接、路由连接和近邻连接三种类型互连资源.通过建立互连资源估计的随机模型,提出了可重构计算芯片中各种互连资源数目的估计方法.仿真结果表明,该方法能够比较准确地估计各种互连资源的数目,从而指导可重构计算互连资源的设计,降低设计风险. 相似文献
13.
Maestre R. Kurdahi F.J. Fernandez M. Hermida R. Bagherzadeh N. Singh H. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(6):858-873
Dynamically reconfigurable architectures are emerging as a viable design alternative to implement a wide range of computationally intensive applications. At the same time, an urgent necessity has arisen for support tool development to automate the design process and achieve optimal exploitation of the architectural features of the system. Task scheduling and context (configuration) management become very critical issues in achieving the high performance that digital signal processing (DSP) and multimedia applications demand. This article proposes a strategy to automate the design process which considers all possible optimizations that can be carried out at compilation time, regarding context and data transfers. This strategy is general in nature and could be applied to different reconfigurable systems. We also discuss the key aspects of the scheduling problem in a reconfigurable architecture such as MorphoSys. In particular, we focus on a task scheduling methodology for DSP and multimedia applications, as well as the context management and scheduling optimizations 相似文献
14.
This paper describes a new specialized Reconfigurable Cryptographic for Block ciphers Architecture(RCBA).Application-specific computation pipelines can be configured according to the characteristics of the block cipher processing in RCBA,which delivers high performance for cryptographic applications.RCBA adopts a coarse-grained reconfigurable architecture that mixes the appropriate amount of static configurations with dynamic configurations.RCBA has been implemented based on Altera’s FPGA,and representative algorithms of block cipher such as DES,Rijndael and RC6 have been mapped on RCBA architecture successfully.System performance has been analyzed,and from the analysis it is demonstrated that the RCBA architecture can achieve more flexibility and efficiency when compared with other implementations. 相似文献
15.
Multiplexer model for RTL satisfiability using MILP 总被引:1,自引:0,他引:1
Navarro H. Montiel-Nelson J.A. Sosa J. Garcia J.C. Fay D.Q.M. 《Electronics letters》2004,40(7):417-418
New approaches to the satisfiability problem (SAT) for register transfer level (RTL) designs combine arithmetic blocks with Boolean logic to form a mixed integer linear program (MILP). Two-to-one multiplexers with word-level inputs can be decomposed to logic gates, but it is more efficient to describe them in MILP constraints as arithmetic operators. Larger multiplexers are built using a multilevel selection tree. However, such an approach should be improved to optimise the overall efficiency in solving the SAT problem. Proposed is a new MILP model for multiplexers. Experimental results indicate a 50% decrease in the number of constraints and a reduction in MILP complexity from /spl Omega/(N/sup 2.4/) to /spl Omega/(N/sup 1.7/), measured in CPU time. 相似文献
16.
A multiple grid technique for solving electromagnetic field problems using the transmission-line modeling (TLM) method is described. The ideal conversion conditions across the interface between fine and coarse mesh regions are described and the implications of making the approximations needed for a practical implementation are discussed. Simulations are presented showing the accuracy of the method and its benefits in terms of reduced storage and run-time 相似文献
17.
媒体处理算法内在的并行性推动了媒体处理器朝着运算阵列架构的方向发展.在分析了算法映射对电路执行效果的影响后,将运算阵列设计与算法映射相结合,针对如何有效利用阵列提出了一种流水线映射的方案,并分析了该映射方法对系统性能的影响.在此基础之上,以H 264中的IDCT算法为例提取流水线模型,并基于该模型设计出了粗粒度的可重构阵列.实验结果表明,该阵列在功耗、速度、器件利用率等方面具有明显优势,具有较好的应用价值. 相似文献
18.
A series FinFET based non-volatile logic gates with multiple logic functions defined by embedded non-volatile states are proposed for the first time and demonstrated in advanced CMOS technology platform. The device channels in the proposed CMOS logic gate is controlled by a metal floating gate coupled by slot contacts uniquely available in the FinFET process employed in this study. The new logic gate with non-volatile states only enable reconfiguration ability in a Boolean computing unit at a gate level aimed for adaptive and specialized systems in the AI era. Furthermore, the extended applications in tunable ring oscillators for multi-functional IOT modules are successfully demonstrated in this study. 相似文献
19.
Kenneth S. Schneider 《International Journal of Network Management》1997,7(2):94-102
In this article application stories illustrate two rather uncommon network remodelings—one for a semiconductor fabrication plant, the other for a supermarket—in which recently developed hardware is applied to redesign and optimize the networks. © 1997 John Wiley & Sons, Ltd. 相似文献
20.
Domain specific coarse-grained reconfigurable architectures (CGRAs) have great promise for energy-efficient flexible designs for a suite of applications. Designing such a reconfigurable device for an application domain is very challenging because the needs of different applications must be carefully balanced to achieve the targeted design goals. It requires the evaluation of many potential architectural options to select an optimal solution. Exploring the design space manually would be very time consuming and may not even be feasible for very large designs. Even mapping one algorithm onto a customized architecture can require time ranging from minutes to hours. Running a full power simulation on a complete suite of benchmarks for various architectural options require several days. Finding the optimal point in a design space could require a very long time. We have designed a framework/tool that made such design space exploration (DSE) feasible. The resulting framework allows testing a family of algorithms and architectural options in minutes rather than days and can allow rapid selection of architectural choices. In this paper, we describe our DSE framework for domain specific reconfigurable computing where the needs of the application domain drive the construction of the device architecture. The framework has been developed to automate design space case studies, allowing application developers to explore architectural tradeoffs efficiently and reach solutions quickly. We selected some of the core signal processing benchmarks from the MediaBench benchmark suite and some edge-detection benchmarks from the image processing domain for our case studies. We describe two search algorithms: a stepped search algorithm motivated by our manual design studies and a more traditional gradient based optimization. Approximate energy models are developed in each case to guide the search toward a minimal energy solution. We validate our search results by comparing the architectural solutions selected by our tool to an architecture optimized manually and by performing sensitivity tests to evaluate the ability of our algorithms to find good quality minima in the design space. All selected fabric architectures were synthesized on 130 nm cell-based ASIC fabrication process from IBM. These architectures consume almost same amount of energy on average, but the gradient based approach is more general and promises to extend well to new problem domains. We expect these or similar heuristics and the overall design flow of the system to be useful for a wide range of architectures, including mesh based and other commonly used architectures for CGRAs. 相似文献