期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimizing Decoupling Capacitors in 3D Circuits for Power Grid Integrity

Pingqiang Zhou Sridharan K. Sapatnekar S.S. 《Design & Test of Computers, IEEE》2009,26(5):15-25

This article studies one of the EDA problems for 3D IC design. The article presents a design automation solution for power grid optimization in 3D ICs. The authors propose a congestion-aware 3D power supply network optimization algorithm, which applies a sequence-of-linear-programs-based method to optimize the power grid design. We explore the trade-offs between MIM decaps and traditional CMOS decaps in chip design, and we propose a congestion-aware 3D power supply network optimization algorithm to optimize this trade-off. One of the novel features of our work is that it optimizes the power supply network using both conventional CMOS decaps and metal insulator-metal (MIM) decaps. However, because MIM decaps are built between layers of metal interconnects, they present routing blockages to nets that attempt to cross them, and therein lies the trade-off. The properties of MIM decaps make them attractive for both 2D and 3D chips, but we pay particular attention to the 3D decap problem in this article because, first, the power integrity problem is particularly critical in 3D, and requires novel approaches that leverage advances in materials, and second, the added complexity of handling routing blockages in a constrained environment makes the 3D problem especially challenging. 相似文献

2.

岛式FPGA线长驱动快速布局算法 总被引：3，自引：0，他引：3

隋文涛董社勤边计年《计算机辅助设计与图形学学报》2009,21(9)

传统的FPGA布局箅法需要花费大量时间,影响了FPGA物理设计效率.为了在保证布局质量的前提下缩短布局时间,提出一种岛式FPGA快速布局算法.首先考虑终端传输的迭代二划分,然后进行最小费用流初始布局和低温模拟退火的布局优化.在每一个划分层次中,考虑了线网的终端对线网权重的影响;对于每一个划分的区域,使用最小费用流来确定初始的布局;在布局的最后阶段使用低温模拟退火来提高初始布局的质量.实验结果表明,该算法布局结果的质量高、速度快. 相似文献

3.

Lightweight implementation of SILC,CLOC, AES-JAMBU and COLM authenticated ciphers

《Microprocessors and Microsystems》2020

Authenticated encryption schemes provide both confidentiality and integrity services, simultaneously. CAESAR competition will identify a portfolio of authenticated ciphers, which is expected to be suitable for widespread adoption and offers advantages over AES-GCM. An important criterion for selecting the final candidates, besides security, is the hardware performance in resource-limited environments. In this paper, SILC, CLOC, AES-JAMBU, and COLM authenticated ciphers have been selected from the third round of the CAESAR competition for hardware evaluation. The main reasons to choose these schemes are their lightweight design, sufficient security level, and the use of the AES algorithm as their underlying block cipher. To the best our knowledge, it is the first time that an 8-bit lightweight architecture which is compatible with API v2 is presented for the selected schemes. To implement AES, the Atomic-AES v2 which is one of the smallest implementations has been adopted according to the requirements of the selected schemes. Furthermore, to reduce the area in the hardware implementation, several techniques are used, including implementing one AES core in the datapath, sharing registers to store intermediate values, implementing the tweak functions with the shuffling of wires, and implementing doubling on the GF(2¹²⁸) with 8-bit architecture to construct the higher-order multipliers. The implementation results are presented on ASIC and FPGA platforms. The proposed architecture for each scheme on the two platforms is similar, but different optimization techniques are used for each platform, e.g. the AES S-box is implemented as ROM-based and logic-based on FPGA and ASIC, respectively. The comparing of the results with 128-bit implementations shows that the area on FPGA and ASIC is reduced up to 65% and 88%, respectively. The results of the current study demonstrate that AES-JAMBU has the lowest hardware area and the highest throughput and performance on both platforms. Besides, CLOC has the highest area reduction on both platforms, compared with those of the 128-bit implementations. 相似文献

4.

基于性能的LUT结构的FPGA的再设计算法

张万鹏童家榕唐璞山《计算机应用与软件》2001,18(3):64-68

本文提出一个基于性能的LUT结构的FPGA的再设计算法,该算法采用特征函数以及对原布尔网络进行相应的约束实现电路的再设计。因为不改变网络的拓扑结构,从而避免了在再设计过程中重新考虑电路的时延和布局布线结果。相似文献

5.

ProNoC: A low latency network-on-chip based many-core system-on-chip prototyping platform

《Microprocessors and Microsystems》2017

Network-on-chip (NoC) is an emerging interconnect infrastructure to address the scalability limitation of conventional shared bus architecture for many-core system-on-chip (MCSoC). Current field-programmable gate arrays (FPGAs) have over million lookup tables, making it possible to prototype a complete NoC-based MCSoC on a single FPGA device. FPGA prototyping allows rapid system verification and optimum design parameters estimation. However, existing NoC-based MCSoC prototypes are usually adopting simple NoC architectural functionality. These NoC prototypes cannot represent a realistic projection of the state-of-the-art application-specific integrated circuit (ASIC) NoCs as these prototypes have limited overall system performance. This paper presents ProNoC, an integrated tool for rapid prototyping and validation of NoC-based MCSoC projects targeting FPGA devices. ProNoC adopts most advanced NoC features such as the support of virtual channel (VC), virtual network, low latency routing and different routing algorithms. Results show that NoC interconnect in ProNoC outperforms CONNECT, the most recent VC based prototype NoC with lower logic cell utilization, higher maximum operating frequency, higher average saturation throughput, and lower average communication latency. Moreover, ProNoC is equipped with graphical user interface to facilitate the development of MCSoC prototypes on FPGA platforms. 相似文献

6.

用FPGA实现仪表用DDS信号源的ASIC设计

包本刚朱湘萍《单片机与嵌入式系统应用》2017,(11):60-63

ASIC可以采用全定制和半定制的方法加以实现,采用FPGA来进行ASIC的可测试设计,可以很大程度上节约ASIC设计的成本.本文介绍了一种基于FPGA实现的DDS信号源的ASIC的设计方案,它可以灵活地输出任意波形,并可以较方便地改变波形的频率和相位.该方案可以嵌入到采用FPGA芯片实现的仪器仪表中,具有结构简单、功能强大、性价比高的特点,稍加改动可适用于许多仪器仪表系统中,具有很好的可移植性. 相似文献

7.

Hardware implementation of dynamic fuzzy logic based routing in Network-on-Chip

《Microprocessors and Microsystems》2017

This paper presents the hardware implementation of a generic fuzzy logic-based adaptive routing scheme for both buffered and bufferless Networks-on-Chip (NoC). The routing scheme considers the dynamic traffic load and power consumption on neighboring router links to select the output port of an incoming flit. Specifically, fuzzy logic control is used to build a simple, generic, and efficient nonlinear control law that dynamically calculates the input link cost. Basing the link cost on traffic load and power consumption and not on empty buffer slots, makes the proposed algorithm applicable to both buffered and bufferless NoCs. Hardware implementation in ASIC and FPGA technologies demonstrate that the hardware area overhead imposed by the fuzzy control logic is from minimal to negligible for practical flit sizes and scales excellently with network size. Furthermore, since the fuzzy control logic is not in the router critical path, it imposes no additional latency. Finally, we demonstrate the efficiency of the proposed routing scheme through simulative evaluation against representative conventional counterparts. 相似文献

8.

Caching and optimized request routing in cloud-based content delivery systems

《Performance Evaluation》2014

Geographically distributed cloud platforms enable an attractive approach to large-scale content delivery. Storage at various sites can be dynamically acquired from (and released back to) the cloud provider so as to support content caching, according to the current demands for the content from the different geographic regions. When storage is sufficiently expensive that not all content should be cached at all sites, two issues must be addressed: how should requests for content be routed to the cloud provider sites, and what policy should be used for caching content using the elastic storage resources obtained from the cloud provider. Existing approaches are typically designed for non-elastic storage and little is known about the optimal policies when minimizing the delivery costs for distributed elastic storage.In this paper, we propose an approach in which elastic storage resources are exploited using a simple dynamic caching policy, while request routing is updated periodically according to the solution of an optimization model. Use of pull-based dynamic caching, rather than push-based placement, provides robustness to unpredicted changes in request rates. We show that this robustness is provided at low cost. Even with fixed request rates, use of the dynamic caching policy typically yields content delivery cost within 10% of that with the optimal static placement. We compare request routing according to our optimization model to simpler baseline routing policies, and find that the baseline policies can yield greatly increased delivery cost relative to optimized routing. Finally, we present a lower-cost approximate solution algorithm for our routing optimization problem that yields content delivery cost within 2.5% of the optimal solution. 相似文献

9.

Combining temporal partitioning and temporal placement techniques for communication cost improvement

Bouaoui Ouni Ramzi Ayadi Abdellatif Mtibaa 《Advances in Engineering Software》2011,42(7):444-451

In this paper, we present a typical temporal partitioning methodology that temporally partitions a data flow graph on reconfigurable system. Our approach optimizes the communication cost of the design. This aim can be reached by minimizing the transfer of data required between design partitions and the routing cost between FPGA modules. Consequently, our algorithm is composed by two main steps. The first step aims to find a temporal partitioning of the graph. This step gives the optimal solution in term of communication cost. Next, our approach builds the best architecture, on a partially reconfigurable FPGA, that gives the lowest routing cost between modules. The proposed methodology was tested on several examples on the Xilinx Virtex-II pro. The results show significant reduction in the communication cost compared with others famous approaches used in this field. 相似文献

10.

FPGA及其ASIC的开发

下载免费PDF全文

王东霞孙利民窦文华《计算机工程与科学》1995,17(1):55-61

专用集成电路（ＡＳＩＣ）迅速发展，其使用被认为是电子设备设计先进性的重要标志之一。现场可编程门阵列（ＦＰＧＡ）以其突出的性能，成为ＡＳＩＣ设计采用的主要器件。本文介绍了ＦＰＧＡ的内部结构和工作原理，重点讲述了用其开发ＡＳＩＣ的过程相似文献

11.

An integrated environment for technology closure of deep-submicron IC designs

Trevillyan L. Kung D. Puri R. Reddy L.N. Kazda M.A. 《Design & Test of Computers, IEEE》2004,21(1):14-22

With larger chip images and increasingly aggressive technologies, key design processes must interoperate, PDS, a physical-synthesis system, accomplishes technology closure through interacting processes of logic optimization, placement, timing, clock insertion, and routing, all using a common infrastructure with robust variable-accuracy analysis abstractions. 相似文献

12.

可重构计算硬件平台的改进设计 总被引：2，自引：2，他引：0

下载免费PDF全文

王晟中陈伟男彭澄廉《计算机工程》2010,36(5):250-252

针对现有可重构计算硬件平台配置时间长、灵活性受限的缺陷,提出一种改进设计。基于支持二维重构区域的Virtex-4现场可编程门阵列(FPGA)芯片,使重构模块放置更灵活、芯片面积利用率更高,通过将单片FPGA和外设集成在一块印刷电路板上,使系统的结构更紧凑,利用FPGA内嵌微处理器减轻通信和访存开销。调试结果表明,改进平台灵活性较高、功能和可扩展性更强。相似文献

13.

一种新的硬件设计方法——结构化ASIC技术

王国章刘战须自明于宗光《微计算机信息》2006,(1Z):153-155

本文介绍了一种新的硬件设计方法——结构化ASIC技术及其代表性的设计流程，对标准单元ASIC技术、FPGA技术与结构化ASIC技术进行了比较，最后给出了结构化ASIC的应用框图。相似文献

14.

力驱动三维FPGA布局算法

隋文涛董社勤边计年《计算机辅助设计与图形学学报》2011,23(10)

三维FPGA布局问题的复杂度与二维情况相比成指数倍增长,布局算法需要花费大量时间,影响了FPGA物理设计效率.为了在保证布局质量的前提下缩短布局时间,提出以线长为优化目标基于力驱动的三维FPGA布局算法——3D-WFP.该算法由整体布局、坐标合法化和层划分、布局优化3个阶段组成,通过力驱动算法快速形成整体布局,为后续2个子过程提供更精确的逻辑单元位置和时延信息.提出三维空间填充曲线,根据位置和时延信息依次对逻辑单元按照三维空间填充曲线进行坐标合法化和层划分;修正了低温模拟退火进行布局优化的解空间,大大加快了低温模拟退火的收敛速度.与已有的三维FPGA布局算法比较,3D-WFP在保证运行时间和时延性能的前提下,有效地缩短了最终布局结果,缩短的总线长达7.38%. 相似文献

15.

Branch pipe routing based on 3D connection graph and concurrent ant colony optimization algorithm 总被引：1，自引：0，他引：1

Yanfeng Qu Dan Jiang Qingyan Yang 《Journal of Intelligent Manufacturing》2018,29(7):1647-1657

Pipe routing, in particular branch pipes with multiple terminals, has an important influence on product performance and reliability. This paper develops a new rectilinear branch pipe routing approach for automatic generation of the optimal rectilinear branch pipe routes in constrained spaces. Firstly, this paper presents a new 3D connection graph, which is constructed by extending a new 2D connection graph. The new 2D connection graph is constructed according to five criteria in discrete Manhattan spaces. The 3D connection graph can model the 3D constrained layout space efficiently. The length of pipelines and the number of bends are modeled as the optimal design goal considering the number of branch points and three types of engineering constraints. Three types of engineering constraints are modeled by this 3D graph and potential value. Secondly, a new concurrent Max–Min Ant System optimization algorithm, which adopts concurrent search strategy and dynamic update mechanism, is used to solve Rectilinear Branch Pipe Routing optimization problem. This algorithm can improve the search efficiency in 3D constrained layout space. Numerical comparisons with other current approaches in literatures demonstrate the efficiency and effectiveness of the proposed approach. Finally, a case study of pipe routing for aero-engines is conducted to validate this approach. 相似文献

16.

Remote classroom system for Chinese linguistics teaching based on FPGA and embedded system

《Microprocessors and Microsystems》2021

Online intelligent remote classrooms subject to Chinese Linguistics and FPGA (Field-programmable gate array) procedures are used to handle complex blends of teaching and work estimations. FPGA (Field-Programmable Gate Array) is regularly realized as ASICs (Application-Specific Integrated Circuit) because of the high model thickness and execution essentials. Embedded designs regularly don't achieve high efficiency and low power application. The FPGA (Field-Programmable Gate Array) online course replication, appropriately giving the Fault Distributed Algorithm and hardware stage to organizing and evaluating new features before silicon is set into usage. Chinese Linguistics causes the network interface to grow faster and decline the ASIC design cycle. The Fault Distributed Algorithm is the improved forefront FPGA (Field-Programmable Gate Array) introduced another maximize card that reinforces the latest ASIC and FPGA plans that are a bit of checking trigger. The delivered output is taken care of in a model group and set aside in the data line that makes up an amount model pack. Considering the produced test set, the proposed system discovers each level's progression closeness, ultimately the overall likeness. 相似文献

17.

Emulation of an ASIC power and temperature monitoring system (eTPMon) for FPGA prototyping

《Microprocessors and Microsystems》2017

Hardware monitoring information can be used during system runtime to increase system lifetime and reliability. Examples of such monitoring information are power, temperature, and the aging status of processors. They provide the system with relevant information about the current hardware health. Such information is especially crucial in resource-aware computing concepts that introduce self-organizing behavior to deal with large MPSoCs (Multi-Processor Systems-on-Chip): For resource-aware computing, resources are allocated according to the current requirements. To find suitable resource-application pairs and achieve system targets like optimizing the utilization, current hardware status must be considered during resource allocation.To evaluate and optimize resource allocation strategies during the design phase, FPGA prototyping is often required before its implementation in ASIC. The evolution of power, temperature and aging differ between ASIC implementation and FPGA prototype. The FPGA prototype should react on sensor data characterized from the target ASIC design instead of FPGA’s hardware status.This paper describes the design of an emulated ASIC Temperature and Power Monitoring system (eTPMon) for FPGA-based prototyping. The emulation approach for power monitors is based on an instruction-level energy model. For emulating temperature monitors, a thermal RC model is used.eTPMon can supply MPSoC prototypes with the hardware status information (power and temperature of the cores) needed for efficient load distribution, achieving resource-aware computing targets. Based on the eTPMon data, different operating strategies and control targets were evaluated for a 2-tile resource-aware MPSoC system. Values provided by eTPMon are usable for extracting information about the aging of processors, which can be used for increasing the system lifetime. 相似文献

18.

基于节点密度加权的T-LEACH三维动态路由协议研究 总被引：1，自引：0，他引：1

余敏李雅晴张琦唐瑞《传感技术学报》2016,29(2):278-284

随着无线传感网络在三维动态环境应用需求的俱增,如何在动态拓扑的三维网络环境下,设计能量高效和数据传输高可靠性的路由协议是当前学术界的研究热点。现有的三维路由协议未充分考虑节点移动和环境因素的影响,多为二维静态协议的补充和支持。将拓扑结构和地理结构的路由协议相结合,提出了基于节点密度加权的T-LEACH三维动态路由协议。使用T-LEACH算法获得全局节点密度信息,DDRS算法预知路由空洞和使用DDRS-R的恢复算法逃逸空洞和修正路由,有效地实现规避局部最优问题和迅速逃逸路由空洞的目标,提高了网络整体的健壮性和生存时间。通过在课题组设计的无线传感器网络三维环境路由协议仿真平台上的实验和对比,证明本文提出的路由协议在能量消耗,网络生存期及数据交付率等方面优于现有协议,具有良好的应用前景。相似文献

19.

Computer-aided prototyping for ASIC-based systems 总被引：1，自引：0，他引：1

Walters S. 《Design & Test of Computers, IEEE》1991,8(2):4-10

The use of computer-aided prototyping (CAP) with the RPM Emulation System is described. RPM creates a hardware functional prototype from an ASIC or full-custom chip netlist. It reads the chip netlist and then converts the chip design gates into a prototype design. It then synthesizes the prototype design, obtaining the information it needs to configure the reprogrammable hardware, primarily with partitioning and placement and routing technology. Finally, it physically implements the prototype design by electronically configuring the reprogrammable hardware. RPM includes embedded tools for interactive debugging with access to any internal design node, and a facility for handling quick incremental changes to the design. It is argued that other techniques such as silicon prototyping and manual prototyping are not practical; silicon has a poor debugging ability, and manual prototyping cannot handle large designs. The practical benefits of CAP are discussed 相似文献

20.

A high-throughput and high-capacity IPv6 routing lookup system

Yi-Mao Hsiao Yuan-Sun Chu Jeng-Farn Lee Jinn-Shyan Wang 《Computer Networks》2013,57(3):782-794

With the growing number of routing entries, IP routing lookup has become the major performance bottleneck in backbone routers. In this paper, a complete hardware-based routing lookup system is proposed to achieve high-throughput and high-capacity for IPv6. The proposed system is a cache-centric, hash-based architecture that contains a routing lookup application specific integrated circuit (ASIC) and a memory set. A hash function is used to reduce lookup time for the routing table and ternary content addressable memory (TCAM) effectively resolves the collision problem. The gate count of the ASIC, excluding the binary content addressable memory (BCAM), is about 5306 gates, using an in-house 0.18 μm CMOS single-poly six-metal standard cell library. The results of post-layout simulations show that the ASIC operates in 3.6 ns so that the routing lookup system approaches 260 Mega lookups per second (Mlps), which is sufficient for 100 Gbps networks. The memory density is good, with each routing entry requiring only 64 bits. Moreover, the routing table only needs 10.24 KB on-chip BCAM, 20.04 KB off-chip TCAM and 29.29 MB DRAM for 3.6 M routing entries in the proposed system. 相似文献