首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
深亚微米下系统级芯片的物理设计实例   总被引:2,自引:0,他引:2  
曾宏  曾璇  闵昊 《微电子学》2005,35(6):634-638
深亚微米下芯片的物理设计面临很多挑战,特别是对于超大规模的SOC,比如互连延迟(Interconnect delay)、信号完整性(SI)、电压降(IR-Drop)与电迁移(EM)、第三方IP集成,等等.应对这些问题,在后端设计流程上要有新的方法.文章以一块0.18 μm工艺下200万门无线数据传输芯片的物理设计为例,介绍了其中的关键设计步骤和一些解决问题的方案,可为其他类似的设计提供参考.  相似文献   

2.
Static timing analysis is a key step in the physical design optimization of VLSI designs. The lumped capacitance model for gate delay and the Elmore model for wire delay have been shown to be inadequate for wire-dominated designs. Using the effective capacitance model for the gate delay calculation and model-order reduction techniques for wire delay calculation is prohibitively expensive. In this paper, we present sufficiently accurate and highly efficient filtering algorithms for interconnect timing as well as gate timing analysis. The key idea is to partition the circuit into low and high complexity circuits, whereby low complexity circuits are handled with efficient algorithms such as total capacitance algorithm for gate delay and the Elmore metric for wire delay and high complexity circuits are handled with sign-off algorithms. Experimental results on microprocessor designs show accuracies that are quite comparable with sign-off delay calculators with more than of 65% reduction in the computation times  相似文献   

3.
A high performance communication architecture, SAMBA-bus, is proposed in this paper. In SAMBA-bus architecture, multiple compatible bus transactions can be performed simultaneously with only a single bus access grant from the bus arbiter. Experimental results show that, compared with a traditional bus architecture, the SAMBA-bus architecture can have up to 3.5 times improvement in the effective bandwidth, and up to 15 times reduction in the average communication latency. In addition, the performance of SAMBA-bus architecture is affected only slightly by arbitration latency, because bus transactions can be performed without waiting for the bus access grant from the arbiter. This feature is desirable in SoC designs with large numbers of modules and long communication delay between modules and the bus arbiter  相似文献   

4.
基于精确时延模型考虑缓冲器插入的互连线优化算法   总被引:2,自引:0,他引:2  
随着VLSI电路集成度增大和特征尺寸的不断减小,连线的寄生效应不可忽略,互连线的时延在电路总时延中占了很大的比例,成为决定电路性能的主要因素.在互连时延的优化技术中,缓冲器插入是最有效的减小连线时延的方法.本文提出了一个在精确时延模型下,在布线区域内给定一些可行的缓冲器插入位置,对两端线网进行拓扑优化,并同时插入缓冲器以优化时延的多项式时间实现内的算法.我们的算法不但可以实现时延的最小化,也可以在满足时延约束的条件下,最小化缓冲器的插入数目,从而避免不必要的面积和功耗的浪费.  相似文献   

5.
Two-Hop Polling: An Access Scheme for Clustered, Multihop Ad hoc Networks   总被引:1,自引:0,他引:1  
In multiple channel environments, clustering provides a convenient framework for channel access and bandwidth allocation. Many clustering schemes, however, demand that terminals may communicate directly only if they share a common clusterhead. This requirement deactivates otherwise helpful links; those between nodes that belong to different clusters (intercluster links). Links between nodes that belong to different clusters constitute a distributed gateway. In this paper, we evaluate the importance of distributed gateways for two different clustering schemes and propose a novel access scheme for clustered environments using the link-cluster architecture, called two-hop polling (2HP). Two-hop polling manages to utilize intercluster links, leading to better connectivity and throughput.  相似文献   

6.

With advancements in technology, size and speed have been the important facet in VLSI interconnects. The channel length of the device reduces to tens of nanometers, as the technology is transferring to the deep submicron level. This leads to the requirement of long interconnects in VLSI chips. Interconnects are known as the basic building block that can vary from size to size. They provide a connection between two or more blocks and have scaling problems that an IC designer faces while designing. As scaling increases, the impact of interconnect in the VLSI circuits became even more important. It controls all the important electrical characteristics on the chip. With scale-down technology, interconnects not only become closer with each other but their dimensions also change which can directly impact the circuit parameters. Certain RC models have already been defined to control these parameters but in this paper, authors have proposed a new improved Elmore delay estimation model (RC) to reduce delay and power consumption in interconnect circuits. An optimized Elmore delay calculation was performed for uniform and non-uniform wires to reduce the time constant of the interconnect circuits. Further, the proposed model is estimated and verified theoretically. A new improved RC model is compared to the designed π-model that shows remarkable results. We also observed the linear relationship of power consumption and delay for both the RC models and found that in π-model, upon decreasing the length of wire the power first increases then decreases but in the proposed model, the power first increases then remain constant and then further increases upon increasing the length of wire. Our proposed model shows the remarkable values as the average percentage improvement of power is 75.167% and delay as 74.714% is achieved using a uniform distribution.

  相似文献   

7.
Advanced digital receiver principles and technologies for PCS   总被引:1,自引:0,他引:1  
The synergy between digital radio communications and VLSI signal processing is revolutionizing the design of wireless terminals. Driving this synergy are certain fundamental paradigms in modern communication theory, digital signal processing, and VLSI design. The authors discuss the modern centers-of-gravity model, which they believe is emerging as the basis for the successful design and implementation of advanced digital communication systems. Central to this model are design principles that enable engineers to systematically derive digital receiver structures and explore algorithm and architecture trade-offs using sophisticated tools. Digital signal processing technology is critical in the implementation of these digital receiver structures efficiently. Finally, CAD tools for digital communications system design and design space exploration are shown to be of crucial importance in the efficient execution of these designs  相似文献   

8.
As technology scales, more sophisticated fabrication processes cause variations in many different parameters in the device. These variations could severely affect the performance of processors by making the latency of circuits less predictable and thus requiring conservative design approaches. In this paper, we use Monte Carlo simulations in addition to worst-case circuit analysis to establish the overall delay due to process variations in a data cache sub-system under both typical and worst-case conditions. The distribution of the cache critical-path-delay in the typical scenario was determined by performing Monte Carlo simulations at different supply voltages, threshold voltages, and transistor lengths on a complete cache design. In addition to establishing the delay variation, we present an adaptive variable-cycle-latency cache architecture that mitigates the impact of process variations on access latency by closely following the typical latency behavior rather than assuming a conservative worst-case design-point. Simulation results show that our adaptive data cache can achieve a 9% to 31% performance improvement in a superscalar processor, on the SPEC2000 applications studied, compared to a conventional design. The area overhead for the additional circuits of the adaptive technique has less than 1% of the total cache area. Additional performance improvement potential exists in processors where the data cache access is on the critical path, by allowing a more aggressive clock rate.   相似文献   

9.
孟李林 《半导体技术》2008,33(3):190-192
遵循摩尔定律的预言,半导体集成电路工艺技术持续高速向深亚微米工艺发展,大规模集成电路设计技术是发展过程中需要解决的关键问题.基于片上总线的SOC设计技术解决了大规模集成电路的设计难点,但是片上总线的应用带来了可扩展性差、平均通信效率低等问题.近几年研究提出全新的集成电路体系结构NOC,是将计算机网络技术移植到芯片设计中,从体系结构上彻底解决了SOC设计技术存在的问题.因此,NOC将成为集成电路下一代主流设计技术.  相似文献   

10.
This paper presents a single-chip programmable platform that integrates most of hardware blocks required in the design of embedded system chips. The platform includes a 32-bit multithreaded RISC processor (MT-RISC), configurable logic clusters (CLCs), programmable first-in-first-out (FIFO) memories, control circuitry, and on-chip memories. For rapid thread switch, a multithreaded processor equipped with a hardware thread scheduling unit is adopted, and configurable logics are grouped into clusters for IP-based design. By integrating both the multithreaded processor and the configurable logic on a single chip, high-level language-based designs can be easily accommodated by performing the complex and concurrent functions of a target chip on the multithreaded processor and implementing the external interface functions into the configurable logic clusters. A 64-mm/sup 2/ prototype chip integrating a four-threaded MT-RISC, three CLCs, programmable FIFOs, and 8-kB on-chip memories is fabricated in a 0.35-/spl mu/m CMOS technology with four metal layers, which operates at 100-MHz clock frequency and consumes 370 mW at 3.3-V power supply.  相似文献   

11.
随着网络技术发展,以网络虚拟化为手段解决TCP/IP网络体系结构僵化问题已成为未来网络领域发展的主流方向之一.SDN(software defined networking,软件定义网络)作为一种新兴的网络体系结构,为网络虚拟化提供了有效的解决方案.首先总结了当前具有代表性的SDN网络虚拟化平台,并对比了SDN与传统网络环境中部署虚拟网的区别,然后针对SDN网络虚拟化平台中的虚拟网络映射问题,提出一种时延敏感的虚拟化控制器放置算法,最后通过实验验证了该算法在提高网络资源的利用效率的同时,保证了控制器与底层交换机的通信时延在可接受范围之内.  相似文献   

12.
光电子学与光通信   总被引:1,自引:0,他引:1  
目前微电子芯片中功能强大的,广泛使用的硅晶体管从微型化的角度说已接近其技术极限,新的技术途径在不断探索之中。光微芯片乃至光计算机是被广泛研究的极具吸引力的新技术之一,光通信更是推动人类进入通讯新纪元的关键技术。  相似文献   

13.
As more transistors are integrated onto bigger die, an on‐chip multiprocessor will become a promising alternative to the superscalar microprocessor that dominates today's microprocessor marketplace. This paper describes key parts of a new on‐chip multiprocessor, called Raptor, which is composed of four 2‐way superscalar processor cores and one graphic co‐processor. To obtain performance characteristics of Raptor, a program‐driven simulator and its programming environment were developed. The simulation results showed that Raptor can exploit thread level parallelism effectively and offer a promising architecture for future on‐chip multiprocessor designs.  相似文献   

14.
Interconnection of components in a VLSI chip is becoming an increasingly complex problem. In this paper we examine the complexity of the wire routing process and discuss several new approaches to solving the problem using a parallel system architecture. The machines discussed range from compact systems for highly specialized applications to more general designs suited for broader applications. The process speedup due to parallelism and the cost advantage due to the use of large numbers of identical VLSI parts make these new machines practical today.  相似文献   

15.
通过分析集群通信系统沿专网与公网方向发展演进的技术趋势,结合公安调度需求研究了基于5G切片的警务集群系统体系结构,包括应用层、服务层、传输层、终端层、标准及管理体系和安全保障体系。在网络组网架构方面,通过超高可靠低时延通信(Ultra-reliable and Low Latency Communications,uRLLC)切片传输控制信号,增强型移动宽带(Enhanced Mobile Broadband,eMBB)切片传输业务内容,并提出集群业务软件中通信调度业务逻辑、综合业务适配和维护管理软件的模块组成,对其应用的协同算法、时延保证、安全可靠性和可扩展性等关键技术问题给出建议。基于多智能体控制模型提出多接入边缘计算(Multiple Access Edge Computing,MEC)服务器之间状态同步协调算法,为警务集群系统在5G技术体制下的进一步发展提供了基础。  相似文献   

16.
We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system.  相似文献   

17.
This paper presents systematic techniques to find low-power high-performance superscalar processors tailored to specific user applications. The model of power is novel because it separates power into architectural and technology components. The architectural component is found via trace-driven simulation, which also produces performance estimates. An example technology model is presented that estimates the technology component, along with critical delay time and real estate usage. This model is based on case studies of actual designs. It is used to solve an important problem: decreasing power consumption in a superscalar processor without greatly impacting performance. Results are presented from runs using simulated annealing to reduce power consumption subject to performance reduction bounds. The major contributions of this paper are the separation of architectural and technology components of dynamic power the use of trace-driven simulation for architectural power measurement, and the use of a near-optimal search to tailor a processor design to a benchmark  相似文献   

18.
FABSYN: floorplan-aware bus architecture synthesis   总被引:1,自引:0,他引:1  
As system-on-chip (SoC) designs become more complex, it is becoming harder to design communication architectures to handle the ever increasing volumes of inter-component communication. Manual traversal of the vast communication design space to synthesize a communication architecture that meets performance requirements becomes infeasible. In this paper, we address this problem by proposing an automated approach for floorplan-aware bus architecture synthesis (FABSYN) to synthesize cost-effective, bus-based communication architectures that satisfy the performance constraints in a design. Our synthesis approach incorporates a high-level floorplanning and wire delay estimation engine to evaluate the feasibility of the synthesized bus architecture and detect bus cycle time violations early in the design How, at the system level. We present case studies of network communication SoC subsystems for which we synthesized bus architectures, detected and eliminated timing violations, and generated core placements in a matter of hours instead of several days for a manual effort.  相似文献   

19.
层次化片上网络结构的簇生成算法   总被引:3,自引:1,他引:2       下载免费PDF全文
王宏伟  陆俊林  佟冬  程旭 《电子学报》2007,35(5):916-920
半导体工艺的发展及嵌入式电子产品复杂度的不断增长,系统芯片互连结构的吞吐量、功耗、信号完整性、延迟以及时钟同步等问题更加复杂.基于总线的片上通信结构不足以提供良好的通信能力,出现了以片上网络为核心的通信结构.本文提出了层次化片上网络设计中,根据实现工艺和应用需求,进行层次划分的簇生成算法.实验表明,通过使用该算法,能够有效的分配系统芯片的内部通信,提高系统性能,降低硬件实现开销,同时满足一定的服务质量需求.  相似文献   

20.
With the de facto transformation of technology into nano-technology, more and more functional components can be embedded on a single silicon die, thus enabling high degree pipelining operations such as those required for multimedia applications. In recent years, system-on-chip designs have migrated from fairly simple single processor and memory designs to relatively complicated systems with multiple processors, on-chip memories, standard peripherals, and other functional blocks. The communication between these IP blocks is becoming the dominant critical system path and performance bottleneck of system-on-chip designs. Network-on-chip architectures, such as Virtual Channel (2004), Black-bus (2004), Pirate (2004), AEthereal (2005), and VICHAR (2006) architectures, emerged as promising solutions for future system-on-chip communication architecture designs. However, these existing architectures all suffer from certain problems, including high area cost and communication latency and/or low network throughput. This paper presents a novel network-on-chip architecture, Pipelining Multi-channel Central Caching, to address the shortcomings of the existing architectures. By embedding a central cache into every switch of the network, blocked head packets can be removed from the input buffers and stored in the caches temporally, thus alleviating the effect of head-of-line and deadlock problems and achieving higher network throughput and lower communication latency without paying the price of higher area cost. Experimental results showed that the proposed architecture exhibits both hardware simplicity and system performance improvement compared to the existing network-on-chip architectures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号