首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The advent of various real-time multimedia applications in high-speed networks creates a need for quality of service (QoS) based multicast routing. The Steiner tree problem, is a well-known NP-complete problem, provides the mathematical structure behind multicast communications. Two important QoS constraints are the bandwidth constraint and the end-to-end delay constraint. In this paper, we propose various algorithms to solve the bandwidth-delay-constrained least-cost multicast routing problem based on Tabu Search (TS), addressing issues of the selected initial solution and move type as two major building blocks in short-term memory version of Tabu Search and longer-term memory with associated intensification and diversification strategies as advanced Tabu Search techniques. We evaluate the performance and efficiency of the proposed TS-based algorithms in comparison with other existing TS-based algorithms and heuristics on a variety of random generated networks with regard to total tree cost. Finally we identify the most efficient algorithm uncovered by our testing.  相似文献   

2.
This paper describes a system integrated memory with direct interface to CPU which integrates an SRAM, a DRAM, and control circuitry, including a tag memory (TAG). This memory realizes a computer system without glue chips, and thus enables a computer system which is low cost, low power, and compact size, but still with sufficient performance. Also fast clock cycle time and access time is realized using a newly proposed clock driver and internal signal generator. This memory is fabricated with a quad-polysilicon double-metal 0.55-μm CMOS process which is the same as used in a conventional 16-Mb DRAM. The chip size of 145.3 mm2 is only a 12% increase over the conventional 16-Mb DRAM. The maximum operating frequency is 90-MHz and the operating current at cache-bit is 156-mA. This memory is suitable for various types of computer systems such as personal digital assistants (PDA's), personal computer systems, and embedded controller applications  相似文献   

3.
嵌入式系统中存储器性能研究   总被引:1,自引:0,他引:1  
动态随机存储器是嵌入式系统的一个重要组成部分,而动态随机存储器故障是嵌入式系统故障的一个主要原因之一。在此从动态随机存储器的结构和失效模型出发,有针对地提出了用于检测性能的数据和读写方式,实验证明通过提出的检测方法能够有效地找出潜在的存储器故障,从而能够为嵌入式系统设计人员提供改善系统性能的方法和途径。  相似文献   

4.
This paper describes key techniques for a 1.6-GB/s high data-rate 1-Gb synchronous DRAM (SDRAM). Its high data transfer rate and large memory capacity are targeted to the advanced unified memory system in which a single DRAM (array) is used as both the main memory and the three-dimensional (3-D) graphics frame memory in a time sharing fashion. The 200-MHz high-speed operation is achieved by the unique hierarchical square-shaped memory block (SSMB) layout and the novel distributed bank (D-BANK) architecture. A 0.29 μm2 cell and 581.8 mm2 small die area are achieved using 0.15-μm CMOS technology. The ×61 chip uses 196-pin BGA type chip-scale-package (CSP). Implementation of a built-in self-test (BIST) circuit with a margin test capability is also described  相似文献   

5.
Embedded and portable systems running multimedia applications create a new challenge for hardware architects. A microprocessor for such applications needs to be easy to program like a general-purpose processor and have the performance and power efficiency of a digital signal processor. This paper presents the codevelopment of the instruction set, the hardware, and the compiler for the Vector IRAM media processor. A vector architecture is used to exploit the data parallelism of multimedia programs, which allows the use of highly modular hardware and enables implementations that combine high performance, low power consumption, and reduced design complexity. It also leads to a compiler model that is efficient both in terms of performance and executable code size. The memory system for the vector processor is implemented using embedded DRAM technology, which provides high bandwidth in an integrated, cost-effective manner. The hardware and the compiler for this architecture make complementary contributions to the efficiency of the overall system. This paper explores the interactions and tradeoffs between them, as well as the enhancements to a vector architecture necessary for multimedia processing. We also describe how the architecture, design, and compiler features come together in a prototype system-on-a-chip, able to execute 3.2 billion operations per second per watt  相似文献   

6.
Enabled by the emerging three-dimensional (3D) integration technologies, 3D integrated computing platforms that stack high-density DRAM die(s) with a logic circuit die appear to be attractive for memory-hungry applications such as multimedia signal processing. This paper considers the design of motion estimation accelerator under a 3D logic-DRAM integrated heterogeneous multi-core system framework. In this work, we develop one specific DRAM organization and image frame storage strategy geared to motion estimation. This design strategy can seamlessly support various motion estimation algorithms and variable block size with high energy efficiency. With a DRAM performance modeling/estimation tool and ASIC design at 65 nm, we demonstrate the energy efficiency of such 3D integrated motion estimation accelerators with a case study on HDTV multi-frame motion estimation.  相似文献   

7.
Recently, as multimedia large scale integrated devices (LSIs) have developed, there has been strongly increased demand for high-speed/high-bandwidth LSIs which integrate the DRAM core and logic elements (CPU etc.). However, the high-speed/high-bandwidth operation induces the large switching noise. This noise degrades the DRAM's operating margin, and especially its data retention characteristics. In this paper, we analyze the noise transmission model and propose DRAM and logic compatible design methodologies to maintain the reliability of high-speed/high-bandwidth system LSIs. We also show that good experimental results are obtained on the test device. Furthermore, we propose the most suitable VDD/GND line scheme for on-chip DRAM system LSI  相似文献   

8.
Multicasting applications such as multimedia conferencing, online multiplayer interactive games, and distance learning are becoming increasingly popular. With multiprotocol label switching, Internet protocol networks can offer quality of service and traffic engineering capabilities. This article introduces several approaches for multisource multicast sessions in the context of IP over WDM networks and evaluates their performance in terms of blocking probability, time complexity, and memory consumption. Our simulation study shows that among all the approaches, the newly proposed approach, known as one Bidirectional Tree with Just enough bandwidth reserved on each link of the tree, achieves the best overall performance.  相似文献   

9.
Automotive applications would greatly benefit of multimedia telematic services for many purposes, from tourism and entertainment, to most important issues such as security and traffic management. Within this context, the AIDER system (AIDER is the acronym of Accident Information and Driver Emergency Rescue) is one of the most advanced multimedia mobile services targeted at emergency situations such as road accidents. The AIDER allows the interactive exchange of multimedia data (and in particular, audio, video and biomedical information) between the vehicle and a remote rescue centre, by using several different narrowband radio channels including cellular networks and satellite. In this paper an overview of the AIDER architecture is provided, focusing on the advanced video communication system.  相似文献   

10.
Memory latency has always been a major issue in embedded systems that execute memory-intensive applications. This is even more true as the gap between processor and memory speed continues to grow. Hardware and software prefetching have been shown to be effective in tolerating the large memory latencies inherit in large off-chip memories; however, both types of prefetching have their shortcomings. Hardware schemes are more complex and require extra circuitry to compute data access strides, while software schemes generate prefetch instructions, which if not computed carefully may hamper performance. On the other hand, some applications domains (such as multimedia) have a uniform and known a priori memory access pattern, that if exploited, could yield significant application performance improvement. With this characteristic in mind, we present our findings on hiding memory latency using the direct memory access (DMA) mode, which is present in all modern systems, combined with a software prefetch mechanism, and a customized on-chip memory hierarchy mapping. Compared to previous approaches, we are able to estimate the performance and power metrics, without actually implementing the embedded system. Experimental results on nine well known multimedia and imaging applications prove the efficiency of our technique. Finally, we verify the performance estimations by implementing and simulating the algorithms on the TI C6201 processor.  相似文献   

11.
Low-noise, high-speed circuit techniques for high-density DRAMs (dynamic random-access memories), as well as their application to a single 5-V 16-Mb CMOS DRAM with a 3.3-V internal operating voltage for a memory array, are described. It was found that data-line interference noise becomes unacceptably high (more than 25% of the signal) and causes a serious problem in 16-Mb DRAM memory arrays. A transposed data-line structure is proposed to eliminate the noise. Noise suppression below 5% is confirmed using this transposed data-line structure. A current sense amplifier is also proposed to maintain the data-transmission speed in common I/O lines, in spite of a reduced operating voltage and increased parasitic capacitance loading in the memory array. A speed improvement of 10 ns is achieved. Using these circuit techniques, a 16-Mb CMOS DRAM with a typical RAS access time of 60 ns was realized  相似文献   

12.
A high-speed DRAM data transfer scheme between DRAM and logic parts in merged DRAM logic (MDL) designs is proposed with logically divided DRAM row address mapping. The proposed scheme results in a 20% faster write access and 40% faster read access. It can be used as a general design framework to maximise DRAM access speed in various MDL designs. A test chip has been fabricated by 0.16 μm DRAM technology, and the scheme has been verified in the design of a DRAM L2 cache memory  相似文献   

13.
A cache DRAM which consists of a dynamic RAM (DRAM) as main memory and a static RAM (SRAM) as cache memory is proposed. An error checking and correcting (ECC) scheme utilizing the wide internal data bus is also proposed. It is constructed to be suitable for a four-way set associated cache scheme with more than a 90% hit rate estimated to be obtained. An experimental cache DRAM with 1-Mb DRAM and 8-kb SRAM has been fabricated using a 1.2-μm, triple-polysilicon, single-metal CMOS process. A SRAM access time of 12 ns and a DRAM access time of 80 ns, including an ECC time of 12 ns, have been obtained. Accordingly, an average access time of 20 ns is expected under the condition that the hit rate is 90%. The cache DRAM has a high-speed data mapping capability and high reliability suitable for low-end workstations and personal computers  相似文献   

14.
We have developed a 0.25-μm, 200-MHz embedded RISC processor for multimedia applications. This processor has a dual-issue superscalar datapath that consists of a 32-bit integer unit and a 64-bit single-instruction multiple-data (SIMD) function unit that together have a total of five multiply-adders. An on-chip concurrent Rambus DRAM (C-RDRAM) controller uses interleaved transactions to increase the memory bandwidth of the Rambus channel to 533 Mb/s. The controller also reduces latency by using the transaction interleaving and instruction prefetching. A 64-bit, 200-MHz internal bus transfers data among the CPU core, the C-RDRAM, and the peripherals. These high-data-rate channels improve CPU performance because they eliminate a bottleneck in the data supply. The datapath part of this chip was designed using a functional macrocell library that included placement information for leaf cells and resulted in the SIMD function unit of this chip's having 68000 transistors per square millimeter  相似文献   

15.
New investigations are presented here on a high-density and DRAM-like high-speed non-volatile memory (NVM) application of unified RAM (URAM). For a high-density application of URAM, multiple data storage is demonstrated with a multi-dual cell (MDC). Because each NVM state can be split by programming with a one-transistor (1T) DRAM without a capacitor, the total number of memory states can be doubled. Furthermore, a high-speed DRAM-level NVM scheme is proposed for the joint operation of 1T DRAM buffer programming and NVM post-background programming. The MDC and the proposed scheme are unique URAM properties that can extend the application range of memory devices.  相似文献   

16.
A modular architecture for a DRAM-integrated, multimedia chip with a data transfer rate of 6 to 12 Gbyte/s is proposed. The architecture offers the design flexibility in terms of both DRAM capacity and the logic-memory interface for use in a wide variety of applications. A DRAM macro built from cascadable DRAM bank modules having a 256-kb memory capacity and 128-b I/Os provides flexibility and reconfigurability of DRAM capacity and a high data transfer rate with an area of 6.4 mm2 /Mb. A data transfer circuit (called the “reconfigurable data I/O attachment”), which is attached to the I/O lines of the DRAM macro, provides a flexible logic-memory interface by changing the data-transfer routes between the DRAM macro and logic circuits in real time. A 6.4-Gbyte/s test chip (called the “media chip”) for three-dimensional computer graphics was fabricated to test the proposed design methodology. It integrates an 8-Mb DRAM and four pixel processors on an 8.35×14.6-mm chip by using a 0.4-μm CMOS design rule  相似文献   

17.
Chip-on-heat sink leadframe (COHS-LF) packages offer a simple, low-cost chip encapsulation structure with advanced electrical and thermal performance for high-speed integrated circuit applications. The COHS-LF package is a novel solution to the problems of increased power consumption and signal bandwidth demands that result from high-speed data transmission rates. Not only does it offer high thermal and electrical performance, but also provides a low-cost short time-to-market package solution for high-speed applications. In general, there are two main memory packages employed by the most popular high-speed applications, double data rate (DDR) SDRAM. One is the cheaper, higher parasitic leadframe packages, such as the thin small outline packages (TSOPs), and the other is the more expensive, lower parasitic substrate-based packages, such as the ball grid array (BGA). Due to the requirement for higher ambient temperature and operating frequency for high-speed devices, DDR2 SDRAM packages were switched from conventional TSOPs to more expensive chip-scale packages (i.e., BGA) with lower parasitic effects. And yet, by using an exposed heat sink pasted on the surface of the chip and packed in a conventional leadframe package, the COHS-LF is a simpler, lower cost design. Results of a three-dimensional full-wave electromagnetic field solver and SPICE simulator tests show that the COHS-LF package achieves less signal loss, propagation delay, edge rate degradation, and crosstalk than the BGA package. Furthermore, transient analysis using the wideband T-3/spl pi/ models optimized up to 5.6 GHz for signal speeds as high as 800 Mb/s/lead demonstrates the accuracy of the equivalent circuit model and reconfirms the superior electrical characteristics of COHS-LF package.  相似文献   

18.
This paper proposes the virtual-socket architecture in order to reduce the design turn-around time (TAT) of the embedded DRAM. The required memory density and the function of the embedded DRAM are system dependent. In the conventional design, the DRAM control circuitry with the DRAM memory array is handled as a hardware macro, resulting in the increase in design TAT. On the other hand, our proposed architecture provides the DRAM control circuitry as a software macro to take advantage of the automated tools based on synchronous circuit design. With array-generator technology, this architecture can achieve high quality and quick turn-around time (QTAT) of flexible embedded DRAM that is almost the same as the CMOS ASIC. We applied this virtual-socket architecture to the development of the 61-Mb synchronous DRAM core using 0.18-μm design rule and confirmed the high-speed operation, 166 MHz at CAS latency of two, and 180 MHz at that of three. The experimental results show that our proposed architecture can be applied to the development of the high-performance embedded DRAM with design QTAT  相似文献   

19.
Recently, the level of realism in PC graphics applications has been approaching that of high-end graphics workstations, necessitating a more sophisticated texture data cache memory to overcome the finite bandwidth of the AGP or PCI bus. This paper proposes a multilevel parallel texture cache memory to reduce the required data bandwidth on the AGP or PCI bus and to accelerate the operations of parallel graphics pipelines in PC graphics cards. The proposed cache memory is fabricated by 0.16-μm DRAM-based SOC technology. It is composed of four components: an 8-MB DRAM L2 cache, 8-way parallel SRAM L1 caches, pipelined texture data filters, and a serial-to-parallel loader. For high-speed parallel L1 cache data replacement, the internal bus bandwidth has been maximized up to 75 GB/s with a newly proposed hidden double data transfer scheme. In addition, the cache memory has a reconfigurable architecture in its line size for optimal caching performance in various graphics applications from three-dimensional (3-D) games to high-quality 3-D movies  相似文献   

20.
Today, high-speed multimedia services are becoming one of the most challenging demands of cellular subscribers. Many approaches have been proposed during last decade and still there are many ongoing researches aiming to provide higher speed and higher quality services for users. Multihop cellular network (MCN) is a promising solution to this problem. MCNs offer high-speed communication to mobile stations which are far from their base stations by relaying their data in a multihop connection. Moreover, TDD-CDMA networks are proven to be able to accommodate the asymmetric traffic generated by applications such as Internet browsers or multimedia applications. These types of traffic are most of the time biased towards uplink or downlink. In this paper we deploy multihop relaying in TDD-CDMA networks in order to provide high data rate connections targeting for multimedia applications. We propose time slot allocation schemes for TDD-CDMA MCNs and compare them with conventional schemes used in TDD-CDMA single-hop cellular networks in terms of users’ data rate and maximum capacity in different symmetric and asymmetric traffic scenarios. Results of this study show that our proposed schemes can provide higher data rate as well as capacity and outperform other conventional schemes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号