首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The design, implementation, and experimental results for a ternary content addressable search engine chip, known as the Database Accelerator (DBA), are discussed. The DBA chip architecture is presented. It is well suited to serve as a coprocessor for a variety of logic search applications. The core of the DBA system is composed of novel high-density content addressable memory (CAM) cells capable of storing three states. The design of these cells and their support circuitry are described. The CAM cell and support circuitry were fabricated and their operation confirmed. The circuit implementation of the DMA data path is described with particular emphasis on the optimization of the multiple response resolver. The timing and control methodology, which simultaneously satisfies the complexity, speed, and robustness requirements of the DBA chip, are reported. Experimental DBA chip results that verify the full functionality and testability of the design are presented  相似文献   

2.
A 1.2-million transistor, 33-MHz, 20-b dictionary search processor (DISP) ULSI has been developed using a 0.8-μm triple-layer-Al, CMOS fabrication technology. A 13.02×12.51-mm2 chip contains a specially developed 160-kb content addressable memory (CAM) and cellular automation processor (CAP). A single DISP chip can store a maximum of 2048 words, and performs dictionary search in various search modes, including an approximate word search. The character input rate for the dictionary search operation is 33 million characters per second. The DISP typically consumes 800 mW at a supply voltage of 5 V. A high-speed, functional 50000 word dictionary search system can be built with 25 DISP chips arranged in parallel, to play an important role in natural language processing  相似文献   

3.
For real-time image-processing applications, a highly parallel system that exploits parallelism is desirable. A content addressable memory (CAM), or an associative processor, that can perform various types of parallel processing with words as the basic unit is a promising component for creating such a system because of its suitability for LSI implementation. Conventional CAM LSI's, however, have neither efficient function nor enough capacity for pixel-parallel processing. This paper describes a fully parallel 1-Mb CAM LSI. It has advanced functions for processing various pixel-parallel algorithms, such as mathematical morphology and discrete-time cellular neural networks. Moreover, since it has 16-K words, or processing elements (PEs), which can process 128×128 pixels in parallel, a board-sized pixel-parallel image-processing system can be implemented using several chips. A chip capable of operating at 56 MHz and 2.5 V was fabricated using 0.25-μm full-custom CMOS technology with five aluminum layers. A total of 15.5 million transistors have been integrated into a 16.1×17.0 mm chip. Typical power dissipation is 0.25 W. Processing performance of various update and data transfer operations is 3-640 GOPS. This CAM LSI will make a significant contribution to the development of compact, high-performance image-processing systems  相似文献   

4.
This paper describes the circuit technologies and the experimental results for a 1 Mb flash CAM, a content addressable memory LSI based on flash memory technologies. Each memory cell in the flash CAM consists of a pair of flash memory cell transistors. Additionally, four new circuit technologies have been developed: a small-size search sense amplifier; a highly parallel search management circuit; a high-speed priority encoder; and word line/bit line redundancy circuits for higher production yields. A cell size of 10.34 μm2 and a die size of 42.9 mm2 have been achieved with 0.8 μm design rules. Read access time and search access time are 115 ns and 135 ns, respectively, with a 5 V supply voltage. Power dissipation in 3.3 MHz operations is 210 mW in read and 140 mW in search access  相似文献   

5.
Translation functions in high-speed communications networks such as Internet protocol and asynchronous transfer mode are requiring larger and faster lookup tables. Content addressable memories (CAMs) provide built-in hardware lookup capability with high speed and high flexibility in address allocation. Previous high-capacity CAMs have been inadequate for emerging applications; comparators are time-shared among multiple bits or multiple words, resulting in serialized operation, Fully parallel architectures represent the best solution for high-speed operation, but previous fully parallel CAMs have lacked the capacity required for leading-edge networking applications. This paper describes a fully parallel (single-clock-cycle) CAM chip. The chip uses a 0.35-μm digital CMOS technology to achieve 2.5 Mb of CAM storage and 30-MHz operating frequency. Innovative layout techniques are used to achieve two-dimensional decoding, a traditional problem with high-capacity CAMs. Architecture and operation of the chip are described, including a novel NAND match architecture, operation-specific self-timing loops, and on-board cascade management circuits. The chip functions at 31 MHz, with a search access time of 26 ns and an average search power dissipation of 5.2 W at 25 MHz  相似文献   

6.
Using ternary content addressable memory (TCAM) for high-speed IP address lookup has been gaining popularity due to its deterministic high performance. However, restricted by the slow improvement of memory accessing speed, the route lookup engines for next-generation terabit routers demand exploiting parallelism among multiple TCAM chips. Traditional parallel methods always incur excessive redundancy and high power consumption. We propose in this paper an original TCAM-based IP lookup scheme that achieves both ultra-high lookup throughput and optimal utilization of the memory while being power-efficient. In our multi-chip scheme, we devise a load-balanced TCAM table construction algorithm together with an adaptive load balancing mechanism. The power efficiency is well controlled by decreasing the number of TCAM entries triggered in each lookup operation. Using four 133 MHz TCAM chips and given 25% more TCAM entries than the original route table, the proposed scheme achieves a lookup throughput of up to 533 MPPS while remains simple for ASIC implementation.  相似文献   

7.
A novel, low-energy content addressable memory (CAM) structure is presented which achieves an approximately four-fold improvement in energy per access, compared to a standard parallel CAM, when used as tag storage for caches. It exploits the address patterns commonly found in application programs, where testing the four least significant bits of the tag is sufficient to determine over 90% of the tag mismatches; the proposed CAM checks those bits first and evaluates the remainder of the tag only if they match. Although, the energy savings come at the cost of a 25% increase in search time, the proposed CAM organization also supports a parallel operating mode without a speed loss but with reduced energy savings.  相似文献   

8.
With a rapid increase in the data transmission link rates and an immense continuous growth in the Internet traffic, the demand for routers that perform Internet protocol packet forwarding at high speed and throughput is ever increasing. The key issue in the router performance is the IP address lookup mechanism based on the longest prefix matching scheme. Earlier work on fast Internet protocol version 4 (IPv4) routing table lookup includes, software mechanisms based on tree traversal or binary search methods, and hardware schemes based on content addressable memory (CAM), memory lookups and the CPU caching. These schemes depend on the memory access technology which limits their performance. The paper presents a binary decision diagrams (BDDs) based optimized combinational logic for an efficient implementation of a fast address lookup scheme in reconfigurable hardware. The results show that the BDD hardware engine gives a throughput of up to 175.7 million lookups per second (Ml/s) for a large AADS routing table with 33 796 prefixes, a throughput of up to 168.6 Ml/s for an MAE-West routing table with 29 487 prefixes, and a throughput of up to 229.3 Ml/s for the Pacbell routing table with 6822 prefixes. Besides the performance of the scheme, routing table update and the scalability to Internet protocol version 6 (IPv6) issues are discussed.  相似文献   

9.
A ternary content‐addressable memory (TCAM) is a popular hardware device for performing fast IP‐address lookup. Because keeping all entries sorted in TCAM, we need move the entries for inserting a new entry. In this paper, we have presented a scheme for minimizing route update overheads in TCAM‐based forwarding engines. Our optimizations are based on the hierarchy of prefixes in the routing table. The number of memory movement per update depends on the sequence of the new‐inserted prefixes, instead of the initial prefixes in routing table. For the real route update traces, the average number of movements is less than 0.01. Further, when compared to an existing optimization algorithm, in the average case, our algorithm shows a 90% reduction in movement overheads. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

10.
Ghosh  D. Daly  J.C. Fried  J. 《Electronics letters》1989,25(8):524-526
The design and performance of a content addressable memory (CAM) LSI using a newly developed cell circuit is presented. The LSI has all the functions necessary to implement a high-speed data searching system and is fabricated using a 3 mu m CMOS double-metallisation process. A cycle time of 60 ns with the basic associative operation taking 20 ns has been measured.<>  相似文献   

11.
We propose a new CAM architecture for the large-scale integration and low-power operation of a network router application. This CAM reduces entry count by an average of 52%, using a newly developed one-hot-spot block code. This code eliminates redundancy in a memory cell and improves the efficiency of IP address compression. To implement the proposed code, a hierarchical match-line structure and an on-chip entry compression/extraction scheme are introduced. With this architecture, a search-depth control scheme deactivates unnecessary search lines and reduces power consumption by 45%. Using a DRAM cell, our new content addressable memory (CAM) can achieve 1.5 million entries in 0.13-/spl mu/m technology, which is six times more than a conventional static ternary CAM.  相似文献   

12.
Lee  H.-J. 《Electronics letters》2008,44(4):269-270
Content addressable memory (CAM) is used in many applications. As the process technology scales into the deep sub-micron regime, soft error rate increases significantly. Densely integrated memory cells in CAM are prone to soft errors. Bit flipping in CAM leads to an incorrect search operation which could be fatal from a system point of view. The proposed scheme enables the detection of soft errors immediately and the correction of problems with small additional logic gates.  相似文献   

13.
A versatile data string-search VLSI has been fabricated using 1.6-μm CMOS technology. The VLSI consists of an 8 K content addressable memory (CAM) and a 20 K-gate finite-state automation logic (FSAL). A number of unique functions, such as strict/approximate-match string search and fixed/variable-length `don't care' operations, were implemented. A total of 217600 transistors have been integrated on an 8.62×12.76-mm die. The unique functions were efficiency tested by the scan path method. The data comparison rate was 5.12 billion characters/s in text-search application  相似文献   

14.
Scalable IP lookup for Internet routers   总被引:2,自引:0,他引:2  
Internet protocol (IP) address lookup is a central processing function of Internet routers. While a wide range of solutions to this problem have been devised, very few simultaneously achieve high lookup rates, good update performance, high memory efficiency, and low hardware cost. High performance solutions using content addressable memory devices are a popular but high-cost solution, particularly when applied to large databases. We present an efficient hardware implementation of a previously unpublished IP address lookup architecture, invented by Eatherton and Dittia (see M.S. thesis, Washington Univ., St. Louis, MO, 1998). Our experimental implementation uses a single commodity synchronous random access memory chip and less than 10% of the logic resources of a commercial configurable logic device, operating at 100 MHz. With these quite modest resources, it can perform over 9 million lookups/s, while simultaneously processing thousands of updates/s, on databases with over 100000 entries. The lookup structure requires 6.3 bytes per address prefix: less than half that required by other methods. The architecture allows performance to be scaled up by using parallel fast IP lookup (FIPL) engines, which interleave accesses to a common memory interface. This architecture allows performance to scale up directly with available memory bandwidth. We describe the tree bitmap algorithm, our implementation of it in a dynamically extensible gigabit router being developed at Washington University in Saint Louis, and the results of performance experiments designed to assess its performance under realistic operating conditions.  相似文献   

15.
Many applications would benefit from the availability of large-capacity content addressable memories (CAMs). However, while RAMs, EEPROMs, and other memory types achieve ever-increasing per-chip bit counts, CAMs show little promise of following suit, due primarily to an inherent difficulty in implementing two-dimensional decoding. The serialized operation of most proposed solutions is not acceptable in speed-sensitive environments. In response to the resulting need, this paper describes a fully-parallel (single-clock-cycle) CAM architecture that uses the concept of “preclassification” to realize a second dimension of decoding without compromising throughput. As is typically the case, each CAM entry is used as an index to additional data in a RAM. To achieve improved system integration, the preclassified CAM is merged into the same physical array as its target RAM, and both use the same core cells. Architecture and operation of the resulting novel memory are described, as are two critical-path circuits: the match-line pull-down and the multiple match resolver. The memory circuits, designed in 0.8 μm BiCMOS technology, may be employed in chips as large as 1 Mb, and simulations confirm 37 MHz operation for this capacity. To experimentally verify the feasibility of the architectural and circuit design, an 8 kb test chip was fabricated and found to be fully functional at clock speeds up to 59 MHz, with a power dissipation of 260 mW at 50 MHz  相似文献   

16.
全息存储进行查表的实验结果分析   总被引:2,自引:0,他引:2  
本文设计了一种利用全息的关联特性,互补编码和阈值判断的内容寻址光查表系统,利用基本的全息存储的异或运算,演示了模4的余数加法,并对铁电晶体多次存储的动态特性所导致的曝光次数,时间和衍射效率的关系,给出了简单的物理模型。  相似文献   

17.
This paper demonstrates a keyword match processor capable of performing fast dictionary search with approximate match capability. Using a content addressable memory with processor element cells, the processor can process arbitrary sized keywords and match input text streams in a single clock cycle. We present an architecture that allows priority detection of multiple keyword matches on single input strings. The processor is capable of determining approximate match and providing distance information as well. A 64-word design has been developed using 19,000 transistors and it could be expanded to larger sizes easily. Using a modest 0.5 μm process, we are achieving cycle times of 10 ns and the design will scale to smaller feature sizes.  相似文献   

18.
软件定义网络一致性协同更新算法   总被引:1,自引:0,他引:1       下载免费PDF全文
于倡和  兰巨龙  胡宇翔 《电子学报》2018,46(10):2341-2346
为实现软件定义网络的一致性更新,本文提出一种协同利用分段路由、顺序更新、两步复制三种机制的更新算法.算法首先启用分段路由机制,尝试用现有路径规则拼接待更新数据流的最终路径,并根据最终路径是否能由现有规则拼接,将数据流分为可拼接与不可拼接两种.对于可拼接流,分段路由可将最终路径信息封装入数据包包头,使得数据包能立即沿最终路径转发.对于不可拼接流,算法计算最长一致性更新序列,并按照此序列依次更新节点,最后利用两步复制机制来完成剩余未更新节点的更新.并且经实验验证,算法比之前研究提出的算法不仅消耗更少的三态内容寻址存储器的空间资源,并且有更好的适用性与稳定性.  相似文献   

19.
A novel approach to charge-coupled device (CCD) memory organization has been conceived and implemented in a 16 384-bit memory chip. It utilizes an isoplanar n-channel silicon gate MOS process in conjunction with self-aligned implanted barrier, buried channel CCD technology. The chip is organized in four parallel, identical sections of 32 independent lines with each line 128 bits long. The four sections are controlled in parallel. Any of the 32 lines (the same line in each of the four sections) can be randomly accessed; hence the name, line addressable random-access memory (LARAM). Each line can be brought to a halt at any of its 128 possible positions. Design features and test results of the memory are described.  相似文献   

20.
《Microelectronics Journal》2014,45(8):1118-1124
A novel nanoelectronic single-electron content addressable memory is designed and simulated. The proposed memory has three important building blocks: a storage block, a comparison block and an addressing block. These building blocks were built based on single-electron circuits such as Reset-Set latches, exclusive-or gates and a WTA neural network. Each one of the building blocks was separately adjusted to provide room temperature operation before being connected together. Some analyses concerning stability of each block and of the whole memory circuit were made. The nanoelectronic memory was successfully validated by simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号