首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
A newly designed three dimensional (3-D) memory die stack package has been established, and the prototype of the 3-D package using mechanical dies has been successfully demonstrated. Fabrication processes of the 3-D package consist of: (1) wafer cutting into die segments; (2) die passivation including sidewall insulation; (3) via opening on the original I/O pads; (4) I/O redistribution from center pads to sidewall; (5) bare die stacking using polymer adhesive; (6) sidewall interconnection; and (7) solder balls attachment. There are several significant improvements in this new 3-D package design compared with the current 3-D package concept. The unique feature of this newly developed package is the sidewall insulation of dies prior to the I/O redistribution of dies, which produces (1) better chip-to-wafer yields and (2) significant process simplification during subsequent fabrication steps. According to this design, 100% of die yields on a conventional wafer design can be obtained without any neighboring die losses which usually occur during the I/O redistribution processes of conventional 3-D package design. Furthermore, the new 3-D package design can simplify the following processes such as I/O redistribution, sidewall insulation, sidewall interconnection, and package formation. It is proven that the mechanical integrity of the prototype 3-D stacked package meets requirements of the JEDEC Level III and 85°C/85% test  相似文献   

2.
Three-dimensional discrete wavelet transform architectures   总被引:2,自引:0,他引:2  
The three-dimensional (3-D) discrete wavelet transform (DWT) suits compression applications well, allowing for better compression on 3-D data as compared with two-dimensional (2-D) methods. This paper describes two architectures for the 3-D DWT, called the 3DW-I and the 3DW-II. The first architecture (3DW-I) is based on folding, whereas the 3DW-II architecture is block-based. Potential applications for these architectures include high definition television (HDTV) and medical data compression, such as magnetic resonance imaging (MRI). The 3DW-I architecture is an implementation of the 3-D DWT similar to folded 1-D and 2-D designs. It allows even distribution of the processing load onto 3 sets of filters, with each set performing the calculations for one dimension. The control for this design is very simple, since the data are operated on in a row-column-slice fashion. Due to pipelining, all filters are utilized 100% of the time, except for the start up and wind-down times. The 3DW-II architecture uses block inputs to reduce the requirement of on-chip memory. It has a central control unit to select which coefficients to pass on to the lowpass and highpass filters. The memory on the chip will be small compared with the input size since it depends solely on the filter sizes. The 3DW-I and 3DW-II architectures are compared according to memory requirements, number of clock cycles, and processing of frames per second. The two architectures described are the first 3-D DWT architectures  相似文献   

3.
Three‐dimensional (3D) memories using through‐silicon vias (TSVs) as vertical buses across memory layers will likely be the first commercial application of 3D integrated circuit technology. The memory dies to stack together in a 3D memory are selected by a die‐selection method. The conventional die‐selection methods do not result in a high‐enough yields of 3D memories because 3D memories are typically composed of known‐good‐dies (KGDs), which are repaired using self‐contained redundancies. In 3D memory, redundancy sharing between neighboring vertical memory dies using TSVs is an effective strategy for yield enhancement. With the redundancy sharing strategy, a known‐bad‐die (KBD) possibly becomes a KGD after bonding. In this paper, we propose a novel die‐selection method using KBDs as well as KGDs for yield enhancement in 3D memory. The proposed die‐selection method uses three search‐space conditions, which can reduce the search space for selecting memory dies to manufacture 3D memories. Simulation results show that the proposed die‐selection method can significantly improve the yield of 3D memories in various fault distributions.  相似文献   

4.
Three‐dimensional (3D) memories using through‐silicon vias (TSVs) will likely be the first commercial applications of 3D integrated circuit technology. A 3D memory yield can be enhanced by vertical redundancy sharing strategies. The methods used to select memory dies to form 3D memories have a great effect on the 3D memory yield. Since previous die‐selection methods share redundancies only between neighboring memory dies, the opportunity to achieve significant yield enhancement is limited. In this paper, a novel die‐selection method is proposed for multi‐layer 3D memories that shares redundancies among all of the memory dies by using additional TSVs. The proposed method uses three selection conditions to form a good multi‐layer 3D memory. Furthermore, the proposed method considers memory fault characteristics, newly detected faults after bonding, and multiple memory blocks in each memory die. Simulation results show that the proposed method can significantly improve the multi‐layer 3D memory yield in a variety of situations. The TSV overhead for the proposed method is almost the same as that for the previous methods.  相似文献   

5.
Going vertical as in 3-D IC design, reduces the distance between vertical active silicon dies, allowing more dies to be placed closer to each other. However, putting 2-D IC into three-dimensional structure leads to thermal accumulation due to closer proximity of active silicon layers. Also the top die experiences a longer heat dissipation path. All these contribute to higher and non-uniform temperature variations in 3-D IC; higher temperature exacerbates negative bias temperature instability (NBTI). NBTI degrades CMOS transistor parameters such as delay, drain current and threshold voltage. While the impact of transistor aging is well understood from the device point of view, very little is known about its impact on security. We demonstrated that a hardware intruder could leverage this phenomenon to trigger the payload, without requiring a separate triggering circuit. In this paper we provide a detailed analysis on how tiers of 3-D ICs can be subject to exacerbated NBTI. We proposed to embed threshold voltage extractor circuit in conjunction with a novel NBTI-mitigation scheme as a countermeasure against such anticipated Trojans. We validated through post-layout and Monty Carlo simulations using 45 nm technology that our proposed solution against NBTI effects can compensate the NBTI-effects in the 3-D ICs. With the area overhead of 7% implemented in Mod-3 counter, our proposed solution can completely tolerate NBTI-induced degraded threshold voltage shift of up to 60%.  相似文献   

6.
本文提出了一种二维OCT快速算法的FPGA实现结构,采用行列快速算法将二维DCT分解成两个一维DCT实现,其中一维DCT借鉴Loeffler DCT算法,采用并行的流水线结构,提高电路的数据吞吐率和运算速度,通过系数矩阵的简化和蝶形运算结构的等价减少乘法器的消耗,一维DCT核消耗16个乘法器.转置RAM采用8片双口RAM,一个时钟可以完成 8个数据读写.实验结果验证了二维DCT核设计的正确性,该电路结构消耗资源少,布线简单,功耗小,适合图像的实时处理.  相似文献   

7.
Three-dimensional chip (3-D) stacking technology provides a new approach to address the so-called memory wall problem. Memory processor chip stacking reduces this memory wall problem, permitting faster clock rates (with suitable processor logic) or permitting multicore access to shared memory using a large number of vertical vias between tiers in the stack, for ultrawide bit path transfer of data and address information to and from various levels of cache. Although a limited amount of parallel access is possible using conventional two-dimensional (2-D) chip memory-processor approaches, 3-D memory-processor stacking greatly extends this to much larger capacity memories. We evaluate high-clock-rate processors as well as shared memory processors with a large number of cores. Various architectural design options to reduce the impact of the memory wall on the processor performance are explored and validated through simulations. Certain architectural features can be implemented in a 3-D chip, such as an ultrawide, ultrashort vertical bus with low parasitic resistance and the elimination of conventional electrostatic discharge, and packaging parasitics required in multiple package 2-D solutions. The objective is to reduce the clocks per instruction figure of merit for high clock speeds in order to deliver significant performance levels. High-clock-rate processors can be designed with SiGe heterostructure bipolar transistors to obtain processors operating on the order of 16 or 32 GHz.   相似文献   

8.
Two-dimensional (2-D) filters for video signal processing typically operate at high uniform sampling rates and require very large delay-line (DL) memory blocks. By employing 2-D multirate signal processing techniques to reduce the sampling rate, not only the DL memory blocks can be downsized to save silicon area, but also the memory access time can be increased to save power as well. This is demonstrated in this paper considering a 2-D switched-capacitor multirate image processor that realizes (2×2)nd-order recursive low-pass and high-pass filtering functions employing half of the storage cells that would be needed in a nonmultirate system. Only one type of operational transconductance amplifier with 1-mS transconductance and 120-MHz unity gain bandwidth is needed for both the vertical filter and associated DL memory blocks and the horizontal decimating filter. Fully differential circuit techniques are employed to increase immunity to charge feedthrough injection in the analog storage cells. The complete system has been implemented in a CMOS 1.0-μm double-poly technology. The core active area is only 2.5×3.0 mm2, and at 5-V supply and 18-MHz sampling it dissipates 85 mW  相似文献   

9.
FAIR: a hardware architecture for real-time 3-D image registration   总被引:2,自引:0,他引:2  
Mutual information-based image registration, shown to be effective in registering a range of medical images, is a computationally expensive process, with a typical execution time on the order of minutes on a modern single-processor computer. Accelerated execution of this process promises to enhance efficiency and therefore promote routine use of image registration clinically. This paper presents details of a hardware architecture for real-time three-dimensional (3-D) image registration. Real-time performance can be achieved by setting up a network of processing units, each with three independent memory buses: one each for the two image memories and one for the mutual histogram memory. Memory access parallelization and pipelining, by design, allow each processing unit to be 25 times faster than a processor with the same bus speed, when calculating mutual information using partial volume interpolation. Our architecture provides superior per-processor performance at a lower cost compared to a parallel supercomputer.  相似文献   

10.
In this paper, we propose an efficient pipeline architecture for the DWT 9/7 filter defined in JPEG 2000. The proposed architecture is composed of column and row processors to perform the separable 2-D DWT. Based on the rescheduling DWT algorithm, we derive a new data flow graph to shorten the critical path. The proposed 1-D column processor requires less pipeline registers to achieve about the same critical path compared with other lifting-based architectures. For the row processor, the data dependency of each lifting step is reduced to only two computation nodes and therefore more pipeline registers can be applied to achieve higher processing speed without increasing the internal memory size in the 2-D case. That is, for an N × N image, it only requires 4N internal memory to perform the row-wise transform. For the memory bit-width analysis, we use software simulation to reduce the memory bit-width for various compression ratios. Since a portion of information from least significant bits of DWT coefficients would be discarded after EBCOT-tier2 processing, one can decrease the data width of internal memory to perform various compression ratios of JPEG 2000 coding, especially at the low-bit rates. Our simulation results suggest that it is practically possible to design the energy-aware memory architecture to further reduce the power consumption in the future work.  相似文献   

11.
刘军  朱承强  吴玺  王伟  任福继 《电子学报》2018,46(3):629-635
存储裸片堆叠方案和冗余共享策略对提高三维存储器成品率有重要影响.为提高三维存储器的成品率并且减少行列冗余所需的TSVs数量,提出了一种相邻层冗余共享结构.该冗余共享结构使得每层存储裸片的行列冗余不仅能被本层使用,而且能被相邻层使用.并在此结构的基础上,提出了一种新的存储裸片堆叠方案.该方案通过构建存储裸片的选择限制条件,每次选中适合的存储裸片来堆叠三维存储器以充分利用行列冗余.实验结果表明,与国际上同类方法相比,所提方案有效地提高了三维存储器的成品率,并且减少了行列冗余所需的TSVs数量.  相似文献   

12.
This paper considers two-dimensional (2-D) retiming, which is the problem of retiming circuits that operate on 2-D signals. We begin by discussing two types of parallelism available in 2-D data processing, which we call inter-iteration parallelism and inter-operation parallelism. We then present two novel techniques for 2-D retiming that can be used to extract inter-operation parallelism. These two techniques are designed to minimize the amount of memory required to implement a 2-D data-flow graph while maintaining a desired clock rate for the circuit. The first technique is based on an integer linear programming (ILP) formulation of the problem, and is called ILP 2-D retiming. This technique considers the entire 2-D retiming problem as a whole, but long central processing unit times are required if the circuit is large. The second technique, called orthogonal 2-D retiming, is a linear programming formulation which is derived by partitioning ILP 2-D retiming into two parts called s- and a-retiming. This technique finds a solution in polynomial time and is much faster than the ILP 2-D retiming technique, but the two sub problems (s- and a-retiming) can give results which are not compatible with one another. To solve this incompatibility problem, a variation of orthogonal 2-D retiming called integer orthogonal 2-D retiming is developed. This technique runs in polynomial time and the s-retiming and a-retiming steps are guaranteed to give compatible results. We show that the techniques presented in this paper can result in memory hardware savings of 50% compared to previously published 2-D retiming techniques  相似文献   

13.
One major issue in designing image processors is to design a memory system that supports parallel access with a simple interconnection network. This paper presents an efficient memory allocation to minimize the number of memory modules and processing elements with a parallel access capability when multiple windows with arbitrary shapes are specified. This paper also presents an efficient search method based on regularity of window-type image processing. We give some practical examples including a stereo-matching processor for acquiring 3-D information, and an optical-flow processor for motion estimation. These examples show that the numbers of memory modules are reduced to 2.7% and 10%, respectively, in comparison with a basic approach. It is also shown that the search time is less than 1 ms for practical image sizes and window sizes.   相似文献   

14.
An architecture and a design for a high-speed CMOS digital convolver which can be used for real-time one-dimensional (1-D) and two-dimensional (2-D) signal processing are presented. In the 2-D mode this device can be used to convolve 10-bit image data with a 3×3 or 2×5 2-D eight-bit-per-coefficient impulse response at 20 M samples/s throughput. In 1-D applications it can be used as a ten-tap finite-impulse response (FIR) filter. Devices can be cascaded to increase the order of the convolution reference in both dimensions  相似文献   

15.
为了高质量地将目标图像标刻在任意曲面上, 需要将计算机中排版的2维数据转换为3维数据从而形成具体振镜加工轨迹数据。在大量激光标刻实验测量与计算的基础上, 采用一种对任意自由曲面进行最小化失真展开的区域重点最小二乘共形展开(FPLSCM)算法, 完成了实用的3维振镜激光加工系统软件设计, 并进行了理论分析和实验验证, 取得了大量标刻测量数据。结果表明, 在z轴上下300mm范围内, 本算法能有效将由于高度与待加工表面形状差异带来的标刻畸变控制在1%左右, 并且在各类3维曲面上都得到了很好的加工效果。此研究成功实现了在自由3维曲面上利用动态聚焦技术完成各种表面加工的工作, 并通过优化算法有效减少了畸变。  相似文献   

16.
Multiview video coding (MVC) plays an important role in a 3-D video system. In addition, the resolution of HDTV is increasing to present more vivid perception for users. To realize real-time processing of dozens of TOPS, VLSI solution is necessary. However, ultra high computational complexity, a large amount of external memory bandwidth and on-chip SRAM size, and complex MVC prediction structures are three main design challenges of implementation of MVC hardware architecture. In this paper, an MVC single-chip encoder is proposed for H.264/AVC Multiview High Profile and High Profile for 3-D and quad full high definition (QFHD) TV applications, respectively. The 4096 × 2160 p multiview video encoder chip is implemented on a 11.46 mm2 die with 90 nm CMOS technology. An eight-stage macroblock pipelined architecture with proposed system scheduling and cache-based prediction core supports real-time processing from one-view 4096 × 2160 p to seven-view 720 p videos. The 212 Mpixels/s throughput is 3.4 to 7.7 times higher than previous work. The 407 Mpixels/W power efficiency is achieved, and 94% on-chip SRAM size and 79% external memory bandwidth are saved by the proposed techniques.  相似文献   

17.
In this paper, the novel two-dimensional (2-D) fast algorithm for realization of 4 /spl times/ 4 forward integer transform in H.264 is proposed. Based on matrix operations with Kronecker product and direct sum, the efficient fast 2-D 4 /spl times/ 4 forward integer transform can be derived from the proposed one-dimensional fast 4 /spl times/ 4 forward integer transform through matrix decompositions. The proposed fast 2-D 4 /spl times/ 4 forward integer transform design doesn't need transpose memory for direct parallel pipelined architecture. The fast 2-D 4 /spl times/ 4 forward integer transform requires fewer latency delays than the state-of-the-art methods. With regular modularity, the proposed fast algorithm is suitable for VLSI implementation to achieve real-time H.264/advanced video coding (AVC) signal processing.  相似文献   

18.
一种用于H.264编解码的新型高效 可重构多变换VLSI结构   总被引:3,自引:3,他引:0  
H.264/AVC标准采用了4×4整数变换.本文针对4×4正反变换分别提出了两个新的二维直接信号流图.在此基础上,设计了一个支持多变换的可重构高性能二维结构.该结构无需转置寄存器.采用0.18微米CMOS工艺实现了该电路结构.结果表明,该结构同现有典型结构相比具有更高的效率.同采用三个独立的单一变换结构实现的ASIC相比,可重构结构以较少的效率下降(14.4%)获得了较大的芯片面积节省(61.1%).在100MHz的时钟频率下工作,该电路即可实时处理分辨率为4096×2048、每秒60帧的高质量视频序列.  相似文献   

19.
Three-dimensional (3-D) signal processing offers many advantages over two-dimensional (2-D) processing, because it preserves 3-D correlations. In this paper the design and the stability of 3-D rotated filters are considered. These filters are designed by rotating a one-dimensional (1-D) digital filter in 3-D space. The rotated filters are valuable in the design of various 3-D filters which possess prescribed spectral specifications. An efficient algorithm for the design of 3-D lowpass (LP) digital filters, with approximately spherically symmetric magnitude responses, is introduced. To achieve the desirable spectral characteristics, a number of 3-D rotated filters is cascaded. The stability of the spherically symmetric filters designed is considered, and stable realizations are proposed. The relation between the cut-off isopotential sphere of the 3-D filter and the cut-off frequency of the 1-D filter employed in the design, is derived. Finally, configurations that result in highpass (HP) and bandpass (BP) filters are proposed. Examples of LP, HP, and BP filters, designed on the basis of the method proposed, are presented.This research was supported by the Public Benefit Foundation, Alexander S. Onassis.  相似文献   

20.
In this paper, a three-dimensional (3-D) memory array architecture is proposed. This new architecture is realized by stacking several cells in series vertically on each cell located in a two-dimensional array matrix. Therefore, this memory array architecture has a conventional horizontal row and column address and new vertical row address. The total bit-line capacitance of this proposed architecture's DRAM is suppressed to 37% of normal DRAM when one bit-line has 1-Kbit cells and the same design rules are used. Moreover, an array area of 1-Mbit DRAM using the proposed architecture is reduced to 11.5% of normal DRAM using the same design rules. This proposed architecture's DRAM can realize small bit-line capacitance and small array area simultaneously. Therefore, this proposed 3-D memory array architecture is suitable for future ultrahigh-density DRAM  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号