首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
NTS 高缓为直接映象高缓扩充了一个小的全相联高缓,其中保存被预测为具有非时间局部性的数据块.与牺牲高缓的区别在于NTS 高缓在两个高缓之间没有直接的数据通路,因此结构设计简单,功耗低.本文提出了一个NTS 高缓的改进方案,称为选择性冲突预测高缓(SCP 高缓)设计方案.SCP 高缓利用冲突预测算法监测高缓中数据块局部性,并有选择性地将数据块填充到直接映象高缓或全相联高缓中.仿真结果显示,容量相当的SCP 高缓性能优于NTS 高缓.  相似文献   

2.
提出并实现了4-way组相联高速缓存设计[1]中能够减少电路复杂性、节省Valid RAM空间的5-bit位复用近似LRU算法,其基本方法是通过位比较对4-way数据访问先后进行排序、对Valid位和比较位进行复用.给出了不命中时的替换选择电路逻辑和通过VHDL实现后的测试结果.相关结果表明,该算法实现电路简单,占用面积小,且命中率高:在指令高速缓存设计中,高速缓存大小为1 kB时,测试的平均命中率为90.2%,4 kB时为92.3%,16 kB时为94.2%.  相似文献   

3.
0323600利用冲突预测方法的高缓组织方案[刊]/李晓明//电子学报.—2003,31(5).—724~727(L)0323601以操作系统为中心的存储一致性模型——线程一致性模型[刊]/戴华东//计算机研究与发展.—2003,40(2).—351~359(E)0323602Windows 下的虚拟网络存储的设计与实现[刊]/丁振//计算机工程.—2003,29(5).—163~165(E)  相似文献   

4.
本文提出了一种基于随机指纹模型的Wu and Manber(WM)算法(Randomizing Fingerprint WM,RFPWM),它通过为每一个模式串计算唯一指纹可以有效降低误报率.与WM算法相比,RFP-WM算法极大地降低了哈希冲突率,提高了命中率,在海量模式集上这一效果更为显著.实验结果表明,相对于传统WM算法,该算法的匹配效率更高,而且模式集的规模越大,性能越优越.  相似文献   

5.
口令猜测攻击是一种最直接的获取信息系统访问权限的攻击,采用恰当方法生成的口令字典能够准确地评估信息系统口令集的安全性。该文提出一种针对中文口令集的口令字典生成方法(CSNN)。该方法将每个完整的汉语拼音视为一个整体元素,后利用汉语拼音的规则对口令进行结构划分与处理。将处理后的口令放入长短期记忆网络(LSTM)中训练,用训练后的模型生成口令字典。该文通过命中率实验评估CSNN方法的效能,将CSNN与其它两种经典口令生成方法(即,概率上下文无关文法PCFG和5阶马尔可夫链模型)对生成口令的命中率进行实验对比。实验选取了不同规模的字典,结果显示,CSNN方法生成的口令字典的综合表现优于另外两种方案。与概率上下文无关文法相比,在猜测数为107时,CSNN字典在不同测试集上的命中率提高了5.1%~7.4%(平均为6.3%);相对于5阶马尔可夫链模型,在猜测数为8×105时,CSNN字典在不同测试集上的命中率提高了2.8%~12%(平均为8.2%)。  相似文献   

6.
结合TDD-CDMA系统帧结构的特点,提出了基于正交解扩的信干比(SIR)估计方法。根据该估计方法,设计了直接预测、线性插值预测和卡尔曼滤波预测等3种易于实现的SIR预测算法。理论研究与仿真结果表明,基于正交解扩的SIR估计算法在不同信道条件和不同传输方式下都有比较好的性能,直接预测算法与其他2种算法相比,综合性能最好。  相似文献   

7.
设计、制造并测试了一种单片集成的压阻式高性能三轴高g加速度计,量程可达105g.x和y轴单元均采用一种带微梁的三梁-质量块结构,z轴单元采用三梁-双岛结构.与传统的单悬臂梁结构或者悬臂梁-质量块结构相比,这两种结构均同时具有高灵敏度和高谐振频率的优点.采用ANSYS软件进行了结构分析和优化设计.中间结构层主要制作工艺包括压阻集成工艺和双面Deep ICP刻蚀,并与玻璃衬底阳极键合和上层盖板BCB键合形成可以塑封的三层结构,从而提高加速度计的可靠性.封装以后的加速度计采用落杆方法进行测试,三轴灵敏度分别为2.28,2.36和2.52 μV/g,谐振频率分别为309,302和156 kHz.利用东菱冲击试验台,采用比较校准法测得y轴和z轴加速度计的非线性度分别为1.4%和1.8%.  相似文献   

8.
设计、制造并测试了一种单片集成的压阻式高性能三轴高g加速度计,量程可达105g.x和y轴单元均采用一种带微梁的三梁-质量块结构,z轴单元采用三梁-双岛结构.与传统的单悬臂梁结构或者悬臂梁-质量块结构相比,这两种结构均同时具有高灵敏度和高谐振频率的优点.采用ANSYS软件进行了结构分析和优化设计.中间结构层主要制作工艺包括压阻集成工艺和双面Deep ICP刻蚀,并与玻璃衬底阳极键合和上层盖板BCB键合形成可以塑封的三层结构,从而提高加速度计的可靠性.封装以后的加速度计采用落杆方法进行测试,三轴灵敏度分别为2.28,2.36和2.52 μV/g,谐振频率分别为309,302和156 kHz.利用东菱冲击试验台,采用比较校准法测得y轴和z轴加速度计的非线性度分别为1.4%和1.8%.  相似文献   

9.
在H.264/AVC视频压缩标准采用的几项关键技术中,使用多参考帧预测可以增加最佳匹配块的检索概率,进而大大提高了编码效率.受B帧直接( direct)预测模式的启发,提出了一种新的基于扩展帧的多参考帧预测方法,由多参考帧中的共同位置块及其参考块扩展得到一个新的抽取帧,增加了原始序列的时域分辨率,使得扩展帧更加接近当前帧,提高了运动估计中最佳匹配的检索概率,进而提升了编码效率.仿真结果证实该方法的编码性能好于H.264/AVC参考软件.  相似文献   

10.
张江山  朱光喜 《微电子学》2002,32(2):113-116
提出了一种新的多级运动估值器的结构 ,它支持低比特视频编码器的高级预测模式 ,如H.2 63和 MPEG- 4。该 VLSI结构的所有级别中共用一个基本的搜索单元 ( BSU) ,减小了芯片尺寸。另外 ,由于它为计算 8× 8块的绝对误差和 SAD提供了一种对存储器数据流的控制电路 ,因此 ,对于高级预测模式 ,可同时获得 1个宏块运动矢量和每个宏块中的 4个子块运动矢量。这种尺寸较小的运动估值电路可以获得与全搜索块匹配算法 ( FSBMA)相似的编码效果  相似文献   

11.
We present a high performance cache structure with a hardware prefetching mechanism that enhances exploitation of spatial and temporal locality. Temporal locality is exploited by selectively moving small blocks into the direct‐mapped cache after monitoring their activity in the spatial buffer. Spatial locality is enhanced by intelligently prefetching a neighboring block when a spatial buffer hit occurs. We show that the prefetch operation is highly accurate: over 90% of all prefetches generated are for blocks that are subsequently accessed. Our results show that the system enables the cache size to be reduced by a factor of four to eight relative to a conventional direct‐mapped cache while maintaining similar performance.  相似文献   

12.
Power consumption is an increasingly pressing problem in modern processor design. Since the on-chip caches usually consume a significant amount of power, it is one of the most attractive targets for power reduction. This paper presents a two-level filter scheme, which consists of the L1 and L2 filters, to reduce the power consumption of the on-chip cache. The main idea of the proposed scheme is motivated by the substantial unnecessary activities in conventional cache architecture. We use a single block buffer as the L1 filter to eliminate the unnecessary cache accesses. In the L2 filter, we then propose a new sentry-tag architecture to further filter out the unnecessary way activities in case of the L1 filter miss. We use SimpleScalar to simulate the SPEC2000 benchmarks and perform the HSPICE simulations to evaluate the proposed architecture. Experimental results show that the two-level filter scheme can effectively reduce the cache power consumption by eliminating most unnecessary cache activities, while the compromise of system performance is negligible. Compared to a conventional instruction cache (32 kB, two-way) implemented with only the L1 filter, the use of a two-level filter can result in roughly 30% reduction in total cache power consumption. Similarly, compared to a conventional data cache (32 kB, four-way) implemented with only the L1 filter, the total cache power reduction is approximately 46%.  相似文献   

13.
A single 3.3-V only, 8-Gb NAND flash memory with the smallest chip to date, 98.8 mm2, has been successfully developed. This is the world's first integrated semiconductor chip fabricated with 56-nm CMOS technologies. The effective cell size including the select transistors is 0.0075 mum2 per bit, which is the smallest ever reported. To decrease the chip size, a very efficient floor plan with one-sided row decoder, one-sided page buffer, and one-sided pad is introduced. As a result, an excellent 70% cell area efficiency is realized. The program throughput is drastically improved to twice as large as previously reported and comparable to binary memories. The best ever 10-MB/s programming is realized by increasing the page size from 4kB to 8kB. In addition, noise cancellation circuits and the dual VDD-line scheme realize both a small die size and a fast programming. An external page copy achieves a fast 93-ms block copy, efficiently using a 1-MB block size  相似文献   

14.
An aggressive drowsy cache block management, where the cache block is forced into drowsy mode all the time except during write and read operations, is proposed. The word line (WL) is used to enable the normal supply voltage (V DD_high) to the cache line only when it is accessed for read or write whereas the drowsy supply voltage (V DD_low) is enabled to the cache cell otherwise. The proposed block management neither needs extra cycles nor extra control signals to wake the drowsy cache cell, thereby reducing the performance penalty associated with traditional drowsy caches. In fact, the proposed aggressive drowsy mode can reduce the total power consumption of the traditional drowsy mode by 13% or even more, depending on the cache access rate, access frequency and the CMOS technology used.  相似文献   

15.
凌明  武建平  张阳  梅晨  翟婷婷 《微电子学》2012,42(1):102-106,129
可重构Cache架构可根据程序的存储资源需求自动调整Cache结构,对系统能耗优化具有重要意义。设计了一种容量和组关联度可重构的指令Cache架构以及与之对应的高效自适应可重构算法。通过选取MiBench和MediaBench中的8个测试例程进行测试验证,提出的自适应可重构Cache与16kB四路组关联配置固定的指令Cache相比,在性能平均仅下降0.34%的情况下,系统总能耗平均降低10.51%。  相似文献   

16.
The design of a second-level cache chip with the most suitable architecture for shared-bus multiprocessing is described. This chip supports high-speed (160-MB/s) burst transfer between multilevel caches and a newly proposed cache-consistency protocol. The chip, which supports a 50-MHz CPU and uses 0.8 μm CMOS technology, includes a 32 kB data memory, 42 kb tag memory. and 21.7 K-gate logic  相似文献   

17.
This paper presents a new data cache design, cache-processor coupling, which tightly binds an on-chip data cache with a microprocessor. Parallel architectures and high-speed circuit techniques are developed for speeding address handling process associated with accessing the data cache. The address handling time has been reduced by 51% by these architectures and circuit techniques. On the other hand, newly proposed instructions increase data cache bandwidth by eight times. Excessive power consumption due to the wide-bandwidth data transfer is carefully avoided by newly developed circuit techniques, which reduce dissipation power per bit to 1/26. Simulation study of the proposed architecture and circuit techniques yields a 1.8 ns delay each for address handling, cache access, and register access for a 16 kilobyte direct mapped cache with a 0.4 μm CMOS design rule  相似文献   

18.
In this paper, we investigate an incentive edge caching mechanism for an internet of vehicles (IoV) system based on the paradigm of software‐defined networking (SDN). We start by proposing a distributed SDN‐based IoV architecture. Then, based on this architecture, we focus on the economic side of caching by considering competitive cache‐enablers market composed of one content provider (CP) and multiple mobile network operators (MNOs). Each MNO manages a set of cache‐enabled small base stations (SBS). The CP incites the MNOs to store its popular contents in cache‐enabled SBSs with highest access probability to enhance the satisfaction of its users. By leasing their cache‐enabled SBSs, the MNOs aim to make more monetary profit. We formulate the interaction between the CP and the MNOs, using a Stackelberg game, where the CP acts first as the leader by announcing the popular content quantity that it which to cache and fixing the caching popularity threshold, a minimum access probability under it a content cannot be cached. Then, MNOs act subsequently as followers responding by the content quantity they accept to cache and the corresponding caching price. A noncooperative subgame is formulated to model the competition between the followers on the CP's limited content quantity. We analyze the leader and the follower's optimization problems, and we prove the Stackelberg equilibrium (SE). Simulation results show that our game‐based incentive caching model achieves optimal utilities and outperforms other incentive caching mechanisms with monopoly cache‐enablers whilst enhancing 30% of the user's satisfaction and reducing the caching cost.  相似文献   

19.
Update-Based Cache Access and Replacement in Wireless Data Access   总被引:1,自引:0,他引:1  
Cache has been applied for wireless data access with different replacement policies in wireless networks. Most of the current cache replacement schemes are access-based replacement policies since they are based on object access frequency/recency information. Access-based replacement policies either ignore or do not focus on update information. However, update information is extremely important since it can make access information almost useless. In this paper, we consider two fundamental and strongly consistent access algorithms: poll-per-read (PER) and call-back (CB). We propose a server-based PER (SB-PER) cache access mechanism in which the server makes replacement decisions and a client-based CB cache access mechanism in which clients make replacement decisions. Both mechanisms have been designed to be suitable for using both update frequency and access frequency. We further propose two update-based replacement policies, least access-to-update ratio (LA2U) and least access-to-update difference (LAUD). We provide a thorough performance analysis via extensive simulations for evaluating these algorithms in terms of access rate, update rate, cache size, database size, object size, etc. Our study shows that although effective hit ratio is a better metric than cache hit ratio, it is a worse metric than transmission cost, and a higher effective hit ratio does not always mean a lower cost. In addition, the proposed SB-PER mechanism is better than the original PER algorithm in terms of effective hit ratio and cost, and the update-based policies outperform access-based policies in most cases  相似文献   

20.
基于传统的频域抽取快速傅里叶变换(FFT)算法以及二维FFT算法,设计了一种高精度的大点数FFT处理器。该处理单元采用一个状态机控制整个运算流程,针对小点数情况的一维FFT算法和大点数情况的二维FFT算法,该处理器都可以智能地选择合适的处理流程和缓存管理,自动地完成整个FFT运算而无需软件介入。在支持大点数的二维FFT算法的基础上,该设计还通过对旋转因子计算过程的优化,以提高在大点数情况下的精度表现,在4M长度的输入序列时可以获得130 dB以上的信噪比。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号