首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
提出的基于代理服务器的自适应存储缓冲策略,通过自适应地存储已访问过的热点Web资源,缩短了网络传输的延迟和文档的检索时间,减轻了热点Web服务器的压力。  相似文献   

2.
杨梦梦  卢凯  卢锡城 《计算机工程》2005,31(16):80-82,109
阐述非一致性存储访问(NUMA)体系结构中存储结构的特点,分析该结构对操作系统存储管理子系统的影响,介绍针对NUMA存储结构特点在操作系统存储管理子系统的不连续内存支持、节点存储关系描述、存储资源分配等方面所作的工作和优化技术。通过实际系统的验证,文中所提出的技术方案较好地支持了NUMA系统复杂存储结构的管理需求,减少了存储访问延迟,提高了系统性能。  相似文献   

3.
嵌入式系统内存规划方法的研究   总被引:3,自引:0,他引:3  
内存访问延迟是嵌入式系统性能提高的瓶颈。本文以数据在内存的存储方式为出发点来解决内存访问延迟的问题,并用遗传算法实现了优化算法。  相似文献   

4.
为保证数据的完整性和可靠性,云存储中主要采用多副本和纠删码两种存储策略对数据进行冗余保存.针对单一冗余存储策略的不足,考虑存储开销和访问质量等方面因素,根据用户访问数据的规律,提出一种基于纠删码的动态副本冗余存储方案.采用RC纠删码来存储云中海量数据,使用曲线拟合预测访问热度,适时调整副本的数量.实验结果表明,该方案空间利用率高,能有效减小用户访问的平均延迟,提高用户访问的成功率.  相似文献   

5.
为改善应用Hadoop分布式文件系统存储大量小文件时效率低下的问题,将NameNode职责分离,使用单独的NFS服务器同步存储元数据信息,以降低Client数据请求压力,提供大吞吐量数据访问并改善访问延迟;设计文件与数据块的对应模式,允许在同一块中存储多个小文件,并对系统加以实现,为海量小文件的存储提供了一个有效的解决方案。实验结果表明,该机制可以在数据迅速增长的背景下实现海量小文件的高效存取。  相似文献   

6.
多维数据以线性形式在存储系统中进行访问操作,二维及以上维度空间中的相邻节点被不同的映射算法映射到一维空间的不相邻位置。高维空间中进行相邻节点访问时,其一维存储映射位置有着不同的访问距离和访问延迟。提出了基于空间填充曲线Z-Ordering的存储映射方法及其访问距离的度量指标,并和常规优先算法进行了对比,发现能更好地将高维相邻的数据节点簇集到一维存储位置,加强了局部性。调整缓存空间中用于预取的空间大小,可以利用增强的局部性,提高了缓存命中率。实验结果表明,改善了多维数据的访问速度,优化了系统性能。  相似文献   

7.
社交网络的庞大数据需求分布式存储,多个用户的数据分散存储在各个存储和计算节点上可以保持并行性和冗余性。如何在有限的分布式存储空间内高性能存储和访问用户数据具有现实意义。在当前的社交网络系统中,用户数据之间的读写操作会导致大量跨存储节点的远程访问。减少节点间的远程访问可以降低网络负载和访问延迟,提高用户体验。提出一种基于用户交互行为的动态划分复制算法,利用用户之间的朋友关系和评论行为描述社交网络的结构,周期性划分复制用户数据,从而提高本地访问率,降低网络负载。通过真实数据集验证,该算法相比随机划分和复制算法能够大大提升本地访问率,降低访问延迟。  相似文献   

8.
社交网站和电子商务等网络服务发展迅速,这类服务需要存储大量图片、音乐、微博文本等小文件。传统的分布式存储系统,如HDFS(Hadoop distributed file system),是面向大文件而设计的,在存储小文件时会产生元数据开销过大,访问延迟较高等问题,不能适应存储海量小文件的应用环境。分析了TFS(Taobao file system)的系统架构和读写流程,发现TFS在每次读/写过程中至少要建立3次网络连接,增大了读写延迟。针对海量小文件存储带来的挑战和TFS存在的问题,提出了一种新的低延迟、高可用的面向海量小文件的分布式存储方案,并实现了分布式文件系统SFFS(small-file file system)。性能测试表明,SFFS和TFS相比,写延迟降低了76.6%,读延迟降低了约10%。通过对系统结构的分析,相比于TFS,SFFS在中心节点的负载更轻,失效恢复更快,在可用性方面更有优势。  相似文献   

9.
数据存储是无线传感器网络的研究重点.本文分析无线传感器网络中数据存储和访问的相关代价,研究了网状拓扑结构网络中数据存放位置的选择问题.将数据存储问题抽象为传感器节点聚类问题,实现了三种基于聚类的分布式数据存储方法CBDS.为了能够降低能量消耗,CBDS依据生产者和消费者的位置信息、网络拓扑信息及数据速率计算数据存放位置,并且依据这些参数的变化自适应地调整数据存放位置.实验结果表明:CBDS较传统的数据存储方法不仅减少了能量消耗,延长了网络的生命周期,并且降低了访问延迟.  相似文献   

10.
多线程和向量技术相结合是当前微处理器设计的一个重要趋势.提出一种多线程向量处理器中向量数据存储结构,利用多线程切换来隐藏访存延迟,并让向量数据直接访问二级cache来提高带宽.模拟实验表明在所提出的存储结构下,访存带宽随线程数线性增长,向量数据访问带宽明显高于标量数据访问带宽.  相似文献   

11.
刘恒  杨小帆 《计算机应用研究》2012,29(10):3772-3775
动态内存管理的问题对无锁动态数据结构的性能尤为关键,因为多线程环境下的动态内存管理涉及开销较高的同步操作。提出一种构建用于动态无锁数据结构的内存池的方法来减少动态内存使用和与之相伴的动态内存管理开销。该方法通过平衡线程的动态内存消耗来减小内存开销,利用本方法构建的内存池基于线程私有的支持节点窃取的无锁循环队列。本方法具有以下优点:a)用本方法构建的内存池是无锁的;b)能够平衡线程的堆内存消耗;c)可以方便地与动态无锁数据结构集成。实验结果显示,用该方法构造的资源窃取型内存池扩展性较强,且能够在高负载下有效降低无锁数据结构的堆内存消耗和操作执行时间;平衡算法在很大程度上决定内存消耗量,内存池在高负载下的扩展性也受到它所用的数据结构自身多线程访问性能的影响。  相似文献   

12.
Since the development of the HP memristor, much attention has been paid to studies of memris- tive devices and applications, particularly memristor-based nonvolatile semiconductor memory. Owing to its unique properties, theoretically, one could restart a memristor-based computer immediately without the need for reloading the data. Further, current memories are mainly binary and can store only ones and zeros, whereas memristors have multilevel states, which means a single memristor unit can replace many binary transistors and realize higher-density memory. It is believed that memristors can also implement analog storage besides binary and multilevel information memory. In this paper, an implementation scheme for analog memristive memory is considered. A charge-controlled memristor model is derived and the corresponding SPICE model is constructed. Special write and read operations are demonstrated through numerical analysis and circuit simulations. In addition, an audio analog record/play system using a memristor crossbar array is designed. This system can provide great storage capacity (long recording time) and high audio quality with a simple small circuit structure. A series of computer simulations and analyses verify the effectiveness of the proposed scheme.  相似文献   

13.
14.
With the ever-growing storage density, high-speed, and low-cost data access, flash memory has inevitably become popular. Multi-level cell (MLC) NAND flash memory, which can well balance the data density and memory stability, has occupied the largest market share of flash memory. With the aggressive memory scaling, however, the reliability decays sharply owing to multiple interferences. Therefore, the control system should be embedded with a suitable error correction code (ECC) to guarantee the data integrity and accuracy. We proposed the pre-check scheme which is a multi-strategy polar code scheme to strike a balance between reasonable frame error rate (FER) and decoding latency. Three decoders namely binary-input, quantized-soft, and pure-soft decoders are embedded in this scheme. Since the calculation of soft log-likelihood ratio (LLR) inputs needs multiple sensing operations and optional quantization boundaries, a 2-bit quantized hard-decision decoder is proposed to outperform the hard-decoded LDPC bit-flipping decoder with fewer sensing operations. We notice that polar codes have much lower computational complexity compared with LDPC codes. The stepwise maximum mutual information (SMMI) scheme is also proposed to obtain overlapped boundaries without exhausting search. The mapping scheme using Gray code is employed and proved to achieve better raw error performance compared with other alternatives. Hardware architectures are also given in this paper.  相似文献   

15.
Flash memory is becoming a major database storage in building embedded systems or portable devices because of its non-volatile, shock-resistant, power-economic nature, and fast access time for read operations. Flash memory, however, should be erased before it can be rewritten and the erase and write operations are very slow as compared to main memory. Due to this drawback, traditional database management schemes are not easy to apply directly to flash memory database for portable devices. Therefore, we improve the traditional schemes and propose a new scheme called flash two phase locking (F2PL) scheme for efficient transaction processing in a flash memory database environment. F2PL achieves high transaction performance by exploiting the notion of the alternative version coordination which allows previous version reads and efficiently handles slow write/erase operations in lock management processes. We also propose a simulation model to show the performance of F2PL. Based on the results of the performance evaluation, we conclude that F2PL scheme outperforms the traditional schemes.  相似文献   

16.
The static specification of operations executed in parallel using No Operations (NOPs) is another culprit to make code size to be increased in VLIW architecture. Some alternatives in the instruction encoding and memory subsystem are proposed to minimize the impact of NOP on the code size. One is the compressed cache using the packed encoding scheme and the other is the decompressed cache using the unpacked encoding scheme. The compressed cache shows high memory utilization but increases the pipeline branch penalty because it requires very complex fetch hardware. On the contrary, the fetch overhead can be decreased in the decompressed cache because the unpacked encoding scheme allows an instruction to be issued to the pipeline without any recovery process. However, it has a shortcoming that the memory utilization is deteriorated due to the memory allocation irrespective of the number of useful operations. In this research, a new instruction encoding scheme called a semi-packed encoding scheme and the section cache, which enables effective store and retrieval of semi-packed instructions, are proposed. This can decrease the hardware complexity to fetch an instruction and the wasted memory space due to NOPs via the partially fixed length of an instruction. The experimental results reveal that the memory utilization in the section cache is 3.4 times higher than in the decompressed cache. The memory subsystem using the section cache can provide about 15% performance improvement with the moderate size of chip area.  相似文献   

17.
This paper develops a novel, compressed B+-tree based indexing scheme that supports the processing of moving objects in one-, two-, and multi- dimensional spaces. The past, current, and anticipated future trajectories of movements are fully indexed and well organized. No parameterized functions and geometric representations are introduced in our data model so that update operations are not required and the maintenance of index structures can be accomplished by basic insertion and deletion operations. The proposed method has two contributions. First, the spatial and temporal attributes of trajectories are accurately preserved and well organized into compact index structures with very efficient memory space utilization and storage requirement. Second, index maintenance overheads are more economical and query performance is more responsive than those of conventional methods. Both analytical and empirical studies show that our proposed indexing scheme outperforms the TPR-tree.  相似文献   

18.
We present a geometry compression scheme for restricted quadtree meshes and use this scheme for the compression of adaptively triangulated digital elevation models (DEMs). A compression factor of 8–9 is achieved by employing a generalized strip representation of quadtree meshes to incrementally encode vertex positions. In combination with adaptive error-controlled triangulation, this allows us to significantly reduce bandwidth requirements in the rendering of large DEMs that have to be paged from disk. The compression scheme is specifically tailored for GPU-based decoding, since it minimizes dependent memory access operations. We can thus trade CPU operations and CPU–GPU data transfer for GPU processing, resulting in twice faster streaming of DEMs from main memory into GPU memory. A novel storage format for decoded DEMs on the GPU facilitates a sustained rendering throughput of about 300 million triangles per second. Due to these properties, the proposed scheme enables scalable rendering with respect to the display resolution independent of the data size. For a maximum screen-space error below 1 pixel it achieves frame rates of over 100 fps, even on high-resolution displays. We validate the efficiency of the proposed method by presenting experimental results on scanned elevation models of several hundred gigabytes.  相似文献   

19.
NAND flash memory is a promising storage media that provides low-power consumption, high density, high performance, and shock resistance. Due to these versatile features, NAND flash memory is anticipated to be used as storage in enterprise-scale systems as well as small embedded devices. However, unlike traditional hard disks, flash memory should perform garbage collection that consists of a series of erase operations. The erase operation is time-consuming and it usually degrades the performance of storage systems seriously. Moreover, the number of erase operations allowed to each flash memory block is limited. This paper presents a new garbage collection scheme for flash memory based storage systems that focuses on reducing garbage collection overhead, and improving the endurance of flash memory. The scheme also reduces the energy consumption of storage systems significantly. Trace-driven simulations show that the proposed scheme performs better than various existing garbage collection schemes in terms of the garbage collection time, the number of erase operations, the energy consumption, and the endurance of flash memory.  相似文献   

20.
The roofline model not only provides a powerful tool to relate an application's performance with the specific constraints imposed by the target hardware but also offers a graphic representation of the balance between memory access cost and compute throughput. In this work, we present a strategy to break up the tight coupling between the precision format used for arithmetic operations and the storage format employed for memory operations. (At a high level, this idea is equivalent to compressing/decompressing the data in registers before/after invoking store/load memory operations.) In practice, we demonstrate that a “memory accessor” that hides the data compression behind the memory access, can virtually push the bandwidth-induced roofline, yielding higher performance for memory-bound applications using high precision arithmetic that can handle the numerical effects associated with lossy compression. We also demonstrate that memory-bound applications operating on low precision data can increase the accuracy by relying on the memory accessor to perform all arithmetic operations in high precision. In particular, we demonstrate that memory-bound BLAS operations (including the sparse matrix-vector product) can be re-engineered with the memory accessor and that the resulting accessor-enabled BLAS routines achieve lower rounding errors while delivering the same performance as the fast low precision BLAS.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号