期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

崔斌吕雁飞陈学轩《计算机应用》2010,30(1):1-4

闪存容量的增大使在其上构建大型系统成为可能,如何构建闪存数据库也成为数据库的热点研究领域之一。索引结构是数据库中必不可少的结构之一,而B+树是最广泛使用的索引结构。这里对存储在闪存芯片模拟器及固态硬盘上的B+树性能进行了测试及分析。首先介绍了闪存的IO特点,并测试了固态硬盘的基本IO特性。接着,对B+树的插入和查询效率进行了详细地测试。测试发现节点大小,缓存大小,以及数据值的分布方式都会对B+树的性能带来很大影响。例如由于闪存的读取速度不对称,闪存的更新和查询操作最优块大小相差较大。这些测试结果为更好地在闪存上使用B+树索引,并进一步设计出更适合闪存的索引提供了指导。相似文献

2.

非易失性存储器件的性能、可靠性及应用

杜亚娟金凯伦王子烨宁新杰《集成技术》2022,11(3):42-55

随着大数据和人工智能应用的发展,数据呈现爆发式增长,对数据存储的需求日益加剧。传统内存技术的容量已经接近其物理存储密度的极限,而非易失性存储器具有按字节寻址、能耗低、读写速度快等优良特性,有望替代传统的动态随机存储器或磁盘技术。然而,该介质本身也存在一些不足,如使用寿命有限、读写速度不对称、磨损不均衡和错误来源多样等缺点。该文通过阐述常见非易失性存储器的存储原理,调研并总结了一些现有改进技术。相似文献

3.

面向非易失内存的异构索引

刘睿诚张俊晨罗永平金培权《软件学报》2022,33(3):832-848

非易失内存(non-volatile memory,NVM)为数据存储与管理带来新的机遇,但同时也要求已有的索引结构针对NVM的特性进行重新设计.围绕NVM的存取特性,重点研究了树形索引在NVM上的访问、持久化、范围查询等操作的性能优化,并提出了一种上下两层结构的异构索引HART.该索引结合了B+树与Radix树的特点... 相似文献

4.

基于新型存储器件的分布式文件系统性能优化

董聪张晓程文迪石佳《计算机应用》2020,40(12):3594-3603

新型存储器件的I/O性能通常比传统固态驱动器(SSD)高一个数量级,然而使用新型存储器件的分布式文件系统相对于使用SSD的分布式文件系统性能并没有显著的提高,这说明目前的分布式文件系统并不能充分发挥新型存储器件的性能.针对这个问题,对Hadoop分布式文件系统(HDFS)的数据写入流程及传输过程进行了量化分析.通过量化... 相似文献

5.

下载免费PDF全文

安世通胡海波李宇徐建良《计算机科学技术学报》2010,25(3):509-522

With the rapid increasing capacity of flash memory, flash-aware indexing techniques are highly desirable for flash devices. The unique features of flash memory, such as the erase-before-write constraint and the asymmetric read/write cost, severely deteriorate the performance of the traditional B+-tree algorithm. In this paper, we propose an optimized indexing method, called lazy-update B+-tree, to overcome the limitations of flash memory. The basic idea is to defer the committing of update requests to the B... 相似文献

6.

ZB⁺-tree：一种ZNS SSD感知的新型索引结构

下载免费PDF全文

刘扬金培权《计算机研究与发展》2023,60(3):509-524

ZNS SSD是近年来提出的一种新型固态硬盘（solid state drive,SSD）,它以分区（Zone）的方式管理和存取SSD内的数据.相比于传统SSD,ZNS SSD可以有效提升SSD的读写吞吐,降低写放大,减少SSD的预留空间.但是,ZNS SSD要求Zone内必须采用顺序写模式,并且Zone上的空间分配、垃圾回收等任务都需要用户自行控制.ZNS SSD的这些特性对于传统数据库系统的存储管理、索引、缓存等技术均提出了新的挑战.针对如何使传统的B⁺-tree索引结构适配ZNS SSD的问题,提出了一种ZNS SSD感知的新型索引结构——ZB⁺-tree （ZNS-aware B⁺-tree）.ZB⁺-tree是目前已知的首个ZNS SSD感知的索引,它以B⁺-tree为基础,利用ZNS SSD内部支持少量随机写的常规Zone（conventional zone,Cov-Zone）和只支持顺序写的顺序Zone（sequential zone,Seq-Zone）,通过常规Zone来吸收对ZNS SSD的随机写操作.ZB⁺-tree将索引节点分散存储在常规Zone和顺序Zone中,并为2种Zone内的节点分别设计了节点结构,使ZB⁺-tree不仅能够吸收对索引的随机写操作,而且又可以保证顺序Zone内的顺序写要求.在实验中利用null_blk和libzbd模拟ZNS SSD设备,并将现有的CoW B⁺-tree修改后作为对比索引.结果表明,ZB⁺-tree在运行时间、空间利用率等多个指标上均优于CoW B⁺-tree. 相似文献

7.

基于非易失性存储器的存储引擎性能优化

王海涛李战怀张晓赵晓南《集成技术》2022,11(3):56-70

非易失性存储器具有接近内存的读写速度,可利用其替换传统的存储设备,从而提升存储引擎的性能。但是,传统的存储引擎通常使用通用块接口读写数据,导致了较长的 I/O 软件栈,增加了软件层的读写延迟,进而限制了非易失性存储器的性能优势。针对这一问题,该文以 Ceph 大数据存储系统为基础,研究设计了基于非易失性存储器的新型存储引擎 NVMStore,通过内存映射的方式访问存储设备,根据非易失性存储器的字节可寻址和数据持久化特性,优化数据读写流程,从而减小数据写放大以及软件栈的开销。实验结果表明,与使用非易失性存储器的传统存储引擎相比,NVMStore能够显著提升 Ceph 的小块数据读写性能。相似文献

8.

基于非易失性内存的持久化嵌入式内存数据库

下载免费PDF全文

沙行勉陈咸彰马殿龙诸葛晴凤《软件学报》2016,27(S2):320-327

随着近年来嵌入式应用的复杂化和多样化,工业界和学术界提出来用内存数据库满足嵌入式系统对数据处理性能不断提升的要求.然而,现有的内存数据库需要在磁盘或闪存等外存上持久化存储真实的数据库备份,并且以I/O操作的方式将数据库的更新操作同步回外存,有极大的性能开销.此外,这类数据库即便直接部署在新型非易失性内存（non-volatile memory,简称NVM)中,也因为缺乏内存中的持久化机制而不能脱离外存.针对现有内存数据库的不足,提出一套面向NVM的持久化内存数据库设计方案.该方案直接用数据库独立管理NVM,持久化存储NVM的空间信息以及内存数据库的元数据.依据该方案,在典型的内存数据库Redis的基础上实现了可在NVM上持久化的内存数据库.实验结果表明,该方案与既有Redis的持久化方案AOF相比,数据库的启动速度可提高2 400倍,关闭速度可提高5倍,set操作的速度可提高58倍,delete操作的速度可提升34倍. 相似文献

9.

一种wandering B+ tree问题解决方法

下载免费PDF全文

杨勇鹏蒋德钧《计算机研究与发展》2023,60(3):539-554

为了应对磁盘和固态硬盘随机写和顺序写性能差异较大的问题,文件系统和块存储系统通常采用日志结构（log-structured）技术将随机写转换为顺序写. 因此,对于日志结构存储系统数据和元数据的修改都以异地写的方式执行. 在日志结构存储系统中,B+ tree常被用于管理元数据,这就会导致wandering B+ tree问题,即树结点异地更新会导致树结构递归更新. 目前,现有工作主要通过分离树结点的逻辑索引和物理地址,并使用额外的数据结构和物理设备空间存放树结点逻辑索引和物理地址的映射,从而避免递归更新树结构. 但现有方法既引入额外空间开销,又存在额外物理设备空间非顺序写的问题. 提出IBT B+ tree,将树结点逻辑索引和物理地址均存放在树结构中. 同时,基于IBT B+ tree结构引入dirty链表设计,并提出了非递归更新的IBT B+ tree下刷算法. IBT B+ tree既解决了wandering B+ tree问题,又不引入额外的数据结构和物理设备空间,消除了固定物理设备空间的非顺序写. 分别实现IBT B+ tree和基于F2FS中NAT设计的B+ tree,在此基础上设计实现Monty-Dev块存储系统以评价2棵B+ tree. 实验表明,在HDD和SSD介质上,IBT B+ tree在写放大和下刷效率方面均优于NAT B+ tree. 相似文献

10.

下载免费PDF全文

Fan Ziqi Park Dongchul 《计算机科学技术学报》2019,34(1):113-132

Journal of Computer Science and Technology - New non-volatile memory (NVM) technologies are expected to replace main memory DRAM (dynamic random access memory) in the near future. NAND flash... 相似文献

11.

新型非易失存储的安全与隐私问题研究综述

下载免费PDF全文

徐远超闫俊峰万虎孙凤芸张伟功李涛《计算机研究与发展》2016,53(9):1930-1942

近年来,以相变存储器(phase change memory, PCM)为代表的各种新型非易失存储(non-volatile memory, NVM)技术得到广泛关注.NVM同时具有传统内存的字节寻址特性和外存的非易失特性,因而可以同时替代内存和外存,也可以用于混合存储体系结构.NVM具有低延时、高密度、低功耗的优势,有效缓解了存储墙问题.然而,由于应用程序可以直接通过存取指令(load/store)接口访问NVM,并且掉电后存储在NVM上的信息不会丢失,这给NVM的应用带来了一些新的安全和隐私挑战.首先讨论了持久化内存泄漏、不经意写操作、元数据安全、恶意磨损攻击、非易失指针等NVM应用中可能存在的安全问题以及最新的解决方案;然后讨论了数据保护、信息泄露等NVM应用中可能存在的隐私问题及现有的解决方案;最后探讨了NVM还需解决的安全和隐私问题,包括非易失缓存、程序安全等,并提出了一些解决方案,包括权限和保护机制的融合、使用易失性的NVM等. 相似文献

12.

Performance of RAID5 disk arrays with read and write caching 总被引：1，自引：0，他引：1

Jai Menon 《Distributed and Parallel Databases》1994,2(3):261-293

In this paper, we develop analytical models and evaluate the performance of RAID5 disk arrays in normal mode (all disks operational), in degraded mode (one disk broken, rebuild not started) and in rebuild mode (one disk broken, rebuild started but not finished). Models for estimating rebuild time under the assumption that user requests get priority over rebuild activity have also been developed. Separate models were developed for cached and uncached disk controllers. Particular emphasis is on the performance of cached arrays, where the caches are built of Non-Volatile memory and support write caching in addition to read caching. Using these models, we evaluate the performance of arrayed and unarrayed disk subsystems when driven by a database workload such as those seen on systems running any of several popular database managers. In particular, we assume single-block accesses, flat device skew and little seek affinity.With the above assumptions, we find six significant results. First, in normal mode, we find there is no difference in performance between subsystems built out of either small arrays or large arrays as long as the total number of disks used is the same. Second, we find that if our goal is to minimize the average response time of a subsystem in degraded and rebuild modes, it is better to use small arrays rather than large arrays in the subsystem. Third, we find the counter-intuitive result that if our goal is to minimize the average response time of requests to any one array in the subsystem, it is better to use large arrays than small arrays in the subsystem. We call this the best worst-case phenomenon.Fourth, we find that when no caching is used in the disk controller, subsystems built out of arrays have a normal mode performance that is significantly worse than an equivalent unarrayed subsystem built of the same drives. For the specific drive, controller, workload and system parameters we used for our calculations, we find that, without a cache in the controller and operating at typical I/O rates, the normal mode response time of a subsystem built out of arrays is 50% higher than that of an unarrayed subsystem. In rebuild mode, we find that a subsystem built out of arrays can have anywhere from 100% to 200% higher average response time than an equivalent unarrayed subsystem.Out fifth result is that, with cached controllers, the performance differences between arrayed and equivalent unarrayed subsystems shrink considerably. We find that the normal mode response time in a subsystem built out of arrays is only 4.1% higher than that of an equivalent unarrayed system. In degraded (rebuild) mode, a subsystem built out of small arrays has a response time 11% (13%) higher and a subsystem built out of large arrays has a response time 15% (19%) higher than an unarrayed subsystem.Our sixth and last result is that cached arrays have significantly better response times and throughputs than equivalent uncached arrays. For one workload, a cached array with good hit ratios had 5 times the throughout and 10 to 40 times lower response times than the equivalent uncached array. With poor hit ratios, the cached array is still a factor of 2 better in throughput and a factor of 4 to 10 better in response time for this same workload.We conclude that 3 design decisions are important when designing disk subsystems built out of RAID level 5 arrays. First, it is important that disk subsystems built out of arrays have disk controllers with caches, in particular Non-Volatile caches that cache writes in addition to reads. Second, if one were trying to minimize the worst response time seen by any user, one would choose disk array subsystems built out of large RAID level 5 arrays because of the best worst-case phenomenon. Third, if average subsystem response time is the most important design metric, the subsystem should be built out of small RAID level 5 arrays. 相似文献

13.

一种新的基于B~+树结构的XML元素的索引方法

下载免费PDF全文

李卓伟郭松涛《计算机工程与应用》2007,(14)

提出了一种用于搜索XML文档的新的索引方法即RIST。通过采用代码化的结构序列(SES)来表示XML文档和XML查询,得出查询XML数据等同于查找子序列匹配。RIST采用树结构作为查询的基本单元,从而避免了代价高昂的连接操作。另外,RIST还在XML文档的内容和结构上提供了一个统一的索引,所以它的一个很明显的优势就是克服了仅仅根据内容或结构建立索引的弊端。实验表明RIST在支持结构查询上是一种高效的方法。相似文献

14.

Upscaledb: Efficient integer-key compression in a key-value store using SIMD instructions

《Information Systems》2017

Compression can sometimes improve performance by making more of the data available to the processors faster. We consider the compression of integer keys in a B+-tree index. For this purpose, systems such as IBM DB2 use variable-byte compression over differentially coded keys. We revisit this problem with various compression alternatives such as Google's VarIntGB, Binary Packing and Frame-of-Reference. In all cases, we describe algorithms that can operate directly on compressed data. Many of our alternatives exploit the single-instruction-multiple-data (SIMD) instructions supported by modern CPUs. We evaluate our techniques in a database environment provided by Upscaledb, a production-quality key-value database. Our best techniques are SIMD accelerated: they simultaneously reduce memory usage while improving single-threaded speeds. In particular, a differentially coded SIMD binary-packing techniques (BP128) can offer a superior query speed (e.g., 40% better than an uncompressed database) while providing the best compression (e.g., by a factor of ten). For analytic workloads, our fast compression techniques offer compelling benefits. Our software is available as open source. 相似文献

15.

B-tree with fixed node size and no minimum degree

Chandra Prakash 《Journal in Computer Virology》2007,3(3):211-219

In anti-virus and anti-spyware applications, due to multiplicative increase in processing times with increasing complexity of detection logic and fast growing number of signatures, there is a necessity for data structures for quick retrieval and efficient storage of large collection of signatures. This paper presents a variant of B-tree data structure, where the minimum degree constraint is relaxed while maintaining the order of worst case performance bounds for primitive search, insert and delete operations of the B-tree. It presents a detailed case study of the impact of key (signature) size on storage utilization, given fixed sized nodes and also derives a maximum optimal key size with respect to node size. This variant of B-tree is found to be specifically very useful for storage of large number of keys where size of keys exhibit a wide variation and node size remains fixed. 相似文献

16.

可持久化CSB+-树索引技术研究

王胜秦小麟沈尧李博涵史文浩《计算机科学与探索》2015,(2):182-192

现有主存索引方案为实现重用功能仅将更新操作存储到硬盘中,根据操作序列进行索引恢复,实时性和重用性均较差。为进一步提升重用性和实时性,提出了一种可持久化的CSB+-树(cache sensitive B+-tree)索引方案。该方案基于内存映射技术,完整而高效地将索引结构保存到外存中,导入时无需重复创建索引,可节省大量计算资源。针对索引更新过程中出现大量内存碎片问题,采用一种分类内存管理机制进行管理和监视,当内存碎片过多而无法利用时,基于有序键值对进行索引重构以完全消除内存碎片。实验结果表明,所提方案与现有方案相比具有更好的实时性和重用性,同时具有高效的查询处理能力。相似文献

17.

基于CB+-tree的时态XML索引 总被引：1，自引：0，他引：1

徐海燕蒋夏军《计算机工程》2010,36(9):77-79

针对时态查询与时间属性紧密相关的特点,利用时间区间作为改进后B+-tree的索引关键字建立索引,改进后的B+-tree命名为Changing B+-tree(CB+-tree)。实验证明,在CB+-tree上进行时态查询比B+-tree及基于DOM的XML文档的查询效率有所提高。相似文献

18.

下载免费PDF全文

Rui-Xiang Ma Fei Wu Bu-Rong Dong Meng Zhang Wei-Jun Li Chang-Sheng Xie 《计算机科学技术学报》2021,36(5):1037-1050

Due to its low latency,byte-addressable,non-volatile,and high density,persistent memory (PM) is expected to be used to design a high-performance storage system.However,PM also has disadvantages such as limited endurance,thereby proposing challenges to traditional index technologies such as B+ tree.B+ tree is originally designed for dynamic random access memory (DRAM)-based or disk-based systems and has a large write amplification problem.The high write amplification is detrimental to a PM-based system.This paper proposes WO-tree,a write-optimized B+ tree for PM.WO-tree adopts an unordered write mechanism for the leaf nodes,and the unordered write mechanism can reduce a large number of write operations caused by maintaining the entry order in the leaf nodes.When the leaf node is split,WO-tree performs the cache line flushing operation after all write operations are completed,which can reduce frequent data flushing operations.WO-tree adopts a partial logging mechanism and it only writes the log for the leaf node.The inner node recognizes the data inconsistency by the read operation and the data can be recovered using the leaf node information,thereby significantly reducing the logging overhead.Furthermore,WO-tree adopts a lock-free search for inner nodes,which reduces the locking overhead for concurrency operation.We evaluate WO-tree using the Yahoo!Cloud Serving Benchmark(YCSB) workloads.Compared with traditional B+ tree,wB-tree,and Fast-Fair,the number of cache line flushes caused by WO-tree insertion operations is reduced by 84.7％,22.2％,and 30.8％,respectively,and the execution time is reduced by 84.3％,27.3％,and 44.7％,respectively. 相似文献

19.

基于哈希的B+树结构的频繁模式挖掘

叶飞跃《计算机工程》2004,30(13):113-115

提出了一种基干改进的B 树结构及一种新的数据挖掘算法,HB-Minc,该算法通过构造哈希函数,获得B 树的关键字,并在B 树的叶子结点上构建链表结构,记录卡H关关键字的项集及频数,这样在无需产生巨大的候选项集的情况下,挖掘出频繁模式,且具有较高的时间效率。相似文献

20.

基于Haar小波的同源图像消冗技术研究

陈明李超王树鹏吴广君云晓春《计算机研究与发展》2012,(Z1):319-323

针对分辨率不同、品质不同的同源①图像,提出一种基于Haar小波的图像消冗技术.该技术在Haar小波分解提取图像特征的基础上,利用图像特征向量的1-范数建立B+树索引,在B+树中通过范围查询计算不同图像的曼哈顿距离D1.同时为保证消冗的精确性,当D1≤T时,提取图像特征向量的部分数据构建集合,通过阈值t和不同集合中相同元素的个数v来判断是否进行消冗.实验表明,当t=5,T≤7000,消冗率②达到85%,消冗精度③为100%. 相似文献