首页 | 本学科首页   官方微博 | 高级检索  
     

内存数据库在TPC-H负载下的处理器性能
引用本文:刘大为,栾 华,王 珊,覃 飙.内存数据库在TPC-H负载下的处理器性能[J].软件学报,2008,19(10):2573-2584.
作者姓名:刘大为  栾 华  王 珊  覃 飙
作者单位:1. 数据工程与知识工程教育部重点实验室(中国人民大学),北京,100872;中国石油信息技术服务中心,北京,100724
2. 数据工程与知识工程教育部重点实验室(中国人民大学),北京,100872
基金项目:国家自然科学基金,国际合作(HP Lab.)项目
摘    要:Ailamaki等人1999年研究了数据库管理系统(database management system,简称DBMS)在处理器上的时间开销分解.此后,相关研究集中在分析DBMS在处理器上的瓶颈.但这些研究工作均是在磁盘数据库DRDBs(disk resident databases)上开展的,而且都是分析DBMS上的TPC-C类负载.然而,随着硬件技术的进步,现代计算机的多级缓存结构(memory hierarchy)在逐渐地"上移".例如,容量越来越大的芯片内缓存(on-chip caches)和芯片外缓存(off-chip caches),容量越来越大的RAM,Flash Memory等等.为此,处理器负载分析的研究工作也应随之"上移".研究内存数据MMDBs(mainmemory resident databases)在计算密集型负载下的处理器行为特性.由于磁盘数据库的主要性能瓶颈是磁盘I/O,因而可以用索引、压缩等技术进行优化;然而,内存数据库的性能瓶颈却在于处理器和内存之间的数据交换.针对这一问题,首先分析了磁盘数据库和内存数据库在TPC-H负载下处理器性能瓶颈的差异,并给出了一些优化建议,提出了通过预取的优化方法.其次,通过实验比较了不同存储体系结构(行存储与列存储)对处理器利用率的差异,并探索了下一代内存数据库体系结构方面的解决方案.此外,还研究了索引结构对处理器多级缓存的影响,并给出了索引的优化建议.最后,提出一个微测试集用于评估内存数据库在DSS(decision support system)负载下处理器的性能及行为特性.研究结果会对运行于下一代处理器上的内存数据库体系结构设计和性能优化提供一定的实验依据.

关 键 词:内存数据库  TPC-H负载  处理器特性
收稿时间:2007/7/20 0:00:00
修稿时间:2008/1/29 0:00:00

Main Memory Database TPC-H Workload Characterization on Modern Processor
LIU Da-Wei,LUAN Hu,WANG Shan and QIN Biao.Main Memory Database TPC-H Workload Characterization on Modern Processor[J].Journal of Software,2008,19(10):2573-2584.
Authors:LIU Da-Wei  LUAN Hu  WANG Shan and QIN Biao
Abstract:In 1999,the research of database systems' execution time breakdown on modem computer platforms has been analyzed by Ailamaki,et al.The primary motivation of these studies is to improve the performance of Disk Resident Databases(DRDBs),which form the main stream of database systems until now.The typical benchmark used in those studies is TPC-C.However,continuing hardware advancements have"moved-up"on the memory hierarchy,such as the larger and larger on-chip and off-chip caches,the steadily increasing RAM space, and the commercial availability of huge flash memory(solid-state disk)on top of regular disk,etc.To reflect such a trend,the target of workload characterization research along the memory hierarchy is also studied.This paper focuses on Main Memory Databases(MMDBs),and the TPC-H benchmark.Unlike the performance of DRDB which is I/O bound and may be optimized by high-level mechanisms such as indexing,the performance of MMDB is basically CPU and memory bound.In this study,the paper first compares the execution time breakdown of DRDB and MMDB,and the paper proposes an optimize strategy to optimize the memory resident aggregate.Then,the paper explores the difference between column-oriented and row-oriented storage models in CPU and cache utilization.Furthermore,the paper measures performance of MMDBs on different generational CPUs.In addition, the paper analyzes the index influence and gives a strategy for main memory database index optimization.Finally, the paper analyzes each query in the full TPC-H benchmark in detail,and obtains systematic results,which help design micro-benchmarks for further analysis of CPU cache stall.Results of this study are expected to benefit the performance optimization of MMDBs,and the architecture design memory-oriented databases of the next generation.
Keywords:MMDB(main memory database)  TPC-H workload  processor characterization
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号