多格式海量数据统一存取的索引结构 Index structure of unified access in big data of multi-format期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

多格式海量数据统一存取的索引结构

引用本文：	冯亚丽,丁良奎,刘永江,王兴兆.多格式海量数据统一存取的索引结构[J].计算机应用研究,2013,30(6):1664-1667.

作者姓名：	冯亚丽丁良奎刘永江王兴兆

作者单位：	1. 东北石油大学计算机与信息技术学院,黑龙江大庆,163318 2. 中海油研究中心技术研究部地球物理重点实验室,北京,100027 3. 中国石油管道公司信息中心,河北廊坊,065000

基金项目：	国家科技重大专项资助项目（2011ZX05023-005-012）

摘要：	为提高多格式海量数据统一存取效率,提出了一种基于Hadoop的分布式数据读取模式。并通过对海量数据非主键索引结构的研究,结合统一存取的描述理念,提出了基于HDFS的一种可适用于B-树和R-树及其变种的层次索引结构,改变了原键—值存储在非主键索引结构中的劣势。通过提出Hadoop缓冲策略、基于随机读取的新数据传输模型以及相应的查询处理策略,进一步降低了数据传输开销。实验表明,该系列方法优化了统一存取中随机存取效率,减少了相应的查询响应时间和数据传输开销,提高了多格式海量数据统一存取的性能。
关键词：	R-树索引海量数据查询处理
Index structure of unified access in big data of multi-format

FENG Ya-li,DING Liang-kui,LIU Yong-jiang,WANG Xing-zhao.Index structure of unified access in big data of multi-format[J].Application Research of Computers,2013,30(6):1664-1667.

Authors:	FENG Ya-li DING Liang-kui LIU Yong-jiang WANG Xing-zhao

Affiliation:	1. College of Computer & Information Technology, Northeast Petroleum University, Daqing Heilongjiang 163318, China; 2. Geophysical Laboratory, Dept. of Technology Research, CNOOC Research Center, Beijing 100027, China; 3. Information Center, China Petroleum Pipeline Company, Langfang Hebei 065000, China

Abstract:	This paper proposed a distributed data access mode based on Hadoop underlying the unified access design of the big data of multi-format. Through the research on the structure of the big data with non-primary key index, combining with the theory of the unified access, this paper gave a HDFS-based hierarchical index structure, which was applied to both B-tree and R-tree and their variants, thereby changing the disadvantage of the original key-value storage in the structure with non-primary key index. To further reduce the data transfer overhead, it put forward Hadoop buffering strategy, a new data transfer model based on random read and the corresponding query processing strategy. The experiments show that the series above well improve the efficiency of the unified access and reduce the corresponding query response time and data transfer overhead.

Keywords:	R-tree indexes big data query processing
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏