首页 | 本学科首页   官方微博 | 高级检索  
     

基于HBase的多维索引查询机制的优化
引用本文:徐江峰,谭玉龙.基于HBase的多维索引查询机制的优化[J].计算机应用,2020,40(2):571-577.
作者姓名:徐江峰  谭玉龙
作者单位:郑州大学 信息工程学院,郑州 450001
基金项目:中央高校基本科研业务费专项资助项目(20190605)
摘    要:键值存储旨在从非常大的数据量中提取值,同时具有高可用性、容错性和可伸缩性,因此提供了非常需要的基础设施来支持基于位置的服务(LBS)。然而,多维数据上的复杂查询不能有效地处理,因为键值存储不提供访问多个属性的方法。针对键值存储HBase不能有效处理多维数据的问题,提出了一个统一的索引框架——New-grid,使键值存储HBase支持多维查询。在改进的P-grid覆盖网络中,组织了一组节点,提供了高效的数据分布、容错和多维数据的查询处理。为了进行索引,使用基于Hilbert空间填充曲线来保存数据的局部性,从而有效地管理键值存储中的多维数据。同时使用HBase底层存储管理数据,并提出了一种范围查询和K最近邻查询的算法,以消除维护单独索引表的开销。在Amazon EC2上使用4、8和16个普通节点的集群进行了广泛的实验。实验结果表明,New-grid的性能相比MD-Hbase以及MapReduce更优。

关 键 词:基于位置的服务  多维索引  Hbase  空间填充曲线  覆盖网络  
收稿时间:2019-08-22
修稿时间:2019-11-04

Optimization of multidimensional index query mechanism based on HBase
Jiangfeng XU,Yulong TAN.Optimization of multidimensional index query mechanism based on HBase[J].journal of Computer Applications,2020,40(2):571-577.
Authors:Jiangfeng XU  Yulong TAN
Affiliation:School of Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China
Abstract:The key value store is designed to extract values from very large amounts of data and is highly available, fault-tolerant, and scalable, providing a much needed infrastructure to support Location-Based Service (LBS). However, complex queries on multidimensional data cannot be processed effectively because the key value store does not provide a way to access multiple properties. For the key value storage, HBase cannot effectively deal with the problem of multidimensional data, a uniform indexing framework named New-grid was proposed. In the improved P-grid coverage network, a group of nodes was organized to provide efficient data distribution, fault tolerance and multi-dimensional data query processing. For indexing purposes, the locality of data storage based on Hilbert space filling curves was used to effectively manage the multidimensional data in the key value store. Simultaneously, HBase underlying storage was used to manage data, and an algorithm of range query and K-Nearest Neighbors (KNN) query were given to eliminate the overhead of maintaining separate index tables. Extensive experiments were conducted on Amazon EC2 using cluster sizes of 4, 8 and 16 normal nodes. Experimental results show that New-grid performance is more optimized than MD-HBase and MapReduce.
Keywords:Location-Based Service (LBS)                                                                                                                        multidimensional index                                                                                                                        HBase                                                                                                                        space filling curve                                                                                                                        coverage network
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号