首页 | 本学科首页   官方微博 | 高级检索  
     

多层一致性哈希的HDFS副本放置策略
引用本文:席屏,薛峰.多层一致性哈希的HDFS副本放置策略[J].计算机系统应用,2015,24(2):127-133.
作者姓名:席屏  薛峰
作者单位:江苏科技大学 计算机科学与工程学院,镇江,212003
摘    要:分布式文件系统HDFS采用机架感知的副本放置策略在一定程度上保证了数据的可靠性,但系统运行一段时间后会出现数据分布不均衡的情况.虽然使用Balancer程序可以对数据进行重分布,但对数据存储不均衡处理的后置性影响了系统的数据读取速率和可靠性.采用多层一致性哈希的副本放置策略,首先通过一致性哈希算法获得数据副本对应的机架位置,再通过一致性哈希算法获得该机架下对应的数据节点位置并最终成为存储位置.一致性哈希算法在查找对应位置的过程中采用地址等分和虚拟节点的技术,提高了查找的效率和分布的均衡性.该策略在数据均衡存储、上传速率方面较原有策略都有很大的提高,并且具有数据自适应性的能力.

关 键 词:一致性哈希  HDFS  副本放置  存储均衡  自适应性
收稿时间:2014/5/13 0:00:00
修稿时间:2014/6/20 0:00:00

Replica Placement Strategy Based on Multi-layer Consistent Hashing in HDFS
XI Ping and XUE Feng.Replica Placement Strategy Based on Multi-layer Consistent Hashing in HDFS[J].Computer Systems& Applications,2015,24(2):127-133.
Authors:XI Ping and XUE Feng
Affiliation:School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003 , China;School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003 , China
Abstract:The HDFS distributed file system, with RackAwareness replica placement strategy, ensures the reliability of the data in a certain extent. But the data distribution will be unbalanced after the system runs for a period of time. Although the usage of Balancer program could redistribute tha data, the postposition of unbalanced treatment of the data storage affects the data read rate and reliability of the system. This paper adopts a replica placement strategy, which is based on multi-layer consistent hashing. At first, we will get the position of the frame which corresponds to replica through the consistent hashing algorithm, and then with the consistent hashing algorithm, we will get datanode position which is under the frame, finally, becoming the storage location. Consistent hashing algorithm uses the equal-sized partitions technology and the virtual node technology in the process of searching the corresponding position, which improves the search efficiency and the balance of distribution. The strategy, used in the data equilibrium storage and the upload rate, has greatly improved than the original one. Besides, it has the ability of replicas adaptability.
Keywords:consistent hashing  HDFS  replica placement  storage equilibrium  adaptability
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号