首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于内容分块的层次化去冗优化策略
引用本文:李建江,马占宁,张凯.一种基于内容分块的层次化去冗优化策略[J].电子学报,2019,47(5):1094-1100.
作者姓名:李建江  马占宁  张凯
作者单位:北京科技大学计算机科学与技术系,北京,100083;北京科技大学计算机科学与技术系,北京,100083;北京科技大学计算机科学与技术系,北京,100083
摘    要:在过去的数十年中,信息数据量呈现指数级增长,如何存储和保护这些大量信息数据成为一个难题.云存储和冗余去重技术成为解决上述难题的主要技术.去冗技术在云存储系统中得到广泛应用,但主流的云存储系统存在索引信息的膨胀以及数据分块的不确定性等不足,而这些弊端会导致内存空间的浪费和数据分块的不可预知性.针对这些问题,提出了一种基于内容分块的层次化去冗优化策略,并构建了对应的算法,解决了云存储系统中索引信息表过大和数据分块过大或过小的问题.并且选取CNN新闻的页面内容作为测试集进行实际测试,通过比较去冗比和去冗时间可以看出,相比于目前主流的去冗策略,本文提出的基于内容分块的层次化去冗优化策略能够提升3%左右的去冗比,同时降低2%左右的去冗时间.

关 键 词:云存储  冗余去重技术  数据分块  层次化  去冗比
收稿时间:2017-12-26

An Optimal Hierarchical Deduplication Strategy Based on Content Defined Chunking
LI Jian-jiang,MA Zhan-ning,ZHANG Kai.An Optimal Hierarchical Deduplication Strategy Based on Content Defined Chunking[J].Acta Electronica Sinica,2019,47(5):1094-1100.
Authors:LI Jian-jiang  MA Zhan-ning  ZHANG Kai
Affiliation:Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
Abstract:In the past decades,the amount of information data is growing in an unexpected speed and how to store and protect these large amounts of data becomes a problem.Cloud storage and data deduplication technology are the principal technology to solve the above problem.Deduplication technology is widely used in cloud storage systems.However,there are some shortcomings such as the expansion of the index information and the uncertainty of the data block in the current mainstream cloud storage technology,which lead to the waste of space and a big difference in the length of the data block.To overcome these shortcomings,through the study of cloud storage and data deduplication,this paper presents an optimal hierarchical deduplication strategy based on content defined chunking and proposes the corresponding algorithm,and achieves the purpose of saving memory space and obtaining better compression performance.Finally,this paper selects the content of the CNN news as a test set.By comparing the compression ratio and the compression time,the optimal hierarchical deduplication strategy has increased compression ratio by 3% and reduced compression time by 2% compared with the current mainstream deduplication strategy.
Keywords:cloud storage  data deduplication  data block partition  hierarchical  compression ratio  
本文献已被 万方数据 等数据库收录!
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号