首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于聚类分组的虚拟机镜像去冗余方法
引用本文:徐继伟,张文博,魏峻,钟华,黄涛.一种基于聚类分组的虚拟机镜像去冗余方法[J].软件学报,2016,27(2):466-480.
作者姓名:徐继伟  张文博  魏峻  钟华  黄涛
作者单位:中国科学院 软件研究所 软件工程技术中心, 北京 100190;计算机科学国家重点实验室中国科学院 软件研究所, 北京 100190;中国科学院大学, 北京 100190,中国科学院 软件研究所 软件工程技术中心, 北京 100190,中国科学院 软件研究所 软件工程技术中心, 北京 100190;计算机科学国家重点实验室中国科学院 软件研究所, 北京 100190;中国科学院大学, 北京 100190,中国科学院 软件研究所 软件工程技术中心, 北京 100190;中国科学院大学, 北京 100190,中国科学院 软件研究所 软件工程技术中心, 北京 100190;计算机科学国家重点实验室中国科学院 软件研究所, 北京 100190;中国科学院大学, 北京 100190
基金项目:国家自然科学基金(61402450); 国家科技支撑计划(2013BAH45F01); 国家高技术研究发展计划(863)(2013AA0413 01); 北京市自然科学基金(4154088)
摘    要:随着云计算的兴起,虚拟化技术使用也越来越广泛,虚拟机正逐步取代物理机,成为应用服务的部署环境.出于灵活性、可靠性等方面的需求,虚拟机镜像急剧增长,如何高效地、经济地管理这些镜像文件已成为一个很有挑战性的研究热点.由于虚拟机镜像之间存在大量重复性的数据块,高效的去冗余方法对于虚拟机镜像管理至关重要.然而,传统的去冗余方法由于需要巨大的资源开销,会对平台中托管的虚拟机性能造成干扰,因而并不适用于云环境.提出了一种局部去冗余的方法,旨在优化镜像去冗余过程.其核心思想是:将全局去冗余变成局部去冗余,从而降低去冗余算法的空间复杂度,以达到减少操作时间的目的.该方法利用虚拟机镜像相似性作为启发式规则对虚拟机镜像进行分组,当一个新的镜像到来时,通过统计抽样的方法为镜像选取最为相似的分组进行去冗余.实验结果表明:该方法可以通过牺牲1%左右的存储空间,缩短50%以上的去冗余操作时间.

关 键 词:云计算  虚拟化  虚拟机镜像  存储  去冗余
收稿时间:2014/4/23 0:00:00
修稿时间:2014/12/31 0:00:00

Virtual Machine Image Deduplication Method Based on Clustering
XU Ji-Wei,ZHANG Wen-Bo,WEI Jun,ZHONG Hua and HUANG Tao.Virtual Machine Image Deduplication Method Based on Clustering[J].Journal of Software,2016,27(2):466-480.
Authors:XU Ji-Wei  ZHANG Wen-Bo  WEI Jun  ZHONG Hua and HUANG Tao
Affiliation:Technology Center of Software Engineering, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China,Technology Center of Software Engineering, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China,Technology Center of Software Engineering, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China,Technology Center of Software Engineering, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China and Technology Center of Software Engineering, Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100190, China
Abstract:Virtualization technology is becoming more and more prevalence with the rise of cloud computing. The physical machines for service hosting are gradually being replaced by virtual ones. Driven by reliability and flexibility considerations, virtual machine images increase sharply, and how to manage them efficiently and economically has become a big challenge. Since large amount of duplicated data blocks exist in different virtual machine images, an efficient deduplication method is vital to the virtual machine image management. The existing deduplication works are not very suitable for cloud environments as they employ time-consuming algorithms which can cause serious performance interference to the neighboring virtual machines. This paper proposes a local deduplication method which can greatly optimize the deduplication process of virtual machine. The main idea of the method is to convert the global deduplication to a local one, thus considerably reducing the space and time complexity. In this method, the images are classified into different groups through an improved k-means clustering algorithm according to image similarities. When a new image is entered, a sampling method is used to choose an appropriate group to perform the deduplication operation. Experiments show that this approach is robust and effective. It can significantly reduce (more than 50%) the performance interference to hosting virtual machine with an acceptable increase (about 1%) in disk space usage.
Keywords:cloud computing  virtualization  virtual machine image  storage  deduplication
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号