首页 | 本学科首页   官方微博 | 高级检索  
     

高效的两轮远程文件快速同步算法
引用本文:徐旦,生拥宏,鞠大鹏,吴建平,汪东升. 高效的两轮远程文件快速同步算法[J]. 计算机科学与探索, 2011, 5(1): 38-49. DOI: 10.3778/j.issn.1673-9418.2011.01.004
作者姓名:徐旦  生拥宏  鞠大鹏  吴建平  汪东升
作者单位:1. 北京邮电大学计算机科学与技术学院,北京,100876
2. 清华大学计算机科学与技术系,北京,100084
3. 清华大学计算机科学与技术系,北京,100084;清华大学信息科学与技术国家实验室,北京,100084
基金项目:国家自然科学基金No.60833004,60673145; 国家高技术研究发展计划(863)No.2009AA1Z104~~
摘    要:远程文件快速同步在文件备份与恢复、Web与ftp网站镜像、内容分发网络、Web访问中具有广泛的应用.提出了一种高效的基于内容变长分块和定长滑动块相结合的两轮快速文件同步算法--tpsync.同步算法分两轮进行,第一轮利用基于内容可变分块技术在粗粒度上定位待同步文件的局部变化数据段,第二轮对局部变化数据段采用定长滑动切块...

关 键 词:重复数据检测  文件同步  rsync算法
修稿时间: 

High Effective Two-round Remote File Fast Synchronization Algorithm
XU Dan,SHENG Yonghong,JU Dapeng,WU Jianping,WANG Dongsheng. High Effective Two-round Remote File Fast Synchronization Algorithm[J]. Journal of Frontier of Computer Science and Technology, 2011, 5(1): 38-49. DOI: 10.3778/j.issn.1673-9418.2011.01.004
Authors:XU Dan  SHENG Yonghong  JU Dapeng  WU Jianping  WANG Dongsheng
Affiliation:1. School of Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing100876, China 2. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China3. National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China
Abstract:Fast remote file synchronization has a widespread application in many scenarios such as the file backup and recovery, Web and ftp site mirroring, content distribution network, Web access and so on. This paper presents a high effective two-round fast synchronization algorithm tpsync which combines content-based variable-sized chunk and fixed-sized sliding block methods. tpsync is implemented with two rounds. For the first round, tpsync adopts content-based variable-sized chunk to locate the local change between similar files in coarse-grained scale. In the second round, tpsync looks up the differential data in the local changed data segment with fixed-sized sliding block method in fine-grained scale, and finally achieves the file synchronization by two-round data interaction. This paper executes a comparison experiment between tpsync and the traditional single-round synchronization method rsync.Extensive experiments on text, binary and database files demonstrate that tpsync can achieve a higher performance on average synchronization time and the amount of network traffic data than rsync.
Keywords:duplicated data detection  file synchronization  rsync
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学与探索》浏览原始摘要信息
点击此处可从《计算机科学与探索》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号