首页 | 本学科首页   官方微博 | 高级检索  
     

大数据访问中信息传输冗余量消除仿真
引用本文:李杰,侯锐.大数据访问中信息传输冗余量消除仿真[J].计算机仿真,2020,37(3):148-151,177.
作者姓名:李杰  侯锐
作者单位:西安石油大学计算机学院,陕西西安,710065
基金项目:陕西省教育厅科学研究项目
摘    要:针对传统的大数据访问中信息传输冗余量消除方法存在查全率、信息传输冗余量消除效率以及速率较低等问题,提出了基于Hamming距离值的大数据访问中信息传输冗余量消除方法。利用滑动以及滚动相结合的窗口移动模式减少窗口计算量,将Rsync滚动校验算法以及MD5算法相结合,在文件任意位置开始计算滚动校验值,通过递进关系,获取连续数据块的校验值,根据不同数据块的校验值进行数据匹配。将经过匹配后的数据块利用CDC分块检测算法进行检测,根据余弦相似度计算公式以及Hamm距离值计算相似度,实现大数据访问中信息传输冗余量消除。实验结果表明,所提方法有效提高了冗余信息查全率、信息传输冗余量消除效率以及速率,能够快速、准确地消除多余的信息。

关 键 词:大数据访问  信息传输  冗余量消除

Simulation of Information Transmission Redundancy Elimination in Big Data Access
LI Jie,HOU Rui.Simulation of Information Transmission Redundancy Elimination in Big Data Access[J].Computer Simulation,2020,37(3):148-151,177.
Authors:LI Jie  HOU Rui
Affiliation:(Computer College,Xi*an Shiyou University,Xi*an Shaanxi 710065,China)
Abstract:Traditional method of data transmission redundancy elimination leads to low recall rate and information transmission redundancy elimination efficiency.Therefore,this paper proposes a method to eliminate the information transmission redundancy in big data access based on Hamming distance value.Firstly,we used the window moving mode combining sliding with scrolling to reduce the computation amount.Secondly,we combined Rsync rolling verification algorithm with MD5 algorithm.At any position of the file,we began to calculate the rolling verification value,so as to obtain the verification of the continuous data block through the progressive relation.According to check values of different data blocks,we matched data.Moreover,we used CDC block detection algorithm to detect the matched data block.According to the cosine similarity calculation formula and Hamm distance value,we calculated the similarity.Finally,we achieved the elimination of information transmission redundancy in big data access.Simulation results show that the proposed method effectively improves the recall rate of redundant information and the efficiency and rate of information transmission redundancy elimination.Meanwhile,this method can quickly and accurately eliminate redundant information.
Keywords:Big data access  Information transmission  Redundancy elimination
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号