首页 | 本学科首页   官方微博 | 高级检索  
     

XML文档的相似测度和结构索引研究
引用本文:郑仕辉,周傲英,张龙.XML文档的相似测度和结构索引研究[J].计算机学报,2003,26(9):1116-1122.
作者姓名:郑仕辉  周傲英  张龙
作者单位:复旦大学计算机科学与工程系,上海,200433
基金项目:国家自然科学基金 ( 60 0 0 3 0 16,60 0 0 3 0 0 8),国家“九七三”重点基础研究发展规划项目 ( 19980 3 0 40 4)资助
摘    要:提出了一个可用于定量度量XML文档间差异的方法(称为XED距离)。利用结点间的模拟关系,一个XML文档可以表示为一棵精简的、带权重的结构索引树,两个XML文档间的相似度可以通过计算它们的索引树间的编辑距离来测定,利用索引树可以大大提高判定两个XML文档结构相似度的效率,XED距离测度可用于XML文档的结构搜索、XML文档聚类、XML文档结构抽取、XML文档的变换检测以及XML视图的增量计算和维护等。

关 键 词:数据库  XML文档  相似测度  结构索引
修稿时间:2001年8月28日

Similarity Measure and Structural Index of XML Documents
ZHENG Shi,Hui,ZHOU Ao,Ying,ZHANG Long.Similarity Measure and Structural Index of XML Documents[J].Chinese Journal of Computers,2003,26(9):1116-1122.
Authors:ZHENG Shi  Hui  ZHOU Ao  Ying  ZHANG Long
Abstract:This paper presents a quantitative approach to measure the difference between two XML documents, called XED distance. An XML document can be represented as a concise, weighted, structural index tree. It is proven that the similarity between two XML documents can be measured by distance between their structural index trees. Since the structural index tree is dramatically smaller than the original tree, it can greatly reduce the cost for measuring the similarity between two XML documents. The approach presented in this paper can be used in many applications, such as approximate searching of XML documents, clustering XML documents, structural extracting of XML documents, change checking of XML documents, and incremental maintenance of XML views, etc.
Keywords:edit distance  XED distance  structural index tree
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号