首页 | 本学科首页   官方微博 | 高级检索  
     

Clustering DTDs: An Interactive Two-Level Approach
引用本文:周傲英,钱卫宁,钱海蕾,张龙,梁宇奇,金文. Clustering DTDs: An Interactive Two-Level Approach[J]. 计算机科学技术学报, 2002, 17(6): 0-0. DOI: 10.1007/BF02960771
作者姓名:周傲英  钱卫宁  钱海蕾  张龙  梁宇奇  金文
作者单位:[1]DepartmentofComputerScience,LaboratoryforIntelligentInformationProcessingFudanUniversity,Shanghai200433,P.R.China [2]DepartmentofComputerScience,SimonFraserUniversity,Canada
基金项目:This work is supported by the NKBRSF of China (Grant No.G1998030414), the National Natural Science Foundation of China (Grant No.60003016), the National Doctoral Research Foundation of China, and the Joint Project with IBM China Research Lab.
摘    要:XML(eXtensible Markup Language)is a standard which is widely applied in data representation and data exchange,However,as an important concept of XML,DTD(Document Type Definition)is not taken full advantage in current applications.In this paper,a new method for clustering DTDs is presented.and it can be used in XML document clustering.The two-level method clusters the elements in DTDs and clusters DTDs separately.Element clustering forms the first level and provides element clusters,which are the generalization of relevant elements.DTD clustering utilizes the generalized information and forms the second level in the whole clustering process.The two-level method has the following advantages:1) It takes into consideration both the content and the structure within DTDs;2) The generalized information about elements is more useful than the separated words in the vector model;3) The two-level method facilitates the searching of outliers.The experiments show that this method is able to categorize the relevant DTDs effectively.

关 键 词:程序设计语言 XML DTD 因特网 数字图书馆

Clustering DTDs: An interactive two-level approach
Aoying Zhou,Weining Qian,Hailei Qian,Long Zhang,Yuqi Liang,Wen Jin. Clustering DTDs: An interactive two-level approach[J]. Journal of Computer Science and Technology, 2002, 17(6): 0-0. DOI: 10.1007/BF02960771
Authors:Aoying Zhou  Weining Qian  Hailei Qian  Long Zhang  Yuqi Liang  Wen Jin
Affiliation:(1) Department of Computer Science, Laboratory for Intelligent Information Processing, Fudan University, 200433 Shanghai, P.R. China;(2) Department of Computer Science, Simon Fraser University, Canada
Abstract:XML (extensible Markup Language) is a standard which is widely applied in data representation and data exchange. However, as an important concept of XML, DTD (Document Type Definition) is not taken full advantage in current applications. In this paper, a new method for clustering DTDs is presented, and it can be used in XML document clustering. The two-level method clusters the elements in DTDs and clusters DTDs separately. Element clustering forms the first level and provides element clusters, which are the generalization of relevant elements. DTD clustering utilizes the generalized information and forms the second level in the whole clustering process. The two-level method has the following advantages: 1) It takes into consideration both the content and the structure within DTDs; 2) The generalized information about elements is more useful than the separated words in the vector model; 3) The two-level method facilitates the searching of outliers. The experiments show that this method is able to categorize the relevant DTDs effectively.
Keywords:clustering   XML (extensible Markup Language)   DTD (Document Type Definition)
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
点击此处可从《计算机科学技术学报》浏览原始摘要信息
点击此处可从《计算机科学技术学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号