首页 | 本学科首页   官方微博 | 高级检索  
     

异构文本数据转换中XML解析方法对比研究
引用本文:何卓桁,刘志勇,李璐,李长明,张琳.异构文本数据转换中XML解析方法对比研究[J].计算机工程,2020,46(7):286-293,299.
作者姓名:何卓桁  刘志勇  李璐  李长明  张琳
作者单位:东北师范大学信息科学与技术学院,长春130024;同济大学软件学院,上海200092;长春光华学院电气信息学院,长春130033;吉林大学软件学院,长春130012
基金项目:吉林省教育厅新工科研究与实践项目;吉林省教育厅十三五科学技术研究规划项目
摘    要:对异构文本数据转换过程中解析XML文本的DOM、SAX、JDOM、DOM4J方法进行对比研究,以解析时间、内存堆占用空间、CPU占用率为评价指标来判定4种解析方法的优劣。该评价方法的优势在于当数据量或数据属性发生变化时,4种解析方法对评价结果的影响仍具有良好的区分度。通过对10份Web日志异构文本数据转换后的XML数据集进行比较,实验结果表明,当数据量增大且以解析时间为重点时,DOM4J解析方法优于其他3种解析方法,当以空间占用为重点时,SAX解析方法优于其他3种解析方法。

关 键 词:异构文本  XML解析  数据结构转换  时间复杂度  空间复杂度

Comparative Study of XML Parsing Methods in Heterogeneous Text Data Conversion
HE Zhuoheng,LIU Zhiyong,LI Lu,LI Changming,ZHANG Lin.Comparative Study of XML Parsing Methods in Heterogeneous Text Data Conversion[J].Computer Engineering,2020,46(7):286-293,299.
Authors:HE Zhuoheng  LIU Zhiyong  LI Lu  LI Changming  ZHANG Lin
Affiliation:(School of Information Science and Technology,Northeast Normal University,Changchun 130024,China;School of Software,Tongji University,Shanghai 200092,China;School of Electrical and Information Engineering,Changchun Guanghua University,Changchun 130033,China;School of Software,Jilin University,Changchun 130012,China)
Abstract:This paper compares and studies the DOM,SAX,JDOM,DOM4J methods for parsing XML texts in heterogeneous text data conversion.The pros and cons of the four parsing methods are judged based on parsing time,memory heap space,and CPU occupancy rate.The advantage of this evaluation method is that when the amount of data or data attributes change,the impact of the four analytical methods on the evaluation results still has a good degree of discrimination.By comparing 10 converted XML datasets of heterogeneous text data of Web log,experimental results show that when the amount of data increases and the analysis time is mainly concerned,the DOM4J parsing method is superior to the other three analysis methods.When space occupation is mainly concerned,the SAX parsing method is superior to the other three analysis methods.
Keywords:heterogeneous text  XML parsing  data structure conversion  time complexity  space complexity
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号