首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进决策树算法的Web数据库查询结果自动分类方法
引用本文:孟祥福, 马宗民, 张霄雁, 王 星. 基于改进决策树算法的Web数据库查询结果自动分类方法[J]. 计算机研究与发展, 2012, 49(12): 2656-2670.
作者姓名:孟祥福  马宗民  张霄雁  王星
作者单位:1. 辽宁工程技术大学电子与信息工程学院 辽宁葫芦岛 125105
2. 东北大学信息科学与工程学院 沈阳 110819
基金项目:国家青年科学基金项目,国家自然科学基金面上项目,中国煤炭工业协会科学技术研究指导性计划项目,辽宁省科技厅计划项目
摘    要:为了解决Web数据库多查询结果问题,提出了一种基于改进决策树算法的Web数据库查询结果自动分类方法.该方法在离线阶段分析系统中所有用户的查询历史并聚合语义上相似的查询,根据聚合的查询将原始数据划分成多个元组聚类,每个元组聚类对应一种类型的用户偏好.当查询到来时,基于离线阶段划分的元组聚类,利用改进的决策树算法在查询结果集上自动构建一个带标签的分层分类树,使得用户能够通过检查标签的方式快速选择和定位其所需信息.实验结果表明,提出的分类方法具有较低的搜索代价和较好的分类效果,能够有效地满足不同类型用户的个性化查询需求.

关 键 词:Web数据库  用户偏好  元组聚类  C4.5算法  查询结果分类

A Categorization Approach Based on Adapted Decision Tree Algorithm for Web Databases Query Results
Meng Xiangfu, Ma Zongmin, Zhang Xiaoyan, Wang Xing. A Categorization Approach Based on Adapted Decision Tree Algorithm for Web Databases Query Results[J]. Journal of Computer Research and Development, 2012, 49(12): 2656-2670.
Authors:Meng Xiangfu    Ma Zongmin    Zhang Xiaoyan    Wang Xing
Affiliation:1(College of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105) 2(College of Information Science and Engineering, Northeastern University, Shenyang 110819)
Abstract:To deal with the problem that too many results are returned from a Web database in response to a user query, this paper proposes a novel approach based on adapted decision tree algorithm for automatically categorizing Web database query results. The query history of all users in the system is analyzed offline and then similar queries in semantics are merged into the same cluster. Next, a set of tuple clusters over the original data is generated in accordance to the query clusters, each tuple cluster corresponding to one type of user preferences. When a query is coming, based on the tuple clusters generated in the offline time, a labeled and leveled categorization tree, which can enable the user to easily select and locate the information he/she needs, is constructed by using the adapted decision tree algorithm. Experimental results demonstrate that the categorization approach has lower navigational cost and better categorization effectiveness, and can meet different type user's personalized query needs effectively as well.
Keywords:Web database  user preference  tuples clustering  C4.5 algorithm  query results categorization
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机研究与发展》浏览原始摘要信息
点击此处可从《计算机研究与发展》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号