首页 | 本学科首页   官方微博 | 高级检索  
     

中文文本过滤的信息分流机制
引用本文:林鸿飞,李业丽,姚天顺.中文文本过滤的信息分流机制[J].计算机研究与发展,2000,37(4):470-476.
作者姓名:林鸿飞  李业丽  姚天顺
作者单位:东北大学计算机科学系,沈阳,110006
基金项目:国家自然科学基金项目!(项目编号 69675 0 19),国家教委博士点基金
摘    要:在文本过滤中信息分流是提高过滤效率的有力的手段,为此,提出了一种新的中文文本过滤的信息分流机制.其基本思路是在概念扩充基础上,将不同用户的信息需求组织为树状结构,使其共同的部分成为共享分支,依据提出的侧面相似度和侧面匹配率来实现文本与模板的定量匹配,减弱传统的布尔模型对文本与模板匹配的严格限制,也弥补向量空间模型单纯数量化的不足,更加全面地反映用户的信息需求,试验表明该机制能够明显地提高过滤效率。

关 键 词:文本过滤  概念扩充  信息分流  判定树  信息处理

AN INFORMATION DIFFLUENCE MECHANISM FOR CHINESE TEXT FILTERING
LIN Hong-Fei,LI Ye-Li,YAO Tian-Shun.AN INFORMATION DIFFLUENCE MECHANISM FOR CHINESE TEXT FILTERING[J].Journal of Computer Research and Development,2000,37(4):470-476.
Authors:LIN Hong-Fei  LI Ye-Li  YAO Tian-Shun
Abstract:The information diffluence has an important role in improving the efficiency of text filtering, so a new mechanism for information diffluence is presented in this paper. The main idea of the mechanism is shown as follows: Based on the concept expansion for the keyword given by users, user profiles are automatically constructed into the structure of CDT (concept based decision tree), and the mechanism for information diffluence is based on the CDT. It has the common segments shared by users, and it implements the quantitative matching between texts and user profiles based on the side similarity and the side matching ratio. Consequently, it weakens the strict Boolean constraint and overcomes the shortcoming of the vector space model which only focuses on the quantitative factors. As a result, the mechanism can express the information requirements for diverse users across the board and remarkably improve the efficiency of text filtering.
Keywords:text filtering  vector space model  concept expansion  user profiles  information diffluence  decision tree
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号