首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的自适应文本信息过滤模型
引用本文:马亮,陈群秀,蔡莲红.一种改进的自适应文本信息过滤模型[J].计算机研究与发展,2005,42(1):79-84.
作者姓名:马亮  陈群秀  蔡莲红
作者单位:清华大学计算机科学与技术系智能技术与系统国家重点实验室,北京,100084
基金项目:国家"八六三"高技术研究发展计划基金项目(2001AA14040)
摘    要:自适应信息过滤技术能够帮助用户从Web等信息海洋中获得感兴趣的内容或过滤无关垃圾信息.针对现有自适应过滤系统的不足,提出了一种改进的自适应文本信息过滤模型.模型中提供了两种相关性检索机制,在此基础上改进了反馈算法,并采用了增量训练的思想,对过滤中的自适应学习机制也提出了新的算法.基于本模型的系统在相关领域的国际评测中取得良好成绩.试验数据说明各项改进是有效的,新模型具有更高的性能.

关 键 词:信息检索  Web  自适应信息过滤  Language  Model  相关性反馈

An Improved Model for Adaptive Text Information Filtering
Ma Liang,Chen Qunxiu,Cai Lianhong.An Improved Model for Adaptive Text Information Filtering[J].Journal of Computer Research and Development,2005,42(1):79-84.
Authors:Ma Liang  Chen Qunxiu  Cai Lianhong
Abstract:The information filtering technology is usually used to track favorite topics and eliminate garbage content from information stream. The adaptive information filtering, which requires little initial training resource and can actively improve itself in filtering process, provides a better performance and convenience than the old way. But there are still some difficulties in training and adaptive learning. In this paper, an improved filtering model for adaptive text filtering is proposed. In this model, two retrieval/feedback mechanisms are used respectively. One is based on vector space model and Rocchio feedback algorithm, and another mechanism is derived from a latest language model IR system. Based on them, an incremental learning method using multi-step pseudo feedback is introduced in profile training to keep a minimal bias to the original topic, and an adaptive profile adjusting mechanism in filtering process, which newly takes into account the document distribution and the decay rate of the topic feature, is also developed. The running system constructed using the new model got a high evaluation score in related international contest, indicating that the improvements in the filtering model are effective.
Keywords:information retrieval  Web  adaptive information filtering  language model  relevance feedback  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号