首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
一种改进的自适应文本信息过滤模型   总被引:19,自引:1,他引:18  
自适应信息过滤技术能够帮助用户从Web等信息海洋中获得感兴趣的内容或过滤无关垃圾信息.针对现有自适应过滤系统的不足,提出了一种改进的自适应文本信息过滤模型.模型中提供了两种相关性检索机制,在此基础上改进了反馈算法,并采用了增量训练的思想,对过滤中的自适应学习机制也提出了新的算法.基于本模型的系统在相关领域的国际评测中取得良好成绩.试验数据说明各项改进是有效的,新模型具有更高的性能.  相似文献   

2.
论文提出了一种基于向量空间模型的用户个性化需求建模方法。对关键词权重算法作出改进,将网页分为四类逻辑段,通过计算关键词在各类逻辑段中的权重而加权得到综合权重。采用基于内容的构建原则和反馈原则,将用户模型构建分为训练阶段和自适应学习阶段。在训练阶段由用户给出的样本文档与关键词采用类重心分类算法训练得到初始用户模型;在自适应学习阶段,提出了基于 Rocchio 算法的周期性自适应学习机制,根据用户对过滤结果的评价,调整用户模型,以提高对用户个性化需求的动态追踪能力。开发了个性化信息过滤原型系统。以中国服装网为实验数据源,对比百度搜索引擎,测试系统的信息过滤性能。实验结果表明,系统索引更新及时,响应速度快,返回的信息更精确,更合理,更加符合用户的实际需求。  相似文献   

3.
为了从网络动态信息流中获得感兴趣的内容或过滤掉无关的垃圾信息,设计了一个基于向量空间模型的自适应信息过滤系统;描述了该系统的结构和工作流程;对该系统实现的关键技术,包括文本表示、用户模板与过滤阈值初始化、特征项选取、自适应过滤算法、模板更新和阈值调整等进行了深入的分析和研究。  相似文献   

4.
基于用户实时反馈的协同过滤算法   总被引:2,自引:0,他引:2  
傅鹤岗  李冉 《计算机应用》2011,31(7):1744-1747
传统的基于内存的协同过滤算法存在可扩展性不足的问题,而基于模型的协同过滤算法由于模型数据的滞后,造成推荐质量不高。针对以上情况,提出一种基于用户实时反馈的协同过滤算法,该算法在用户提交项目评分之后能实现对推荐模型数据的实时更新,从而更精确地反映用户的兴趣变化。实验结果表明,该算法能够有效地提高推荐精确度并且大幅地缩短了推荐时间。  相似文献   

5.
基于概念扩充的文本过滤模型   总被引:8,自引:1,他引:7  
该文在介绍文本过滤的背景及向量空间模型的同时,提出了基于语义词典对用户模板进行扩充的文本过滤模型,该模型首先对文本进行分析,把文本表示成向量空间中的向量形式,在形成用户初始模板之后,对用户模板进行同义词扩充,形成扩充后的用户模板,以此模板来进行文本过滤。在用户反馈的基础上,自适应地修改该模板,以适应用户变化的需求及改善系统过滤性能。实验表明,这样的确可以提高系统覆盖面,提高系统效率。  相似文献   

6.
针对基于用户和基于项目的协同过滤模型存在推荐质量不高等问题,提出一种综合用户和项目预测的协同过滤模型。该模型同时考虑用户和项目两方面,首先对性能优秀的相似性模型进行自适应的优化;然后根据相似性值分别选取相似用户和相似项目为目标对象构造近邻集合,并利用预测函数得到基于用户和基于项目的预测结果;最后通过自适应平衡因子的协调处理获得最终预测结果。比较实验在不同的评估标准下进行,结果表明,与目前典型的模型如RSCF、HCFR和UNCF相比,新提出的协同过滤模型不仅在项目预测准确性方面拥有出色的表现,而且在推荐准确性和全面性方面同样表现优秀。  相似文献   

7.
基于向量空间模型的文本过滤系统   总被引:64,自引:0,他引:64       下载免费PDF全文
文本过滤是指从大量的文本数据流中寻找满足特定用户需求的文本的过程.首先从任务、测试主题、语料库和评测指标等方面介绍了文本检索领域最权威的国际评测会议--文本检索会议(TREC)及其中的文本过滤项目,然后详细地描述了基于向量空间模型的文本过滤系统.该系统由训练和自适应过滤两个阶段组成.在训练阶段,通过特征抽取和伪反馈建立初始的过滤模板,并设置初始阈值;在过滤阶段,则根据用户的反馈信息自适应地调整模板和阈值.该系统参加了2000年举行的第9次文本检索会议的评测,取得了很好的成绩,在来自多个国家的15个系统中名列前茅,其中自适应过滤和批过滤的平均准确率分别为26.5%和31.7%.  相似文献   

8.
基于词汇链的文本过滤模型   总被引:5,自引:0,他引:5  
在介绍文本过滤的背景及传统基于关键词的向量空间方法不足之处的同时,引入了词汇链的概念,提出了基于词汇链表示文本的文本过滤模型,该模型首先对文本进行分析,把文本表示成词汇链的形式,在形成用户初始模板之后,以此模板来进行文本过滤。在用户反馈的基础上,自适应地修改该模板,以适应用户变化的需求及改善系统过滤性能,实验表明,这样的确可以提高系统精度。  相似文献   

9.
电子商务环境下信息过滤中用户偏好调整算法   总被引:5,自引:0,他引:5  
徐博艺  姜丽红 《计算机工程》2001,27(10):102-104
对信息过滤过程进行了分析,包括定义用户偏好、接受信息输入流、过滤以及用户反馈环节。在此基础上,分析了网络环境下群体决策信息收集与过滤的特点,提出决策信息过滤中用户偏好生成及自适应调整算法。  相似文献   

10.
基于向量空间模型的信息安全过滤系统   总被引:6,自引:0,他引:6  
信息过滤是指通过监控信息源以找到满足用户需求的信息的过程。详细地论述了基于向量空间模型的信息过滤系统,系统由训练和自适应过滤两个阶段组成,在训练阶段,通过主题处理和特征抽取建立初始的过滤模板,设置初始阈值;在过滤阶段,则根据用户的反馈信息自适应地调整模板和阈值,最后给出了评估方法和实验结果。  相似文献   

11.
A universal search engine is unable to provide a personal touch to a user query. To overcome the deficiency of a universal search engine, vertical search engines are used, which return search results from a specific domain. An alternate option is to use a personalized search system. In our endeavor to provide personalized search results, the proposed system, Exclusively Your’s, observes a user browsing behavior and his actions. Based on the observed user behavior, it dynamically constructs user profile which consists of some terms that are related to user's interest. The constructed profile is later used for query expansion. The goal of research work in this paper is not to provide all the relevant results, but a few high quality personalized search results at the top of ranked list, which in other words means high precision. We performed experiments by personalizing Google, Yahoo, and Naver (widely used search engine in Korea). The results show that using Exclusively Your’s, a search engine yields significant improvement. We also compared the user profile constructed by the proposed approach with other similar personalization approaches; the results show a marginal increase in precision.  相似文献   

12.
查询扩展是信息检索技术研究的一个重要组成部分。目前的查询扩展是基于统一的用户模型,没有考虑到用户的个人兴趣,这对查询扩展的精确度造成了一定的影响。分析了产生这种问题的原因,提出了基于概念图的用户兴趣扩展模型,通过该模型来有效提高查询扩展的精确度。实验显示,该方法能有效提高查询的查全率和查准率。  相似文献   

13.
文本特征区域与文本过滤的匹配机制   总被引:3,自引:0,他引:3  
为了根据用户的信息需求,在因特网上搜索相关文本,该文提出了一种文本过滤的匹配机制,其基本思想是:利用基于词典的概念扩张方法,改进用户模板。计算扩张的用户模板与文本的全局相似度,获取初步的过滤结果;在文本特征区域,进行标题、摘要段、首段和尾段等片断的局部相似度计算,以综合评价文本与用户模板的匹配情况。该方法可操作性强,效果明显。  相似文献   

14.
针对用户兴趣模型构建问题,利用用户兴趣树描述用户兴趣,采用空间向量模型的表示方法,对构建用户个性化模型进行研究,并提出一种兴趣模型调整算法。模拟实验表明,该模型能有效提高检索结果的查准率,以满足用户个性化需求。  相似文献   

15.
This paper provides a transparent and speculative algorithm for content based web page prefetching. The algorithm relies on a profile based on the Internet browsing habits of the user. It aims at reducing the perceived latency when the user requests a document by clicking on a hyperlink. The proposed user profile relies on the frequency of occurrence for selected elements forming the web pages visited by the user. These frequencies are employed in a mechanism for the prediction of the user’s future actions. For the anticipation of an adjacent action, the anchored text around each of the outbound links is used and weights are assigned to these links. Some of the linked documents are then prefetched and stored in a local cache according to the assigned weights. The proposed algorithm was tested against three different prefetching algorithms and yield improved cache–hit rates given a moderate bandwidth overhead. Furthermore, the precision of accurately inferring the user’s preference is evaluated through the recall–precision curves. Statistical evaluation testifies that the achieved recall–precision performance improvement is significant.  相似文献   

16.
Most Web search engines use the content of the Web documents and their link structures to assess the relevance of the document to the user’s query. With the growth of the information available on the web, it becomes difficult for such Web search engines to satisfy the user information need expressed by few keywords. First, personalized information retrieval is a promising way to resolve this problem by modeling the user profile by his general interests and then integrating it in a personalized document ranking model. In this paper, we present a personalized search approach that involves a graph-based representation of the user profile. The user profile refers to the user interest in a specific search session defined as a sequence of related queries. It is built by means of score propagation that allows activating a set of semantically related concepts of reference ontology, namely the ODP. The user profile is maintained across related search activities using a graph-based merging strategy. For the purpose of detecting related search activities, we define a session boundary recognition mechanism based on the Kendall rank correlation measure that tracks changes in the dominant concepts held by the user profile relatively to a new submitted query. Personalization is performed by re-ranking the search results of related queries using the user profile. Our experimental evaluation is carried out using the HARD 2003 TREC collection and showed that our session boundary recognition mechanism based on the Kendall measure provides a significant precision comparatively to other non-ranking based measures like the cosine and the WebJaccard similarity measures. Moreover, results proved that the graph-based search personalization is effective for improving the search accuracy.  相似文献   

17.
针对根据目前网络信息检索存在的查全率和查准率低的特点,提出一种个性化的局部上下文分析方法,以提高Web信息检索的性能.该方法通过设计一种客户端的用户兴趣挖掘模型,同时将用户兴趣模型与局部上下文分析方法相结合,克服了局部上下文分析的缺陷.实验结果显示该方法能有效提高Web信息检索的查全率与查准率.  相似文献   

18.
基于用户兴趣的查询扩展语义模型   总被引:1,自引:0,他引:1  
自然语言中词的同义现象和歧义现象一直是降低信息检索查全率和查准率的关键,在Web搜索引擎上显得更加突出。提出了一种基于用户兴趣的查询扩展语义模型,通过构建基于Yahoo的语义ontology知识库消除同义现象,设计客户端的用户兴趣挖掘模型消除歧义现象。实验结果显示该方法能有效提高Web信息检索的查全率与查准率。  相似文献   

19.
We propose an information filtering system for documents by a user profile using latent semantics obtained by singular value decomposition (SVD) and independent component analysis (ICA). In information filtering systems, it is useful to analyze the latent semantics of documents. ICA is one method to analyze the latent semantics. We assume that topics are independent of each other. Hence, when ICA is applied to documents, we obtain the topics included in the documents. By using SVD remove noises before applying ICA, we can improve the accuracy of topic extraction. By representation of the documents with those topics, we improve recommendations by the user profile. In addition, we construct a user profile with a genetic algorithm (GA) and evaluate it by 11-point average precision. We carried out an experiment using a test collection to confirm the advantages of the proposed method. This work was presented in part at the 10th International Symposium on Artificial Life and Robotics, Oita, Japan, February 4–6, 2005  相似文献   

20.
This paper provides a transparent and speculative algorithm for content based web page prefetching. The algorithm relies on a profile based on the Internet browsing habits of the user. It aims at reducing the perceived latency when the user requests a document by clicking on a hyperlink. The proposed user profile relies on the frequency of occurrence for selected elements forming the web pages visited by the user. These frequencies are employed in a mechanism for the prediction of the user’s future actions. For the anticipation of an adjacent action, the anchored text around each of the outbound links is used and weights are assigned to these links. Some of the linked documents are then prefetched and stored in a local cache according to the assigned weights. The proposed algorithm was tested against three different prefetching algorithms and yield improved cache–hit rates given a moderate bandwidth overhead. Furthermore, the precision of accurately inferring the user’s preference is evaluated through the recall–precision curves. Statistical evaluation testifies that the achieved recall–precision performance improvement is significant.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号