共查询到17条相似文献,搜索用时 171 毫秒
1.
为了改善自然语言处理应用中长期存在的主题漂移和词不匹配问题,本文首先提出一种加权项集支持度计算方法和基于项权值排序的剪枝方法,给出面向查询扩展的基于项权值排序的加权关联规则挖掘算法,讨论关联规则混合扩展、后件扩展和前件扩展模型,最后提出基于项权值排序挖掘的跨语言查询扩展算法.该算法采用新的支持度和剪枝策略挖掘加权关联规则,根据扩展模型从规则中提取高质量扩展词实现跨语言查询扩展.实验结果表明,与现有基于加权关联规则挖掘的跨语言扩展算法比较,本文扩展算法能有效遏制查询主题漂移和词不匹配问题,可用于各种语言的信息检索以改善检索性能,扩展模型中后件扩展获得最优检索性能,混合扩展的检索性能不如后件扩展和前件扩展,支持度对后件扩展更有效,置信度更有利于提升前件扩展和混合扩展的检索性能.本文挖掘方法可用于文本挖掘、商务数据挖掘和推荐系统以提高其挖掘性能. 相似文献
2.
针对自然语言处理中查询主题漂移和词不匹配问题,提出基于CSC(Copulas-based Support and Confidence)框架的关联模式挖掘与规则扩展算法,并将基于统计学分析的关联模式与具有上下文语义信息的词向量融合,提出关联模式挖掘与词向量学习融合的伪相关反馈查询扩展模型.该模型对伪相关反馈文档集挖掘规则扩展词,对初检文档集进行词嵌入学习训练得到词向量,计算规则扩展词与原查询的向量相似度,提取向量相似度不低于阈值的规则扩展词作为最终扩展词.实验结果表明,所提扩展模型能有效地减少查询主题漂移和词不匹配问题,提高检索性能,与现有基于关联模式的和基于词向量的查询扩展方法比较,MAP(Mean Average Precision)平均增幅最大可达17.52%,对短查询更有效.所提挖掘方法可用于其他文本挖掘任务和推荐系统,以提高其性能. 相似文献
3.
4.
5.
查询翻译是影响跨语言信息检索(CLIR)性能的关键因素之一.查询中未登录词(OOV)译文的挖掘对改进CLIR性能具有重要意义.利用主题词译文查询扩展方法从搜索引擎自动获取有效双语摘要资源;采用频度变化信息和邻接信息从双语摘要资源中抽取多词候选单元,并与常见的基于统计的多词单元抽取方法进行了比较.实验中译文挖掘方法取得了TOP 1包含率62.02%,TOP 10包含率95.35%的效果. 相似文献
6.
7.
基于统计机器翻译模型的查询扩展 总被引:1,自引:0,他引:1
在搜索引擎等实际的信息检索应用中,用户提交的查询请求通常都只包含很少的几个关键词,这会引起相关文档与用户查询之间的词不匹配问题,对检索性能有较严重的负面影响。该文在分析了查询产生模型的基础上,提出了一种新的基于统计机器翻译的查询扩展方法。通过统计机器翻译模型提取文档集中与查询词相关联的词,用以进行查询扩展。在TREC数据集上的试验结果表明:基于统计翻译的查询扩展方法不仅比不扩展的语言模型方法始终有12%~17%的提高,而且比流行的查询扩展方法-伪反馈也具有可比的平均准确率。 相似文献
8.
为了解决传统查询扩展时查准率低下的问题,基于词义消歧技术提出一种综合扩展语义树和词频共现率的语义查询扩展方法.针对查询词歧义所带来的查询主题漂移现象,利用WordNet知识源及其领域信息进行查询词义消歧,进而根据WordNet的层次结构生成扩展语义树,产生候选扩展词,并根据待扩展词与用户查询的整体最大相关性原则最终确定扩展词及其权重,使得扩展词能够充分表达用户查询请求,提高查询匹配准确率.实验表明,该方法在保证查全率的同时获得了较高的查准率. 相似文献
9.
10.
CR:一种逆向的关联规则挖掘算法 总被引:4,自引:0,他引:4
引入与交易相关的有关概念,对传统关联规则挖掘的概念进行了扩展,并基于交易提出了一种关联规则挖掘算法,该算法从较长的交易入手,试图找出长的频繁项集,再确定它们的子项集,从而避免了组合爆炸问题。该算法对原数据库进行1次扫描,对压缩数据库进行了2次扫描,较Apriori算法减少了扫描次数,提高了挖掘效率。 相似文献
11.
Combining positive and negative examples in relevance feedback for content-based image retrieval 总被引:2,自引:0,他引:2
M. L. Kherfi D. Ziou A. Bernardi 《Journal of Visual Communication and Image Representation》2003,14(4):428-457
In this paper, we address some issues related to the combination of positive and negative examples to improve the efficiency of image retrieval. We start by analyzing the relevance of the negative example and how it can be interpreted and utilized to mitigate certain problems in image retrieval, such as noise, miss, the page zero problem and feature selection. Then we propose a new relevance feedback approach that uses the positive example (PE) to perform generalization and the negative example (NE) to perform specialization. In this approach, a query containing both PE and NE is processed in two steps. The first step considers the PE alone, in order to reduce the set of images participating in retrieval to a more homogeneous subset. Then, the second step considers both PE and NE and acts on the images retained in the first step. Mathematically, relevance feedback is formulated as an optimization of the intra and inter variances of the PE and NE. The proposed relevance feedback algorithm was implemented in our image retrieval system, which we tested on a collection of more than 10,000 images. The experimental results show how the NE as considered in our model can contribute in improving the relevance of the images retrieved. 相似文献
12.
13.
Wei Jiang Guihua Er Qionghai Dai Jinwei Gu 《IEEE transactions on image processing》2006,15(3):702-712
Content-based image retrieval (CBIR) has been more and more important in the last decade, and the gap between high-level semantic concepts and low-level visual features hinders further performance improvement. The problem of online feature selection is critical to really bridge this gap. In this paper, we investigate online feature selection in the relevance feedback learning process to improve the retrieval performance of the region-based image retrieval system. Our contributions are mainly in three areas. 1) A novel feature selection criterion is proposed, which is based on the psychological similarity between the positive and negative training sets. 2) An effective online feature selection algorithm is implemented in a boosting manner to select the most representative features for the current query concept and combine classifiers constructed over the selected features to retrieve images. 3) To apply the proposed feature selection method in region-based image retrieval systems, we propose a novel region-based representation to describe images in a uniform feature space with real-valued fuzzy features. Our system is suitable for online relevance feedback learning in CBIR by meeting the three requirements: learning with small size training set, the intrinsic asymmetry property of training samples, and the fast response requirement. Extensive experiments, including comparisons with many state-of-the-arts, show the effectiveness of our algorithm in improving the retrieval performance and saving the processing time. 相似文献
14.
为自动调节当前检索任务以使最终查询结果朝着有利于用户要求的方向发展,提出一种基于组合特征双重加权的相关反馈算法。将图像检索中初始权重的设定过程作为优化问题,利用量子遗传算法求取全局最优解,作为图像检索过程中各特征初始权重的加权值;另外,在组合特征权重动态调节的过程中,将灰色关联分析理论中的灰关联度作为特征权重的估计值,同时将反馈结果中每幅图像的评价都考虑到灰色关联分析的计算中,从而来估计不同特征在检索中的相对重要性。实验结果表明,本文算法能够达到精炼检索结果的目的,大幅提高检索全面性和检索准确度。 相似文献
15.
提出了一种基于蚁群算法在数据库查询应用中的新方法及其仿真,蚁群算法就是对自然界中蚂蚁的寻食过程进行模拟而得出的一种模拟进化算法。与传统的算法相比,该算法的主要特点是正反馈和并行性,正反馈使得该算法能很快发现较好查询路径,并行性使得该算法易于实现并行查询计算,从而提高了查询的速度。最后,利用Excel对蚁群查询算法和传统查询算法进行了仿真并进行了比较。 相似文献
16.
17.
Relevance feedback in content-based image retrieval: Bayesian framework, feature subspaces, and progressive learning 总被引:9,自引:0,他引:9
Zhong Su Hongjiang Zhang Li S. Shaoping Ma 《IEEE transactions on image processing》2003,12(8):924-937
Research has been devoted in the past few years to relevance feedback as an effective solution to improve performance of content-based image retrieval (CBIR). In this paper, we propose a new feedback approach with progressive learning capability combined with a novel method for the feature subspace extraction. The proposed approach is based on a Bayesian classifier and treats positive and negative feedback examples with different strategies. Positive examples are used to estimate a Gaussian distribution that represents the desired images for a given query; while the negative examples are used to modify the ranking of the retrieved candidates. In addition, feature subspace is extracted and updated during the feedback process using a principal component analysis (PCA) technique and based on user's feedback. That is, in addition to reducing the dimensionality of feature spaces, a proper subspace for each type of features is obtained in the feedback process to further improve the retrieval accuracy. Experiments demonstrate that the proposed method increases the retrieval speed, reduces the required memory and improves the retrieval accuracy significantly. 相似文献