首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 203 毫秒
1.
随着网络技术的发展,互联网中越来越多的资源被应用于信息检索中,大量的研究表明,社会化标注可以用于改善信息检索。现有个性化排序的方法中,用户之间的相似度大多通过其共同使用过的标签集来计算。然而,现实中用户标注数据存在稀疏性和标签同义词等问题,导致相似度计算并不准确。在前人研究的基础上,提出了一种融合主题域相似的个性化排序方法。该方法首先通过主题域的划分,将不同主题含义的网页及标签分开,通过构建的标签相似网络找出标签同义词。然后结合用户标签和主题偏好找出兴趣相近的用户,并对用户的标注信息进行扩展,从而能够有效地改善个性化信息检索的效果。在真实数据上的实验结果表明,该方法能有效缓解标注稀疏性和标签同义词问题,有助于改善用户检索体验。  相似文献   

2.
一种基于加权领域本体的语义检索方法   总被引:2,自引:0,他引:2  
提出了新方法WOSR,以对已经被本体概念标注的领域信息资源进行语义检索.WOSR方法首先建立领域本体,再采用均等概率分布方法为概念赋权,然后通过概念的权重求出概念相似度,最后计算用户检索请求和信息资源之间的语义相似度,并根据相似度的大小排序输出检索结果.实验结果表明,WOSR方法比其他经典方法的检索效果更好.  相似文献   

3.
王娟  赖思渝  李明东 《计算机应用》2009,29(7):1947-1950
为了提高图像标注与检索的性能,提出了一种基于区域分割与相关反馈的图像标注与检索算法。该算法利用视觉特征与标注信息的相关性,采用基于区域的视觉特征对每幅图像采用聚类方法获得其一组视觉相似图像。通过计算与其距离最近的前3个分类的相似度,然后对这些关键字概率向量进行整合,获得最适合该图像的关键字概率向量,对图像进行标注。利用用户的反馈信息,修正查询关键词与每个分类之间的关系,进一步提高图像检索的准确性。实验结果表明,提出的算法具有更高的查准率与查全率。  相似文献   

4.
基于特征子空间学习的跨媒体检索方法   总被引:1,自引:0,他引:1  
学习不同模态的多媒体数据在底层特征上的潜在关系,在降维得到的特征子空间中通过基于相似度传递的优化算法对图像和音频的聚类质量进行修正.相关反馈过程中设计了3种主动学习策略用以计算用户标注样本周围未标注样本的条件概率,从而在反馈样本有限的情况下提高跨媒体检索效率.实验结果表明该方法准确度量跨媒体的相关性,有效实现图像和音频数据之间的相互检索.  相似文献   

5.
用户通过检索平台能获得大量信息,但搜索结果往往会出现主题漂移、偏重旧网页的现象,不能满足用户实际需求.为改善这种现象,提出了一种改进的PageRank算法.该算法采用BM25相似度算法对主题相似度进行计算,根据相似度评分来赋予不同的影响权重,可以提相似度高的网页的排名;利用网页在搜索引擎周期内被搜索到的次数来表示网页存...  相似文献   

6.
为消除协同标注系统中标注信息的封闭性,为标注信息引入统一的数据元模型地理标识语言GML.为减少标注信息的异构性和局部自治性,引入模式匹配技术对GML模式中元素的相似度进行匹配计算.在一般模式匹配算法的基础上引入了动态权值调配函数,并根据规则之间的优先关系将规则组合起来,完成元素相似度的计算.通过标注信息模式匹配,协同标注系统用户在协同工作活动中可直接、透明的获取标注信息之间的相似关系,更高效的在协同工作环境下进行信息交流.  相似文献   

7.
社会化标签提供了网页信息的额外描述,直观上对搜索具有重要价值。该文提出一种新颖的利用社会化标签的分类属性进行检索的方法。该方法通过将群体的标注信息建模为高层类别来估计话题模型,然后基于该话题模型来对语言模型进行平滑。建模方法可以降低标注稀疏性的影响,有效地表达标签含义,从而提升检索效果。基于TREC评测构建的数据集上的实验结果表明,该方法优于基于LDA的检索方法以及现有其他基于标签数据的检索方法。  相似文献   

8.
李劲  张华  吴浩雄  向军  辜希武 《计算机应用》2012,32(5):1335-1339
社会标注是一种用户对网络资源的大众分类,蕴含了丰富的语义信息,因此将社会标注应用到信息检索技术中有助于提高信息检索的质量。研究了一种基于社会标注的文本分类改进算法以提高网页分类的效果。由于社会标注属于大众分类,标注的产生具有很大的随意性,标注的质量差别很大,因此首先利用文档间的语义相似度以及标注间的语义相似度来对标注的质量进行量化评估。在此基础上对标注进行质量过滤,利用质量相对较好的标注对文档向量空间模型进行扩展,将文档表示成由文档单词以及文档标注信息组成的扩展向量。同时采用支持向量机分类算法进行分类实验。实验结果表明,通过对标注进行质量评估并过滤质量差的标注,同时结合文档内容以及标注来对文档能提高分类的效果,同传统的基于文档内容的分类算法相比,分类结果的F1度量值提高了6.2%。  相似文献   

9.
基于本体集成的语义标注模型设计   总被引:1,自引:0,他引:1  
语义Web的全面实现需借助于语义标注,标注网页信息会涉及到多个本体.据此,通过研究桥本体,提出一个在本体集成的基础上建立起来的多本体语义标注模型.该模型利用桥本体集成顶层本体和多个领域本体,同时借助基于本体的信息抽取技术对网页进行语义标注,并将标注信息存入标注库,使标注信息与网页分离,提高语义检索的效率.通过举例说明了本模型的合理性.  相似文献   

10.
搜索引擎中用户查询和网页资源之间的相似度研究一直是页面排序的研究核心。利用 HowNet 对词语的语义层次架构模型,对用户的检索词进行兴趣挖掘,同时对检索词和挖掘出的兴趣关键词的语义相似度计算方法进行改进,用户的检索请求与分块后的网页资源进行相似度迭代计算。实验结果表明,改进的算法使得页面排序的准确率和首页命中率有了较大提高。  相似文献   

11.
一种基于社会性标注的网页排序算法   总被引:2,自引:0,他引:2  
社会性标注作为一种新的资源管理和共享方式,吸引为数众多的用户参与其中,由此产生的大量社会性标注数据成为网页质量评价的一个新维度.文中研究如何利用社会性标注改进网页检索性能,提出一种有机结合网页和用户的查询相关性与互增强关系的网页排序算法.首先利用统计主题模型,使用相关标签为网页和用户建模,并计算查询相关性.然后利用二部图模型刻画网页和用户间的互增强关系,并使用相关标签与用户兴趣和网页内容的匹配度为互增强关系赋予权重.最后结合查询相关性和互增强关系,以迭代方式同时计算网页和用户的评分.实验结果表明,文中提出的检索模型和互增强模型能够有效地提高排序算法的性能.与目前的代表性算法相比,该算法在检索性能上有明显提高.  相似文献   

12.
Collaborative social annotation systems allow users to record and share their original keywords or tag attachments to Web resources such as Web pages, photos, or videos. These annotations are a method for organizing and labeling information. They have the potential to help users navigate the Web and locate the needed resources. However, since annotations are posted by users under no central control, there exist problems such as spam and synonymous annotations. To efficiently use annotation information to facilitate knowledge discovery from the Web, it is advantageous if we organize social annotations from semantic perspective and embed them into algorithms for knowledge discovery. This inspires the Web page recommendation with annotations, in which users and Web pages are clustered so that semantically similar items can be related. In this paper we propose four graphic models which cluster users, Web pages and annotations and recommend Web pages for given users by assigning items to the right cluster first. The algorithms are then compared to the classical collaborative filtering recommendation method on a real-world data set. Our result indicates that the graphic models provide better recommendation performance and are robust to fit for the real applications.  相似文献   

13.
The MADCOW annotation system supports a notion of group, facilitating focused annotations with respect to a domain. In previous work, we adopted ontologies to represent knowledge about domains, thus allowing more refined annotations to a group, and discussed how the use of ontologies facilitates the formulation of semantically significant queries for retrieving annotations on specific topics. We now expand on previous results and study two new types of measures to identify matches between users׳ interests and groups: Degree Centrality, developed for social networks to assess the quality of concepts in an ontology, and URL concordance, indicating the similarity of interests among users who annotate the same pages.  相似文献   

14.
Zhang  Hongjiang  Chen  Zheng  Li  Mingjing  Su  Zhong 《World Wide Web》2003,6(2):131-155
A major bottleneck in content-based image retrieval (CBIR) systems or search engines is the large gap between low-level image features used to index images and high-level semantic contents of images. One solution to this bottleneck is to apply relevance feedback to refine the query or similarity measures in image search process. In this paper, we first address the key issues involved in relevance feedback of CBIR systems and present a brief overview of a set of commonly used relevance feedback algorithms. Almost all of the previously proposed methods fall well into such framework. We present a framework of relevance feedback and semantic learning in CBIR. In this framework, low-level features and keyword annotations are integrated in image retrieval and in feedback processes to improve the retrieval performance. We have also extended framework to a content-based web image search engine in which hosting web pages are used to collect relevant annotations for images and users' feedback logs are used to refine annotations. A prototype system has developed to evaluate our proposed schemes, and our experimental results indicated that our approach outperforms traditional CBIR system and relevance feedback approaches.  相似文献   

15.
This paper is concerned with the problem of boosting social annotations using propagation, which is also called social propagation. In particular, we focus on propagating social annotations of web pages (e.g., annotations in Del.icio.us). Social annotations are novel resources and valuable in many web applications, including web search and browsing. Although they are developing fast, social annotations of web pages cover only a small proportion (<0.1%) of the World Wide Web. To alleviate the low coverage of annotations, a general propagation model based on Random Surfer is proposed. Specifically, four steps are included, namely basic propagation, multiple-annotation propagation, multiple-link-type propagation, and constraint-guided propagation. The model is evaluated on a dataset of 40,422 web pages randomly sampled from 100 most popular English sites and ten famous academic sites. Each page’s annotations are obtained by querying the history interface of Del.icio.us. Experimental results show that the proposed model is very effective in increasing the coverage of annotations while still preserving novel properties of social annotations. Applications of propagated annotations on web search and classification further verify the effectiveness of the model.  相似文献   

16.
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic measures is limited by the coverage of user studies, which do not scale with the size, heterogeneity, and growth of the Web. Here we propose to leverage human-generated metadata—namely topical directories—to measure semantic relationships among massive numbers of pairs of Web pages or topics. The Open Directory Project classifies millions of URLs in a topical ontology, providing a rich source from which semantic relationships between Web pages can be derived. While semantic similarity measures based on taxonomies (trees) are well studied, the design of well-founded similarity measures for objects stored in the nodes of arbitrary ontologies (graphs) is an open problem. This paper defines an information-theoretic measure of semantic similarity that exploits both the hierarchical and non-hierarchical structure of an ontology. An experimental study shows that this measure improves significantly on the traditional taxonomy-based approach. This novel measure allows us to address the general question of how text and link analyses can be combined to derive measures of relevance that are in good agreement with semantic similarity. Surprisingly, the traditional use of text similarity turns out to be ineffective for relevance ranking.  相似文献   

17.
Traditional search engines have become the most useful tools to search the World Wide Web. Even though they are good for certain search tasks, they may be less effective for others, such as satisfying ambiguous or synonym queries. In this paper, we propose an algorithm that, with the help of Wikipedia and collaborative semantic annotations, improves the quality of web search engines in the ranking of returned results. Our work is supported by (1) the logs generated after query searching, (2) semantic annotations of queries and (3) semantic annotations of web pages. The algorithm makes use of this information to elaborate an appropriate ranking. To validate our approach we have implemented a system that can apply the algorithm to a particular search engine. Evaluation results show that the number of relevant web resources obtained after executing a query with the algorithm is higher than the one obtained without it.  相似文献   

18.
面向垂直搜索引擎的主题提取算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对HITS算法对所有链接分配相等权重导致产生主题漂移的问题,提出基于计算链接价值度及Web页面语义主题相似度对链接分配合理权重的HITS改进算法,突出链接重要度的差异。实验表明,该算法的主题相关度提高了13%~42%,且较好地避免了主题漂移问题,增强了采集信息的准确性,对垂直搜索引擎的研究有重要的理论和实际应用价值。  相似文献   

19.
Recent research has shown that more and more web users utilize social annotations to manage and organize their interested resources. Therefore, with the growing popularity of social annotations, it is becoming more and more important to utilize such social annotations to achieve effective web search. However, using a statistical model, there are no previous studies that examine the relationships between queries and social annotations. Motivated by this observation, we use social annotations to re-rank search results. We intend to optimize retrieval ranking method by using the ranking strategy of integrating the query-annotation similarity into query-document similarity. Specifically, we calculate the query-annotation similarity by using a statistical language model, which in a shorter form we call simply a language model. Then the initial search results are re-ranked according to the computational weighted score of the query-document similarity score and the query-annotation similarity score. Experimental results show that the proposed method can improve the NDCG score by 8.13%. We further conduct an empirical evaluation of the method by using a query set including about 300 popular social annotations and constructed phrases. More generally, the optimized results with social annotations based on a language model can be of significant benefit to web search.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号