首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Modern search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve clickthrough rate (CTR), Web search ranking, and query recommendations and suggestions. Although click-through logs can provide implicit feedback of users’ click preferences, deriving accurate absolute relevance judgments is difficult because of the existence of click noises and behavior biases. Previous studies showed that user clicking behaviors are biased toward many aspects such as “position” (user’s attention decreases from top to bottom) and “trust” (Web site reputations will affect user’s judgment). To address these problems, researchers have proposed several behavior models (usually referred to as click models) to describe users? practical browsing behaviors and to obtain an unbiased estimation of result relevance. In this study, we review recent efforts to construct click models for better search ranking and propose a novel convolutional neural network architecture for building click models. Compared to traditional click models, our model not only considers user behavior assumptions as input signals but also uses the content and context information of search engine result pages. In addition, our model uses parameters from traditional click models to restrict the meaning of some outputs in our model’s hidden layer. Experimental results show that the proposed model can achieve considerable improvement over state-of-the-art click models based on the evaluation metric of click perplexity.  相似文献   

2.
Inspired by the swarm intelligence of particle swarm, a novel global harmony search algorithm (NGHS) is proposed to solve reliability problems in this paper. The proposed algorithm includes two important operations: position updating and genetic mutation with a small probability. The former enables the worst harmony of harmony memory to move to the global best harmony rapidly in each iteration, and the latter can effectively prevent the NGHS from trapping into the local optimum. Based on a large number of experiments, the proposed algorithm has demonstrated stronger capacity of space exploration than most other approaches on solving reliability problems. The results show that the NGHS can be an efficient alternative for solving reliability problems.  相似文献   

3.
梁秋实  吴一雷  封磊 《计算机应用》2012,32(11):2989-2993
在微博搜索领域,单纯依赖于粉丝数量的搜索排名使刷粉行为有了可乘之机,通过将用户看作网页,将用户间的“关注”关系看作网页间的链接关系,使PageRank关于网页等级的基本思想融入到微博用户搜索,并引入一个状态转移矩阵和一个自动迭代的MapReduce工作流将计算过程并行化,进而提出一种基于MapReduce的微博用户搜索排名算法。在Hadoop平台上对该算法进行了实验分析,结果表明,该算法避免了用户排名单纯与其粉丝数量相关,使那些更具“重要性”的用户在搜索结果中的排名获得提升,提高了搜索结果的相关性和质量。  相似文献   

4.
针对PageRank算法不十分关注页面内容而只关注"超链分析"的现状,并存在着用户实际所需要的页面的次序并不靠前的问题,提出了一种搜索引擎页面排序融合算法.该算法通过考虑词项权重、链接分析和用户偏好3个主要方面,得到一个URL的权值评价,这样每个待搜集的网页都有自己的权值评价,超链选择程序根据这些权值,从中选出一个或一批权值最大的来搜集,以达到精确检索的目的.  相似文献   

5.
We develop a new algorithm for clustering search results. Differently from many other clustering systems that have been recently proposed as a post-processing step for Web search engines, our system is not based on phrase analysis inside snippets, but instead uses latent semantic indexing on the whole document content. A main contribution of the paper is a novel strategy – called dynamic SVD clustering – to discover the optimal number of singular values to be used for clustering purposes. Moreover, the algorithm is such that the SVD computation step has in practice good performance, which makes it feasible to perform clustering when term vectors are available. We show that the algorithm has very good classification performance, and that it can be effectively used to cluster results of a search engine to make them easier to browse by users. The algorithm has being integrated into the Noodles search engine, a tool for searching and clustering Web and desktop documents.  相似文献   

6.
Topic-sensitive PageRank: a context-sensitive ranking algorithm for Web search   总被引:14,自引:0,他引:14  
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared. By using linear combinations of these (precomputed) biased PageRank vectors to generate context-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. We describe techniques for efficiently implementing a large-scale search system based on the topic-sensitive PageRank scheme.  相似文献   

7.
针对传统网页排序算法Okapi BM25通常会出现网页与查询关键词领域无关的领域漂移现象,以及改进算法需要人工建立领域向量的问题,提出了一种基于BM25和Softmax回归分类模型的网页搜索排序算法。该方法首先对网页文本进行数据预处理并利用词袋模型进行网页文本的向量表示,之后通过少量的网页数据来训练Softmax回归分类模型,来预测测试网页数据的类别分数,并与BM25信息检索的分数结合在一起,得到最终的网页排序结果。实验结果显示该检索算法无须人工建立领域向量,即可达到很好的网页排序结果。  相似文献   

8.
A new QoS ontology and its QoS-based ranking algorithm for Web services   总被引:4,自引:0,他引:4  
Web service composition is a promising solution for building distributed applications on the Internet in which Web service discovery is a key step. With a number of Web services having similar functionality, it is necessary to rank those services to select the best Web services for a request. QoS information which can reflect user’s expectation and experience of using a service is often used as a distinguish factor in a service ranking algorithm. Different service providers and participants may use different QoS concepts for describing service quality information. Therefore, it leads to the issue of semantic interoperability of QoS. In this paper, we propose a novel approach for designing and developing a QoS ontology and its QoS-based ranking algorithm for evaluating Web services. The QoS ontology can support not only describing QoS information in great detail but also facilitating various service participants expressing their QoS offers and demands at different levels of expectation. The QoS-based ranking algorithm adopted Analytic Hierarchy Process (AHP), a multiple criteria decision making technique, as an underlying mechanism for developing a flexible and dynamic ranking algorithm. The proposed QoS ontology and ranking algorithm can be used in various applications in order to facilitate automatic and dynamic discovery and selection of Web services.  相似文献   

9.
Despite the effectiveness of search engines, the persistently increasing amount of web data continuously obscures the search task. Efforts have thus concentrated on personalized search that takes account of user preferences. A new concept is introduced towards this direction; search based on ranking of local set of categories that comprise a user search profile. New algorithms are presented that utilize web page categories to personalize search results. Series of user-based experiments show that the proposed solutions are efficient. Finally, we extend the application of our techniques in the design of topic-focused crawlers, which can be considered an alternative personalized search.  相似文献   

10.
陈伟柱  陈英  吴燕 《计算机应用》2005,25(5):995-997,1003
提出了一种基于分类技术的搜索引擎新排名算法CategoryRank。该算法能够借助类别信息,更加准确地计算网页的排名得分,提高搜索引擎排名的准确性。算法基于任意两个网页之间的类别信息,对链接图进行了分析和计算,并且与PageRank等算法进行相比,该算法能够更加准确地模拟用户浏览网页的习惯。同时针对Web中的每个网页,算法计算出它的类别属性,直接体现了该页面针对不同用户的重要程度。最后,把该算法的离线模型扣在线模型统一起来,阐明了算法在搜索引擎排名中的运行机制。  相似文献   

11.
Filter modeling using gravitational search algorithm   总被引:4,自引:0,他引:4  
This paper is devoted to the presentation of a new linear and nonlinear filter modeling based on a gravitational search algorithm (GSA). To do this, unknown filter parameters are considered as a vector to be optimized. Examples of infinite impulse response (IIR) filter design, as well as rational nonlinear filter, are given. To verify the effectiveness of the proposed GSA based filter modeling, different sets of initial population with the presence of different measurable noises are given and tested in simulations. Genetic algorithm (GA) and particle swarm optimization (PSO) are also used to model the same examples and some simulation results are compared. Obtained results confirm the efficiency of the proposed method.  相似文献   

12.
一种十字形运动搜索算法   总被引:2,自引:0,他引:2  
近几年,虽然运动估计算法有了很多种的快速算法,但是,运动搜索巨大的计算量依然是视频压缩速率的瓶颈,本文针对运动矢量的分布特点,提出了一种新的运动搜索算法,算法不仅结构简单,而且测试结果表明,该算法比原有DS9(dia-mondsearch)算法在搜索点数和图像质量方面有较大的提高,最好时的搜索点数只有DS算法的3/4。  相似文献   

13.
The periodic orbit theory gives the basic framework to study a quantum and classical correspondence. In this paper, we firstly report that we have found the existence of a certain surface, which we call the devil’s staircase surface. Secondly, taking the advantage of some intriguing properties of this surface, we propose a new method to exhaustively search for periodic orbits in the anisotropic Kepler problem. Our method fully takes into account of an intriguing property of the initial value problem of the anisotropic Kepler problem, and it reduces the two-dimensional search into the one-dimensional search. Using this method, all of the periodic orbits up to the length \(2N=20\) (altogether 19284 distinct periodic orbits) have been successfully obtained, which exceeds the world record of 76 periodic orbits up to \(2N=10\) .  相似文献   

14.
The main objective of the article is to permit the reliability analyst's/engineers/managers/practitioners to analyze the failure behavior of a system in a more consistent and logical manner. To this effect, the authors propose a methodological and structured framework, which makes use of both qualitative and quantitative techniques for risk and reliability analysis of the system. The framework has been applied to model and analyze a complex industrial system from a paper mill. In the quantitative framework, after developing the Petrinet model of the system, the fuzzy synthesis of failure and repair data (using fuzzy arithmetic operations) has been done. Various system parameters of managerial importance such as repair time, failure rate, mean time between failures, availability, and expected number of failures are computed to quantify the behavior in terms of fuzzy, crisp and defuzzified values. Further, to improve upon the reliability and maintainability characteristics of the system, in depth qualitative analysis of systems is carried out using failure mode and effect analysis (FMEA) by listing out all possible failure modes, their causes and effect on system performance. To address the limitations of traditional FMEA method based on risky priority number score, a risk ranking approach based on fuzzy and Grey relational analysis is proposed to prioritize failure causes.  相似文献   

15.
A novel correlation based memetic framework (MA-C) which is a combination of genetic algorithm (GA) and local search (LS) using correlation based filter ranking is proposed in this paper. The local filter method used here fine-tunes the population of GA solutions by adding or deleting features based on Symmetrical Uncertainty (SU) measure. The focus here is on filter methods that are able to assess the goodness or ranking of the individual features. Empirical study of MA-C on several commonly used datasets from the large-scale Gene expression datasets indicates that it outperforms recent existing methods in the literature in terms of classification accuracy, selected feature size and efficiency. Further, we also investigate the balance between local and genetic search to maximize the search quality and efficiency of MA-C.  相似文献   

16.
We consider the online scheduling problem with m−1, m?2, uniform machines each with a processing speed of 1, and one machine with a speed of s, 1?s?2, to minimize the makespan. The well-known list scheduling (LS) algorithm has a worst-case bound of [Y. Cho, S. Sahni, Bounds for list schedules on uniform processors, SIAM J. Comput. 9 (1980) 91-103]. An algorithm with a better competitive ratio was proposed in [R. Li, L. Shi, An on-line algorithm for some uniform processor scheduling, SIAM J. Comput. 27 (1998) 414-422]. It has a worst-case bound of 2.8795 for a big m and s=2. In this note we present a 2.45-competitive algorithm for m?4 and any s, 1?s?2.  相似文献   

17.
The affective component has been acknowledged as critical to understand information search behavior and user-computer interactions. There is a lack of studies that analyze the emotions that the user feels when searching for information about products with search engines. The present study analyzes the emotional outcomes of the online search process, taking into account the user’s (a) perceptions of success and effort exerted on the search process, (b) initial affective state, and (c) emotions felt during the search process. In addition, we identify profiles of online searchers based on the emotional outcomes of the search process, which allow us to differentiate the emotional processes and behavioral patterns that lead to such emotions. The results of the study stress the importance of the affective component of the online search behavior, given that these emotional outcomes are likely to influence all the subsequent actions that users perform on the Web.  相似文献   

18.
A new algorithm RAV (reparameterized angle variations) is proposed which makes explicit use of trajectory information where the time evolution of the pen coordinates plays a crucial role. The algorithm is robust against stroke connections/abbreviations as well as shape distortions, while maintaining reasonable robustness against stroke-order variations. Preliminary experiments are reported on tests against the Kuchibue_d-96-02 database from the Tokyo University of Agriculture and Technology. Received July 24, 2000 / Revised October 6, 2000  相似文献   

19.
A new local search based hybrid genetic algorithm for feature selection   总被引:2,自引:0,他引:2  
This paper presents a new hybrid genetic algorithm (HGA) for feature selection (FS), called as HGAFS. The vital aspect of this algorithm is the selection of salient feature subset within a reduced size. HGAFS incorporates a new local search operation that is devised and embedded in HGA to fine-tune the search in FS process. The local search technique works on basis of the distinct and informative nature of input features that is computed by their correlation information. The aim is to guide the search process so that the newly generated offsprings can be adjusted by the less correlated (distinct) features consisting of general and special characteristics of a given dataset. Thus, the proposed HGAFS receives the reduced redundancy of information among the selected features. On the other hand, HGAFS emphasizes on selecting a subset of salient features with reduced number using a subset size determination scheme. We have tested our HGAFS on 11 real-world classification datasets having dimensions varying from 8 to 7129. The performances of HGAFS have been compared with the results of other existing ten well-known FS algorithms. It is found that, HGAFS produces consistently better performances on selecting the subsets of salient features with resulting better classification accuracies.  相似文献   

20.
In this paper, we consider the problem of clustering and re-ranking web image search results so as to improve diversity at high ranks. We propose a novel ranking framework, namely cluster-constrained conditional Markov random walk (CCCMRW), which has two key steps: first, cluster images into topics, and then perform Markov random walk in an image graph conditioned on constraints of image cluster information. In order to cluster the retrieval results of web images, a novel graph clustering model is proposed in this paper. We explore the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, we propose to combine tf-idf method with a novel feature of word, namely visibility, to infer the word-to-image correlation. By latent Dirichlet allocation model, we define a topic relevance function to compute the weights of word-to-word correlations. Taking word to image correlations as heterogeneous links and word-to-word correlations as homogeneous links, graph clustering algorithms, such as complex graph clustering and spectral co-clustering, are respectively used to cluster images into topics in this paper. In order to perform CCCMRW, a two-layer image graph is constructed with image cluster nodes as upper layer added to a base image graph. Conditioned on the image cluster information from upper layer, Markov random walk is constrained to incline to walk across different image clusters, so as to give high rank scores to images of different topics and therefore gain the diversity. Encouraging clustering and re-ranking outputs on Google image search results are reported in this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号