首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

We design an information retrieval algorithm that mimics the stochastic behavior of decision-makers (DMs) when evaluating the alternatives displayed by an online search engine. The algorithm consists of a decision tree that incorporates all the 1024 decision nodes that may arise from the information retrieval process of DMs. We calibrate the behavior of the algorithm to the one observed from online users and run several sets of 1,000,000 queries. Each query lets DMs decide which subset of the ten alternatives composing the initial page of results to click, allowing us to evaluate their behavior as ranking reliability is assumed to decrease when DMs decide not to click on an alternative. We compare the click-through rates (CTRs) obtained when modifying the degree of ranking reliability derived from the alternatives displayed on the first page of search results. We illustrate how the stability of the CTR prevails among the top-ranked alternatives within relatively reliable scenarios while it drops when imposing large initial decrements in reliability. The resulting consequences regarding the importance of relative ranking positions are analyzed, the top three alternatives exhibiting a generally contained decrease in their CTRs that contrasts with the cumulative pattern arising from the fourth position onwards.

  相似文献   

2.
Topic-sensitive PageRank: a context-sensitive ranking algorithm for Web search   总被引:14,自引:0,他引:14  
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared. By using linear combinations of these (precomputed) biased PageRank vectors to generate context-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. We describe techniques for efficiently implementing a large-scale search system based on the topic-sensitive PageRank scheme.  相似文献   

3.
Despite the effectiveness of search engines, the persistently increasing amount of web data continuously obscures the search task. Efforts have thus concentrated on personalized search that takes account of user preferences. A new concept is introduced towards this direction; search based on ranking of local set of categories that comprise a user search profile. New algorithms are presented that utilize web page categories to personalize search results. Series of user-based experiments show that the proposed solutions are efficient. Finally, we extend the application of our techniques in the design of topic-focused crawlers, which can be considered an alternative personalized search.  相似文献   

4.
浦慧忠 《软件》2014,(7):126-128
基于用户兴趣的不同,研究如何针对用户的浏览行为来获取用户的有效兴趣数据,并根据现有用户兴趣模型存在的不足,结合Web挖掘中的相关技术,先显式构建用户兴趣模型,后隐式更新用户兴趣模型,从而实现能适应用户兴趣变化的用户兴趣模型。  相似文献   

5.
In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset.  相似文献   

6.
Searching for relevant information on the World Wide Web is often a laborious and frustrating task for casual and experienced users. To help improve searching on the Web based on a better understanding of user characteristics, we investigate what types of knowledge are relevant for Web-based information seeking, and which knowledge structures and strategies are involved. Two experimental studies are presented, which address these questions from different angles and with different methodologies. In the first experiment, 12 established Internet experts are first interviewed about search strategies and then perform a series of realistic search tasks on the World Wide Web. From this study a model of information seeking on the World Wide Web is derived and then tested in a second study. In the second experiment two types of potentially relevant types of knowledge are compared directly. Effects of Web experience and domain-specific background knowledge are investigated with a series of search tasks in an economics-related domain (introduction of the Euro currency). We find differential and combined effects of both Web experience and domain knowledge: while successful search performance requires the combination of the two types of expertise, specific strategies directly related to Web experience or domain knowledge can be identified.  相似文献   

7.
8.
This study addresses graphical Web directory, a new way to present hierarchical structure of Web directory. An analysis of the characteristics and problems of current Web directories is presented. The feature graphical Web directory for supporting information processing and decision making in Web directory browsing in Web search is proposed to improve users' performance and satisfaction. An experiment was conducted to test the effectiveness of the proposed feature. The results of the experiment indicated that (a) the enhancement of this feature improved users' initial and overall search performance by 32.6 and 43.4%, respectively, (b) the enhancement of this feature also improved users' satisfaction by 27.7%.  相似文献   

9.

This study addresses graphical Web directory, a new way to present hierarchical structure of Web directory. An analysis of the characteristics and problems of current Web directories is presented. The feature graphical Web directory for supporting information processing and decision making in Web directory browsing in Web search is proposed to improve users' performance and satisfaction. An experiment was conducted to test the effectiveness of the proposed feature. The results of the experiment indicated that (a) the enhancement of this feature improved users' initial and overall search performance by 32.6 and 43.4%, respectively, (b) the enhancement of this feature also improved users' satisfaction by 27.7%.  相似文献   

10.
In this paper, we consider the problem of clustering and re-ranking web image search results so as to improve diversity at high ranks. We propose a novel ranking framework, namely cluster-constrained conditional Markov random walk (CCCMRW), which has two key steps: first, cluster images into topics, and then perform Markov random walk in an image graph conditioned on constraints of image cluster information. In order to cluster the retrieval results of web images, a novel graph clustering model is proposed in this paper. We explore the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, we propose to combine tf-idf method with a novel feature of word, namely visibility, to infer the word-to-image correlation. By latent Dirichlet allocation model, we define a topic relevance function to compute the weights of word-to-word correlations. Taking word to image correlations as heterogeneous links and word-to-word correlations as homogeneous links, graph clustering algorithms, such as complex graph clustering and spectral co-clustering, are respectively used to cluster images into topics in this paper. In order to perform CCCMRW, a two-layer image graph is constructed with image cluster nodes as upper layer added to a base image graph. Conditioned on the image cluster information from upper layer, Markov random walk is constrained to incline to walk across different image clusters, so as to give high rank scores to images of different topics and therefore gain the diversity. Encouraging clustering and re-ranking outputs on Google image search results are reported in this paper.  相似文献   

11.
This paper provides a transparent and speculative algorithm for content based web page prefetching. The algorithm relies on a profile based on the Internet browsing habits of the user. It aims at reducing the perceived latency when the user requests a document by clicking on a hyperlink. The proposed user profile relies on the frequency of occurrence for selected elements forming the web pages visited by the user. These frequencies are employed in a mechanism for the prediction of the user’s future actions. For the anticipation of an adjacent action, the anchored text around each of the outbound links is used and weights are assigned to these links. Some of the linked documents are then prefetched and stored in a local cache according to the assigned weights. The proposed algorithm was tested against three different prefetching algorithms and yield improved cache–hit rates given a moderate bandwidth overhead. Furthermore, the precision of accurately inferring the user’s preference is evaluated through the recall–precision curves. Statistical evaluation testifies that the achieved recall–precision performance improvement is significant.  相似文献   

12.
User profiling for Web page filtering   总被引:2,自引:0,他引:2  
To help address pressing problems with information overload, researchers have developed personal agents to provide assistance to users in navigating the Web. To provide suggestions, such agents rely on user profiles representing interests and preferences, which makes acquiring and modeling interest categories a critical component in their design. Existing profiling approaches have only partially tackled the characteristics that distinguish user profiling from related tasks. The authors' technique generates readable user profiles that accurately capture interests, starting from observations of user behavior on the Web.  相似文献   

13.
The primary goal of the secure socket layer protocol (SSL) is to provide confidentiality and data integrity between two communicating entities. Since the most computationally expensive step in the SSL handshake protocol is the server’s RSA decryption, it is introduced that the proposed secret exchange algorithm can be used to speed up the SSL session initialization. This paper first points out that the previous batch method is impractical since it requires multiple certificates. It then proposes a unique certificate scheme to overcome the problem. The optimization strategy, which is based on the constrained model considering the user requirements-aware security ranking, focuses on the optimal result in different public key sizes. It is also introduced that the parameter is optimized when integrating user requirements for Internet QoS, such as the stability of the system and the tolerable response time. Finally, the proposed algorithm is evaluated to be practical and efficient through both analysis and simulation studies.  相似文献   

14.
针对现代互联网环境下,网络日志规模急速扩张,可挖掘内容极为丰富的现状,梳理国内基于网络日志的用户行为检测和用户画像领域的主要文献及工作。简要叙述上述两个领域的基本理论,并以公安工作、电子商务、医疗健康、旅游行业和图书馆业这五个行业中的案例来总结上述两个领域内的主要应用。对网络日志进行挖掘可以极大地提升用户体验,但也要正视其在隐私保护方面的缺失。  相似文献   

15.
16.
Web search     
  相似文献   

17.
Relevance ranking in georeferenced video search   总被引:1,自引:0,他引:1  
The rapid adoption and deployment of ubiquitous video cameras has led to the collection of voluminous amounts of media data. However, indexing and searching of large video databases remain a very challenging task. Recently, some recorded video data are automatically annotated with meta-data collected from various sensors such as Global Positioning System (GPS) and compass devices. In our earlier work, we proposed the notion of a viewable scene model derived from the fusion of location and direction sensor information with a video stream. Such georeferenced media streams are useful in many applications and, very importantly, they can effectively be searched via their meta-data on a large scale. Consequently, search by geo-properties complements traditional content-based retrieval methods. The result of a georeferenced video query will in general consist of a number of video segments that satisfy the query conditions, but with more or less relevance. For example, a building of interest may appear in a video segment, but may only be visible in a corner. Therefore, an essential and integral part of a video query is the ranking of the result set according to the relevance of each clip. An effective result ranking is even more important for video than it is for text search, since the browsing of results can only be achieved by viewing each clip, which is very time consuming. In this study, we investigate and present three ranking algorithms that use spatial and temporal properties of georeferenced videos to effectively rank search results. To allow our techniques to scale to large video databases, we further introduce a histogram-based approach that allows fast online computations. An experimental evaluation demonstrates the utility of the proposed methods.  相似文献   

18.
When people talk to each other, eye contact is very important for a trustful and efficient communication. Video-conferencing systems were invented to enable such communication over large distances, recently using mostly Internet and personal computers. Despite low cost of such solutions, a broader acceptance and use of these communication means has not happened yet. One of the most important reasons for this situation is that it is almost impossible to establish eye contact between distant parties on the most common hardware configurations of such videoconferencing systems, where the camera for face capture is usually mounted above the computer monitor, where the face of the correspondent is observed. Different hardware and software solutions to this problem of missing eye contact have been proposed over the years. In this article we propose a simple solution that can improve the subjective feeling of eye contact, which is based on how people perceive 3D scenes displayed on slanted surfaces, and offer some experiments in support of the hypothesis.  相似文献   

19.
针对PageRank算法不十分关注页面内容而只关注"超链分析"的现状,并存在着用户实际所需要的页面的次序并不靠前的问题,提出了一种搜索引擎页面排序融合算法.该算法通过考虑词项权重、链接分析和用户偏好3个主要方面,得到一个URL的权值评价,这样每个待搜集的网页都有自己的权值评价,超链选择程序根据这些权值,从中选出一个或一批权值最大的来搜集,以达到精确检索的目的.  相似文献   

20.
提出了一种针对新客户在商务站点购物的个性化推荐方法。首先利用已购物客户的浏览信息生成购物行为模型,得到新客户在站点中的浏览行为生成浏览行为模型,通过最近邻居的协同过滤技术生成与新客户行为最为相近的用户集,将最近邻居已购商品推荐给新客户。该方法能够给新客户提供及时准确的个性化商品信息。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号