首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
KDD中知识评价的研究综述   总被引:12,自引:1,他引:11  
在知识发现过程中,通过挖掘算法产生大量的模式,但是大多数用户对此不感兴趣。如何对它们进行评价,选取出用户感兴趣的和有用的知识成为至关重要的一环,故对知识评价的研究具有重要的意义。首先分析了评价过程与知识发现的结合方式;针对评价的综合度量标准(感兴趣度),从客观性和主观性两个方面分别进行了介绍;并针对因果关联规则概述了我们提出的一种新的评价方法。  相似文献   

2.
数据挖掘是在数据中发现隐藏的结构和模式。但发现的许多模式对用卢来说可能是已知的,从而使这些模式毫无意义,毫无兴趣性。文献中多强调分类规则的准确性和可理解性,但发现兴趣规则在数据挖掘算法中依然是一个令人生畏的挑战。本文采用一种遗传数据挖掘方法,在分类规则产生的同时对其兴趣性进行度量,直接产生兴趣规则。实验表明该方法是可行的、高效的。  相似文献   

3.
挖掘所关注规则的多策略方法研究   总被引:20,自引:1,他引:19  
通过数据挖掘,从大型数据库中发现了大量规则,如何选取所关注的规则,是知识发现的重要研究内容。该文研究了利用领域知识对规则的主观关注程度进行度量的方法,给出了一个能够度量规则的简洁性和新奇性的客观关注程度的计算函数,提出了选取用户关注的规则的多策略方法。  相似文献   

4.
What makes patterns interesting in knowledge discovery systems   总被引:6,自引:0,他引:6  
One of the central problems in the field of knowledge discovery is the development of good measures of interestingness of discovered patterns. Such measures of interestingness are divided into objective measures-those that depend only on the structure of a pattern and the underlying data used in the discovery process, and the subjective measures-those that also depend on the class of users who examine the pattern. The focus of the paper is on studying subjective measures of interestingness. These measures are classified into actionable and unexpected, and the relationship between them is examined. The unexpected measure of interestingness is defined in terms of the belief system that the user has. Interestingness of a pattern is expressed in terms of how it affects the belief system. The paper also discusses how this unexpected measure of interestingness can be used in the discovery process  相似文献   

5.
用户兴趣模型用于描述用户的个人信息、专业背景、偏好倾向和历史行为等,通过这些信息,系统可以发现和预测用户的信息需求,从而对用户进行个性化的信息推荐服务.用户兴趣模型是影响推荐系统服务效率的重要因素,因此针对用户兴趣进行建模是个性化推荐系统实现中要重点考虑的问题之一.本文从教育网站用户对象特点出发,提出了将用户兴趣分为固定兴趣与临时兴趣相结合的动态模型.  相似文献   

6.
When a search engine user becomes interested in a new area for him/herself, it is difficult for the user to enter a query precisely expressing the interest or to select areas including the interest, because he/she is just a beginner of the interest. This paper presents a system called Index Navigator, which tells areas a user is interested in, keywords he/she should enter as a query, and documents concerning his/her interest. A tough problem for such a system is to understand the user's interest from the query he/she entered. Index Navigator employs an inference method called Cost-based Cooperation of Multi-Abducers (CCMA), for understanding a user's interest from the history of the user's queries (expression of interest in incomplete keywords), even if the changing speed of the user's interest can not be estimated. With this device, Index Navigator guided the user to areas, keywords and documents relevant to his/her interest, according to the experimental results.  相似文献   

7.
影响关联规则挖掘的有趣性因素的研究   总被引:7,自引:2,他引:7  
关联规则挖掘是数据挖掘研究中的一个重要方面,而其中一个重要问题是对挖掘出的规则的感兴趣程度的评估。实际应用中可从数据源中挖掘出大量的规则,但这些规则中的大部分对用户来说是不一定感兴趣的。关联规则挖掘中的有趣性问题可从客观和主观两个方面对关联规则的兴趣度进行评测。利用模板将用户感兴趣的规则和不感兴趣的规则区分开,以此来完成关联规则有趣性的主观评测;在关联规则的置信度和支持度基础上对关联规则的有趣性的客观评测增加了约束。  相似文献   

8.
戴敏  黄亚楼 《计算机应用》2006,26(1):207-0209
关联规则通常以规则列表形式表达,而许多关联规则挖掘算法往往产生大量规则,这给用户理解规则和从中找出感兴趣的规则带来了极大困难。为了标识重要的规则,而又保持挖掘结果的完整性,提出了根据规则的通用性,按照由概括—具体的方式分层表达关联规则。先用挖掘结果的最概括规则集表达出最通用、最基本的领域知识,再根据用户要求分层查看概括规则下面更具体的规则。这种表达方式可以在不同层次上查看关联规则,使挖掘结果更容易管理和被人理解。  相似文献   

9.
Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst’s expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly focus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search.  相似文献   

10.
Knowledge discovery in databases is used to discover useful and understandable knowledge from large databases. A process of knowledge discovery consists of two steps, the data mining step and the evaluation step. In this paper, evaluating and ranking the interestingness of summaries generated from databases, which is a part of the second step, is studied using diversity measures. Sixteen previously analyzed diversity measures of interestingness are used along with three not previously considered ones, brought from different well-known areas. The latter three measures are evaluated theoretically according to five principles that a measure must satisfy to be qualified acceptable for ranking summaries. A theoretical correlation study between the eight measures that satisfy all five principles is presented based on mathematical proofs. An empirical evaluation is conducted using three real databases. Then, a classification of the eight measures is deduced. The resulting classification is used to reduce the number of measures to only two, which are the best over all criteria, and that produce non-similar results. This helps the user interpret the most important discovered knowledge in his decision making process.  相似文献   

11.
Many multimedia content-based retrieval systems allow query formulation with the user setting the relative importance of features (e.g., color, texture, shape, etc.) to mimic the user's perception of similarity. However, the systems do not modify their similarity matching functions, which are defined during the system development. We present a neural network-based learning algorithm for adapting the similarity matching function toward the user's query preference based on his/her relevance feedback. The relevance feedback is given as ranking errors (misranks) between the retrieved and desired lists of multimedia objects. The algorithm is demonstrated for facial image retrieval using the NIST Mugshot Identification Database with encouraging results  相似文献   

12.
In this paper, we advance a technique to develop a user profile for information retrieval through knowledge acquisition techniques. The profile bridges the discrepancy between user-expressed keywords and system-recognizable index terms. The approach presented in this paper is based on the application of personal construct theory to determine a user's vocabulary and his/her view of different documents in a training set. The elicited knowledge is used to develop a model for each phrase/concept given by the user by employing machine learning techniques.Our model correlates the concepts in a user's vocabulary to the index terms present in the documents in the training set. Computation of dependence between the user phrases also contributes in the development of the user profile and in creating a classification of documents. The resulting system is capable of automatically identifying the user concepts and query translation to index terms computed by the conventional indexing process. The system is evaluated by using the standard measures of precision and recall by comparing its performance against the performance of the smart system for different queries.This research is supported by the NSF grant IRI-8805875.  相似文献   

13.
关联规则挖掘是数据挖掘研究中的一个重要方面,而其中一个重要问题是对挖掘出的规则的兴趣度的评估,过去的研究发现,在实际应用中往往很容易从数据源中挖掘出大量的规则,但这些规则中的大部分对用户来说是不感兴趣的,本文对规则的兴趣度度量的两个方面作了讨论:一个是主观兴趣度度量,另一个是客观兴趣度度量,最后介绍了如何利用模板进行挖掘有趣的规则。  相似文献   

14.
Mining association rules and mining sequential patterns both are to discover customer purchasing behaviors from a transaction database, such that the quality of business decision can be improved. However, the size of the transaction database can be very large. It is very time consuming to find all the association rules and sequential patterns from a large database, and users may be only interested in some information.

Moreover, the criteria of the discovered association rules and sequential patterns for the user requirements may not be the same. Many uninteresting information for the user requirements can be generated when traditional mining methods are applied. Hence, a data mining language needs to be provided such that users can query only interesting knowledge to them from a large database of customer transactions. In this paper, a data mining language is presented. From the data mining language, users can specify the interested items and the criteria of the association rules or sequential patterns to be discovered. Also, the efficient data mining techniques are proposed to extract the association rules and the sequential patterns according to the user requirements.  相似文献   


15.
Personalized Web search for improving retrieval effectiveness   总被引:11,自引:0,他引:11  
Current Web search engines are built to serve all users, independent of the special needs of any individual user. Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to learn user profiles from users' search histories. The user profiles are then used to improve retrieval effectiveness in Web search. A user profile and a general profile are learned from the user's search history and a category hierarchy, respectively. These two profiles are combined to map a user query into a set of categories which represent the user's search intention and serve as a context to disambiguate the words in the user's query. Web search is conducted based on both the user query and the set of categories. Several profile learning and category mapping algorithms and a fusion algorithm are provided and evaluated. Experimental results indicate that our technique to personalize Web search is both effective and efficient.  相似文献   

16.
基于用户兴趣的个性化搜索引擎的设计   总被引:1,自引:0,他引:1  
建立了个性化条件下的用户兴趣模型。该模型借助于对用户自身信息和其他用户访问信息的挖掘,得到用户兴趣向量,并以此对检索结果进行过滤,从而使用户得到的检索结果能够满足用户个人爱好。最后,本文应用该模型设计了一个个性化搜索引擎系统。  相似文献   

17.
This paper describes the implementation of evolutionary techniques for information filtering and collection from the World Wide Web. We consider the problem of building intelligent agents to facilitate a person's search for information on the Web. An intelligent agent has been developed that uses a metagenetic algorithm in order to collect and recommend Web pages that will be interesting to the user. The user's feedback on the agent's recommendations drives the learning process to adapt the user's profile with his/her interests. The software agent utilizes the metagenetic algorithm to explore the search space of user interests. Experimental results are presented in order to demonstrate the suitability of the metagenetic algorithm's approach on the Web.  相似文献   

18.
We propose a robotic wheelchair that observes the user and the environment. It can understand the user's intentions from his/her behaviors and the environmental information. It also observes the user when he/she is off the wheelchair, recognizing the user's commands indicated by hand gestures. Experimental results show our approach to be promising. Although the current system uses face direction, for people who find it difficult to move their faces, it can be modified to use the movements of the mouth, eyes, or any other body parts that they can move. Since such movements are generally noisy, the integration of observing the user and the environment will be effective in understanding the real intentions of the user and will be a useful technique for better human interfaces.  相似文献   

19.
提出了一种基于生物特征的(k,n)门限群签名机制,使用这种机制可以让系统中的用户使用自己的生物特征对自己的身份进行认证,以代表整个群体对消息进行签名。基于生物特征的认证可以使用用户的指纹或者虹膜等生物特征信息来恢复预先分配给他的秘密。提出了基于生物特征的门限签名体制。在该体制中有一个包含n个用户的群体。每个用户的秘密存在一个防篡改的smart卡中,用户使用自己的生物特征对自己的身份实现认证,认证通过后,smart卡可以代表用户进行签名。当系统中有任意k个或者多于k个用户认证通过,整个系统就可以形成一个代表该群体的签名。  相似文献   

20.
Businesses and people often organize their information of interest (IOI) into a hierarchy of folders (or categories). The personalized folder hierarchy provides a natural way for each of the users to manage and utilize his/her IOI (a folder corresponds to an interest type). Since the interest is relatively long-term, continuous web scanning is essential. It should be directed by precise and comprehensible specifications of the interest. A precise specification may direct the scanner to those spaces that deserve scanning, while a specification comprehensible to the user may facilitate manual refinement, and a specification comprehensible to information providers (e.g. Internet search engines) may facilitate the identification of proper seed sites to start scanning. However, expressing such specifications is quite difficult (and even implausible) for the user, since each interest type is often implicitly and collectively defined by the content (i.e. documents) of the corresponding folder, which may even evolve over time. In this paper, we present an incremental text mining technique to efficiently identify the user's current interest by mining the user's information folders. The specification mined for each interest type specifies the context of the interest type in conjunctive normal form, which is comprehensible to general users and information providers. The specification is also shown to be more precise in directing the scanner to those sites that are more likely to provide IOI. The user may thus maintain his/her folders and then constantly get IOI, without paying much attention to the difficult tasks of interest specification and seed identification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号