首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 16 毫秒
1.
Systems for knowledge discovery in databases   总被引:7,自引:0,他引:7  
Knowledge-discovery systems face challenging problems from real-world databases, which tend to be dynamic, incomplete, redundant, noisy, sparse, and very large. These problems are addressed and some techniques for handling them are described. A model of an idealized knowledge-discovery system is presented as a reference for studying and designing new systems. This model is used in the comparison of three systems: CoverStory, EXPLORA, and the Knowledge Discovery Workbench. The deficiencies of existing systems relative to the model reveal several open problems for future research  相似文献   

2.
This paper describes a graphical user-interface for database-oriented knowledge discovery systems, DBLEARN, which has been developed for extracting knowledge rules from relational databases. The interface, designed using a query-by-example approach, provides a graphical means of specifying knowledge-discovery tasks. The interface supplies a graphical browsing facility to help users to perceive the nature of the target database structure. In order to guide users' task specification, a cooperative, menu-based guidance facility has been integrated into the interface. The interface also supplies a graphical interactive adjusting facility for helping users to refine the task specification to improve the quality of learned knowledge rules.  相似文献   

3.
The Web has profoundly reshaped our vision of information management and processing, enlightening the power of a collaborative model of information production and consumption. This new vision influences the Knowledge Discovery in Databases domain as well. In this paper we propose a service-oriented, semantic-supported approach to the development of a platform for sharing and reuse of resources (data processing and mining techniques), enabling the management of different implementations of the same technique and characterized by a community-centered attitude, with functionalities for both resource production and consumption, facilitating end-users with different skills as well as resource providers with different technical and domain specific capabilities. We first describe the semantic framework underlying the approach, then we demonstrate how this framework is exploited to give different functionalities to users through the presentation of the platform functionalities.  相似文献   

4.

In this editorial we briefly discuss past research on discovering first-order-logic patterns in databases. After a short introduction on the fields of Knowledge Discovery in Databases (KDD) and Inductive Logic Programming (ILP), we highlight some important areas of current research in the intersection of these two areas. Our goal is to provide readers that are not experts in these areas with a minimal background that will help them put the four contributions in the remainder of this special issue into the proper context.  相似文献   

5.
Modern database technologies process large volumes of data to discover new knowledge. Some large databases make discovery computationally expensive. Additional knowledge, known as domain or background knowledge, can often guide and restrict the search for interesting knowledge. This paper discusses mechanisms by which domain knowledge can be used effectively in discovering knowledge from databases. In particular, we look at the use of domain knowledge to reduce the size of the database for discovery, to optimize the hypotheses which represent the interesting knowledge to be discovered, to optimize the queries used to prove the hypotheses, and to avoid possible redundant and contradictory rule discovery. Some experimental results using the IDIS knowledge discovery tool is provided. ©2000 John Wiley & Sons, Inc.  相似文献   

6.
A framework for knowledge discovery and evolution in databases   总被引:8,自引:0,他引:8  
A concept for knowledge discovery and evolution in databases is described. The key issues include: using a database query to discover new rules; using not only positive examples (answer to a query), but also negative examples to discover new rules; and harmonizing existing rules with the new rules. A tool for characterizing the exceptions in databases and evolving knowledge as a database evolves is developed  相似文献   

7.
TheNielsen Opportunity Explorer tmproduct can be used by sales and trade marketing personnel within consumer packaged goods manufacturers to understand how their products are performing in the market place and find opportunities to sell more product, more profitably to the retailers. Opportunity Explorer uses data collected at the point-of-sale terminals, and by auditors of A. C. Nielsen. Opportunity Explorer uses a knowledge-base of market research expertise to analyze large databases and generate interactive reports using knowledge discovery templates, converting a large space of data into concise, inter-linkedinformation frames. Each information frame addresses specific business issues, and leads the user to seek related information by means of dynamically created hyperlinks.  相似文献   

8.
Knowledge discovery is a wide ranged process including data mining, which is used to find out meaningful and useful patterns in large amounts of data. In order to explore the factors having impact on the success of university students, knowledge discovery software, called MUSKUP, has been developed and tested on student data. In this system a decision tree classification is employed as a data mining technique. With this software system all the tasks involved in the knowledge discovery process are kept together. The advantage of this approach is to have access to all the functionalities of SQL server and Analysis Services through single software. The study was carried out on the data from university students. According to results of the study, the types of registration to the university and the income levels of the students’ family were found to be associated with student success.  相似文献   

9.
10.
Abstract-driven pattern discovery in databases   总被引:6,自引:0,他引:6  
The problem of discovering interesting patterns in large volumes of data is studied. Patterns can be expressed not only in terms of the database schema but also in user-defined terms, such as relational views and classification hierarchies. The user-defined terminology is stored in a data dictionary that maps it into the language of the database schema. A pattern is defined as a deductive rule expressed in user-defined terms that has a degree of uncertainty associated with it. Methods are presented for discovering interesting patterns based on abstracts which are summaries of the data expressed in the language of the user  相似文献   

11.
An approach to knowledge discovery in complex molecular databases is described. The machine learning paradigm used is structured concept formation, in which object's described in terms of components and their interrelationships are clustered and organized in a knowledge base. Symbolic images are used to represent classes of structured objects. A discovered molecular knowledge base is successfully used in the interpretation of a high resolution electron density map  相似文献   

12.
To complete missing values a solution is to use correlations between the attributes of the data. The problem is that it is difficult to identify relations within data containing missing values. Accordingly, we develop a kernel-based missing data imputation in this paper. This approach aims at making an optimal inference on statistical parameters: mean, distribution function and quantile after missing data are imputed. And we refer this approach to parameter optimization method (POP algorithm). We experimentally evaluate our approach, and demonstrate that our POP algorithm (random regression imputation) is much better than deterministic regression imputation in efficiency and generating an inference on the above parameters.  相似文献   

13.

Concept or Galois Lattices (CL) provide a productive framework for a variety of problems that arise in Knowledge Discovery in Databases (KDD). This paper introduces this special issue of Applied Artificial Intelligence devoted to applications of Concept Lattices for KDD (CLKDD). The papers in this volume come from a call for papers issued after the first International Workshop on Concept Lattice-based Theory, Methods and Tools for KDD held in July 2001 at Stanford University. Another special issue devoted to algorithms and methods of CLKDD will appear in the Journal of Experimental and Theoretical Artificial Intelligence.  相似文献   

14.
关联规则是数据库中的知识发现(KDD)领域的重要研究课题。模糊关联规则可以用自然语言来表达人类知识,近年来受到KDD研究人员的普遍关注。但是,目前大多数模糊关联规则发现方法仍然沿用经典关联规则发现中常用的支持度和置信度测度。事实上,模糊关联规则可以有不同的解释,而且不同的解释对规则发现方法有很大影响。从逻辑的观点出发,定义了模糊逻辑规则、支持度、蕴含度及其相关概念,提出了模糊逻辑规则发现算法,该算法结合了模糊逻辑概念和Apriori算法,从给定的定量数据库中发现模糊逻辑规则。  相似文献   

15.
This paper deals with an approach to knowledge discovery in databases applied in order to identify a dynamic model of a real-existing machine. The problem considered within the paper is how to identify dynamic models suitable for model-based diagnosing of a physical object. A special attention is paid to identification on unsupervised way, while big databases collected by a SCADA system is handled.In the paper a method of identification of dynamic models of objects and processes is presented. The usefulness of the method in technical diagnostics are shown. The elaborated method of analysis of quantitative dynamic data is based on applications of accessible methods of knowledge discovery in databases. The essence of the method is to project values of considered set of attributes into the so-called multidimensional space of regressors. In order to select the subset of relevant features the genetic algorithm was used. Knowledge was induced using the support vector machines (SVM) method. The AIC measure as well as our own heuristic function were applied as evaluation criteria. The method was applied in a process of discovery of a model of changes of temperature of a pump. Within framework of the research, data gathered by means of an industrial system registering data on a peculiar object, which was deep-well pumping station, was analyzed.  相似文献   

16.
Knowledge discovery in databases using lattices   总被引:3,自引:0,他引:3  
The rapid pace at which data gathering, storage and distribution technologies are developing is outpacing our advances in techniques for helping humans to analyse, understand, and digest the vast amounts of resulting data. This has led to the birth of knowledge discovery in databases (KDD) and data mining—a process that has the goal to selectively extract knowledge from data. A range of techniques, including neural networks, rule-based systems, case-based reasoning, machine learning, statistics, etc. can be applied to the problem. We discuss the use of concept lattices, to determine dependences in the data mining process. We first define concept lattices, after which we show how they represent knowledge and how they are formed from raw data. Finally, we show how the lattice-based technique addresses different processes in KDD, especially visualization and navigation of discovered knowledge.  相似文献   

17.
Knowledge discovery in time series databases   总被引:13,自引:0,他引:13  
Adding the dimension of time to databases produces time series databases (TSDB) and introduces new aspects and difficulties to data mining and knowledge discovery. In this correspondence, we introduce a general methodology for knowledge discovery in TSDB. The process of knowledge discovery in TSDR includes cleaning and filtering of time series data, identifying the most important predicting attributes, and extracting a set of association rules that can be used to predict the time series behavior in the future. Our method is based on signal processing techniques and the information-theoretic fuzzy approach to knowledge discovery. The computational theory of perception (CTP) is used to reduce the set of extracted rules by fuzzification and aggregation. We demonstrate our approach on two types of time series: stock-market data and weather data.  相似文献   

18.
The proliferation of large masses of data has created many new opportunities for those working in science, engineering and business. The field of data mining (DM) and knowledge discovery from databases (KDD) has emerged as a new discipline in engineering and computer science. In the modern sense of DM and KDD the focus tends to be on extracting information characterized as knowledge from data that can be very complex and in large quantities. Industrial engineering, with the diverse areas it comprises, presents unique opportunities for the application of DM and KDD, and for the development of new concepts and techniques in this field. Many industrial processes are now automated and computerized in order to ensure the quality of production and to minimize production costs. A computerized process records large masses of data during its functioning. This real-time data which is recorded to ensure the ability to trace production steps can also be used to optimize the process itself. A French truck manufacturer decided to exploit the data sets of measures recorded during the test of diesel engines manufactured on their production lines. The goal was to discover knowledge in the data of the test engine process in order to significantly reduce (by about 25%) the processing time. This paper presents the study of knowledge discovery utilizing the KDD method. All the steps of the method have been used and two additional steps have been needed. The study allowed us to develop two systems: the discovery application is implemented giving a real-time prediction model (with a real reduction of 28%) and the discovery support environment now allows those who are not experts in statistics to extract their own knowledge for other processes.  相似文献   

19.
Efficient discovery of interesting statements in databases   总被引:3,自引:0,他引:3  
The Explora system supportsDiscovery in Databases by large scale search for interesting instances of statistical patterns. In this paper we describe how Explora assessesinterestingness and achievescomputational efficiency. These problems arise because of the variety of patterns and the immense combinatorial possibilities of generating instances when studying relations between variables in subsets of data. First, the user must be saved from getting overwhelmed with a deluge of findings. To restrict the search with respect to the analysis goals, the user can focus each discovery task performed during an interactive and iterative exploration process. Some basic organization principles of search can further limit the search effort. One principle is to organize search hierarchically and to evaluate first the statistical or information theoretic evidence of the general hypotheses. Then more special hypotheses can be eliminated from further search, if a more general hypothesis was already verified. But this approach alone has some drawbacks and even in moderately sized data does not prevent large sets of findings. Therefore, in a second evaluation phase, further aspects of interestingness are assessed. A refinement strategy selects the most interesting of the statistically significant statements. A second problem for discovery systems is efficiency. Each hypothesis evaluation requires many data accesses. We describe strategies that reduce data accesses and speed up computation.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号