共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we propose I-MIN model for knowledge discovery and knowledge management in evolving databases. The model splits the KDD process into three phases. The schema designed during the first phase, abstracts the generic mining requirements of the KDD process and provides a mapping between the generic KDD process and (user) specific KDD subprocesses. The generic process is executed periodically during the second phase and windows of condensed knowledge called knowledge concentrates are created. During the third phase, which corresponds to actual mining by the end users, specific KDD subprocesses are invoked to mine knowledge concentrates. The model provides a set of mining operators for the development of mining applications to discover and renew, preserve and reuse, and share knowledge for effective knowledge management. These operators can be invoked by either using a declarative query language or by writing applications.The architectural proposal emulates a DBMS like environment for the managers, administrators and end users in the organization. Knowledge management functions, like sharing and reuse of the discovered knowledge among the users and periodic updating of the discovered knowledge are supported. Complete documentation and control of all the KDD endeavors in an organization are facilitated by the I-MIN model. This helps in structuring and streamlining the KDD operations in an organization. 相似文献
2.
MineSet aids knowledge discovery and supports decision making based on relational data. It uses visualization and data mining to arrive at interesting results. Providing diverse visualization tools lets users choose the most appropriate method for a given problem. The client-server architecture performs most of the computationally intensive tasks on a server, while the processed results return to the client for visualization. The paper discusses MineSet database visualization and data mining visualization 相似文献
3.
As the amount of streaming audio and video available to World Wide Web users grows, tools for analyzing and indexing this content will become increasingly important. Frequently, knowledge management applications and information portals synthesize unstructured text information from the Web, intranets and partner sites. Given this context, we crawl a statistically significant number of Web pages, detect those that contain streaming media links, crawl the media links to extract associated meta-data, then use the crawl data to build a resource list for Web media. We have used these crawl-data findings to build a media indexing application that uses content-based indexing methods 相似文献
4.
Knowledge-discovery systems face challenging problems from real-world databases, which tend to be dynamic, incomplete, redundant, noisy, sparse, and very large. These problems are addressed and some techniques for handling them are described. A model of an idealized knowledge-discovery system is presented as a reference for studying and designing new systems. This model is used in the comparison of three systems: CoverStory, EXPLORA, and the Knowledge Discovery Workbench. The deficiencies of existing systems relative to the model reveal several open problems for future research 相似文献
5.
This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology. 相似文献
7.
Inductive logic programming (ILP) is concerned with the induction of logic programs from examples and background knowledge.
In ILP, the shift of attention from program synthesis to knowledge discovery resulted in advanced techniques that are practically
applicable for discovering knowledge in relational databases. This paper gives a brief introduction to ILP, presents selected
ILP techniques for relational knowledge discovery and reviews selected ILP applications.
Nada Lavrač, Ph.D.: She is a senior research associate at the Department of Intelligent Systems, J. Stefan Institute, Ljubljana, Slovenia (since
1978) and a visiting professor at the Klagenfurt University, Austria (since 1987). Her main research interest is in machine
learning, in particular inductive logic programming and intelligent data analysis in medicine. She received a BSc in Technical
Mathematics and MSc in Computer Science from Ljubljana University, and a PhD in Technical Sciences from Maribor University,
Slovenia. She is coauthor of KARDIO: A Study in Deep and Qualitative Knowledge for Expert Systems, The MIT Press 1989, and
Inductive Logic Programming: Techniques and Applications, Ellis Horwood 1994, and coeditor of Intelligent Data Analysis in
Medicine and Pharmacology, Kluwer 1997. She was the coordinator of the European Scientific Network in Inductive Logic Programming
ILPNET (1993–1996) and program cochair of the 8th European Machine Learning Conference ECML’95, and 7th International Workshop
on Inductive Logic Programming ILP’97.
Sašo Džeroski, Ph.D.: He is a research associate at the Department of Intelligent Systems, J. Stefan Institute, Ljubljana, Slovenia (since 1989).
He has held visiting researcher positions at the Turing Institute, Glasgow (UK), Katholieke Universiteit Leuven (Belgium),
German National Research Center for Computer Science (GMD), Sankt Augustin (Germany) and the Foundation for Research and Technology-Hellas
(FORTH), Heraklion (Greece). His research interest is in machine learning and knowledge discovery in databases, in particular
inductive logic programming and its applications and knowledge discovery in environmental databases. He is co-author of Inductive
Logic Programming: Techniques and Applications, Ellis Horwood 1994. He is the scientific coordinator of ILPnet2, The Network
of Excellence in Inductive Logic Programming. He was program co-chair of the 7th International Workshop on Inductive Logic
Programming ILP’97 and will be program co-chair of the 16th International Conference on Machine Learning ICML’99.
Masayuki Numao, Ph.D.: He is an associate professor at the Department of Computer Science, Tokyo Institute of Technology. He received a bachelor
of engineering in electrical and electronics engineering in 1982 and his Ph.D. in computer science in 1987 from Tokyo Institute
of Technology. He was a visiting scholar at CSLI, Stanford University from 1989 to 1990. His research interests include Artificial
Intelligence, Global Intelligence and Machine Learning. Numao is a member of Information Processing Society of Japan, Japanese
Society for Artificial Intelligence, Japanese Cognitive Science Society, Japan Society for Software Science and Technology
and AAAI. 相似文献
9.
Conference mining and expert finding are useful academic knowledge discovery problems from an academic recommendation point
of view. Group level (GL) topic modeling can provide us with richer text semantics and relationships, which results in denser
topics. And denser topics are more useful for academic discovery issues in contrast to Element level (EL) or Document level
(DL) topic modeling, which produces sparser topics. Previous methods performed academic knowledge discovery by using network
connectivity (only links not text of documents), keywords-based matching (no semantics) or by using semantics-based intrinsic
structure of the words presented between documents (semantics at DL), while ignoring semantics-based intrinsic structure of
the words and relationships between conferences (semantics at GL). In this paper, we consider semantics-based intrinsic structure
of words and relationships presented in conferences (richer text semantics and relationships) by modeling from GL. We propose
group topic modeling methods based on Latent Dirichlet Allocation (LDA). Detailed empirical evaluation shows that our proposed
GL methods significantly outperformed DL methods for conference mining and expert finding problems. 相似文献
10.
The Web has profoundly reshaped our vision of information management and processing, enlightening the power of a collaborative model of information production and consumption. This new vision influences the Knowledge Discovery in Databases domain as well. In this paper we propose a service-oriented, semantic-supported approach to the development of a platform for sharing and reuse of resources (data processing and mining techniques), enabling the management of different implementations of the same technique and characterized by a community-centered attitude, with functionalities for both resource production and consumption, facilitating end-users with different skills as well as resource providers with different technical and domain specific capabilities. We first describe the semantic framework underlying the approach, then we demonstrate how this framework is exploited to give different functionalities to users through the presentation of the platform functionalities. 相似文献
11.
This paper describes a graphical user-interface for database-oriented knowledge discovery systems, DBLEARN, which has been developed for extracting knowledge rules from relational databases. The interface, designed using a query-by-example approach, provides a graphical means of specifying knowledge-discovery tasks. The interface supplies a graphical browsing facility to help users to perceive the nature of the target database structure. In order to guide users' task specification, a cooperative, menu-based guidance facility has been integrated into the interface. The interface also supplies a graphical interactive adjusting facility for helping users to refine the task specification to improve the quality of learned knowledge rules. 相似文献
12.
The Internet plays an important role in society as a whole. One of the services that arose from the Internet was collaborative systems, in which several users create the content of the systems by means of personal experiences. One of the many collaborative systems existing today is those of sharing gastronomic recipes. The area of information retrieval on the Web has increased interest in regard to recovering the information contained in this environment and studying it in order to identify relationships such as the ingredients used in the preparation of a dish, which can be identified through the use of textual data mining techniques. In this scope, the present work proposes a methodology of knowledge discovery in gastronomic recipes collected from various sources of data. For this, information such as the ingredients, quantities, units of measure, preparation directions and other characteristics associated with the recipes is used. With the results found in the experiments presented in this article, this work represents a first step for the development of a service that, besides aggregating recipes from various sources, explores the collective knowledge that can be discovered when analyzing hundreds of thousands of recipes available on the Internet. 相似文献
13.
This article introduces the idea of using nonmonotonic inheritance networks for the storage and maintenance of knowledge discovered in data (revisable knowledge discovery in databases). While existing data mining strategies for knowledge discovery in databases typically involve initial structuring through the use of identification trees and the subsequent extraction of rules from these trees for use in rule-based expert systems, such strategies have difficulty in coping with additional information which may conflict with that already used for the automatic generation of rules. In the worst case, the entire automatic sequence may have to be repeated. If nonmonotonic inheritance networks are used instead of rules for storing knowledge discovered in databases, additional conflicting information can be inserted directly into such structures, thereby bypassing the need for recompilation. © 1996 John Wiley & Sons, Inc. 相似文献
14.
A concept for knowledge discovery and evolution in databases is described. The key issues include: using a database query to discover new rules; using not only positive examples (answer to a query), but also negative examples to discover new rules; and harmonizing existing rules with the new rules. A tool for characterizing the exceptions in databases and evolving knowledge as a database evolves is developed 相似文献
16.
Data mining (DM) models are knowledge-intensive information products that enable knowledge creation and discovery. As large volume of data is generated with high velocity from a variety of sources, there is a pressing need to place DM model selection and self-service knowledge discovery in the hands of the business users. However, existing knowledge discovery and data mining (KDDM) approaches do not sufficiently address key elements of data mining model management (DMMM) such as model sharing, selection and reuse. Furthermore, they are mainly from a knowledge engineer’s perspective, while the business requirements from business users are often lost. To bridge these semantic gaps, we propose an ontology-based DMMM approach for self-service model selection and knowledge discovery. We develop a DM 3 ontology to translate the business requirements into model selection criteria and measurements, provide a detailed deployment architecture for its integration within an organization’s KDDM application, and use the example of a student loan company to demonstrate the utility of the DM 3. 相似文献
17.
The age of Internet technology has introduced new types of attacks to new assets that did not exist before. Databases that represent information assets are subject to attacks that have malicious intentions, such as stealing sensitive data, deleting records or violating the integrity of the database. Many counter measures have been designed and implemented to protect the databases and the information they host from attacks. While preventive measures could be overcome and detection measures could detect an attack late after damage has occurred, there is a need for a recovery algorithm that will recover the database to its correct previous state before the attack. Numerous damage assessment and recovery algorithms have been proposed by researchersIn this work, we present an efficient lightweight detection and recovery algorithm that is based on the matrix approach and that can be used to recover from malicious attacks. We compare our algorithm with other approaches and show the performance results. 相似文献
18.
Objective: Traditional Chinese Medicine (TCM) provides an alternative method for achieving and maintaining good health. Due to the
increasing prevalence of TCM and the large volume of TCM data accumulated though thousands of years, there is an urgent need
to efficiently and effectively explore this information and its hidden rules with knowledge discovery in database (KDD) techniques.
This paper describes the design and development of a knowledge discovery system for TCM as well as the newly proposed KDD
techniques integrated in this system. 相似文献
20.
The discovery of multi-level knowledge is important to allow queries at and across different levels of abstraction. While there are some similarities between our research and that of others in this area, the work reported in this paper does not directly involve databases and is differently motivated. Our research is interested in taking data in the form of rule-bases and finding multi-level knowledge. This paper describes our motivation, our preferred technique for acquiring the initial knowledge known as Ripple-Down Rules, the use of Formal Concept Analysis to develop an abstraction hierarchy, and our application of these ideas to knowledge bases from the domain of chemical pathology. We also provide an example of how the approach can be applied to other prepositional knowledge bases and suggest that it can be used as an additional phase to many existing data mining approaches. 相似文献
|