共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the major challenges in data mining is the extraction of comprehensible knowledge from recorded data. In this paper, a coevolutionary-based classification technique, namely COevolutionary Rule Extractor (CORE), is proposed to discover classification rules in data mining. Unlike existing approaches where candidate rules and rule sets are evolved at different stages in the classification process, the proposed CORE coevolves rules and rule sets concurrently in two cooperative populations to confine the search space and to produce good rule sets that are comprehensive. The proposed coevolutionary classification technique is extensively validated upon seven datasets obtained from the University of California, Irvine (UCI) machine learning repository, which are representative artificial and real-world data from various domains. Comparison results show that the proposed CORE produces comprehensive and good classification rules for most datasets, which are competitive as compared with existing classifiers in literature. Simulation results obtained from box plots also unveil that CORE is relatively robust and invariant to random partition of datasets. 相似文献
2.
Goran Hrovat Gregor Stiglic Peter Kokol Milan Ojsteršek 《Computer methods and programs in biomedicine》2014
With the increased acceptance of electronic health records, we can observe the increasing interest in the application of data mining approaches within this field. This study introduces a novel approach for exploring and comparing temporal trends within different in-patient subgroups, which is based on associated rule mining using Apriori algorithm and linear model-based recursive partitioning. The Nationwide Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP), Agency for Healthcare Research and Quality was used to evaluate the proposed approach. This study presents a novel approach where visual analytics on big data is used for trend discovery in form of a regression tree with scatter plots in the leaves of the tree. The trend lines are used for directly comparing linear trends within a specified time frame. Our results demonstrate the existence of opposite trends in relation to age and sex based subgroups that would be impossible to discover using traditional trend-tracking techniques. Such an approach can be employed regarding decision support applications for policy makers when organizing campaigns or by hospital management for observing trends that cannot be directly discovered using traditional analytical techniques. 相似文献
3.
Information technology plays an important role in medicine because of the advanced decision support systems (DSS) it can provide. We provide an overview of the building blocks necessary for a medical decision support system and introduce seven research articles in this special issue that describe the development and evaluation of individual medical DSS building blocks or complete medical DSS. 相似文献
4.
5.
Irma Becerra-Fernandez Stelios H. Zanakis Steven Walczak 《Computers & Industrial Engineering》2002,43(4):787-800
This paper presents the insights gained from applying knowledge discovery in databases (KDD) processes for the purpose of developing intelligent models, used to classify a country's investing risk based on a variety of factors. Inferential data mining techniques, like C5.0, as well as intelligent learning techniques, like neural networks, were applied to a dataset of 52 countries. The dataset included 27 variables (economic, stock market performance/risk and regulatory efficiencies) on 52 countries, whose investing risk category was assessed in a Wall Street Journal survey of international experts. The results of applying KDD techniques to the dataset are promising, and successfully classified most countries as compared to the experts' classifications. Implementation details, results, and future plans are also presented. 相似文献
6.
7.
Francesco Bonchi Fosca Giannotti Claudio Lucchese Salvatore Orlando Raffaele Perego Roberto Trasarti 《Information Systems》2009
In this article we present ConQueSt, a constraint-based querying system able to support the intrinsically exploratory (i.e., human-guided, interactive and iterative) nature of pattern discovery. Following the inductive database vision, our framework provides users with an expressive constraint-based query language, which allows the discovery process to be effectively driven toward potentially interesting patterns. Such constraints are also exploited to reduce the cost of pattern mining computation. ConQueSt is a comprehensive mining system that can access real-world relational databases from which to extract data. Through the interaction with a friendly graphical user interface (GUI), the user can define complex mining queries by means of few clicks. After a pre-processing step, mining queries are answered by an efficient and robust pattern mining engine which entails the state-of-the-art of data and search space reduction techniques. Resulting patterns are then presented to the user in a pattern browsing window, and possibly stored back in the underlying database as relations. 相似文献
8.
Knowledge discovery from spatial transactions 总被引:2,自引:0,他引:2
We propose a general mechanism to represent the spatial transactions in a way that allows the use of the existing data mining
methods. Our proposal allows the analyst to exploit the layered structure of geographical information systems in order to
define the layers of interest and the relevant spatial relations among them. Given a reference object, it is possible to describe
its neighborhood by considering the attribute of the object itself and the objects related by the chosen relations. The resulting
spatial transactions may be either considered like “traditional” transactions, by considering only the qualitative spatial
relations, or their spatial extension can be exploited during the data mining process. We explore both these cases. First
we tackle the problem of classifying a spatial dataset, by taking into account the spatial component of the data to compute
the statistical measure (i.e., the entropy) necessary to learn the model. Then, we consider the task of extracting spatial
association rules, by focusing on the qualitative representation of the spatial relations. The feasibility of the process
has been tested by implementing the proposed method on top of a GIS tool and by analyzing real world data. 相似文献
9.
The proliferation of large masses of data has created many new opportunities for those working in science, engineering and business. The field of data mining (DM) and knowledge discovery from databases (KDD) has emerged as a new discipline in engineering and computer science. In the modern sense of DM and KDD the focus tends to be on extracting information characterized as knowledge from data that can be very complex and in large quantities. Industrial engineering, with the diverse areas it comprises, presents unique opportunities for the application of DM and KDD, and for the development of new concepts and techniques in this field. Many industrial processes are now automated and computerized in order to ensure the quality of production and to minimize production costs. A computerized process records large masses of data during its functioning. This real-time data which is recorded to ensure the ability to trace production steps can also be used to optimize the process itself. A French truck manufacturer decided to exploit the data sets of measures recorded during the test of diesel engines manufactured on their production lines. The goal was to discover knowledge in the data of the test engine process in order to significantly reduce (by about 25%) the processing time. This paper presents the study of knowledge discovery utilizing the KDD method. All the steps of the method have been used and two additional steps have been needed. The study allowed us to develop two systems: the discovery application is implemented giving a real-time prediction model (with a real reduction of 28%) and the discovery support environment now allows those who are not experts in statistics to extract their own knowledge for other processes. 相似文献
10.
数据挖掘利用人工智能、机器学习、数理统计等方法从数据中提取有价值的信息给用户带来巨大的经济效益和社会效益。目前数据挖掘在医学方面的应用尚在起步阶段,随着该方法的普及,数据挖掘在医学上的应用将会得到更多的关注和重视。 相似文献
11.
数据挖掘利用人工智能、机器学习、数理统计等方法从数据中提取有价值的信息给用户带采巨大的经济效益和社会效益.目前数据挖掘在医学方面的应用尚在起步阶段,随着该方法的普及,数据挖掘在医学上的应用将会得到更多的关注和重视. 相似文献
12.
高维数据的可视化和快速聚类算法 总被引:2,自引:0,他引:2
本文通过介绍一种用于高维数据的可视化方法,引入了可用于快速聚类的一种距离算法,该方法不仅具有鲁棒性而且有较低的计算复杂性O(n^1),最后我们将该方法用于金融数据立方体的聚类算法,主要用于挖掘庄家行为模式并作为是否存在操纵行为的依据。 相似文献
13.
Supervised classification is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large
number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques)
and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a concise model
of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class
labels to the testing instances where the values of the predictor features are known, but the value of the class label is
unknown. This paper describes various classification algorithms and the recent attempt for improving classification accuracy—ensembles
of classifiers. 相似文献
14.
15.
16.
关联规则是数据挖掘的重要模式之一,有着极其重要的应用价值,但是传统的基于支持度-置信度框架的关联规则挖掘算法在实际应用中存在诸多不足。引入相关性分析,设计了一种基于遗传算法的正相关关联规则挖掘算法。最后,将该算法应用于名老中医临证经验分析挖掘的实际问题,实验证明,它能有效地弥补传统关联规则挖掘算法的不足。 相似文献
17.
Relational concept discovery in structured datasets 总被引:2,自引:0,他引:2
M. Huchard M. Rouane Hacene C. Roume P. Valtchev 《Annals of Mathematics and Artificial Intelligence》2007,49(1-4):39-76
18.
将关联规则发现算法引入到中药配方数据库的数据挖掘中,以求发现方剂中单方之间的关联规则及中药中的药对药组,可以为中药中新药的研制提供重要依据。由于常用的关联规则发现算法:Apriori算法存在多次扫描数据库的缺陷,提出了一种基于矩阵的关联规则发现算法:Apriori_Matrix算法,该算法优化了Apriori算法中集合连接过程多次比较所花费的时间,可极大地提高关联规则挖掘的效率。针对中药数据库中单方的种类有限、配伍规则各不相同、同一种病症对应多种方剂的情况,改进算法有助于缩短新药研制的周期。 相似文献
19.
覃艳 《数字社区&智能家居》2011,(10)
数据挖掘技术是近些年来发展起来的一门新技术,通过该技术,人们可以发现数据后面潜藏的有价值的信息。数据挖掘已经成为解决当前企业信息系统中所面临的数据爆炸状况的最有效的方法,这也为决策者进行各种商业决策提供了科学的理论支持。该文将对数据挖掘的含义与基本算法进行阐述和分析,并对数据挖掘在电子商务中的具体应用进行分析探讨。 相似文献
20.
聚类算法有效性验证工具设计与实现 总被引:1,自引:0,他引:1
随着数据挖掘技术的发展,聚类算法也越来越多.数据挖掘对聚类算法有某些典型要求,如何验证聚类算法是否满足这些要求已成为一个需要解决的问题.由于实际样本集很难获得,且很多无法用来进行聚类算法的测试,因此设计并实现了一个工具,讨论用构造的样本集对加载的聚类算法进行评测,并对聚类结果进行展示. 相似文献