首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the major challenges in data mining is the extraction of comprehensible knowledge from recorded data. In this paper, a coevolutionary-based classification technique, namely COevolutionary Rule Extractor (CORE), is proposed to discover classification rules in data mining. Unlike existing approaches where candidate rules and rule sets are evolved at different stages in the classification process, the proposed CORE coevolves rules and rule sets concurrently in two cooperative populations to confine the search space and to produce good rule sets that are comprehensive. The proposed coevolutionary classification technique is extensively validated upon seven datasets obtained from the University of California, Irvine (UCI) machine learning repository, which are representative artificial and real-world data from various domains. Comparison results show that the proposed CORE produces comprehensive and good classification rules for most datasets, which are competitive as compared with existing classifiers in literature. Simulation results obtained from box plots also unveil that CORE is relatively robust and invariant to random partition of datasets.  相似文献   

2.
With the increased acceptance of electronic health records, we can observe the increasing interest in the application of data mining approaches within this field. This study introduces a novel approach for exploring and comparing temporal trends within different in-patient subgroups, which is based on associated rule mining using Apriori algorithm and linear model-based recursive partitioning. The Nationwide Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP), Agency for Healthcare Research and Quality was used to evaluate the proposed approach. This study presents a novel approach where visual analytics on big data is used for trend discovery in form of a regression tree with scatter plots in the leaves of the tree. The trend lines are used for directly comparing linear trends within a specified time frame. Our results demonstrate the existence of opposite trends in relation to age and sex based subgroups that would be impossible to discover using traditional trend-tracking techniques. Such an approach can be employed regarding decision support applications for policy makers when organizing campaigns or by hospital management for observing trends that cannot be directly discovered using traditional analytical techniques.  相似文献   

3.
Information technology plays an important role in medicine because of the advanced decision support systems (DSS) it can provide. We provide an overview of the building blocks necessary for a medical decision support system and introduce seven research articles in this special issue that describe the development and evaluation of individual medical DSS building blocks or complete medical DSS.  相似文献   

4.
视频挖掘研究进展   总被引:2,自引:2,他引:2  
视频挖掘技术近年来受到了国内外研究者的逐渐关注,但研究还处于初步阶段,实际的应用系统很少,有关概念、系统结构和技术方法仍需深入研究。在回顾国内外研究动态的基础上,对视频挖掘的研究现状进行了归纳、评述;探讨了视频挖掘的概念,清理了视频挖掘与相关技术的联系与区别;展望了视频挖掘研究中的重点问题和解决思路。  相似文献   

5.
This paper presents the insights gained from applying knowledge discovery in databases (KDD) processes for the purpose of developing intelligent models, used to classify a country's investing risk based on a variety of factors. Inferential data mining techniques, like C5.0, as well as intelligent learning techniques, like neural networks, were applied to a dataset of 52 countries. The dataset included 27 variables (economic, stock market performance/risk and regulatory efficiencies) on 52 countries, whose investing risk category was assessed in a Wall Street Journal survey of international experts. The results of applying KDD techniques to the dataset are promising, and successfully classified most countries as compared to the experts' classifications. Implementation details, results, and future plans are also presented.  相似文献   

6.
7.
In this article we present ConQueSt, a constraint-based querying system able to support the intrinsically exploratory (i.e., human-guided, interactive and iterative) nature of pattern discovery. Following the inductive database vision, our framework provides users with an expressive constraint-based query language, which allows the discovery process to be effectively driven toward potentially interesting patterns. Such constraints are also exploited to reduce the cost of pattern mining computation. ConQueSt is a comprehensive mining system that can access real-world relational databases from which to extract data. Through the interaction with a friendly graphical user interface (GUI), the user can define complex mining queries by means of few clicks. After a pre-processing step, mining queries are answered by an efficient and robust pattern mining engine which entails the state-of-the-art of data and search space reduction techniques. Resulting patterns are then presented to the user in a pattern browsing window, and possibly stored back in the underlying database as relations.  相似文献   

8.
Knowledge discovery from spatial transactions   总被引:2,自引:0,他引:2  
We propose a general mechanism to represent the spatial transactions in a way that allows the use of the existing data mining methods. Our proposal allows the analyst to exploit the layered structure of geographical information systems in order to define the layers of interest and the relevant spatial relations among them. Given a reference object, it is possible to describe its neighborhood by considering the attribute of the object itself and the objects related by the chosen relations. The resulting spatial transactions may be either considered like “traditional” transactions, by considering only the qualitative spatial relations, or their spatial extension can be exploited during the data mining process. We explore both these cases. First we tackle the problem of classifying a spatial dataset, by taking into account the spatial component of the data to compute the statistical measure (i.e., the entropy) necessary to learn the model. Then, we consider the task of extracting spatial association rules, by focusing on the qualitative representation of the spatial relations. The feasibility of the process has been tested by implementing the proposed method on top of a GIS tool and by analyzing real world data.  相似文献   

9.
The proliferation of large masses of data has created many new opportunities for those working in science, engineering and business. The field of data mining (DM) and knowledge discovery from databases (KDD) has emerged as a new discipline in engineering and computer science. In the modern sense of DM and KDD the focus tends to be on extracting information characterized as knowledge from data that can be very complex and in large quantities. Industrial engineering, with the diverse areas it comprises, presents unique opportunities for the application of DM and KDD, and for the development of new concepts and techniques in this field. Many industrial processes are now automated and computerized in order to ensure the quality of production and to minimize production costs. A computerized process records large masses of data during its functioning. This real-time data which is recorded to ensure the ability to trace production steps can also be used to optimize the process itself. A French truck manufacturer decided to exploit the data sets of measures recorded during the test of diesel engines manufactured on their production lines. The goal was to discover knowledge in the data of the test engine process in order to significantly reduce (by about 25%) the processing time. This paper presents the study of knowledge discovery utilizing the KDD method. All the steps of the method have been used and two additional steps have been needed. The study allowed us to develop two systems: the discovery application is implemented giving a real-time prediction model (with a real reduction of 28%) and the discovery support environment now allows those who are not experts in statistics to extract their own knowledge for other processes.  相似文献   

10.
数据挖掘利用人工智能、机器学习、数理统计等方法从数据中提取有价值的信息给用户带来巨大的经济效益和社会效益。目前数据挖掘在医学方面的应用尚在起步阶段,随着该方法的普及,数据挖掘在医学上的应用将会得到更多的关注和重视。  相似文献   

11.
数据挖掘利用人工智能、机器学习、数理统计等方法从数据中提取有价值的信息给用户带采巨大的经济效益和社会效益.目前数据挖掘在医学方面的应用尚在起步阶段,随着该方法的普及,数据挖掘在医学上的应用将会得到更多的关注和重视.  相似文献   

12.
高维数据的可视化和快速聚类算法   总被引:2,自引:0,他引:2  
杨莉 《计算机科学》2006,33(11):132-133
本文通过介绍一种用于高维数据的可视化方法,引入了可用于快速聚类的一种距离算法,该方法不仅具有鲁棒性而且有较低的计算复杂性O(n^1),最后我们将该方法用于金融数据立方体的聚类算法,主要用于挖掘庄家行为模式并作为是否存在操纵行为的依据。  相似文献   

13.
Machine learning: a review of classification and combining techniques   总被引:1,自引:0,他引:1  
Supervised classification is one of the tasks most frequently carried out by so-called Intelligent Systems. Thus, a large number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various classification algorithms and the recent attempt for improving classification accuracy—ensembles of classifiers.  相似文献   

14.
提出了一种基于进化算法的子结构发现算法,并将爬山算法的思想融合于交叉和变异算子的设计之中,该算法可有效地跳出局部极值,取得较好的实验结果。将该算法应用于我国区域经济研究,挖掘结果反映了我国经济目前发展的趋势及存在的问题。  相似文献   

15.
基于基因表达式编程的知识发现--沿革、成果和发展方向   总被引:27,自引:1,他引:27  
综述了基于基因表达式编程(Gene Expression Programming,GEP)的知识发现技术的沿革、特色和成果。剖析了GEP中通过简单编码解决复杂问题的关键技术。特别介绍了在这一领域的工作成果,如基于GEP的多项式因式分解,频繁函数挖掘,抗噪声数据的函数挖掘,太阳黑子预测等。对进一步开展基于GEP的知识发现技术的发展策略提出了自己的见解。  相似文献   

16.
正相关关联规则及其在中医药中的应用   总被引:1,自引:0,他引:1       下载免费PDF全文
关联规则是数据挖掘的重要模式之一,有着极其重要的应用价值,但是传统的基于支持度-置信度框架的关联规则挖掘算法在实际应用中存在诸多不足。引入相关性分析,设计了一种基于遗传算法的正相关关联规则挖掘算法。最后,将该算法应用于名老中医临证经验分析挖掘的实际问题,实验证明,它能有效地弥补传统关联规则挖掘算法的不足。  相似文献   

17.
18.
高效中药关联规则发现算法研究及应用   总被引:1,自引:0,他引:1       下载免费PDF全文
将关联规则发现算法引入到中药配方数据库的数据挖掘中,以求发现方剂中单方之间的关联规则及中药中的药对药组,可以为中药中新药的研制提供重要依据。由于常用的关联规则发现算法:Apriori算法存在多次扫描数据库的缺陷,提出了一种基于矩阵的关联规则发现算法:Apriori_Matrix算法,该算法优化了Apriori算法中集合连接过程多次比较所花费的时间,可极大地提高关联规则挖掘的效率。针对中药数据库中单方的种类有限、配伍规则各不相同、同一种病症对应多种方剂的情况,改进算法有助于缩短新药研制的周期。  相似文献   

19.
数据挖掘技术是近些年来发展起来的一门新技术,通过该技术,人们可以发现数据后面潜藏的有价值的信息。数据挖掘已经成为解决当前企业信息系统中所面临的数据爆炸状况的最有效的方法,这也为决策者进行各种商业决策提供了科学的理论支持。该文将对数据挖掘的含义与基本算法进行阐述和分析,并对数据挖掘在电子商务中的具体应用进行分析探讨。  相似文献   

20.
聚类算法有效性验证工具设计与实现   总被引:1,自引:0,他引:1  
随着数据挖掘技术的发展,聚类算法也越来越多.数据挖掘对聚类算法有某些典型要求,如何验证聚类算法是否满足这些要求已成为一个需要解决的问题.由于实际样本集很难获得,且很多无法用来进行聚类算法的测试,因此设计并实现了一个工具,讨论用构造的样本集对加载的聚类算法进行评测,并对聚类结果进行展示.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号