首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
为提升模拟电路故障诊断精度,结合基于故障特征间一维模糊度的特征选择算法,提出一种新的多核超限学习机诊断模型。该模型通过设置虚拟的基核,将正则化参数融入基核权重求解过程中;同时,通过将特征空间类内散度集成到多核优化目标函数中,在最小化训练误差的同时,使得同一模式的故障样本更加集中,有效提升了故障模式间的辨识力。通过两个模拟电路诊断实例表明:相比于单核学习算法,所提方法可以显著提升诊断精度,并且可以将难以辨识的故障样本更加准确地隔离到相应模糊组中;相比于一般的多核学习算法,所提方法在取得相似诊断精度的同时,时间花费更少。  相似文献   

2.
蒋胤傑    况琨    吴飞   《智能系统学报》2020,15(1):175-182
数据驱动的机器学习(特别是深度学习)在自然语言处理、计算机视觉分析和语音识别等领域取得了巨大进展,是人工智能研究的热点。但是传统机器学习是通过各种优化算法拟合训练数据集上的最优模型,即在模型上的平均损失最小,而在现实生活的很多问题(如商业竞拍、资源分配等)中,人工智能算法学习的目标应该是是均衡解,即在动态情况下也有较好效果。这就需要将博弈的思想应用于大数据智能。通过蒙特卡洛树搜索和强化学习等方法,可以将博弈与人工智能相结合,寻求博弈对抗模型的均衡解。从数据拟合的最优解到博弈对抗的均衡解能让大数据智能有更广阔的应用空间。  相似文献   

3.
提出了一种新的粗糙集双重学习方法,该方法能用遗传算法实现外层学习,用规则提取方法进行内层学习.其基本思想是:首先引入遗传算法,将属性编码,并针对不同的属性组合进行规则提取;然后用测试样本对规则集进行检验,并基于所得到的识别率建立适应度函数;最后在合适的遗传算子下获取最佳的属性组合及相应的知识规则.与其他方法相比,本文所提粗糙集双重学习方法集属性约简和规则提取于一体,整个过程具有很强的自适应能力.最后,用算例对本文方法进行了验证.  相似文献   

4.
Learning rules from incomplete training examples by rough sets   总被引:1,自引:0,他引:1  
Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more difficult than learning from complete data sets. In the past, the rough-set theory was widely used in dealing with data classification problems. In this paper, we deal with the problem of producing a set of certain and possible rules from incomplete data sets based on rough sets. A new learning algorithm is proposed, which can simultaneously derive rules from incomplete data sets and estimate the missing values in the learning process. Unknown values are first assumed to be any possible values and are gradually refined according to the incomplete lower and upper approximations derived from the given training examples. The examples and the approximations then interact on each other to derive certain and possible rules and to estimate appropriate unknown values. The rules derived can then serve as knowledge concerning the incomplete data set.  相似文献   

5.
刘晓平 《计算机仿真》2006,23(4):103-105,113
数据挖掘是从大量原始数据中抽取隐藏知识的过程。大部分数据挖掘工具采用规则发现和决策树分类技术来发现数据模式和规则,其核心是归纳算法。与传统统计方法相比,基于机器学习技术得到的分类结果具有较好的可解释性。在针对特定的数据集进行数据挖掘时,如果缺乏相应的领域知识,用户或决策者就很难确定选择何种归纳算法。因此,需要尝试各种算法。借助MLC++,决策者能够轻而易举地比较不同分类算法对特定数据集的有效性,从而选择合适的分类算法。同时,系统开发人员也可以利用MLC++设计各种混合算法。  相似文献   

6.
7.
关系数据库中模糊规则的快速挖掘算法   总被引:10,自引:0,他引:10  
陈宁  陈安  周龙骧 《软件学报》2001,12(7):949-959
关联规则和时序规则是数据挖掘的任务之一.在以往的算法中,规则通常用确定的数值或概念来表示,往往不具有实际意义,而且不容易被用户理解.研究了从大型关系数据库中挖掘模糊关联规则和模糊时序规则的问题.基于模糊集合的理论,提出了两个模糊关联规则的挖掘算法,然后把它们分别扩展为模糊时序规则的挖掘算法.用模糊概念表示的规则更符合人的思维和表达习惯,增强了规则的可理解性.  相似文献   

8.
Most data complexity studies have focused on characterizing the complexity of the entire data set and do not provide information about individual instances. Knowing which instances are misclassified and understanding why they are misclassified and how they contribute to data set complexity can improve the learning process and could guide the future development of learning algorithms and data analysis methods. The goal of this paper is to better understand the data used in machine learning problems by identifying and analyzing the instances that are frequently misclassified by learning algorithms that have shown utility to date and are commonly used in practice. We identify instances that are hard to classify correctly (instance hardness) by classifying over 190,000 instances from 64 data sets with 9 learning algorithms. We then use a set of hardness measures to understand why some instances are harder to classify correctly than others. We find that class overlap is a principal contributor to instance hardness. We seek to integrate this information into the training process to alleviate the effects of class overlap and present ways that instance hardness can be used to improve learning.  相似文献   

9.
Many real‐world problems require multilabel classification, in which each training instance is associated with a set of labels. There are many existing learning algorithms for multilabel classification; however, these algorithms assume implicit negativity, where missing labels in the training data are automatically assumed to be negative. Additionally, many of the existing algorithms do not handle incremental learning in which new labels could be encountered later in the learning process. A novel multilabel adaptation of the backpropagation algorithm is proposed that does not assume implicit negativity. In addition, this algorithm can, using a naïve Bayesian approach, infer missing labels in the training data. This algorithm can also be trained incrementally as it dynamically considers new labels. This solution is compared with existing multilabel algorithms using data sets from multiple domains, and the performance is measured with standard multilabel evaluation metrics. It is shown that our algorithm improves classification performance for all metrics by an overall average of 7.4% when at least 40% of the labels are missing from the training data and improves by 18.4% when at least 90% of the labels are missing.  相似文献   

10.
提出一种新的基于模糊C-均值聚类(FCM)和实值遗传算法(RVGA)的模糊神经网络(FNN)。在对模糊规则进行训练之前,利用模糊C-均值聚类从训练数据中提取出典型数据,以删除野值和协调数据内部冲突。然后利用一种新的实值遗传算法对此典型数据进行训练。此遗传算法的交叉和变异运算均直接对实值进行操作,而不是传统的位操作,因此,可以极大地减少训练时间并实现全局寻优。对非线性函数辨识的仿真实验证明了该方法的优越性。  相似文献   

11.
Association rule mining algorithms mostly use a randomly generated single seed to initialize a population without paying attention to the effectiveness of that population in evolutionary learning. Recently, research has shown significant impact of the initial population on the production of good solutions over several generations of a genetic algorithm. Single seed based genetic algorithms suffer from the following major challenges (1) solutions of a genetic algorithm are varied, since different seeds generate different initial population, (2) difficulty in defining a good seed for a specific application. To avoid these problems, in this paper we propose the MSGA, a new multiple seeds based genetic algorithm which generates multiple seeds from different domains of a solution space to discover high quality rules from a large data set. This scheme introduces m-domain model and m-seeds selection process through which the whole solution space is subdivided into m- number of same size domains, selecting a seed from each domain. Use of these seeds enables this method to generate an effective initial population for evolutionary learning of the fitness value of each rule. As a result, strong searching efficiency is obtained at the beginning of the evolution, achieving fast convergence. The MSGA is tested with different mutation and crossover operators for mining interesting Boolean association rules from four real world data sets. The results are compared to different single seeds based genetic algorithms under the same conditions.  相似文献   

12.
Dividend policy is one of most important managerial decisions affecting the firm value. Although there are many studies regarding decision-making problems, such as credit policy decisions through bankruptcy prediction and credit scoring, there is no research, to our knowledge, about dividend prediction or dividend policy forecasting using machine learning approaches in spite of the significance of dividends. For dealing with the problems involved in literature, we suggest a knowledge refinement model that can refine the multiple rules extracted through rule-based algorithms from dividend data sets by utilizing genetic algorithm (GA). The new technique, called “GAKR (genetic algorithm knowledge refinement)”, aims to combine the advantages of both knowledge consolidation and GA. The main result of the cross-validation procedure is the average accuracy rate of prediction in the five sets over the five iterations. The experiments show that GAKR model always outperforms other models in the performance of dividend policy prediction; we can predict future dividend policy more correctly than any other models. The major advantages of GAKR model can be summarized as follows: (1) Classification process of GAKR can be very fast with a compact set of rules. In other words, fast training mechanism of GAKR is possible regardless of data set sizes. (2) Multiple rules extracted by GAKR development process are much simpler and easier to understand. Moreover, GAKR model can discriminate redundant rules and inconsistent rules.  相似文献   

13.
A set of classification rules can be considered as a disjunction of rules, where each rule is a disjunct. A small disjunct is a rule covering a small number of examples. Small disjuncts are a serious problem for effective classification, because the small number of examples satisfying these rules makes their prediction unreliable and error-prone. This paper offers two main contributions to the research on small disjuncts. First, it investigates six candidate solutions (algorithms) for the problem of small disjuncts. Second, it reports the results of a meta-learning experiment, which produced meta-rules predicting which algorithm will tend to perform best for a given data set. The algorithms investigated in this paper belong to different machine learning paradigms and their hybrid combinations, as follows: two versions of a decision-tree (DT) induction algorithm; two versions of a hybrid DT/genetic algorithm (GA) method; one GA; one hybrid DT/instance-based learning (IBL) algorithm. Experiments with 22 data sets evaluated both the predictive accuracy and the simplicity of the discovered rule sets, with the following conclusions. If one wants to maximize predictive accuracy only, then the hybrid DT/IBL seems to be the best choice. On the other hand, if one wants to maximize both predictive accuracy and rule set simplicity -- which is important in the context of data mining -- then a hybrid DT/GA seems to be the best choice.  相似文献   

14.
We present an approach to designing cellular automata-based multiprocessor scheduling algorithms in which extracting knowledge about the scheduling process occurs. We consider the simplest case when a multiprocessor system is limited to two-processors. To design cellular automata corresponding to a given program graph, we propose a generic definition of program graph neighborhood, transparent to the various kinds, sizes, and shapes of program graphs. The cellular automata-based scheduler works in two modes: learning mode and operation mode. Discovered rules are typically suitable for sequential cellular automata working as a scheduler, while the most interesting and promising feature of cellular automata are their massive parallelism. To overcome difficulties in evolving parallel cellular automata rules, we propose using coevolutionary genetic algorithm. Discovered this way, rules enable us to design effective parallel schedulers. We present a number of experimental results for both sequential and parallel scheduling algorithms discovered in the context of a cellular automata-based scheduling system  相似文献   

15.
In this paper, fuzzy inference models for pattern classifications have been developed and fuzzy inference networks based on these models are proposed. Most of the existing fuzzy rule-based systems have difficulties in deriving inference rules and membership functions directly from training data. Rules and membership functions are obtained from experts. Some approaches use backpropagation (BP) type learning algorithms to learn the parameters of membership functions from training data. However, BP algorithms take a long time to converge and they require an advanced setting of the number of inference rules. The work to determine the number of inference rules demands lots of experiences from the designer. In this paper, self-organizing learning algorithms are proposed for the fuzzy inference networks. In the proposed learning algorithms, the number of inference rules and the membership functions in the inference rules will be automatically determined during the training procedure. The learning speed is fast. The proposed fuzzy inference network (FIN) classifiers possess both the structure and the learning ability of neural networks, and the fuzzy classification ability of fuzzy algorithms. Simulation results on fuzzy classification of two-dimensional data are presented and compared with those of the fuzzy ARTMAP. The proposed fuzzy inference networks perform better than the fuzzy ARTMAP and need less training samples.  相似文献   

16.
基于模糊C均值聚类和粗糙集理论的旋转机械故障诊断   总被引:5,自引:0,他引:5  
李如强  陈进  伍星 《信息与控制》2004,33(3):355-360
提出了一种基于模糊C均值聚类和粗糙集理论的旋转机械故障诊断方法.该方法包括粗糙集规则学习和诊断规则匹配两个过程.其中,学习过程考虑了样本中的重复对象和冲突对象,使获得的诊断规则能够覆盖所有的学习样本,并得到规则强度;在诊断规则匹配时,根据规则中条件属性的属性重要性、条件属性匹配的程度、规则强度以及诊断结论阈值得到诊断结论,从而使得到的结论更客观.最后,通过实验验证了该方法的有效性.  相似文献   

17.
The design of fuzzy controllers for the implementation of behaviors in mobile robotics is a complex and highly time-consuming task. The use of machine learning techniques such as evolutionary algorithms or artificial neural networks for the learning of these controllers allows to automate the design process. In this paper, the automated design of a fuzzy controller using genetic algorithms for the implementation of the wall-following behavior in a mobile robot is described. The algorithm is based on the iterative rule learning approach, and is characterized by three main points. First, learning has no restrictions neither in the number of membership functions, nor in their values. In the second place, the training set is composed of a set of examples uniformly distributed along the universe of discourse of the variables. This warrantees that the quality of the learned behavior does not depend on the environment, and also that the robot will be capable to face different situations. Finally, the trade off between the number of rules and the quality/accuracy of the controller can be adjusted selecting the value of a parameter. Once the knowledge base has been learned, a process for its reduction and tuning is applied, increasing the cooperation between rules and reducing its number.  相似文献   

18.
Kazakov  Dimitar  Manandhar  Suresh 《Machine Learning》2001,43(1-2):121-162
This article presents a combination of unsupervised and supervised learning techniques for the generation of word segmentation rules from a raw list of words. First, a language bias for word segmentation is introduced and a simple genetic algorithm is used in the search for a segmentation that corresponds to the best bias value. In the second phase, the words segmented by the genetic algorithm are used as an input for the first order decision list learner CLOG. The result is a set of first order rules which can be used for segmentation of unseen words. When applied on either the training data or unseen data, these rules produce segmentations which are linguistically meaningful, and to a large degree conforming to the annotation provided.  相似文献   

19.
张旗  石纯一 《软件学报》1996,7(6):339-344
在现实世界里,AI系统难免受到噪声的影响.系统有效工作与否取决于它对噪声的敏感性如何.解释学习EBL(explanation-basedlearning)也不例外.本文探讨了在例子受到噪声影响的情况下,解释学习的处理问题,提出了一个算法NR-EBL(noise-resistantEBL).与现有的解释学习方法不同,NR-EBL在训练例子含有噪声时仍然可以学习,以掌握实际的问题分布;和类似的工作不同,NR-EBL指出了正确识别概念对于噪声规律的依赖性,试图从训练例子集合发现和掌握噪声的规律.可以相信,在识别概念时,借助于对噪声规律的认识,NR-EBL可比EBL和类似工作有更高的识别率.NR-EBL是解释学习和统计模式识别思想的结合.它把现有的解释学习模型推广到例子含有噪声的情形,原来的EBL算法只是它的特例.  相似文献   

20.
Application of a diagnostic system to a helicopter gearbox is presented. The diagnostic system is a nonparametric pattern classifier that uses a multi-valued influence matrix (MVIM) as its diagnostic model and benefits from a fast learning algorithm that enables it to estimate its diagnostic model from a small number of measurement-fault data. To test this diagnostic system, vibration measurements were collected from a helicopter gearbox test stand during accelerated fatigue tests and at various fault instances. The diagnostic results indicate that the MVIM system can accurately detect and diagnose various gearbox faults so long as they are included in training.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号