首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
为了实现对煤与瓦斯突出快速、准确和动态预测,考虑煤与瓦斯突出多种影响因素,提出了一种基于聚类和案例推理(CBR)的煤与瓦斯突出预测方法。利用通过一种基于PCA的描述案例特征权值确定方法所得的描述案例特征权值,对案例库案例进行聚类,使同类案例间具有较高的相似度;以案例聚类结果为基础,进行高效案例检索与匹配,以提高煤与瓦斯突出预测的快速性。利用实测数据对所提方法进行验证,实例验证结果表明,所提方法预测结果的准确性高,预测所用平均时间是已有煤与瓦斯突出预测案例推理方法预测所用时间的40%。  相似文献   

2.
为了提高Tennessee-Eastman(TE)过程的故障诊断准确率,本文研究一种学习型伪度量(learning pseudo metric,LPM)代替距离度量的案例检索方法,并建立了TE过程的案例推理(case-based reasoning,CBR)故障诊断模型.首先建立LPM度量准则并对LPM模型进行训练,其次度量目标案例与每一个源案例的相似度,从中检索与目标案例相似的同类案例,再采用多数重用原则从同类案例中决策出目标案例的解,最后通过TE过程的运行数据对该方法的性能进行测试,并与典型的CBR和BP(back-propagation)神经网络和支持向量机等方法进行对比,表明本文方法能有效提高故障诊断准确率,在实际化工过程中具有一定的推广应用价值.  相似文献   

3.
不平衡数据集的分类方法研究   总被引:2,自引:0,他引:2  
传统的分类算法在处理不平衡数据分类问题时会倾向于多数类,而导致少数类的分类精度较低。针对不平衡数据的分类,首先介绍了现有不平衡数据分类的性能评价;然后介绍了现有常用的基于数据采样的方法及现有的分类方法;最后介绍了基于数据采样和分类方法结合的综合方法。  相似文献   

4.
一种k-means聚类的案例检索算法   总被引:1,自引:1,他引:1       下载免费PDF全文
针对CBR系统中案例检索算法存在的问题,根据k-means算法思想,将案例库进行聚类,在聚类基础上设计了一个案例检索算法。分析了样本案例的选取规则,重点论述了案例检索算法。根据实验结果表明,该方法能够有效地提高案例检索结果的召回率及案例检索效率。  相似文献   

5.
Case-based reasoning (CBR) solves many real-world problems under the assumption that similar observations have similar outputs. As an implementation of this assumption and inspired by the technique for order performance by the similarity to ideal solution (TOPSIS), this paper proposes a new type of multiple criteria CBR method for binary business failure prediction (BFP) with similarities to positive and negative ideal cases (SPNIC). Assuming that the binary prediction of business failure generates two results, i.e., failure and non-failure, we set the principle of this CBR forecasting method which is termed as SPNIC-based CBR as follows: new observations should have the same output as the positive or negative ideal case to which they are more similar. From the perspective of CBR, the SPNIC-based CBR forecasting method consists of R4 processes: retrieving positive and negative ideal cases, reusing solutions of ideal cases to forecast, retain cases, and reconstruct the case base. As a demonstration, we applied this method to forecast business failure in China with three data representations of a formerly collected dataset from normal economic environment and a representation of a recently collected dataset from financial crisis environment. The results indicate that this new CBR forecasting method can produce significantly better short-term discriminate capability than comparative methods, except for support vector machine, in normal economic environment; On the contrary, it cannot produce acceptable performance in financial crisis environment. Further topics about this method are discussed.  相似文献   

6.
The problem of limited minority class data is encountered in many class imbalanced applications, but has received little attention. Synthetic over-sampling, as popular class-imbalance learning methods, could introduce much noise when minority class has limited data since the synthetic samples are not i.i.d. samples of minority class. Most sophisticated synthetic sampling methods tackle this problem by denoising or generating samples more consistent with ground-truth data distribution. But their assumptions about true noise or ground-truth data distribution may not hold. To adapt synthetic sampling to the problem of limited minority class data, the proposed Traso framework treats synthetic minority class samples as an additional data source, and exploits transfer learning to transfer knowledge from them to minority class. As an implementation, TrasoBoost method firstly generates synthetic samples to balance class sizes. Then in each boosting iteration, the weights of synthetic samples and original data decrease and increase respectively when being misclassified, and remain unchanged otherwise. The misclassified synthetic samples are potential noise, and thus have smaller influence in the following iterations. Besides, the weights of minority class instances have greater change than those of majority class instances to be more influential. And only original data are used to estimate error rate to be immune from noise. Finally, since the synthetic samples are highly related to minority class, all of the weak learners are aggregated for prediction. Experimental results show TrasoBoost outperforms many popular class-imbalance learning methods.  相似文献   

7.
研究了知识表示与重用技术,提出一种基于可拓学原理和实例推理的知识重用方法.该方法用物元模型以定量与定性相结合的方式对实例推理中的实例和问题进行统一描述,采用基于距的检索算法获得符合设计制造要求的最佳匹配实例,应用可拓变换对检索实例进行适应性修改,以使新实例符合问题的需要.最后以大型水轮机选型方案设计为例进行了验证.  相似文献   

8.
Abstract: Case-based reasoning (CBR) has been used in various problem-solving areas such as financial forecasting, credit analysis and medical diagnosis. However, conventional CBR has the limitation that it has no criterion for choosing the nearest cases based on the probabilistic similarity of cases. It uses a fixed number of neighbors without considering an optimal number for each target case, so it does not guarantee optimal similar neighbors for various target cases. This leads to the weakness of lowering predictability due to deviation from desired similar neighbors. In this paper we suggest a new case extraction technique called statistical case-based reasoning. The main idea involves a dynamic adaptation of the optimal number of neighbors by considering the distribution of distances between potential similar neighbors for each target case. In order to do this, our technique finds the optimal distance threshold and selects similar neighbors satisfying the distance threshold criterion. We apply this new method to five real-life medical data sets and compare the results with those of the statistical method, logistic regression; we also compare the results with the learning methods C5.0, CART, neural networks and conventional CBR. The results of this paper show that the proposed technique outperforms those of many other methods, it overcomes the limitation of conventional CBR, and it provides improved classification accuracy .  相似文献   

9.
基于范例和规则相结合的推理技术   总被引:5,自引:0,他引:5  
机器学习人员多年来提出诸多机器学习的混合体系结构,以改进机器学习的性能。本文着重提出一个基于范例推理与规则推理相结合的推理技术,以及一个范例库划分算法,其目的是充分发挥两种推理的优势,提高问题求解的效率。最后给出了一些测试结果和相关的结论。  相似文献   

10.
不平衡数据分类是机器学习研究领域中的一个热点问题。针对传统分类算法处理不平衡数据的少数类识别率过低问题,文章提出了一种基于聚类的改进AdaBoost分类算法。算法首先进行基于聚类的欠采样,在多数类样本上进行K均值聚类,之后提取聚类质心,与少数类样本数目一致的聚类质心和所有少数类样本组成新的平衡训练集。为了避免少数类样本数量过少而使训练集过小导致分类精度下降,采用少数过采样技术过采样结合聚类欠采样。然后,借鉴代价敏感学习思想,对AdaBoost算法的基分类器分类误差函数进行改进,赋予不同类别样本非对称错分损失。实验结果表明,算法使模型训练样本具有较高的代表性,在保证总体分类性能的同时提高了少数类的分类精度。  相似文献   

11.
12.
基于案例推理的层流冷却过程建模   总被引:9,自引:3,他引:9  
针对具有非线性,参数时变,分布参数等综合复杂特性的热轧层流冷却过程,将机理建模方法与案例推理技术相结合,通过使用层流冷却过程中带钢的运行工况构造案例,在案例库中检索与其匹配的历史案例,利用实际工况的特征与匹配工况的特征经过推理给出当前工况的模型参数,从而确定层流冷却过程的动态模型.利用这一模型可以预测整个冷却过程中带钢的温度变化过程.通过某钢铁公司热轧层流冷却过程实际数据的实验比较表明所提出的建模方法是有效的.  相似文献   

13.
刘宁  朱波  阴艳超  李岫宸 《控制与决策》2023,38(9):2614-2621
CGAN能够从数据中学习其分布特性,被引入不平衡数据处理中对少数类样本进行过采样,可以生成符合原始数据分布的新样本,因此比传统的重采样方法具有更好的处理效果.然而,CGAN对数据分布特性的学习易受限于样本规模,在少数类样本规模较小时不能充分学习其分布特性,难以保证生成样本的质量.针对这一问题,提出一种将CGAN与SMOTEENN相结合的不平衡数据平衡化处理方法.首先,从既有的少数类样本出发,采用SMOTEENN方法生成一定规模的少数类样本;然后,在此基础上训练CGAN模型,保证其能够生成符合原始少数类样本分布特征的新样本;最后,再利用CGAN重新生成符合原始少数类样本分布的新样本构建平衡数据集.为验证所提出方法的有效性,基于公开的不平衡数据集开展对比实验研究.实验结果表明,相对几种经典的不平衡数据处理方法与近期文献报道的方法,所提出方法在几项不平衡数据分类评价指标上表现出明显的优势.  相似文献   

14.
针对现有机器学习算法难以有效提高不均衡在线贯序数据中少类样本分类精度的问题,提出了一种基于主曲线的不均衡在线贯序极限学习机。该方法的核心思路是根据在线贯序数据的分布特性,均衡各类别样本,以减少少类样本合成过程中的盲目性,主要包括离线和在线两个阶段。离线阶段采用主曲线分别建立各类别样本的分布模型,利用少类样本合成过采样算法对少类样本过采样,并根据各样本点到对应主曲线的投影距离分别为其设定相应大小的隶属度,最后根据隶属区间削减多类和少类虚拟样本,进而建立初始模型。在线阶段对贯序到达的少类样本过采样,并根据隶属区间均衡贯序样本,进而动态更新网络权值。通过理论分析证明了所提算法在理论上存在损失信息上界。采用UCI标准数据集和实际澳门气象数据进行仿真实验,结果表明,与现有典型算法相比,该算法对少类样本的预测精度更高,数值稳定性更好。  相似文献   

15.
不平衡数据分类是当前机器学习的研究热点,传统分类算法通常基于数据集平衡状态的前提,不能直接应用于不平衡数据的分类学习.针对不平衡数据分类问题,文章提出一种基于特征选择的改进不平衡分类提升算法,从数据集的不同类型属性来权衡对少数类样本的重要性,筛选出对有效预测分类出少数类样本更意义的属性,同时也起到了约减数据维度的目的.然后结合不平衡分类算法使数据达到平衡状态,最后针对原始算法错分样本权值增长过快问题提出新的改进方案,有效抑制权值的增长速度.实验结果表明,该算法能有效提高不平衡数据的分类性能,尤其是少数类的分类性能.  相似文献   

16.
面向不均衡数据集的ISMOTE算法   总被引:1,自引:0,他引:1  
许丹丹  王勇  蔡立军 《计算机应用》2011,31(9):2399-2401
为了提高不均衡数据集中少数类的分类性能,提出ISMOTE算法。它是在少数类实例及其最近邻少数类实例构成的n维球体内进行随机插值,从而来改进数据分布的不均衡程度。通过实际数据集上的实验,与SMOTE算法和直接分类不均衡数据算法的性能比较结果表明,ISMOTE算法具有更高的分类精度,可以有效地改进分类器的性能。  相似文献   

17.
将Multi-Agent技术应用于信息系统案例检索中,结合CBR技术与Web Service思想,提出了基于CBR的信息系统案例检索多Agent系统的模型框架和运作流程,设计了基于智能聚类的案例检索算法,通过神经网络的自组织学习优化案例检索的过程,使得该多Agent系统成为具有高度自治性的自我学习与完善的系统,为信息系统案例检索系统的研究与开发提供了一定的借鉴.  相似文献   

18.
王莉  陈红梅 《计算机科学》2018,45(9):260-265
SMOTE(Synthetic Minority Over-sampling TEchnique)在进行样本合成时只在少数类中求其K近邻,这会导致过采样之后少数类样本的密集程度不变的问题。鉴于此,提出一种新的过采样算法NKSMOTE(New Kernel Synthetic Minority Over-Sampling Technique)。该算法首先利用一个非线性映射函数将样本映射到一个高维的核空间,然后在核空间上计算少数类样本在所有样本中的K个近邻,最后根据少数类样本的分布对算法分类性能的影响程度赋予少数类样本不同的向上采样倍率,从而改变数据集的非平衡度。实验采用决策树(Decision Tree,DT)、误差逆传播算法(error BackPropagation,BP)、随机森林(Random Forest,RF)作为分类算法,并将几类经典的过采样方法和文中提出的过采样方法进行多组对比实验。在UCI数据集上的实验结果表明,NKSMOTE算法具有更好的分类性能。  相似文献   

19.
Representing biomedical knowledge is an essential task in biomedical informatics intelligent systems. Case-based reasoning (CBR) holds the promise to represent contextual knowledge in a way that was not possible before with traditional knowledge-based methods. One main issue in biomedical CBR is dealing with the rate of generation of new knowledge in biomedical fields, which often makes the content of a case base partially obsolete. This article proposes to make use of the concept of prototypical case to ensure that a CBR system would keep update with current research advances in the biomedical field. Prototypical cases have served various purposes in biomedical CBR systems, among which to organize and structure the memory, to guide the retrieval as well as the reuse of cases, and to serve as bootstrapping a CBR system memory when real cases are not available in sufficient quantity and/or quality. This paper emphasizes the different roles prototypical cases can play in CBR systems, and presents knowledge maintenance as a very important novel role for these prototypical cases.  相似文献   

20.
This paper presents a simultaneous optimization method of a case-based reasoning (CBR) system using a genetic algorithm (GA) for financial forecasting. Prior research proposed many hybrid models of CBR and the GA for selecting a relevant feature subset or optimizing feature weights. Most research used the GA for improving only a part of architectural factors of the CBR model. However, the performance of the CBR model may be enhanced when these factors are simultaneously considered. In this study, the GA simultaneously optimizes multiple factors of the CBR system. Experimental results show that a GA approach to simultaneous optimization of the CBR model outperforms other conventional approaches for financial forecasting.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号