首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Failure mode (FM) and bearing capacity of reinforced concrete (RC) columns are key concerns in structural design and/or performance assessment procedures. The failure types, i.e., flexure, shear, or mix of the above two, will greatly affect the capacity and ductility of the structure. Meanwhile, the design methodologies for structures of different failure types will be totally different. Therefore, developing efficient and reliable methods to identify the FM and predict the corresponding capacity is of special importance for structural design/assessment management. In this paper, an intelligent approach is presented for FM classification and bearing capacity prediction of RC columns based on the ensemble machine learning techniques. The most typical ensemble learning method, adaptive boosting (AdaBoost) algorithm, is adopted for both classification and regression (prediction) problems. Totally 254 cyclic loading tests of RC columns are collected. The geometric dimensions, reinforcing details, material properties are set as the input variables, while the failure types (for classification problem) and peak capacity forces (for regression problem) are set as the output variables. The results indicate that the model generated by the AdaBoost learning algorithm has a very high accuracy for both FM classification (accuracy = 0.96) and capacity prediction (R2 = 0.98). Different learning algorithms are also compared and the results show that ensemble learning (especially AdaBoost) has better performance than single learning. In addition, the bearing capacity predicted by the AdaBoost is also compared to that by the empirical formulas provided by the design codes, which shows an obvious superior of the proposed method. In summary, the machine learning technique, especially the ensemble learning, can provide an alternate to the conventional mechanics-driven models in structural design in this big data time.  相似文献   

2.
Clustering ensemble integrates multiple base clustering results to obtain a consensus result and thus improves the stability and robustness of the single clustering method. Since it is natural to use a hypergraph to represent the multiple base clustering results, where instances are represented by nodes and base clusters are represented by hyperedges, some hypergraph based clustering ensemble methods are proposed. Conventional hypergraph based methods obtain the final consensus result by partitioning a pre-defined static hypergraph. However, since base clusters may be imperfect due to the unreliability of base clustering methods, the pre-defined hypergraph constructed from the base clusters is also unreliable. Therefore, directly obtaining the final clustering result by partitioning the unreliable hypergraph is inappropriate. To tackle this problem, in this paper, we propose a clustering ensemble method via structured hypergraph learning, i.e., instead of being constructed directly, the hypergraph is dynamically learned from base results, which will be more reliable. Moreover, when dynamically learning the hypergraph, we enforce it to have a clear clustering structure, which will be more appropriate for clustering tasks, and thus we do not need to perform any uncertain postprocessing, such as hypergraph partitioning. Extensive experiments show that, our method not only performs better than the conventional hypergraph based ensemble methods, but also outperforms the state-of-the-art clustering ensemble methods.  相似文献   

3.
尹玉  詹永照  姜震 《计算机应用》2019,39(8):2204-2209
在视频语义检测中,有标记样本不足会严重影响检测的性能,而且伪标签样本中的噪声也会导致集成学习基分类器性能提升不足。为此,提出一种伪标签置信选择的半监督集成学习算法。首先,在三个不同的特征空间上训练出三个基分类器,得到基分类器的标签矢量;然后,引入加权融合样本所属某个类别的最大概率与次大概率的误差和样本所属某个类别的最大概率与样本所属其他各类别的平均概率的误差,作为基分类器的标签置信度,并融合标签矢量和标签置信度得到样本的伪标签和集成置信度;接着,选择集成置信度高的样本加入到有标签的样本集,迭代训练基分类器;最后,采用训练好的基分类器集成协作检测视频语义概念。该算法在实验数据集UCF11上的平均准确率到达了83.48%,与Co-KNN-SVM算法相比,平均准确率提高了3.48个百分点。该算法选择的伪标签能体现样本所属类别与其他类别的总体差异性,又能体现所属类别的唯一性,可减少利用伪标签样本的风险,有效提高视频语义概念检测的准确率。  相似文献   

4.
This paper presents a method for improved ensemble learning, by treating the optimization of an ensemble of classifiers as a compressed sensing problem. Ensemble learning methods improve the performance of a learned predictor by integrating a weighted combination of multiple predictive models. Ideally, the number of models needed in the ensemble should be minimized, while optimizing the weights associated with each included model. We solve this problem by treating it as an example of the compressed sensing problem, in which a sparse solution must be reconstructed from an under-determined linear system. Compressed sensing techniques are then employed to find an ensemble which is both small and effective. An additional contribution of this paper, is to present a new performance evaluation method (a new pairwise diversity measurement) called the roulette-wheel kappa-error. This method takes into account the different weightings of the classifiers, and also reduces the total number of pairs of classifiers needed in the kappa-error diagram, by selecting pairs through a roulette-wheel selection method according to the weightings of the classifiers. This approach can greatly improve the clarity and informativeness of the kappa-error diagram, especially when the number of classifiers in the ensemble is large. We use 25 different public data sets to evaluate and compare the performance of compressed sensing ensembles using four different sparse reconstruction algorithms, combined with two different classifier learning algorithms and two different training data manipulation techniques. We also give the comparison experiments of our method against another five state-of-the-art pruning methods. These experiments show that our method produces comparable or better accuracy, while being significantly faster than the compared methods.  相似文献   

5.
Abstract Error Correcting Output Coding (ECOC) methods for multiclass classification present several open problems ranging from the trade-off between their error recovering capabilities and the learnability of the induced dichotomies to the selection of proper base learners and to the design of well-separated codes for a given multiclass problem. We experimentally analyse some of the main factors affecting the effectiveness of ECOC methods. We show that the architecture of ECOC learning machines influences the accuracy of the ECOC classifier, highlighting that ensembles of parallel and independent dichotomic Multi-Layer Perceptrons are well-suited to implement ECOC methods. We quantitatively evaluate the dependence among codeword bit errors using mutual information based measures, experimentally showing that a low dependence enhances the generalisation capabilities of ECOC. Moreover we show that the proper selection of the base learner and the decoding function of the reconstruction stage significantly affects the performance of the ECOC ensemble. The analysis of the relationships between the error recovering power, the accuracy of the base learners, and the dependence among codeword bits show that all these factors concur to the effectiveness of ECOC methods in a not straightforward way, very likely dependent on the distribution and complexity of the data.An erratum to this article can be found at  相似文献   

6.
ContextBlocking bugs are bugs that prevent other bugs from being fixed. Previous studies show that blocking bugs take approximately two to three times longer to be fixed compared to non-blocking bugs.ObjectiveThus, automatically predicting blocking bugs early on so that developers are aware of them, can help reduce the impact of or avoid blocking bugs. However, a major challenge when predicting blocking bugs is that only a small proportion of bugs are blocking bugs, i.e., there is an unequal distribution between blocking and non-blocking bugs. For example, in Eclipse and OpenOffice, only 2.8% and 3.0% bugs are blocking bugs, respectively. We refer to this as the class imbalance phenomenon.MethodIn this paper, we propose ELBlocker to identify blocking bugs given a training data. ELBlocker first randomly divides the training data into multiple disjoint sets, and for each disjoint set, it builds a classifier. Next, it combines these multiple classifiers, and automatically determines an appropriate imbalance decision boundary to differentiate blocking bugs from non-blocking bugs. With the imbalance decision boundary, a bug report will be classified to be a blocking bug when its likelihood score is larger than the decision boundary, even if its likelihood score is low.ResultsTo examine the benefits of ELBlocker, we perform experiments on 6 large open source projects – namely Freedesktop, Chromium, Mozilla, Netbeans, OpenOffice, and Eclipse containing a total of 402,962 bugs. We find that ELBlocker achieves F1 and EffectivenessRatio@20% scores of up to 0.482 and 0.831, respectively. On average across the 6 projects, ELBlocker improves the F1 and EffectivenessRatio@20% scores over the state-of-the-art method proposed by Garcia and Shihab by 14.69% and 8.99%, respectively. Statistical tests show that the improvements are significant and the effect sizes are large.ConclusionELBlocker can help deal with the class imbalance phenomenon and improve the prediction of blocking bugs. ELBlocker achieves a substantial and statistically significant improvement over the state-of-the-art methods, i.e., Garcia and Shihab’s method, SMOTE, OSS, and Bagging.  相似文献   

7.
《Pattern recognition》2014,47(2):854-864
In this work, a new one-class classification ensemble strategy called approximate polytope ensemble is presented. The main contribution of the paper is threefold. First, the geometrical concept of convex hull is used to define the boundary of the target class defining the problem. Expansions and contractions of this geometrical structure are introduced in order to avoid over-fitting. Second, the decision whether a point belongs to the convex hull model in high dimensional spaces is approximated by means of random projections and an ensemble decision process. Finally, a tiling strategy is proposed in order to model non-convex structures. Experimental results show that the proposed strategy is significantly better than state of the art one-class classification methods on over 200 datasets.  相似文献   

8.
Meta-learning is one of the latest research directions in machine learning, which is considered to be one of the most probably ways to realize strong artificial intelligence. Meta-learning focuses on seeking solutions for machines to learn like human beings do - to recognize things through only few sample data and quickly adapt to new tasks. Challenges occur in how to train an efficient machine model with limited labeled data, since the model is easily over-fitted. In this paper, we address this obvious but important problem and propose a metric-based meta-learning model, which combines attention mechanisms and ensemble learning method. In our model, we first design a dual path attention module which considers both channel attention and spatial attention module, and the attention modules have been stacked to conduct a meta-learner for few shot meta-learning. Then, we apply an ensemble method called snap-shot ensemble to the attention-based meta-learner in order to generate more models in a single episode. Features abstracted from the models are put into the metric-based architecture to compute a prototype for each class. Our proposed method intensifies the feature extracting ability of backbone network in meta-learner and reduces over-fitting through ensemble learning and metric learning method. Experimental results toward several meta-learning datasets show that our approach is effective.  相似文献   

9.
This article addresses the problem of identifying the most likely music performer, given a set of performances of the same piece by a number of skilled candidate pianists. We propose a set of very simple features for representing stylistic characteristics of a music performer, introducing ‘norm-based’ features that relate to a kind of ‘average’ performance. A database of piano performances of 22 pianists playing two pieces by Frédéric Chopin is used in the presented experiments. Due to the limitations of the training set size and the characteristics of the input features we propose an ensemble of simple classifiers derived by both subsampling the training set and subsampling the input features. Experiments show that the proposed features are able to quantify the differences between music performers. The proposed ensemble can efficiently cope with multi-class music performer recognition under inter-piece conditions, a difficult musical task, displaying a level of accuracy unlikely to be matched by human listeners (under similar conditions).  相似文献   

10.
The volunteer computing paradigm, along with the tailored use of peer-to-peer communication, has recently proven capable of solving a wide area of data-intensive problems in a distributed scenario. The Mining@Home framework is based on these paradigms and it has been implemented to run a wide range of distributed data mining applications. The efficiency and scalability of the architecture can be fully exploited when the overall task can be partitioned into distinct jobs that may be executed in parallel, and input data can be reused, which naturally leads to the use of data cachers. This paper explores the opportunities offered by Mining@Home for coping with the discovery of classifiers through the use of the bagging approach: multiple learners are used to compute models from the same input data, so as to extract a final model with high statistical accuracy. Analysis focuses on the evaluation of experiments performed in a real distributed environment, enriched with simulation assessment–to evaluate very large environments–and with an analytical investigation based on the iso-efficiency methodology. An extensive set of experiments allowed to analyze a number of heterogeneous scenarios, with different problem sizes, which helps to improve the performance by appropriately tuning the number of workers and the number of interconnected domains.  相似文献   

11.
王轩  张林  高磊  蒋昊坤 《计算机应用》2018,38(10):2772-2777
为应对抽样不均匀带来的影响,以基于代表的分类算法为基础,提出一种用于符号型数据分类的留一法集成学习分类算法(LOOELCA)。首先采用留一法获得n个小训练集,其中n为初始训练集大小。然后使用每个训练集构建独立的基于代表的分类器,并标注出分类错误的分类器及对象。最后,标注分类器和原始分类器形成委员会并对测试集对象进行分类。如委员会表决一致,则直接给该测试对象贴上类标签;否则,基于k最近邻(kNN)算法并利用标注对象对测试对象分类。在UCI标准数据集上的实验结果表明,LOOELCA与基于代表的粗糙集覆盖分类(RBC-CBNRS)算法相比,精度平均提升0.35~2.76个百分点,LOOELCA与ID3、J48、Naïve Bayes、OneR等方法相比也有更高的分类准确率。  相似文献   

12.
Compressing videos while maintaining an acceptable level of Quality of Experience (QoE) is indispensable. To this aim, a feasible method is to further increase the Quantization Parameter (QP) of video stream to eliminate visual redundancy, simultaneously utilizing perceptual characteristics of Human Visual System (HVS) to impose a threshold constraint on the maximum QP. In this paper, we employ Just Noticeable Distortion (JND) to characterize the aforementioned threshold constraint, thereby avoiding perceptual loss during QP refinement process. We propose an effective JND-based algorithm for QP optimization, in which a video saliency detection is introduced to extract regions of interest, a refinement model based on a lightweight network is designed to predict QP value and an ensemble learning method to improve generalization performance. Theoretical analysis and experimental results demonstrate that the proposed algorithm has been successfully applied to Versatile Video Coding (VVC) to achieve significant bitrate reduction without sacrificing perceived quality.  相似文献   

13.
In multi-instance learning, the training set is composed of labeled bags each consists of many unlabeled instances, that is, an object is represented by a set of feature vectors instead of only one feature vector. Most current multi-instance learning algorithms work through adapting single-instance learning algorithms to the multi-instance representation, while this paper proposes a new solution which goes at an opposite way, that is, adapting the multi-instance representation to single-instance learning algorithms. In detail, the instances of all the bags are collected together and clustered into d groups first. Each bag is then re-represented by d binary features, where the value of the ith feature is set to one if the concerned bag has instances falling into the ith group and zero otherwise. Thus, each bag is represented by one feature vector so that single-instance classifiers can be used to distinguish different classes of bags. Through repeating the above process with different values of d, many classifiers can be generated and then they can be combined into an ensemble for prediction. Experiments show that the proposed method works well on standard as well as generalized multi-instance problems. Zhi-Hua Zhou is currently Professor in the Department of Computer Science & Technology and head of the LAMDA group at Nanjing University. His main research interests include machine learning, data mining, information retrieval, and pattern recognition. He is associate editor of Knowledge and Information Systems and on the editorial boards of Artificial Intelligence in Medicine, International Journal of Data Warehousing and Mining, Journal of Computer Science & Technology, and Journal of Software. He has also been involved in various conferences. Min-Ling Zhang received his B.Sc. and M.Sc. degrees in computer science from Nanjing University, China, in 2001 and 2004, respectively. Currently he is a Ph.D. candidate in the Department of Computer Science & Technology at Nanjing University and a member of the LAMDA group. His main research interests include machine learning and data mining, especially in multi-instance learning and multi-label learning.  相似文献   

14.
Emotion is a status that combines people’s feelings, thoughts, and behaviors, and plays a crucial role in communication among people. Large studies suggest that human emotions can also be conveyed through online interactions. Previous studies have addressed the mechanism of emotional contagion; however, emotional contagion, through users of online social networks, has not yet been thoroughly researched. Therefore, in this study, initially, the definition of emotion roles, which may play an important role in emotional contagion, is introduced. On this basis, an emotion role mining approach based on multiview ensemble learning (ERM-ME) is proposed to detect emotion roles in social networks by fusing the information contained in different features. The ERM-ME approach includes three stages: detection of emotional communities, local fusion, and global fusion. First, ERM-ME divides emotional communities based on user emotional preferences. Then, emotional features are employed to train basic classifiers, which are then combined into meta-classifiers. Finally, an accuracy-based weighted voting scheme is used to integrate the results of meta-classifiers to achieve a more accurate and comprehensive classification. Experiments and evaluations are performed using Flickr and Microblog datasets to verify the practicability and effectiveness of the proposed method. Extensive experimental results show that the proposed approach outperforms alternative methods. The micro F-score is used as an evaluation indicator. Using the ERM-ME approach, the indicator is improved by approximately 1.09%–14.57% on Flickr and 5.19%–8.95% on Microblog, compared with Graph Convolutional Network, random forest, AdaBoost, bagging, and stacking.  相似文献   

15.
蔡铁  伍星  李烨 《计算机应用》2008,28(8):2091-2093
为构造集成学习中具有差异性的基分类器,提出基于数据离散化的基分类器构造方法,并用于支持向量机集成。该方法采用粗糙集和布尔推理离散化算法处理训练样本集,能有效删除不相关和冗余的属性,提高基分类器的准确性和差异性。实验结果表明,所提方法能取得比传统集成学习算法Bagging和Adaboost更好的性能。  相似文献   

16.
Graph-based semi-supervised classification depends on a well-structured graph. However, it is difficult to construct a graph that faithfully reflects the underlying structure of data distribution, especially for data with a high dimensional representation. In this paper, we focus on graph construction and propose a novel method called semi-supervised ensemble classification in subspaces, SSEC in short. Unlike traditional methods that execute graph-based semi-supervised classification in the original space, SSEC performs semi-supervised linear classification in subspaces. More specifically, SSEC first divides the original feature space into several disjoint feature subspaces. Then, it constructs a neighborhood graph in each subspace, and trains a semi-supervised linear classifier on this graph, which will serve as the base classifier in an ensemble. Finally, SSEC combines the obtained base classifiers into an ensemble classifier using the majority-voting rule. Experimental results on facial images classification show that SSEC not only has higher classification accuracy than the competitive methods, but also can be effective in a wide range of values of input parameters.  相似文献   

17.
标题分类是对一个标题性语句进行分类,通常这个标题是不超过20个字的短文本,内容精炼概括性强。针对标题文本的特征稀疏性和含义不确定性,提出了一种融合随机森林与贝叶斯多项式的标题分类算法。该算法把贝叶斯多项式模型引入到随机森林底层分类器构建过程中,同时利用随机森林附带的OOB数据提出了一种基于二维权重分布的投票机制。最后在图书馆真实书目数据上进行实验,针对分类性能与当前基于LDA主题扩展的SVM算法进行对比。实验结果表明在一定条件下,该方法性能稳定,表现较佳。  相似文献   

18.
Generalized additive models (GAMs) are a generalization of generalized linear models (GLMs) and constitute a powerful technique which has successfully proven its ability to capture nonlinear relationships between explanatory variables and a response variable in many domains. In this paper, GAMs are proposed as base classifiers for ensemble learning. Three alternative ensemble strategies for binary classification using GAMs as base classifiers are proposed: (i) GAMbag based on Bagging, (ii) GAMrsm based on the Random Subspace Method (RSM), and (iii) GAMens as a combination of both. In an experimental validation performed on 12 data sets from the UCI repository, the proposed algorithms are benchmarked to a single GAM and to decision tree based ensemble classifiers (i.e. RSM, Bagging, Random Forest, and the recently proposed Rotation Forest). From the results a number of conclusions can be drawn. Firstly, the use of an ensemble of GAMs instead of a single GAM always leads to improved prediction performance. Secondly, GAMrsm and GAMens perform comparably, while both versions outperform GAMbag. Finally, the value of using GAMs as base classifiers in an ensemble instead of standard decision trees is demonstrated. GAMbag demonstrates performance comparable to ordinary Bagging. Moreover, GAMrsm and GAMens outperform RSM and Bagging, while these two GAM ensemble variations perform comparably to Random Forest and Rotation Forest. Sensitivity analyses are included for the number of member classifiers in the ensemble, the number of variables included in a random feature subspace and the number of degrees of freedom for GAM spline estimation.  相似文献   

19.
齐峰  刘希玉 《控制与决策》2010,25(11):1684-1688
针对数据挖掘领域分类问题的特点.提出了基于多神经树集成的分类模型(CMBNTE).该模型利用改进遗传规划算法和粒子群算法,实现单个神经树模型的优化;借鉴集成学习思想,将多个神经树模型组合成最终的分类模型.在6个UCI数据集上的实验结果表明,该模型能较好地解决分类问题,尤其适用于多分类属性的复杂分类问题.  相似文献   

20.
Self-adaptation is an inherent part of any natural and intelligent system. Specifically, it is about the ability of a system to reconcile its requirements or goal of existence with the environment it is interacting with, by adopting an optimal behavior. Self-adaptation becomes crucial when the environment changes dynamically over time. In this paper, we investigate self-adaptation of classification systems at three levels: (1) natural adaptation of the base learners to change in the environment, (2) contributive adaptation when combining the base learners in an ensemble, and (3) structural adaptation of the combination as a form of dynamic ensemble. The present study focuses on neural network classification systems to handle a special facet of self-adaptation, that is, incremental learning (IL). With IL, the system self-adjusts to accommodate new and possibly non-stationary data samples arriving over time. The paper discusses various IL algorithms and shows how the three adaptation levels are inherent in the system's architecture proposed and how this architecture is efficient in dealing with dynamic change in the presence of various types of data drift when applying these IL algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号