首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 27 毫秒
1.
集成分类通过将若干个弱分类器依据某种规则进行组合,能有效改善分类性能。在组合过程中,各个弱分类器对分类结果的重要程度往往不一样。极限学习机是最近提出的一个新的训练单隐层前馈神经网络的学习算法。以极限学习机为基分类器,提出了一个基于差分进化的极限学习机加权集成方法。提出的方法通过差分进化算法来优化集成方法中各个基分类器的权值。实验结果表明,该方法与基于简单投票集成方法和基于Adaboost集成方法相比,具有较高的分类准确性和较好的泛化能力。  相似文献   

2.
几种机器学习方法在人脸识别中的性能比较   总被引:2,自引:1,他引:2       下载免费PDF全文
BP神经网络、RBF神经网络、支持向量机(SVM)和集成学习是目前应用最为广泛的四种机器学习方法。将这四种常用的机器学习方法分别应用于人脸识别,并利用ORL人脸图像库对各学习方法性能进行了测试和评估。测试结果表明SVM和集成学习在实验中取得了较好的性能,最适合用于人脸识别中特征分类器。  相似文献   

3.
将极限学习机算法与旋转森林算法相结合,提出了以ELM算法为基分类器并以旋转森林算法为框架的RF-ELM集成学习模型。在8个数据集上进行了3组预测实验,根据实验结果讨论了ELM算法中隐含层神经元个数对预测结果的影响以及单个ELM模型预测结果不稳定的缺陷;将RF-ELM模型与单ELM模型和基于Bagging算法集成的ELM模型相比较,由稳定性和预测精度的两组对比实验的实验结果表明,对ELM的集成学习可以有效地提高ELM模型的性能,且RF-ELM模型较其他两个模型具有更好的稳定性和更高的准确率,验证了RF-ELM是一种有效的ELM集成学习模型。  相似文献   

4.
Many multi-class classification algorithms in statistics and machine learning typically combine several binary classifiers in order to construct an overall classifier. In the popular pairwise ensemble, one classifier is built for each pair of classes, resulting in pairwise bipartite rankings. In contrast, ordinal regression algorithms consider a single ranking function for several ordered classes. It is known in the literature that pairwise ensembles can be useful for ordinal regression. However, can single ranking models make a contribution to multi-class classification? The answer to this question should be affirmative, as supported by theoretical results presented in this article. We conduct a formal analysis of the consistency of pairwise bipartite rankings by uncovering the conditions under which they can be equivalently expressed in terms of a single ranking. Similar to the utility representability of pairwise preference relations, it turns out that transitivity plays a crucial role in the characterization of the ranking representability of pairwise bipartite rankings. To this end, we introduce the new concepts of strict ranking representability, a restrictive condition that can be verified easily, and AUC ranking representability, a practically more useful condition that is more difficult to verify. However, the link between pairwise bipartite rankings and dice games allows us to formulate necessary transitivity conditions for AUC ranking representability. A sufficient condition on the other hand is obtained by introducing a new type of transitivity that can be verified by solving an integer quadratic program.  相似文献   

5.
传统的雷电数据预测方法往往采用单一最优机器学习算法,较少考虑气象数据的时空变化等现象。针对该现象,提出一种基于集成策略的多机器学习短时雷电预报算法。首先,对气象数据进行属性约简,降低数据维度;其次,在数据集上训练多种异构机器学习分类器,并基于预测质量筛选最优基分类器;最后,通过对最优基分类器训练权重,并结合集成策略产生最终分类器。实验表明,该方法优于传统单最优方法,其平均预测准确率提高了9.5%。  相似文献   

6.
The paper proposes a semantic-based metadata framework for personalised interaction with TV media in a connected home context. Our approach allows the current home media centres to go beyond the simple concept of electronic programme guides and to offer the users a personalised media experience in an ambient home environment. The user’s characteristics, preferences and context are used to personalise the user’s experience of viewing and interacting with multimedia content on different heterogeneous devices. The TV-Anytime specification provides the basis for the metadata framework for handling content from IP, digital broadcast, and Blu-ray disc sources.  相似文献   

7.
This article proposes a new approach to improve the classification performance of remotely sensed images with an aggregative model based on classifier ensemble (AMCE). AMCE is a multi-classifier system with two procedures, namely ensemble learning and predictions combination. Two ensemble algorithms (Bagging and AdaBoost.M1) were used in the ensemble learning process to stabilize and improve the performance of single classifiers (i.e. maximum likelihood classifier, minimum distance classifier, back propagation neural network, classification and regression tree, and support vector machine (SVM)). Prediction results from single classifiers were integrated according to a diversity measurement with an averaged double-fault indicator and different combination strategies (i.e. weighted vote, Bayesian product, logarithmic consensus, and behaviour knowledge space). The suitability of the AMCE model was examined using a Landsat Thematic Mapper (TM) image of Dongguan city (Guangdong, China), acquired on 2 January 2009. Experimental results show that the proposed model was significantly better than the most accurate single classification (i.e. SVM) in terms of classification accuracy (i.e. from 88.83% to 92.45%) and kappa coefficient (i.e. from 0.8624 to 0.9088). A stepwise comparison illustrates that both ensemble learning and predictions combination with the AMCE model improved classification.  相似文献   

8.
一种基于局部随机子空间的分类集成算法   总被引:1,自引:0,他引:1  
分类器集成学习是当前机器学习研究领域的热点之一。然而,经典的采用完全随机的方法,对高维数据而言,难以保证子分类器的性能。 为此,文中提出一种基于局部随机子空间的分类集成算法,该算法首先采用特征选择方法得到一个有效的特征序列,进而将特征序列划分为几个区段并依据在各区段的采样比例进行随机采样,以此来改进子分类器性能和子分类器的多样性。在5个UCI数据集和5个基因数据集上进行实验,实验结果表明,文中方法优于单个分类器的分类性能,且在多数情况下优于经典的分类集成方法。  相似文献   

9.
The TV-Anytime standard describes the structures of categories of digital TV program metadata, as well as user profile metadata for TV programs. We describe a natural language (NL) model for the users to interact with the TV-Anytime metadata and preview TV programs from their mobile devices. The language utilises completely the TV-Anytime metadata specifications (upper ontologies), as well as domain-specific ontologies. The interaction model does not use clarification dialogues, but it uses the user profiles as well as TV-Anytime metadata information and ontologies to rank the possible responses in case of ambiguities. We describe implementations of the model that run on a PDA and on a mobile phone, and manage the metadata on a remote TV-Anytime-compatible TV set. We present user evaluations of the approach. Finally, we propose a generalised implementation framework that can be used to easily provide NL interfaces for mobile devices for different applications and ontologies.  相似文献   

10.
ObjectiveManual evaluation of machine learning algorithms and selection of a suitable classifier from the list of available candidate classifiers, is highly time consuming and challenging task. If the selection is not carefully and accurately done, the resulting classification model will not be able to produce the expected performance results. In this study, we present an accurate multi-criteria decision making methodology (AMD) which empirically evaluates and ranks classifiers’ and allow end users or experts to choose the top ranked classifier for their applications to learn and build classification models for them.Methods and materialExisting classifiers performance analysis and recommendation methodologies lack (a) appropriate method for suitable evaluation criteria selection, (b) relative consistent weighting mechanism, (c) fitness assessment of the classifiers’ performances, and (d) satisfaction of various constraints during the analysis process. To assist machine learning practitioners in the selection of suitable classifier(s), AMD methodology is proposed that presents an expert group-based criteria selection method, relative consistent weighting scheme, a new ranking method, called optimum performance ranking criteria, based on multiple evaluation metrics, statistical significance and fitness assessment functions, and implicit and explicit constraints satisfaction at the time of analysis. For ranking the classifiers performance, the proposed ranking method integrates Wgt.Avg.F-score, CPUTimeTesting, CPUTimeTraining, and Consistency measures using the technique for order performance by similarity to ideal solution (TOPSIS). The final relative closeness score produced by TOPSIS, is ranked and the practitioners select the best performance (top-ranked) classifier for their problems in-hand.FindingsBased on the extensive experiments performed on 15 publically available UCI and OpenML datasets using 35 classification algorithms from heterogeneous families of classifiers, an average Spearman's rank correlation coefficient of 0.98 is observed. Similarly, the AMD method has showed improved performance of 0.98 average Spearman's rank correlation coefficient as compared to 0.83 and 0.045 correlation coefficient of the state-of-the-art ranking methods, performance of algorithms (PAlg) and adjusted ratio of ratio (ARR).Conclusion and implicationThe evaluation, empirical analysis of results and comparison with state-of-the-art methods demonstrate the feasibility of AMD methodology, especially the selection and weighting of right evaluation criteria, accurate ranking and selection of optimum performance classifier(s) for the user's application's data in hand. AMD reduces expert's time and efforts and improves system performance by designing suitable classifier recommended by AMD methodology.  相似文献   

11.
This paper presents cluster‐based ensemble classifier – an approach toward generating ensemble of classifiers using multiple clusters within classified data. Clustering is incorporated to partition data set into multiple clusters of highly correlated data that are difficult to separate otherwise and different base classifiers are used to learn class boundaries within the clusters. As the different base classifiers engage on different difficult‐to‐classify subsets of the data, the learning of the base classifiers is more focussed and accurate. A selection rather than fusion approach achieves the final verdict on patterns of unknown classes. The impact of clustering on the learning parameters and accuracy of a number of learning algorithms including neural network, support vector machine, decision tree and k‐NN classifier is investigated. A number of benchmark data sets from the UCI machine learning repository were used to evaluate the cluster‐based ensemble classifier and the experimental results demonstrate its superiority over bagging and boosting.  相似文献   

12.
基于语义嵌入模型与交易信息的智能合约自动分类系统   总被引:1,自引:0,他引:1  
作为区块链技术的一个突破性扩展,智能合约允许用户在区块链上实现个性化的代码逻辑从而使得区块链技术更加的简单易用.在智能合约代码信息迅速增长的背景下,如何管理和组织海量智能合约代码变得更具挑战性.基于人工智能技术的代码分类系统能根据代码的文本信息自动分门别类,从而更好地帮助人们管理和组织代码的信息.本文以Ethereum平台上的智能合约为例,鉴于词嵌入模型可以捕获代码的语义信息,提出一种基于词嵌入模型的智能合约分类系统.另外,每一个智能合约都关联着一系列交易,我们又通过智能合约的交易信息来更深入地了解智能合约的逻辑行为.据我们所知,本文是对智能合约代码自动分类问题的首次研究尝试.测试结果显示该系统具有较为令人满意的分类性能.  相似文献   

13.
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature.While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC).Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases.RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems.  相似文献   

14.
为了提高预测的准确性,文中结合机器学习中堆积(Stacking)集成框架,组合多个分类器对标记分布进行学习,提出基于标记分布学习的异态集成学习算法(HELA-LDL).算法构造两层模型框架,通过第一层结构将样本数据采用组合方式进行异态集成学习,融合各分类器的学习结果,将融合结果输入到第二层分类器,预测结果是带有置信度的标记分布.在专用数据集上的对比实验表明,HELA-LDL可以发挥各种算法在不同场景下的性能较优,稳定性分析进一步说明算法的有效性.  相似文献   

15.
Different classifiers with different characteristics and methodologies can complement each other and cover their internal weaknesses; so classifier ensemble is an important approach to handle the weakness of single classifier based systems. In this article we explore an automatic and fast function to approximate the accuracy of a given classifier on a typical dataset. Then employing the function, we can convert the ensemble learning to an optimisation problem. So, in this article, the target is to achieve a model to approximate the performance of a predetermined classifier over each arbitrary dataset. According to this model, an optimisation problem is designed and a genetic algorithm is employed as an optimiser to explore the best classifier set in each subspace. The proposed ensemble methodology is called classifier ensemble based on subspace learning (CEBSL). CEBSL is examined on some datasets and it shows considerable improvements.  相似文献   

16.
This approach proposes the creation and management of adaptive learning systems by combining component technology, semantic metadata, and adaptation rules. A component model allows interaction among components that share consistent assumptions about what each provides and each requires of the other. It allows indexing, using, reusing, and coupling of components in different contexts powering adaptation. Our claim is that semantic metadata are required to allow a real reusing and assembling of educational component. Finally, a rule language is used to define strategies to rewrite user query and user model. The former allows searching components developing concepts not appearing in the user query but related with user goals, whereas the last allow inferring user knowledge that is not explicit in user model.John Freddy Duitama received his M.Sc. degree in system engineering from the University of Antioquia -Colombia (South America). He is currently a doctoral candidate in the GET – Institut National des Télécommunications, Evry France. This work is sponsored by the University of Antioquia, where he is assistant professor.His research interest includes semantic web and web-based learning systems, educational metadata and learning objects.Bruno Defude received his Ph.D. in Computer Science from the University of Grenoble (I.N.P.G) in 1986. He is currently Professor in the Department of Computer Science at the GET - Institut National des Télécommunications, Evry France where he leads the SIMBAD project (Semantic Interoperability for MoBile and ADaptive applications).His major field of research interest is databases and semantic web, specifically personalized data access, adaptive systems, metadata, interoperability and semantic Peer-to-peer systems with elearning as a privileged application area.He is a member of ACM SIGMOD.Amel Bouzeghoub received a degree of Ph.D. in Computer Sciences at Pierre et Marie Curie University, France.In 2000, she joined the Computer Sciences Department of GET-INT (Institut National des Telecommunications) at Evry (France) as an associate professor.Her research interests include topics related to Web-based Learning Systems, Semantic Metadata for learning resources, Adaptive Learning Systems and Intelligent Tutoring Systems.Claire Lecocq received an Engineer Degree and a Ph.D. in Computer Sciences respectively in 1994 and 1999. In 1997, she joined the Computer Sciences Department at GET-INT (Institut National des Télécommunications) of Evry, France, as an associate professor. Her first research interests included spatial databases and visual query languages. She is now working on adaptive learning systems, particularly on semantic metadata and user models.  相似文献   

17.
This paper presents a method for combining domain knowledge and machine learning (CDKML) for classifier generation and online adaptation. The method exploits advantages in domain knowledge and machine learning as complementary information sources. Whereas machine learning may discover patterns in interest domains that are too subtle for humans to detect, domain knowledge may contain information on a domain not present in the available domain dataset. CDKML has three steps. First, prior domain knowledge is enriched with relevant patterns obtained by machine learning to create an initial classifier. Second, genetic algorithms refine the classifier. Third, the classifier is adapted online on the basis of user feedback using the Markov decision process. CDKML was applied in fall detection. Tests showed that the classifiers developed by CDKML have better performance than machine‐learning classifiers generated on a training dataset that does not adequately represent all real‐life cases of the learned concept. The accuracy of the initial classifier was 10 percentage points higher than the best machine‐learning classifier and the refinement added 3 percentage points. The online adaptation improved the accuracy of the refined classifier by an additional 15 percentage points.  相似文献   

18.
半监督集成学习综述   总被引:3,自引:0,他引:3  
半监督学习和集成学习是目前机器学习领域中两个非常重要的研究方向,半监督学习注重利用有标记样本与无标记样本来获得高性能分类器,而集成学习旨在利用多个学习器进行集成以提升弱学习器的精度。半监督集成学习是将半监督学习和集成学习进行组合来提升分类器泛化性能的机器学习新方法。首先,在分析半监督集成学习发展过程的基础上,发现半监督集成学习起源于基于分歧的半监督学习方法;然后,综合分析现有半监督集成学习方法,将其分为基于半监督的集成学习与基于集成的半监督学习两大类,并对主要的半监督集成方法进行了介绍;最后,对现有研究进了总结,并讨论了未来值得研究的问题。  相似文献   

19.
将集成学习的思想引入到增量学习之中可以显著提升学习效果,近年关于集成式增量学习的研究大多采用加权投票的方式将多个同质分类器进行结合,并没有很好地解决增量学习中的稳定-可塑性难题。针对此提出了一种异构分类器集成增量学习算法。该算法在训练过程中,为使模型更具稳定性,用新数据训练多个基分类器加入到异构的集成模型之中,同时采用局部敏感哈希表保存数据梗概以备待测样本近邻的查找;为了适应不断变化的数据,还会用新获得的数据更新集成模型中基分类器的投票权重;对待测样本进行类别预测时,以局部敏感哈希表中与待测样本相似的数据作为桥梁,计算基分类器针对该待测样本的动态权重,结合多个基分类器的投票权重和动态权重判定待测样本所属类别。通过对比实验,证明了该增量算法有比较高的稳定性和泛化能力。  相似文献   

20.
Hu Li  Ye Wang  Hua Wang  Bin Zhou 《World Wide Web》2017,20(6):1507-1525
Imbalanced streaming data is commonly encountered in real-world data mining and machine learning applications, and has attracted much attention in recent years. Both imbalanced data and streaming data in practice are normally encountered together; however, little research work has been studied on the two types of data together. In this paper, we propose a multi-window based ensemble learning method for the classification of imbalanced streaming data. Three types of windows are defined to store the current batch of instances, the latest minority instances, and the ensemble classifier. The ensemble classifier consists of a set of latest sub-classifiers, and the instances employed to train each sub-classifier. All sub-classifiers are weighted prior to predicting the class labels of newly arriving instances, and new sub-classifiers are trained only when the precision is below a predefined threshold. Extensive experiments on synthetic datasets and real-world datasets demonstrate that the new approach can efficiently and effectively classify imbalanced streaming data, and generally outperforms existing approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号