首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
如今,机械学习在数据挖掘、图像处理、自然语言处理以及生物特征识别等领域的应用已十分广泛。在机器学习中有一种"无免费午餐(NFL)"的定理,它指出没有任何一个算法可适用于每个问题,尤其是与监督学习相关的。因此,应尝试多种不同的算法来解决问题,同时还要使用"测试集"对不同算法进行评估,并选出最优者。笔者基于机器学习的发展,对几种常见算法优劣进行了研究分析,并讨论了其发展前景。  相似文献   

2.
为研究机器学习的推广误差,提出了变一误差估计条件下一种新的学习算法稳定.逐点假设稳定,并讨论了逐点假设稳定、CV稳定、重叠稳定以及弱假设稳定四种学习算法稳定之间的关系,得出了逐点假设稳定是这四种学习算法稳定中最弱的学习算法稳定的结论。  相似文献   

3.
李群机器学习十年研究进展   总被引:2,自引:0,他引:2  
该文主要从3个方面介绍李群机器学习近年来的研究进展。首先,该文将解释为什么采用李群结构进行数据或特征描述,以此阐明李群机器学习与传统机器学习方法的区别,并且通过李群在人工智能领域的广泛应用来说明李群表示的普遍性。其次,该文概述了李群机器学习自提出以来的主要学习算法,着重强调最近的一些研究进展。最后,针对目前的研究现状,该文给出李群机器学习未来的一些研究方向。  相似文献   

4.
关于深度学习的综述与讨论   总被引:2,自引:0,他引:2       下载免费PDF全文
机器学习是通过计算模型和算法从数据中学习规律的一门学问,在各种需要从复杂数据中挖掘规律的领域中有很多应用,已成为当今广义的人工智能领域最核心的技术之一。近年来,多种深度神经网络在大量机器学习问题上取得了令人瞩目的成果,形成了机器学习领域最亮眼的一个新分支——深度学习,也掀起了机器学习理论、方法和应用研究的一个新高潮。对深度学习代表性方法的核心原理和典型优化算法进行了综述,回顾与讨论了深度学习与以往机器学习方法之间的联系与区别,并对深度学习中一些需要进一步研究的问题进行了初步讨论。  相似文献   

5.
结构损伤识别是一个热门的研究课题。一些归纳学习方法已经被使用来解决这个问题。在这篇文章中,采用分治法(DAC)、变治法(SAC)、装袋学习算法(Bagging)、径向基神经网络(RBFNN)四种不同的机器学习方法来对混凝土悬臂梁进行损伤位置的研究。结果显示归纳学习方法特别是装袋学习方法在噪声程度超过50%时明显好于神经网络方法。  相似文献   

6.
蝴蝶种类成千上万,每种蝴蝶都与一定植物密切相关,研究蝴蝶种类自动识别有重要意义.野外环境下的蝴蝶物种识别研究受制于现有数据集蝴蝶种类较少,每类样本(图像)数量较少,使基于机器学习的蝴蝶种类识别面临泛化推广难的挑战.另外,野外环境下的蝴蝶翅膀遮挡使分类特征学习面临挑战.因此,提出基于元学习的蝴蝶物种自动识别新模型DL-MAML(deep learning advanced model-agnostic meta-learning),实现野外环境下的任意蝴蝶种类识别.首先,DL-MAML模型采用L2正则改进经典元学习算法MAML(model-agnostic meta-learning)的目标函数和模型参数更新方法,并对MAML增加了2层特征学习模块,避免模型陷入过拟合风险,解决现有野外环境下蝴蝶物种识别面临的泛化推广困难;其次,采用ResNet34深度学习模型提取蝴蝶分类特征,对图像进行表征预处理,作为DL-MAML模型元学习模块的输入,克服其特征提取不足的缺陷,以及野外环境下蝴蝶翅膀遮挡带来的分类特征学习困难.大量消融实验以及与同类模型的实验比较表明,DL-MAML算法学习获得的初始模...  相似文献   

7.
深度学习(Deep Learning,DL)是机器学习领域的重要研究分支,当前已广泛应用在智能农业领域中的花卉识别、杂草检测和病虫害检测等方面。笔者介绍了深度学习的发展历程,阐述了主流的基于卷积神经网络的目标识别算法,将当前较为典型的两种图像识别算法FasterR-CNN和YOLO应用于识别花卉图片,通过比较分析两种方法在花卉图片识别过程中的优劣性,并提出了下一步的研究方向。  相似文献   

8.
深度学习及其在目标和行为识别中的新进展   总被引:12,自引:7,他引:5       下载免费PDF全文
深度学习是机器学习中的一个新的研究领域。通过深度学习的方法构建深度网络来抽取特征是目前目标和行为识别中得到关注的研究方向。为引起更多计算机视觉领域研究者对深度学习进行探索和讨论,并推动目标和行为识别的研究,本文对深度学习及其在目标和行为识别中的新进展给予了概述。本文先介绍深度学习领域研究的基本状况、主要概念和原理;然后介绍近期利用深度学习在目标和行为识别应用中的一些新进展;最后阐述了深度学习与神经网络之间的关系,深度学习的优缺点,以及目前深度学习理论需要解决的主要问题。这对拟将深度学习应用于目标和行为识别的研究人员应有所帮助。  相似文献   

9.
近年来机器学习和深度学习在机器视觉方面已取得了很大进展,表情识别已然成为其中的热门领域.表情识别的应用使得计算机可以更好的理解人类情绪,具有较高的研究价值和应用前景.本文归纳了表情识别领域常用公开数据集;介绍了表情识别的基本流程与常见方法,以及不同卷积神经网络在表情识别方面的方法研究与分析;针对表情识别领域现存问题和未...  相似文献   

10.
线性市场值函数模型是用来解决目标销售中识别潜在顾客的一种方法。该文将三个经典的机器学习算法(WH算法、EG算法和EG±算法)应用到该模型上,从而求得市场值函数,并实现了一个实验系统。实验结果表明这些学习算法在线性市场值函数模型的应用是有效可行的。  相似文献   

11.
在监督或半监督学习的条件下对数据流集成分类进行研究是一个很有意义的方向.从基分类器、关键技术、集成策略等三个方面进行介绍,其中,基分类器主要介绍了决策树、神经网络、支持向量机等;关键技术从增量、在线等方面介绍;集成策略主要介绍了boosting、stacking等.对不同集成方法的优缺点、对比算法和实验数据集进行了总结与分析.最后给出了进一步研究方向,包括监督和半监督学习下对于概念漂移的处理、对于同质集成和异质集成的研究,无监督学习下的数据流集成分类等.  相似文献   

12.
检测恶意URL对防御网络攻击有着重要意义. 针对有监督学习需要大量有标签样本这一问题, 本文采用半监督学习方式训练恶意URL检测模型, 减少了为数据打标签带来的成本开销. 在传统半监督学习协同训练(co-training)的基础上进行了算法改进, 利用专家知识与Doc2Vec两种方法预处理的数据训练两个分类器, 筛选两个分类器预测结果相同且置信度高的数据打上伪标签(pseudo-labeled)后用于分类器继续学习. 实验结果表明, 本文方法只用0.67%的有标签数据即可训练出检测精确度(precision)分别达到99.42%和95.23%的两个不同类型分类器, 与有监督学习性能相近, 比自训练与协同训练表现更优异.  相似文献   

13.
高维心电图数据存在大量不相关特征,基于监督机器学习技术很难同时获得较高敏感性与特异性。在预处理操作心电图数据,如校准基线漂移、去除高频噪声和拟合多项式特征的基础上,提出一种基于监督多元对应分析(MCA)降维技术的分类模型自动分类心跳。该方法离散化连续心电图数据为类属数据,并发展有监督MCA降维技术提取心电图数据关键特征,用各种分类算法自动分类心电图心跳数据。在PTB诊断数据库的心电图数据集上测试结果表明,与几种基于监督机器学习分类技术相比,在监督MCA降维框架中各种分类算法能以较高敏感性和特异性自动分类心电图心跳数据。  相似文献   

14.
何丽  韩文秀 《计算机工程》2005,31(12):18-19,80
在机器学习中,分类器融合已经成为一个新的研究领域。该本文介绍了用元决策树(MDT)融合多个分类器的新方法,阐释了MDT、元属性以及用MDT组合多个分类器的stacking框架。  相似文献   

15.
Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learning-theoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a component-wise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsity-promoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.  相似文献   

16.
Tri-training: exploiting unlabeled data using three classifiers   总被引:24,自引:0,他引:24  
In many practical data mining applications, such as Web page classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi-supervised learning algorithms such as co-training have attracted much attention. In this paper, a new co-training style semi-supervised learning algorithm, named tri-training, is proposed. This algorithm generates three classifiers from the original labeled example set. These classifiers are then refined using unlabeled examples in the tri-training process. In detail, in each round of tri-training, an unlabeled example is labeled for a classifier if the other two classifiers agree on the labeling, under certain conditions. Since tri-training neither requires the instance space to be described with sufficient and redundant views nor does it put any constraints on the supervised learning algorithm, its applicability is broader than that of previous co-training style algorithms. Experiments on UCI data sets and application to the Web page classification task indicate that tri-training can effectively exploit unlabeled data to enhance the learning performance.  相似文献   

17.
Software defect prediction is an important decision support activity in software quality assurance. The limitation of the labelled modules usually makes the prediction difficult, and the class‐imbalance characteristic of software defect data leads to negative influence on decision of classifiers. Semi‐supervised learning can build high‐performance classifiers by using large amount of unlabelled modules together with the labelled modules. Ensemble learning achieves a better prediction capability for class‐imbalance data by using a series of weak classifiers to reduce the bias generated by the majority class. In this paper, we propose a new semi‐supervised software defect prediction approach, non‐negative sparse‐based SemiBoost learning. The approach is capable of exploiting both labelled and unlabelled data and is formulated in a boosting framework. In order to enhance the prediction ability, we design a flexible non‐negative sparse similarity matrix, which can fully exploit the similarity of historical data by incorporating the non‐negativity constraint into sparse learning for better learning the latent clustering relationship among software modules. The widely used datasets from NASA projects are employed as test data to evaluate the performance of all compared methods. Experimental results show that non‐negative sparse‐based SemiBoost learning outperforms several representative state‐of‐the‐art semi‐supervised software defect prediction methods. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

18.
Prototype classifiers have been studied for many years. However, few methods can realize incremental learning. On the other hand, most prototype classifiers need users to predetermine the number of prototypes; an improper prototype number might undermine the classification performance. To deal with these issues, in the paper we propose an online supervised algorithm named Incremental Learning Vector Quantization (ILVQ) for classification tasks. The proposed method has three contributions. (1) By designing an insertion policy, ILVQ incrementally learns new prototypes, including both between-class incremental learning and within-class incremental learning. (2) By employing an adaptive threshold scheme, ILVQ automatically learns the number of prototypes needed for each class dynamically according to the distribution of training data. Therefore, unlike most current prototype classifiers, ILVQ needs no prior knowledge of the number of prototypes or their initial value. (3) A technique for removing useless prototypes is used to eliminate noise interrupted into the input data. Results of experiments show that the proposed ILVQ can accommodate the incremental data environment and provide good recognition performance and storage efficiency.  相似文献   

19.
In this paper, we present a weakly supervised learning approach for spoken language understanding in domain-specific dialogue systems. We model the task of spoken language understanding as a two-stage classification problem. Firstly, the topic classifier is used to identify the topic of an input utterance. Secondly, with the restriction of the recognized target topic, the slot classifiers are trained to extract the corresponding slot-value pairs. It is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. More importantly, it allows that weakly supervised strategies are employed for training the two kinds of classifiers, which could significantly reduce the number of labeled sentences. We investigated active learning and naive self-training for the two kinds of classifiers. Also, we propose a practical method for bootstrapping topic-dependent slot classifiers from a small amount of labeled sentences. Experiments have been conducted in the context of the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The experimental results show the effectiveness of our proposed SLU framework and demonstrate the possibility to reduce human labeling efforts significantly.  相似文献   

20.
The paper presents a novel approach for voice activity detection. The main idea behind the presented approach is to use, next to the likelihood ratio of a statistical model-based voice activity detector, a set of informative distinct features in order to, via a supervised learning approach, enhance the detection performance. The statistical model-based voice activity detector, which is chosen based on the comparison to other similar detectors in an earlier work, models the spectral envelope of the signal and we derive the likelihood ratio thereof. Furthermore, the likelihood ratio together with 70 other various features was meticulously analyzed with an input variable selection algorithm based on partial mutual information. The resulting analysis produced a 13 element reduced input vector which when compared to the full input vector did not undermine the detector performance. The evaluation is performed on a speech corpus consisting of recordings made by six different speakers, which were corrupted with three different types of noises and noise levels. In the end, we tested three different supervised learning algorithms for the task, namely, support vector machine, Boost, and artificial neural networks. The experimental analysis was performed by 10-fold cross-validation due to which threshold averaged receiver operating characteristics curves were constructed. Also, the area under the curve score and Matthew's correlation coefficient were calculated for both the three supervised learning classifiers and the statistical model-based voice activity detector. The results showed that the classifier with the reduced input vector significantly outperformed the standalone detector based on the likelihood ratio, and that among the three classifiers, Boost showed the most consistent performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号