共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a new differentially-private decision forest algorithm that minimizes both the number of queries required, and the sensitivity of those queries. To do so, we build an ensemble of random decision trees that avoids querying the private data except to find the majority class label in the leaf nodes. Rather than using a count query to return the class counts like the current state-of-the-art, we use the Exponential Mechanism to only output the class label itself. This drastically reduces the sensitivity of the query – often by several orders of magnitude – which in turn reduces the amount of noise that must be added to preserve privacy. Our improved sensitivity is achieved by using “smooth sensitivity”, which takes into account the specific data used in the query rather than assuming the worst-case scenario. We also extend work done on the optimal depth of random decision trees to handle continuous features, not just discrete features. This, along with several other improvements, allows us to create a differentially private decision forest with substantially higher predictive power than the current state-of-the-art. 相似文献
2.
Gaussian Processes are powerful tools in machine learning which offer wide applicability in regression and classification
problems due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity
which scales cubically with the number of samples. Our work addresses this issue by combining Gaussian Processes with Randomized
Decision Forests to enable fast learning. An important advantage of our method is its simplicity and the ability to directly
control the trade-off between classification performance and computation speed. Experiments on an indoor place recognition
task show that our method can handle large training sets in reasonable time while retaining a good classification accuracy. 相似文献
3.
Gaussian processes are powerful modeling tools in machine learning which offer wide applicability for regression and classification
tasks due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity
which scales cubically with the number of examples. Our work addresses this issue by combining Gaussian processes with random
decision forests to enable fast learning. An important advantage of our method is its simplicity and the ability to directly
control the tradeoff between classification performance and computational speed. Experiments on an indoor place recognition
task and on standard machine learning benchmarks show that our method can handle large training sets of up to three million
examples in reasonable time while retaining good classification accuracy. 相似文献
4.
Speech emotion recognition has been one of the interesting issues in speech processing over the last few decades. Modelling of the emotion recognition process serves to understand as well as assess the performance of the system. This paper compares two different models for speech emotion recognition using vocal tract features namely, the first four formants and their respective bandwidths. The first model is based on a decision tree and the second one employs logistic regression. Whereas the decision tree models are based on machine learning, regression models have a strong statistical basis. The logistic regression models and the decision tree models developed in this work for several cases of binary classifications were validated by speech emotion recognition experiments conducted on a Malayalam emotional speech database of 2800 speech files, collected from ten speakers. The models are not only simple, but also meaningful since they indicate the contribution of each predictor. The experimental results indicate that speech emotion recognition using formants and bandwidths was better modelled using decision trees, which gave higher emotion recognition accuracies compared to logistic regression. The highest accuracy obtained using decision tree was 93.63%, for the classification of positive valence emotional speech as surprised or happy, using seven features. When using logistic regression for the same binary classification, the highest accuracy obtained was 73%, with eight features. 相似文献
5.
This paper presents an integrated system for emotion detection. In this research effort, we have taken into account the fact
that emotions are most widely represented with eye and mouth expressions. The proposed system uses color images and it is
consisted of three modules. The first module implements skin detection, using Markov random fields models for image segmentation
and skin detection. A set of several colored images with human faces have been considered as the training set. A second module
is responsible for eye and mouth detection and extraction. The specific module uses the HLV color space of the specified eye
and mouth region. The third module detects the emotions pictured in the eyes and mouth, using edge detection and measuring
the gradient of eyes’ and mouth’s region figure. The paper provides results from the system application, along with proposals
for further research. 相似文献
6.
Much of previous attention on decision trees focuses on the splitting criteria and optimization of tree sizes. The dilemma between overfitting and achieving maximum accuracy is seldom resolved. A method to construct a decision tree based classifier is proposed that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces. The subspace method is compared to single-tree classifiers and other forest construction methods by experiments on publicly available datasets, where the method's superiority is demonstrated. We also discuss independence between trees in a forest and relate that to the combined classification accuracy 相似文献
7.
Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. In addition, a probabilistic classifier must, of course, also be as accurate as possible. In this paper, Venn predictors, and its special case Venn-Abers predictors, are evaluated for probabilistic classification, using random forests as the underlying models. Venn predictors output multiple probabilities for each label, i.e., the predicted label is associated with a probability interval. Since all Venn predictors are valid in the long run, the size of the probability intervals is very important, with tighter intervals being more informative. The standard solution when calibrating a classifier is to employ an additional step, transforming the outputs from a classifier into probability estimates, using a labeled data set not employed for training of the models. For random forests, and other bagged ensembles, it is, however, possible to use the out-of-bag instances for calibration, making all training data available for both model learning and calibration. This procedure has previously been successfully applied to conformal prediction, but was here evaluated for the first time for Venn predictors. The empirical investigation, using 22 publicly available data sets, showed that all four versions of the Venn predictors were better calibrated than both the raw estimates from the random forest, and the standard techniques Platt scaling and isotonic regression. Regarding both informativeness and accuracy, the standard Venn predictor calibrated on out-of-bag instances was the best setup evaluated. Most importantly, calibrating on out-of-bag instances, instead of using a separate calibration set, resulted in tighter intervals and more accurate models on every data set, for both the Venn predictors and the Venn-Abers predictors. 相似文献
8.
针对单一语音特征对语音情感表达不完整的问题,将具有良好量化和插值特性的LSF参数与体现人耳听觉特性的MFCC参数相融合,提出基于线谱权重的MFCC(WMFCC)新特征。同时,通过高斯混合模型来对该参数建立模型空间,进一步得到GW-MFCC模型空间参数,以获取更高维的细节信息,进一步提高情感识别性能。采用柏林情感语料库进行验证,新参数的识别率比传统的MFCC和LSF分别有5.7%和6.9%的提高。实验结果表明,提出的WMFCC以及GW-MFCC参数可以有效地表现语音情感信息,提高语音情感识别率。 相似文献
9.
The recognition of the emotional state of speakers is a multi-disciplinary research area that has received great interest over the last years. One of the most important goals is to improve the voice-based human–machine interactions. Several works on this domain use the prosodic features or the spectrum characteristics of speech signal, with neural networks, Gaussian mixtures and other standard classifiers. Usually, there is no acoustic interpretation of types of errors in the results. In this paper, the spectral characteristics of emotional signals are used in order to group emotions based on acoustic rather than psychological considerations. Standard classifiers based on Gaussian Mixture Models, Hidden Markov Models and Multilayer Perceptron are tested. These classifiers have been evaluated with different configurations and input features, in order to design a new hierarchical method for emotion classification. The proposed multiple feature hierarchical method for seven emotions, based on spectral and prosodic information, improves the performance over the standard classifiers and the fixed features. 相似文献
10.
针对语音情感识别问题,提出一种采用决策模板的多分类器融合方法,利用不同类型的声学特征子集来构造子分类器。不同的子集能充分提高各子分类器之间的“多样性”指标,这是多分类器融合算法能够成功应用的必备条件。与多数投票融合算法和支持向量机相比该方法取得了较好的识别结果。另一方面,从多样性指标分析的角度出发探究该方法能获得较好识别效果的原因。 相似文献
11.
The problem of risk classification and prediction, an essential research direction, aiming to identify and predict risks for various applications, has been researched in this paper. To identify and predict risks, numerous researchers build models on discovering hidden information of a label (positive credit or negative credit). Fuzzy logic is robust in dealing with ambiguous data and, thus, benefits the problem of classification and prediction. However, the way to apply fuzzy logic optimally depends on the characteristics of the data and the objectives, and it is extraordinarily tricky to find such a way. This paper, therefore, proposes a general membership function model for fuzzy sets (GMFMFS) in the fuzzy decision tree and extend it to the fuzzy random forest method. The proposed methods can be applied to identify and predict the credit risks with almost optimal fuzzy sets. In addition, we analyze the feasibility of our GMFMFS and prove our GMFMFS‐based linear membership function can be extended to a nonlinear membership function without a significant increase in computing complex. Our GMFMFS‐based fuzzy decision tree is tested with a real dataset of US credit, Susy dataset of UCI, and synthetic datasets of big data. The results of experiments further demonstrate the effectiveness and potential of our GMFMFS‐based fuzzy decision tree with linear membership function and nonlinear membership function. 相似文献
12.
近年来,利用机器学习方法处理流量分类问题已成为网络测量领域一个新兴的研究方向。在目前研究中应用较多的是朴素贝叶斯方法及其改进算法,但这些基于贝叶斯定理的分类方法过于依赖样本空间的分布,具有潜在的不稳定性。为此,引入C4.5决策树方法来处理流量分类问题。C4.5决策树方法利用信息熵来构建分类模型,无须假设先验概率的稳定。实验结果表明C4.5决策树方法可以有效避免网络流分布变化所带来的影响。 相似文献
13.
为了识别3类意识任务,提出了一种改进的决策树支持向量机(SVM)算法.该方法将决策树与支持向量机结合构造多类SVM分类器,为了降低由决策树引起的"误差累积"效应,用基于类分布的可分离性测度来决定决策树走向.通过对2005国际脑机接口(BCI)竞赛中IDIAP研究协会提供的一组数据进行分析,分类最高准确率达到了80.8%,明显高于传统多类SVMs,表明了该算法的有效性. 相似文献
14.
域自适应算法被广泛应用于跨库语音情感识别中;然而,许多域自适应算法在追求减小域差异的同时,丧失了目标域样本的鉴别性,导致其以高密度的形式存在于模型决策边界处,降低了模型的性能。基于此,提出一种基于决策边界优化域自适应(DBODA)的跨库语音情感识别方法。首先利用卷积神经网络进行特征处理,随后将特征送入最大化核范数及均值差异(MNMD)模块,在减小域间差异的同时,最大化目标域情感预测概率矩阵的核范数,从而提升目标域样本的鉴别性并优化决策边界。在以Berlin、eNTERFACE和CASIA语音库为基准库设立的六组跨库实验中,所提方法的平均识别精度领先于其他算法1.68~11.01个百分点,说明所提模型有效降低了决策边界的样本密度,提升了预测的准确性。 相似文献
16.
针对雷达组网量测数据不确定性大、信息不完备等特点, 基于决策树分类算法的思想, 创建类决策树的概念, 提出一种基于类决策树分类的特征层融合识别算法. 所给出的算法无需训练样本, 采用边构造边分类的方式, 选取信 息增益最大的属性作为分类属性对量测数据进行分类, 实现了对目标的识别. 该算法能够处理含有空缺值的量测数据, 充分利用量测数据的特征信息. 仿真实验结果表明, 类决策树分类算法是一种简单有效的特征层融合识别算法. 相似文献
17.
An algorithm is developed for the design of an efficient decision tree with application to the pattern recognition problems involving discrete variables. The problem of evaluating an extremely large number of trees in search of a minimum cost decision tree is tackled by defining a criterion to estimate the minimum expected cost of a tree in terms of the weights of its terminal nodes and costs of the measurements, which then is used to establish the search procedure for the efficient decision tree. The concept of prime events is used to obtain the number of modes and the corresponding weights in the design samples. An application of the proposed algorithm is presented for the design of an efficient decision tree for classifying Devanagri numerals. 相似文献
19.
Recently, deep learning methodologies have become popular to analyse physiological signals in multiple modalities via hierarchical architectures for human emotion recognition. In most of the state-of-the-arts of human emotion recognition, deep learning for emotion classification was used. However, deep learning is mostly effective for deep feature extraction. Therefore, in this research, we applied unsupervised deep belief network (DBN) for depth level feature extraction from fused observations of Electro-Dermal Activity (EDA), Photoplethysmogram (PPG) and Zygomaticus Electromyography (zEMG) sensors signals. Afterwards, the DBN produced features are combined with statistical features of EDA, PPG and zEMG to prepare a feature-fusion vector. The prepared feature vector is then used to classify five basic emotions namely Happy, Relaxed, Disgust, Sad and Neutral. As the emotion classes are not linearly separable from the feature-fusion vector, the Fine Gaussian Support Vector Machine (FGSVM) is used with radial basis function kernel for non-linear classification of human emotions. Our experiments on a public multimodal physiological signal dataset show that the DBN, and FGSVM based model significantly increases the accuracy of emotion recognition rate as compared to the existing state-of-the-art emotion classification techniques. 相似文献
20.
鉴于我国现有奶牛发情鉴定主要依赖于人工观测、效率低和误判率高的现状,通过给奶午佩戴嵌入加速度传感器的无线传感节点,实时监测奶牛的行为,并通过基于二叉决策树支持向量机时间序列模型分析加速度数据,逐层分类奶牛的静止、慢走、快跑和爬跨行为类型.实验结果表明,该算法对于奶牛轻微和剧烈的运动特征分类准确度达95.5%,研究结果为... 相似文献
|