首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 15 毫秒
1.
This paper presents our studies of the effects of acoustic features, speaker normalization methods, and statistical modeling techniques on speaker state classification. We focus on the investigation of the effect of simple partial least squares (SIMPLS) in unbalanced binary classification. Beyond dimension reduction and low computational complexity, SIMPLS classifier (SIMPLSC) shows, especially, higher prediction accuracy to the class with the smaller data number. Therefore, an asymmetric SIMPLS classifier (ASIMPLSC) is proposed to enhance the performance of SIMPLSC to the class with the larger data number. Furthermore, we combine multiple system outputs (ASIMPLS classifier and Support Vector Machines) by score-level fusion to exploit the complementary information in diverse systems. The proposed speaker state classification system is evaluated with several experiments on unbalanced data sets. Within the Interspeech 2011 Speaker State Challenge, we could achieve the best results for the 2-class task of the Sleepiness Sub-Challenge with an unweighted average recall of 71.7%. Further experimental results on the SEMAINE data sets show that the ASIMPLSC achieves an absolute improvement of 6.1%, 6.1%, 24.5%, and 1.3% on the weighted average recall value, over the AVEC 2011 baseline system on the emotional speech binary classification tasks of four dimensions, namely, activation, expectation, power, and valence, respectively.  相似文献   

2.
Clustering is the task of classifying patterns or observations into clusters or groups. Generally, clustering in high-dimensional feature spaces has a lot of complications such as: the unidentified or unknown data shape which is typically non-Gaussian and follows different distributions; the unknown number of clusters in the case of unsupervised learning; and the existence of noisy, redundant, or uninformative features which normally compromise modeling capabilities and speed. Therefore, high-dimensional data clustering has been a subject of extensive research in data mining, pattern recognition, image processing, computer vision, and other areas for several decades. However, most of existing researches tackle one or two problems at a time which is unrealistic because all problems are connected and should be tackled simultaneously. Thus, in this paper, we propose two novel inference frameworks for unsupervised non-Gaussian feature selection, in the context of finite asymmetric generalized Gaussian (AGG) mixture-based clustering. The choice of the AGG distribution is mainly due to its ability not only to approximate a large class of statistical distributions (e.g. impulsive, Laplacian, Gaussian and uniform distributions) but also to include the asymmetry. In addition, the two frameworks simultaneously perform model parameters estimation as well as model complexity (i.e., both model and feature selection) determination in the same step. This was done by incorporating a minimum message length (MML) penalty in the model learning step and by fading out the redundant densities in the mixture using the rival penalized EM (RPEM) algorithm, for first and second frameworks, respectively. Furthermore, for both algorithms, we tackle the problem of noisy and uninformative features by determining a set of relevant features for each data cluster. The efficiencies of the proposed algorithms are validated by applying them to real challenging problems namely action and facial expression recognition.  相似文献   

3.
A FORTRAN program is presented which generates a statistical model of broadscale spatially coherent data, and from that model identifies and removes outlying data values. The algorithm also interpolates missing data values by making use of this model, as well as the assumption of broadscale coherence. Examples of the application of this technique to geomagnetic data are presented. A significant improvement in the statistical efficiency and consistency of subsequent estimators is seen to obtain from preprocessing data with this method.  相似文献   

4.
Multiset features extracted from the same pattern usually represent different characteristics of data, meanwhile, matrices or 2-order tensors are common forms of data in real applications. Hence, how to extract multiset features from matrix data is an important research topic for pattern recognition. In this paper, by analyzing the relationship between CCA and 2D-CCA, a novel feature extraction method called multiple rank canonical correlation analysis (MRCCA) is proposed, which is an extension of 2D-CCA. Different from CCA and 2D-CCA, in MRCCA k pairs left transforms and k pairs right transforms are sought to maximize correlation. Besides, the multiset version of MRCCA termed as multiple rank multiset canonical correlation analysis (MRMCCA) is also developed. Experimental results on five real-world data sets demonstrate the viability of the formulation, they also show that the recognition rate of our method is higher than other methods and the computing time is competitive.  相似文献   

5.
A dataset of spectral signatures (leaf level) of tropical dry forest trees and lianas and an airborne hyperspectral image (crown level) are used to test three hyperspectral data reduction techniques (principal component analysis, forward feature selection and wavelet energy feature vectors) along with pattern recognition classifiers to discriminate between the spectral signatures of lianas and trees. It was found at the leaf level the forward waveband selection method had the best results followed by the wavelet energy feature vector and a form of principal component analysis. For the same dataset our results indicate that none of the pattern recognition classifiers performed the best across all reduction techniques, and also that none of the parametric classifiers had the overall lowest training and testing errors. At the crown level, in addition to higher testing error rates (7%), it was found that there was no optimal data reduction technique. The significant wavebands were also found to be different between the leaf and crown levels. At the leaf level, the visible region of the spectrum was the most important for discriminating between lianas and trees whereas at the crown level the shortwave infrared was also important in addition to the visible and near infrared.  相似文献   

6.
This paper (Part II) investigates the motion of a redundant anthropomorphic arm during the writing task. Two approaches are applied. The first is based on the concept of distributed positioning which is suitable to model the “writing” task before the occurrence of fatigue symptoms. The second approach uses the concept of “virtual fatigue” (VF) which is a variable that dynamically behaves in a way analogous to the biological fatigue. VF enables the arm to reconfigure itself and take postures appropriate for the current level of fatigue. The study includes the analysis of legibility and inclination of handwriting, and a set of simulation results that show most practical aspects of robot human-like performance.  相似文献   

7.
This two-part paper is concerned with the analysis and achievement of human-like behavior by robot arms (manipulators). The analysis involves three issues: (i) the resolution of the inverse kinematics problem of redundant robots, (ii) the separation of the end-effector's motion into two components, i.e. the smooth (low accelerated) component and the fast (accelerated) component, and (iii) the fatigue of the motors (actuators) of the robot joints. In the absence of the fatigue, the human-like performance is achieved by using the partitioning of the robot joints into “smooth” and “accelerated” ones (called distributed positioning—DP). The actuator fatigue is represented by the so-called “virtual fatigue” (VF) concept. When fatigue starts, the human-like performance is achieved by engaging more the joints (motors) that are less fatigued, as does the human arm. Part I of the paper provides the theoretical issues of the above approach, while Part II applies it to the handwriting task and provides extensive simulation results that support the theoretical expectations.  相似文献   

8.
Spatial distribution of sponge species richness (SSR) and its relationship with environment are important for marine ecosystem management, but they are either unavailable or unknown. Hence we applied random forest (RF), generalised linear model (GLM) and their hybrid methods with geostatistical techniques to SSR data by addressing relevant issues with variable selection and model selection. It was found that: 1) of five variable selection methods, one is suitable for selecting optimal RF predictive models; 2) traditional model selection methods are unsuitable for identifying GLM predictive models and joint application of RF and AIC can select accuracy-improved models; 3) highly correlated predictors may improve RF predictive accuracy; 4) hybrid methods for RF can accurately predict count data; and 5) effects of model averaging are method-dependent. This study depicted the non-linear relationships of SSR and predictors, generated spatial distribution of SSR with high accuracy and revealed the association of high SSR with hard seabed features.  相似文献   

9.
Grid‐based simulation usually involves large quantities of data at each stage of the simulation process. These data include simulation input and output files, intermediate results files, log and error files, associated metadata, and information capturing the processes that generate the data. The question of how to effectively store and manage data files within a Grid computing environment is increasingly becoming an important issue. This paper illustrates how we built a lightweight e‐Science infrastructure for data management within a Grid computing environment, including the integration of data curation activities into the entire Grid‐based simulation process. Rather than focusing on specific implementation details, we aim to identify the key issues and research challenges, describing how various existing technologies and tools can be best integrated to address these requirements and challenges. Although the case of quantum mechanical simulation of materials properties is used in the paper, much of the discussion is as generic as possible so that approaches, methods and practice (e.g. integrated approach, workflow taxonomy and development approach, simple but useful semantic annotation approach) can be applied to wider domains and disciplines to facilitat the digital research. A comparison between our approach and Cloud computing, and lessons learned in data management within the Grid computing environment, are also presented. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

10.
In statistical data mining and spatial statistics, many problems (such as detection and clustering) can be formulated as optimization problems whose objective functions are functions of consecutive subsequences. Some examples are (1) searching for a high activity region in a Bernoulli sequence, (2) estimating an underlying boxcar function in a time series, and (3) locating a high concentration area in a point process. A comprehensive search algorithm always ends up with a high order of computational complexity. For example, if a length-n sequence is considered, the total number of all possible consecutive subsequences is A comprehensive search algorithm requires at least O(n2) numerical operations.

We present a multiscale-approximation-based approach. It is shown that most of the time, this method finds the exact same solution as a comprehensive search algorithm does. The derived multiscale approximation methods (MAMEs) have low complexity: for a length-n sequence, the computational complexity of an MAME can be as low as O(n). Numerical simulations verify these improvements.

The MAME approach is particularly suitable for problems having large size data. One known drawback is that this method does not guarantee the exact optimal solution in every single run. However, simulations show that as long as the underlying subjects possess statistical significance, a MAME finds the optimal solution with probability almost equal to one.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号