期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A novel adaptive cuckoo search algorithm for intrinsic discriminant analysis based face recognition

《Applied Soft Computing》2016

This paper presents a novel adaptive cuckoo search (ACS) algorithm for optimization. The step size is made adaptive from the knowledge of its fitness function value and its current position in the search space. The other important feature of the ACS algorithm is its speed, which is faster than the CS algorithm. Here, an attempt is made to make the cuckoo search (CS) algorithm parameter free, without a Levy step. The proposed algorithm is validated using twenty three standard benchmark test functions. The second part of the paper proposes an efficient face recognition algorithm using ACS, principal component analysis (PCA) and intrinsic discriminant analysis (IDA). The proposed algorithms are named as PCA + IDA and ACS–IDA. Interestingly, PCA + IDA offers us a perturbation free algorithm for dimension reduction while ACS + IDA is used to find the optimal feature vectors for classification of the face images based on the IDA. For the performance analysis, we use three standard face databases—YALE, ORL, and FERET. A comparison of the proposed method with the state-of-the-art methods reveals the effectiveness of our algorithm. 相似文献

2.

Spoken emotion recognition through optimum-path forest classification using glottal features

Alexander I. Iliev Michael S. Scordilis João P. Papa Alexandre X. Falcão 《Computer Speech and Language》2010,24(3):445-460

A new method for the recognition of spoken emotions is presented based on features of the glottal airflow signal. Its effectiveness is tested on the new optimum path classifier (OPF) as well as on six other previously established classification methods that included the Gaussian mixture model (GMM), support vector machine (SVM), artificial neural networks – multi layer perceptron (ANN-MLP), k-nearest neighbor rule (k-NN), Bayesian classifier (BC) and the C4.5 decision tree. The speech database used in this work was collected in an anechoic environment with ten speakers (5 M and 5 F) each speaking ten sentences in four different emotions: Happy, Angry, Sad, and Neutral. The glottal waveform was extracted from fluent speech via inverse filtering. The investigated features included the glottal symmetry and MFCC vectors of various lengths both for the glottal and the corresponding speech signal. Experimental results indicate that best performance is obtained for the glottal-only features with SVM and OPF generally providing the highest recognition rates, while for GMM or the combination of glottal and speech features performance was relatively inferior. For this text dependent, multi speaker task the top performing classifiers achieved perfect recognition rates for the case of 6th order glottal MFCCs. 相似文献

3.

Cross grouping strategy based 2DPCA method for face recognition

《Applied Soft Computing》2015

Grouping strategy exactly specifies the form of covariance matrix, therefore it is very essential. Most 2DPCA methods use the original 2D image matrices to form the covariance matrix which actually means that the strategy is to group the random variables by row or column of the input image. Because of their grouping strategies these methods have two main drawbacks. Firstly, 2DPCA and some of its variants such as A2DPCA, DiaPCA and MatPCA preserve only the covariance information between the elements of these groups. This directly implies that 2DPCA and these variants eliminate some covariance information while PCA preserves such information that can be useful for recognition. Secondly, all the existing methods suffer from the relatively high intra-group correlation, since the random variables in a row, column, or a block are closely located and highly correlated. To overcome such drawbacks we propose a novel grouping strategy named cross grouping strategy. The algorithm focuses on reducing the redundancy among the row and the column vectors of the image matrix. While doing this the algorithm completely preserves the covariance information of PCA between local geometric structures in the image matrix which is partially maintained in 2DPCA and its variants. And also in the proposed study intra-group correlation is weak according to the 2DPCA and its variants because the random variables spread over the whole face image. These make the proposed algorithm superior to 2DPCA and its variants. In order to achieve this, image cross-covariance matrix is calculated from the summation of the outer products of the column and the row vectors of all images. The singular value decomposition (SVD) is then applied to the image cross-covariance matrix. The right and the left singular vectors of SVD of the image cross-covariance matrix are used as the optimal projective vectors. Further in order to reduce the dimension LDA is applied on the feature space of the proposed method that is proposed method + LDA. The exhaustive experimental results demonstrate that proposed grouping strategy for 2DPCA is superior to 2DPCA, its specified variants and PCA, and proposed method outperforms bi-directional PCA + LDA. 相似文献

4.

Modeling phonetic pattern variability in favor of the creation of robust emotion classifiers for real-life applications

《Computer Speech and Language》2014,28(2):483-500

相似文献

5.

A statistical model for robust integration of narrowband cues in speech

《Computer Speech and Language》2001,15(2):175-194

We investigate a statistical model for integrating narrowband cues in speech. The model is inspired by two ideas in human speech perception: (i) Fletcher’s hypothesis (1953) that independent detectors, working in narrow frequency bands, account for the robustness of auditory strategies, and (ii) Miller and Nicely’s analysis (1955) that perceptual confusions in noisy bandlimited speech are correlated with phonetic features. We apply the model to detecting the phonetic feature [ + / − sonorant] that distinguishes vowels, approximants, and nasals (sonorants) from stops, fricatives, and affricates (obstruents). The model is represented by a multilayer probabilistic network whose binary hidden variables indicate sonorant cues from different parts of the frequency spectrum. We derive the Expectation-Maximization algorithm for estimating the model’s parameters and evaluate its performance on clean and corrupted speech. 相似文献

6.

基于多级SVM分类的语音情感识别算法

任浩叶亮李月沙学军《计算机应用研究》2017,34(6)

为了提高语音情感识别系统的识别准确率,本文在传统支持向量机(SVM)方法的基础之上,提出了一种基于PCA的多级SVM情感分类算法。首先将容易区分的情感分开,针对混淆度大且不能再利用多级分类策略直接进行区分的情感,采用主成分分析法(PCA)进行特征降维,然后逐级地判断出输入语音所属的情感类型。与传统基于SVM分类算法的语音情感识别相比,本文提出的方法可将7种情感的平均识别率提高5.05%,并且特征维度可降低58.3%,从而证明了本文所提出的方法的正确性与有效性。相似文献

7.

Finger-vein pattern identification using SVM and neural network technique

Jian-Da Wu Chiung-Tsiung Liu 《Expert systems with applications》2011,38(11):14284-14289

This paper presents a support vector machine (SVM) technique for finger-vein pattern identification in a personal identification system. Finger-vein pattern identification is one of the most secure and convenient techniques for personal identification. In the proposed system, the finger-vein pattern is captured by infrared LED and a CCD camera because the vein pattern is not easily observed in visible light. The proposed verification system consists of image pre-processing and pattern classification. In the work, principal component analysis (PCA) and linear discriminant analysis (LDA) are applied to the image pre-processing as dimension reduction and feature extraction. For pattern classification, this system used an SVM and adaptive neuro-fuzzy inference system (ANFIS). The PCA method is used to remove noise residing in the discarded dimensions and retain the main feature by LDA. The features are then used in pattern classification and identification. The accuracy of classification using SVM is 98% and only takes 0.015 s. The result shows a superior performance to the artificial neural network of ANFIS in the proposed system. 相似文献

8.

A new approach of audio emotion recognition

《Expert systems with applications》2014,41(13):5858-5869

A new architecture of intelligent audio emotion recognition is proposed in this paper. It fully utilizes both prosodic and spectral features in its design. It has two main paths in parallel and can recognize 6 emotions. Path 1 is designed based on intensive analysis of different prosodic features. Significant prosodic features are identified to differentiate emotions. Path 2 is designed based on research analysis on spectral features. Extraction of Mel-Frequency Cepstral Coefficient (MFCC) feature is then followed by Bi-directional Principle Component Analysis (BDPCA), Linear Discriminant Analysis (LDA) and Radial Basis Function (RBF) neural classification. This path has 3 parallel BDPCA + LDA + RBF sub-paths structure and each handles two emotions. Fusion modules are also proposed for weights assignment and decision making. The performance of the proposed architecture is evaluated on eNTERFACE’05 and RML databases. Simulation results and comparison have revealed good performance of the proposed recognizer. 相似文献

9.

An expert system for detection of breast cancer based on association rules and neural network

Murat Karabatak M. Cevdet Ince 《Expert systems with applications》2009,36(2):3465-3469

This paper presents an automatic diagnosis system for detecting breast cancer based on association rules (AR) and neural network (NN). In this study, AR is used for reducing the dimension of breast cancer database and NN is used for intelligent classification. The proposed AR + NN system performance is compared with NN model. The dimension of input feature space is reduced from nine to four by using AR. In test stage, 3-fold cross validation method was applied to the Wisconsin breast cancer database to evaluate the proposed system performances. The correct classification rate of proposed system is 95.6%. This research demonstrated that the AR can be used for reducing the dimension of feature space and proposed AR + NN model can be used to obtain fast automatic diagnostic systems for other diseases. 相似文献

10.

Spider specie identification and verification based on pattern recognition of it cobweb

Jaime R. Ticay-Rivas Marcos del Pozo-Baños William G. Eberhard Jesús B. Alonso Carlos M. Travieso 《Expert systems with applications》2013,40(10):4213-4225

Biodiversity conservation is a global priority where the study of every type of living form is a fundamental task. Inside the huge number of the planet species, spiders play an important role in almost every habitat. This paper presents a comprehensive study on the reliability of the most used features extractors to face the problem of spider specie recognition by using their cobwebs, both in identification and verification modes. We have applied a preprocessing to the cobwebs images in order to obtain only the valid information and compute the optimal size to reach the highest performance. We have used the principal component analysis (PCA), independent component analysis (ICA), Discrete Cosine Transform (DCT), Wavelet Transform (DWT) and discriminative common vectors as features extractors, and proposed the fusion of several of them to improve the system’s performance. Finally, we have used the Least Square Vector Support Machine with radial basis function as a classifier. We have implemented K-Fold and Hold-Out cross-validation techniques in order to obtain reliable results. PCA provided the best performance, reaching a 99.65% ± 0.21 of success rate in identification mode and 99.98% ± 0.04 of the area under de Reveicer Operating Characteristic (ROC) curve in verification mode. The best combination of features extractors was PCA, DCT, DWT and ICA, which achieved a 99.96% ± 0.16 of success rate in identification mode and perfect verification. 相似文献

11.

ANN based simulation and experimental verification of analytical four- and five-parameters models of PV modules

《Simulation Modelling Practice and Theory》2013

In this article, artificial neural network (ANN) is adopted to predict photovoltaic (PV) panel behaviors under realistic weather conditions. ANN results are compared with analytical four and five parameter models of PV module. The inputs of the models are the daily total irradiation, air temperature and module voltage, while the outputs are the current and power generated by the panel. Analytical models of PV modules, based on the manufacturer datasheet values, are simulated through Matlab/Simulink environment. Multilayer perceptron is used to predict the operating current and power of the PV module. The best network configuration to predict panel current had a 3–7–4–1 topology. So, this two hidden layer topology was selected as the best model for predicting panel current with similar conditions. Results obtained from the PV module simulation and the optimal ANN model has been validated experimentally. Results showed that ANN model provide a better prediction of the current and power of the PV module than the analytical models. The coefficient of determination (R²), mean square error (MSE) and the mean absolute percentage error (MAPE) values for the optimal ANN model were 0.971, 0.002 and 0.107, respectively. A comparative study among ANN and analytical models was also carried out. Among the analytical models, the five-parameter model, with MAPE = 0.112, MSE = 0.0026 and R² = 0.919, gave better prediction than the four-parameter model (with MAPE = 0.152, MSE = 0.0052 and R² = 0.905). Overall, the 3–7–4–1 ANN model outperformed four-parameter model, and was marginally better than the five-parameter model. 相似文献

12.

Efficient multicast schemes for 3-D Networks-on-Chip

《Journal of Systems Architecture》2013,59(9):693-708

3-D Networks-on-Chip (NoCs) have been proposed as a potent solution to address both the interconnection and design complexity problems facing future System-on-Chip (SoC) designs. In this paper, two topology-aware multicast routing algorithms, Multicasting XYZ (MXYZ) and Alternative XYZ (AL + XYZ) algorithms in supporting of 3-D NoC are proposed. In essence, MXYZ is a simple dimension order multicast routing algorithm that targets 3-D NoC systems built upon regular topologies. To support multicast routing in irregular regions, AL + XYZ can be applied, where an alternative output channel is sought to forward/replicate the packets whenever the output channel determined by MXYZ is not available. To evaluate the performance of MXYZ and AL + XYZ, extensive experiments have been conducted by comparing MXYZ and AL + XYZ against a path-based multicast routing algorithm and an irregular region oriented multiple unicast routing algorithm, respectively. The experimental results confirm that the proposed MXYZ and AL + XYZ schemes, respectively, have lower latency and power consumption than the other two routing algorithms, meriting the two proposed algorithms to be more suitable for supporting multicasting in 3-D NoC systems. In addition, the hardware implementation cost of AL + XYZ is shown to be quite modest. 相似文献

13.

Performance of radial basis and LM-feed forward artificial neural networks for predicting daily watershed runoff

Mohammad Zounemat-kermani Ozgur Kisi Taher Rajaee 《Applied Soft Computing》2013,13(12):4633-4644

This study investigated the effects of upstream stations’ flow records on the performance of artificial neural network (ANN) models for predicting daily watershed runoff. As a comparison, a multiple linear regression (MLR) analysis was also examined using various statistical indices. Five streamflow measuring stations on the Cahaba River, Alabama, were selected as case studies. Two different ANN models, multi layer feed forward neural network using Levenberg–Marquardt learning algorithm (LMFF) and radial basis function (RBF), were introduced in this paper. These models were then used to forecast one day ahead streamflows. The correlation analysis was applied for determining the architecture of each ANN model in terms of input variables. Several statistical criteria (RMSE, MAE and coefficient of correlation) were used to check the model accuracy in comparison with the observed data by means of K-fold cross validation method. Additionally, residual analysis was applied for the model results. The comparison results revealed that using upstream records could significantly increase the accuracy of ANN and MLR models in predicting daily stream flows (by around 30%). The comparison of the prediction accuracy of both ANN models (LMFF and RBF) and linear regression method indicated that the ANN approaches were more accurate than the MLR in predicting streamflow dynamics. The LMFF model was able to improve the average of root mean square error (RMSE_ave) and average of mean absolute percentage error (MAPE_ave) values of the multiple linear regression forecasts by about 18% and 21%, respectively. In spite of the fact that the RBF model acted better for predicting the highest range of flow rate (flood events, RMSE_ave/RBF = 26.8 m³/s vs. RMSE_ave/LMFF = 40.2 m³/s), in general, the results suggested that the LMFF method was somehow superior to the RBF method in predicting watershed runoff (RMSE/LMFF = 18.8 m³/s vs. RMSE/RBF = 19.2 m³/s). Eventually, statistical differences between measured and predicted medians were evaluated using Mann-Whitney test, and differences in variances were evaluated using the Levene's test. 相似文献

14.

Modified Köhler illumination for LED-based projection display

《Displays》2014,35(2):84-89

Common projection optics use Köhler illumination to achieve a required lighting. These systems always prevent the realization of a compact optical configuration along with a high lumen output. Based on conventional Köhler illumination, a modified Köhler illumination system for LED-based projection display is presented in this paper, which can significantly reduce the system volume while allowing for adequate and homogeneous illumination. Equipped with the proposed system, a pocket-sized CF-LCoS projector with a physical dimension of 27.4 mm × 19.4 mm × 9.6 mm is designed, simulated and analyzed. Compared to conventional approaches, this design could offer an average 43% volume reduction with acceptable tolerance. To the best of our knowledge, the screen uniformity of 90.2% and the light efficiency of 56.5% are competitive as compared with those of the currently commercialized pocket-sized CF-LCoS projectors. 相似文献

15.

一种基于PCA的段级特征

张兴明王科人黄山奇《电子技术应用》2011,37(5)

提出了一种基于PCA的段级特征(PCAULF)。该特征以现有的帧级语音特征为基础,通过计算段级特征引入了语音的长时特性。对段级特征使用PCA降维,一方面去除由于引入段级特征带来的冗余,实现数据降维,提高识别速度;另一方面抑制了噪声对识别系统的影响,提高了段级特征的鲁棒性。在训练阶段,计算所有语音的段级特征,使用PCA方法得到变换矩阵;在测试阶段,先使用变换矩阵对段级特征进行降维,再进行判别。实验结果表明,采用该特征有效地提高了识别精度和速度,更加适用于实时说话人识别系统。相似文献

16.

Prediction of cutting temperature in orthogonal machining of AISI 316L using artificial neural network

《Applied Soft Computing》2016

In this study, an approach based on artificial neural network (ANN) was proposed to predict the experimental cutting temperatures generated in orthogonal turning of AISI 316L stainless steel. Experimental and numerical analyses of the cutting forces were carried out to numerically obtain the cutting temperature. For this purpose, cutting tests were conducted using coated (TiCN + Al₂O₃ + TiN and Al₂O₃) and uncoated cemented carbide inserts. The Deform-2D programme was used for numerical modelling and the Johnson–Cook (J–C) material model was used. The numerical cutting forces for the coated and uncoated tools were compared with the experimental results. On the other hand, the cutting temperature value for each cutting tool was numerically obtained. The artificial neural network model was used to predict numerical cutting temperatures by means of the numerical cutting forces. The best results in predicting the cutting temperature were obtained using the network architecture with a hidden layer which has seven neurons and LM learning algorithm. Finally, the experimental cutting temperatures were predicted by entering the experimental cutting forces into a formula obtained from the artificial neural networks. Statistical results (R², RMSE, MEP) were quite satisfactory. This demonstrates that the established ANN model is a powerful one for predicting the experimental cutting temperatures. 相似文献

17.

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory

Martin Wöllmer Felix Weninger Jürgen Geiger Björn Schuller Gerhard Rigoll 《Computer Speech and Language》2013,27(3):780-797

This article proposes and evaluates various methods to integrate the concept of bidirectional Long Short-Term Memory (BLSTM) temporal context modeling into a system for automatic speech recognition (ASR) in noisy and reverberated environments. Building on recent advances in Long Short-Term Memory architectures for ASR, we design a novel front-end for context-sensitive Tandem feature extraction and show how the Connectionist Temporal Classification approach can be used as a BLSTM-based back-end, alternatively to Hidden Markov Models (HMM). We combine context-sensitive BLSTM-based feature generation and speech decoding techniques with source separation by convolutive non-negative matrix factorization. Applying our speaker adapted multi-stream HMM framework that processes MFCC features from NMF-enhanced speech as well as word predictions obtained via BLSTM networks and non-negative sparse classification (NSC), we obtain an average accuracy of 91.86% on the PASCAL CHiME Challenge task at signal-to-noise ratios ranging from ?6 to 9 dB. To our knowledge, this is the best result ever reported for the CHiME Challenge task. 相似文献

18.

Compound Linguistic Scale

《Applied Soft Computing》2014

Rating scales are the essential interfaces for many research areas such as in decision making and recommendation. Some issues concerning syntactic and sematic structures are still open to discuss. This research proposes a Compound Linguistic Scale (CLS), a two dimension rating scale, as a promising rating interface. The CLS is comprised of Compound Linguistic Variable (CLV) and Deductive Rating Strategy (DRS). CLV can ideally produce 21 to 73 ((7 ± 2)((7 ± 2) − 1) + 1) ordinal-in-ordinal rating items, which extends the classic rating scales usually on the basis of 7 ± 2 principle, to better reflect the raters’ preferences whilst DRS is a double step rating approach for a rater to choose a compound linguistic term among two dimensional options on a dynamic rating interface. The numerical analyses show that the proposed CLS can address rating dilemma for a single rater and more accurately reflects consistency among various raters. CLS can be applied to surveys, questionnaires, psychometrics, recommender systems and decision analysis of various application domains. 相似文献

19.

Detailed check of the LDA + U and GGA + U corrected method for defect calculations in wurtzite ZnO

Gui-Yang Huang Chong-Yu Wang Jian-Tao Wang 《Computer Physics Communications》2012,183(8):1749-1752

Based on a detailed check of the LDA + U and GGA + U corrected methods, we found that the transition energy levels depend almost linearly on the effective U parameter. GGA + U seems to be better than LDA + U, with effective U parameter of about 5.0 eV. However, though the results between LDA and GGA are very different before correction, the corrected transition energy levels spread less than 0.3 eV. These more or less consistent results indicate the necessity and validity of LDA + U and GGA + U correction. 相似文献

20.

Language identification using multi-core processors

A. Hanani M.J. Carey M.J. Russell 《Computer Speech and Language》2012,26(5):371-383

Graphics processing units (GPUs) provide substantial processing power for little cost. We explore the application of GPUs to speech pattern processing, using language identification (LID) to demonstrate their benefits. Realization of the full potential of GPUs requires both effective coding of predetermined algorithms, and, if there is a choice, selection of the algorithm or technique for a specific function that is most able to exploit the GPU. We demonstrate these principles using the NIST LRE 2003 standard LID task, a batch processing task which involves the analysis of over 600 h of speech. We focus on two parts of the system, namely the acoustic classifier, which is based on a 2048 component Gaussian Mixture Model (GMM), and acoustic feature extraction. In the case of the latter we compare a conventional FFT-based analysis with IIR and FIR filter banks, both in terms of their ability to exploit the GPU architecture and LID performance. With no increase in error rate our GPU based system, with an FIR-based front-end, completes the NIST LRE 2003 task in 16 h, compared with 180 h for the conventional FFT-based system on a standard CPU (a speed up factor of more than 11). This includes a 61% decrease in front-end processing time. In the GPU implementation, front-end processing accounts for 8% and 10% of the total computing times during training and recognition, respectively. Hence the reduction in front-end processing achieved in the GPU implementation is significant. 相似文献