共查询到20条相似文献,搜索用时 15 毫秒
1.
Speaker recognition systems perform almost ideal in neutral talking environments; however, these systems perform poorly in emotional talking environments. This research is devoted to enhancing the low performance of text-independent and emotion-dependent speaker identification in emotional talking environments based on employing Second-Order Circular Suprasegmental Hidden Markov Models (CSPHMM2s) as classifiers. This work has been tested on our speech database which is composed of 50 speakers talking in six different emotional states. These states are neutral, angry, sad, happy, disgust, and fear. Our results show that the average speaker identification performance in these talking environments based on CSPHMM2s is 81.50% with an improvement rate of 5.61%, 3.39%, and 3.06% compared, respectively, to First-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM1s), Second-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM2s), and First-Order Circular Suprasegmental Hidden Markov Models (CSPHMM1s). Our results based on subjective evaluation by human judges fall within 2.26% of those obtained based on CSPHMM2s. 相似文献
3.
Speaker recognition performance in emotional talking environments is not as high as it is in neutral talking environments. This work focuses on proposing, implementing, and evaluating a new approach to enhance the performance in emotional talking environments. The new proposed approach is based on identifying the unknown speaker using both his/her gender and emotion cues. Both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) have been used as classifiers in this work. This approach has been tested on our collected emotional speech database which is composed of six emotions. The results of this work show that speaker identification performance based on using both gender and emotion cues is higher than that based on using gender cues only, emotion cues only, and neither gender nor emotion cues by 7.22 %, 4.45 %, and 19.56 %, respectively. This work also shows that the optimum speaker identification performance takes place when the classifiers are completely biased towards suprasegmental models and no impact of acoustic models in the emotional talking environments. The achieved average speaker identification performance based on the new proposed approach falls within 2.35 % of that obtained in subjective evaluation by human judges. 相似文献
4.
Neural Computing and Applications - In this work, we conducted an empirical comparative study of the performance of text-independent speaker verification in emotional and stressful environments.... 相似文献
6.
Loop transfer recovery is considered to be a special form of loop sensitivity shaping in this paper. This viewpoint suggests a design strategy which relaxes the requirement that the estimator or controller be unbiased. This strategy is illustrated using a stable, SISO example with a nonminimum phase zero. The approach still faces the design tradeoffs and limitations inherent in all feedback systems including those which apply to nonminimum phase plants. The formulation used here, however, suggests a different approach for dealing with these issues 相似文献
7.
Speaker identification from the whispered speech is of great importance in the field of forensic science as well as in many other applications. Whispered speech shows many changes in the characteristics to its neutral counterpart. Hence the task of identification becomes difficult. This paper presents the use of only well-performing timbrel features selected by Hybrid selection method and effect of distance measures used in KNN classifier on the identification accuracy. The results using timbrel features are compared with MFCC features; the accuracy with the former is observed higher. KNN classifier with most probable distance function suitable for a whispered database like Euclidean and City-block are also compared. The combination of timbrel features and KNN classifiers with city block distance function have reported the highest identification accuracy. 相似文献
8.
We present a model for unconstrained and unobtrusive identification and tracking of people in smart environments and answering
queries about their whereabouts. Our model supports biometric recognition based upon multiple modalities such as face, gait,
and voice in a uniform manner. The key technical idea underlying our approach is to abstract a smart environment by a state transition system in which each state records a set of individuals who are present in various zones of the environment. Since biometric recognition
is inexact, state information is inherently probabilistic in nature. An event abstracts a biometric recognition step, and
the transition function abstracts the reasoning necessary to effect state transitions. In this manner, we are able to integrate
different biometric modalities uniformly and also different criteria for state transitions. Fusion of biometric modalities
is also supported by our model. We define performance metrics for a smart environment in terms of the concepts of ‘precision’
and ‘recall’. We have developed a prototype implementation of our proposed concepts and provide experimental results in this
paper. Our conclusion is that the state transition model is an effective abstraction of a smart environment and serves as
a good basis for developing practical systems. 相似文献
9.
In this paper, an application of speaker identification in automobile industry is proposed. The work is divided into two main categories. The first part deals with the task of speaker identification where a system is trained and tested for multiple users using a database of isolated Hindi digits and Hindi sentences. A hybrid new algorithm is used for speaker identification which captures the benefits of both LPC and MFCC feature extraction technique. The new proposed technique shows an improvement of 2.05% over conventional MFCC features for isolated Hindi digits and 12.41% for Hindi sentences. It also shows an improvement of 53.26 over LPC for Hindi sentence and 32.51% for isolated Hindi digit over LPC. The proposed features were also tested for real time noisy environment by adding speech and F16 noise to test voice samples with varying degree of distortion starting from 0 to 20 dB. The second part describes the interfacing techniques and design of the hardware configuration for seat adjustment. The proposed model is designed using MATLAB. Speech samples from users are recorded through a microphone. Different features of this wav file are evaluated and fed into the model generated during testing phase. Depending on outcome from the classifier, a user is identified. Once the user is successfully identified, signals are sent to the servo motor through arduino microcontroller interfaced through MATLAB to automatically adjust the driver’s seat. 相似文献
10.
In this paper, the application of soft computing techniques in prediction of an occupant's behaviour in an inhabited intelligent environment is addressed. In this research, daily activities of elderly people who live in their own homes suffering from dementia are studied. Occupancy sensors are used to extract the movement patterns of the occupant. The occupancy data is then converted into temporal sequences of activities which are eventually used to predict the occupant behaviour. To build the prediction model, different dynamic recurrent neural networks are investigated. Recurrent neural networks have shown a great ability in finding the temporal relationships of input patterns. The experimental results show that non-linear autoregressive network with exogenous inputs model correctly extracts the long term prediction patterns of the occupant and outperformed the Elman network. The results presented here are validated using data generated from a simulator and real environments. 相似文献
11.
This paper proposes a classification scheme that incorporates Karhunen-Loeve transform (KLT) and Gaussian mixture model (GMM) for text-independent speaker identification. Our results show that the combination is beneficial to both classification accuracy and computational cost. For a database with 500 Mandarin speakers, it is demonstrated that accuracy improvement of up to 4% and computational cost saving of 10 times compared to those of the conventional GMM model can be achieved. 相似文献
12.
针对多径环境下MPSK和MQAM信号的调制分类问题给出了一种有效的自动识别算法,利用一组稳健的抗多径的累量不变量和自适应盲均衡算法的代价函数作为识别特征。当盲均衡器与接收到的码元星座图匹配时其代价函数收敛到最小。仿真表明,该方法可以有效识别多径信道下BPSK、QPSK、8PSK、16QAM、32QAM、64QAM信号。 相似文献
13.
Stakeholders are the first emerging challenge in any software project. Their identification is a critical task for success.
Nevertheless, many authors consider them as a default product of a non-explained identification process. Several aspects must
be considered when the project is carried out in environments where multiple organizations interact. These complex contexts
demand extremely hard efforts. Stakeholders must be identified taking into account their attributes (types, roles), which
must be extended and defined for these environments. In general, there are no methodologies that allow performing this task
in a systematic way for the development of interorganizational information systems. The paper proposes a method for carrying
out stakeholder identification considering the diverse dimensions involved in interorganizational environments: organizational,
interorganizational and external. It allows the systematic specification of all the people, groups and organizations whose
interests and needs can affect or are affected by the interorganizational system. Also diverse stakeholders’ attributes such
as types, roles, influence and interest are defined, analyzed, and included in the method. They are all important in later
stages of any software project. 相似文献
14.
This paper presents an efficient approach for automatic speaker identification based on cepstral features and the Normalized Pitch Frequency (NPF). Most relevant speaker identification methods adopt a cepstral strategy. Inclusion of the pitch frequency as a new feature in the speaker identification process is expected to enhance the speaker identification accuracy. In the proposed framework for speaker identification, a neural classifier with a single hidden layer is used. Different transform domains are investigated for reliable feature extraction from the speech signal. Moreover, a pre-processing noise reduction step, is used prior to the feature extraction process to enhance the performance of the speaker identification system. Simulation results prove that the NPF as a feature in speaker identification enhances the performance of the speaker identification system, especially with the Discrete Cosine Transform (DCT) and wavelet denoising pre-processing step. 相似文献
15.
In recent years, various computers have been compromised through several paths. In particular, the attack patterns and paths are becoming more various than in the past. Furthermore, systems damaged by hackers are used as zombie systems to attack other web servers or personal computers, so there is a high probability to spread secondary damage such as DDoS. Also, previously, hacking and malicious code were carried out for self-display or simple curiosity, but recently they are related to monetary extortion. In order to respond to incidents correctly, it is important to measure the damage to a system rapidly and determine the attack paths. This paper will discuss an on-site investigation methodology for incident response and also describe the limitations of this methodology. 相似文献
17.
Computer Mediated Environments (CMEs) allow people to communicate and interact electronically, either synchronously or asynchronously, their key characteristic being online interactivity. This study attempts to provide a better understanding of communication behavior in CMEs, the study objective being to investigate the effects of the level of interactivity on web users’ attitudes and intentions towards the use of online communication tools. It tests constructs based on system characteristics (interactivity), extrinsic motivation (the Technology acceptance model), and intrinsic motivation (Flow theory) in an integrated theoretical framework for online communication behavior. This study demonstrates the development of a reliable and valid measure to capture several critical constructs in order to understand online communication behavior. Questionnaires were placed on the website for voluntary participants who use online communication tools to complete. The statistical results revealed that attitude and behavioral intention are directly affected by users’ internal and external motivation, and are indirectly affected by interactivity through the perceived ease of use, perceived usefulness, and flow experience. This shows that interactivity is an important element of web-based information technology for absorbing users, and is not only mediated by task-oriented (external) motivation but also entertainment-oriented (internal) motivation. 相似文献
19.
为解决说话人识别问题,提出了一种基于支持向量机和小波分析的识别方法以及其框架模型,即将小波分析应用于信号预处理,并以此为基础,利用其奇异点检测原理将语音信号和噪声分离,实现语音增强,最终基于样本进行训练和测试,采用SVM实现说话人的分类识别. 相似文献
20.
In this paper we present indoor positioning within unknown environments as an unsupervised labelling task on sequential data. We explore a probabilistic framework relying on wireless network radio signals and contextual information, which is increasingly available in large environments. Thus, we form an informative spatial classifier without resorting to a pre-determined map, and show the potential of the approach using both simulated and real data sets.Results demonstrate the ability of the procedure to segregate structures of radio signal observations and form clustered regions in association to areas of interest to the user; thus, we show it is possible to differentiate location between closely spaced zones of variable size and shape. 相似文献
|