首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Speaker localization is a technique to locate and track an active speaker from multiple acoustic sources using microphone array. Microphone array is used to improve the speech quality of recorded speech signal in meeting room and other places. In this work, the time delay estimation between source and each microphone is calculated using a localization method called time differences of arrival (TDOA). TDOA localization consists of two steps namely (a) a time delay estimator and (b) a localization estimator. For time delay estimation, the generalized cross-correlation using phase transform, the generalized cross correlation using maximum likelihood, linear prediction (LP) residual and the Hilbert envelope of the LP residual are chosen for estimating the location of a person. A new speaker localization algorithm known as group search optimization (GSO) algorithm is proposed. The performance of this algorithm is analyzed and compared with Gauss–Newton nonlinear least square method and genetic algorithm. Experimental results show that the proposed GSO method outperforms the other methods in terms of mean square error, root mean square error, mean absolute error, mean absolute percentage error, euclidean distance and mean absolute relative error.  相似文献   

2.
基于神经网络的迟滞逆模型   总被引:1,自引:0,他引:1  
一个新的基于神经网络的迟滞逆模型被提出.采用连续坐标变换的方法,建立基本迟滞逆算子(EIHO),EIHO为神经网络提供了基本的迟滞逆信息,并与迟滞逆的输入一起作为神经网络的输入,使迟滞逆由多值映射关系转化为一对一映射关系,从而达到用神经网络逼近迟滞逆的目的.一组实测数据被用来检验模型有效性,实验结果表明,这种建模方法是有效的.  相似文献   

3.
This paper proposes a multimodal approach to distinguish silence from speech situations, and to identify the location of the active speaker in the latter case. In our approach, a video camera is used to track the faces of the participants, and a microphone array is used to estimate the Sound Source Location (SSL) using the Steered Response Power with the phase transform (SRP-PHAT) method. The audiovisual cues are combined, and two competing Hidden Markov Models (HMMs) are used to detect silence or the presence of a person speaking. If speech is detected, the corresponding HMM also provides the spatio-temporally coherent location of the speaker. Experimental results show that incorporating the HMM improves the results over the unimodal SRP-PHAT, and the inclusion of video cues provides even further improvements.  相似文献   

4.
Multimedia Tools and Applications - Emotional speaker recognition under real life conditions becomes an urgent need for several applications. This paper proposes a novel approach using multiple...  相似文献   

5.
A hierarchical neural-network-based approach for circuit tuning at the post-fabrication stage is proposed. In this approach, measurements that characterize the behavior of the circuit under test are first selected. The best candidates of circuit parameters for tuning are also determined. A training set comprising the selected circuit measurements is then constructed. These measurements are calculated during simulations in which the circuit parameter values are uniformly distributed in a tolerance region around their nominal values. The training set is fed to a self organizing map neural network to cluster the measurements. The generated clusters are manipulated and classified via a hierarchical circuit tuning procedure. Based on this classification, tuning values for the tuning parameters are calculated. Situations in which the circuit cannot be tuned are also addressed. Experimental results indicate that the developed approach provides a robust and efficient technique for circuit tuning.  相似文献   

6.
7.
8.
基于DV-Hop定位算法和RSSI测距技术的定位系统   总被引:4,自引:1,他引:4  
针对 DV Hop算法在实验环境中存在的问题,加入接收信号强度指示器(RSSI)测距模块辅助定位,对算法进行改进。为了实现定位系统,首先,需要建立当前实验环境的RSSI模型;然后,应用该模型,从锚节点和非锚节点两方面分别控制DV Hop定位过程。实验证明:改进后的定位系统在增加少量计算复杂度的情况下,改善了系统的稳定性,提高了定位的精度,可以被应用到无线传感器网络中。  相似文献   

9.
A major challenge in ASV is to improve performance with short speech segments for end-user convenience in real-world applications. In this paper, we present a detailed analysis of ASV systems to observe the duration variability effects on state-of-the-art i-vector and classical Gaussian mixture model-universal background model (GMM-UBM) based ASV systems. We observe an increase in uncertainty of model parameter estimation for i-vector based ASV with speech of shorter duration. In order to compensate the effect of duration variability in short utterances, we have proposed adaptation technique for Baum-Welch statistics estimation used to i-vector extraction. Information from pre-estimated background model parameters are used for adaptation method. The ASV performance with the proposed approach is considerably superior to the conventional i-vector based system. Furthermore, the fusion of proposed i-vector based system and GMM-UBM further improves the ASV performance, especially for short speech segments. Experiments conducted on two speech corpora, NIST SRE 2008 and 2010, have shown relative improvement in equal error rate (EER) in the range of 12–20%.  相似文献   

10.
This correspondence introduces a new text-independent speaker verification method, which is derived from the basic idea of pattern recognition that the discriminating ability of a classifier can be improved by removing the common information between classes. In looking for the common speech characteristics between a group of speakers, a global speaker model can be established. By subtracting the score acquired from this model, the conventional likelihood score is normalized with the consequence of more compact score distribution and lower equal error rates. Several experiments are carried out to demonstrate the effectiveness of the proposed method  相似文献   

11.
This paper presents a neural-network-based PID-like control strategy applicable to a class of nonlinear control problems commonly encountered in the process-control industry. An artificial neural network is used to provide compensation of the plant's nonlinear dynamics so that the overall closed-loop system can be described in terms of an equivalent error system. In the paper, the strategy is carefully described, and then evaluated and compared with an alternative control system design which uses conventional gain-scheduled PID controllers. The paper includes real-time experimental results in applying the proposed technique for level control of a coupled-tanks system.  相似文献   

12.
In this article, a neural network with radial-basis functions (RBF-NN) is applied to microwave imaging of cylinders. Initially, the shape function of the target cylinder is expanded by a Fourier series. The RBF-NN is trained by some direct-scattering data sets and thus can predict the images of the target cylinders. © 2004 Wiley Periodicals, Inc. Int J RF and Microwave CAE 14, 398–403, 2004.  相似文献   

13.
In financial time series forecasting, the problem that we often encounter is how to increase the prediction accuracy as possible using the financial data with noise. In this study, we discuss the use of supervised neural networks as a meta-learning technique to design a financial time series forecasting system to solve this problem. In this system, some data sampling techniques are first used to generate different training subsets from the original datasets. In terms of these different training subsets, different neural networks with different initial conditions or training algorithms are then trained to formulate different prediction models, i.e., base models. Subsequently, to improve the efficiency of predictions of metamodeling, the principal component analysis (PCA) technique is used as a pruning tool to generate an optimal set of base models. Finally, a neural-network-based nonlinear metamodel can be produced by learning from the selected base models, so as to improve the prediction accuracy. For illustration and verification purposes, the proposed metamodel is conducted on four typical financial time series. Empirical results obtained reveal that the proposed neural-network-based nonlinear metamodeling technique is a very promising approach to financial time series forecasting.  相似文献   

14.
Iris localization plays a decisive role in the overall iris biometric system’s performance, because it isolates the valid part of iris. This study proposes a reliable iris localization technique. It includes the following. First, it extracts the iris inner contour within a sliding-window in an eye image using a multi-valued adaptive threshold and the two-dimensional (2D) properties of binary objects. Then, it localizes the iris outer contour using an edge-detecting operator in a sub image centered at the pupil center. Finally, it regularizes the iris contours to compensate for their non-circular structure. The proposed technique is tested on the following public iris databases: CASA V1.0, CASIA-Iris-Lamp, IITD V1.0, and the MMU V1.0. The experimental and accuracy results of the proposed scheme compared with other state-of-the-art techniques endorse its satisfactory performance.  相似文献   

15.
ANNSTLF-a neural-network-based electric load forecasting system   总被引:10,自引:0,他引:10  
A key component of the daily operation and planning activities of an electric utility is short-term load forecasting, i.e., the prediction of hourly loads (demand) for the next hour to several days out. The accuracy of such forecasts has significant economic impact for the utility. This paper describes a load forecasting system known as ANNSTLF (artificial neural-network short-term load forecaster) which has received wide acceptance by the electric utility industry and presently is being used by 32 utilities across the USA and Canada. ANNSTLF can consider the effect of temperature and relative humidity on the load. Besides its load forecasting engine, ANNSTLF contains forecasters that can generate the hourly temperature and relative humidity forecasts needed by the system. ANNSTLF is based on a multiple ANN strategy that captures various trends in the data. Both the first and the second generation of the load forecasting engine are discussed and compared. The building block of the forecasters is a multilayer perceptron trained with the error backpropagation learning rule. An adaptive scheme is employed to adjust the ANN weights during online forecasting. The forecasting models are site independent and only the number of hidden layer nodes of ANN's need to be adjusted for a new database. The results of testing the system on data from ten different utilities are reported.  相似文献   

16.
The double traveling salesman problem is a variation of the basic traveling salesman problem where targets can be reached by two salespersons operating in parallel. The real problem addressed by this work concerns the optimization of the harvest sequence for the two independent arms of a fruit-harvesting robot. This application poses further constraints, like a collision-avoidance function. The proposed solution is based on a self-organizing map structure, initialized with as many artificial neurons as the number of targets to be reached. One of the key components of the process is the combination of competitive relaxation with a mechanism for deleting and creating artificial neurons. Moreover, in the competitive relaxation process, information about the trajectory connecting the neurons is combined with the distance of neurons from the target. This strategy prevents tangles in the trajectory and collisions between the two tours. Results of tests indicate that the proposed approach is efficient and reliable for harvest sequence planning. Moreover, the enhancements added to the pure self-organizing map concept are of wider importance, as proved by a traveling salesman problem version of the program, simplified from the double version for comparison.  相似文献   

17.
In this paper a new text-independent speaker verification method GSMSV is proposed based on likelihood score normalization.In this novel method a global speaker model is established to represent the universal features of speech and normalize the likelihood score.Statistical analysis demonstrates that this normalization method can remove common factors of speech and bring the differences between speakers into prominence.As a result the equal error rate is decreased significantly,verification procedure is accelerated and system adaptability to speaking speed is improved.  相似文献   

18.
Pattern localization is a fundamental task in machine vision, and autofocus is a requirement for any automated inspection system by allowing greater variation in the distance from the camera to the object being imaged. In this paper, we propose a unified approach to simultaneous autofocus and alignment for pattern localization by extending the idea of image reference approach. Under the least trimmed squares (LTS) scheme, the proposed hybrid weighted Hausdorff distance (HWHD) is a robust similarity metric that combines the Hausdorff distance (HD) with the edge-amplitude normalized gradient (EANG) matching. The EANG is designed to characterize the different degrees of blur at the edge points for focus cues, immune to illumination variations between the reference and the target image. We experimentally illustrate its performance on simulated as well as real data.  相似文献   

19.
This paper proposes an efficient technique for automatic localization of ear from side face images. The technique is rotation, scale and shape invariant and makes use of the connected components in a graph obtained from the edge map of the side face image. It has been evaluated on IIT Kanpur database consisting of 2672 side faces with variable sizes, rotations and shapes and University of Notre Dame database containing 2244 side faces with variable background and poor illumination. Experimental results reveal the efficiency and robustness of the technique.  相似文献   

20.
This paper presents a novel global localization approach for mobile robots by exploring line-segment features in any structured environment. The main contribution of this paper is an effective data association approach, the Line-segment Relation Matching (LRM) technique, which is based on a generation and exploration of an Interpretation Tree (IT). A new representation of geometric patterns of line-segments is proposed for the first time, which is called as Relation Table. It contains relative geometric positions of every line-segment respect to the others (or itself) in a coordinate-frame independent sense. Based on that, a Relation-Table-constraint is applied to minimize the searching space of IT therefore greatly reducing the processing time of LRM. The Least Square algorithm is further applied to estimate the robot pose using matched line-segment pairs. Then a global localization system can be realized based on our LRM technique integrated with a hypothesis tracking framework which is able to handle pose ambiguity. Sufficient simulations were specially designed and carried out indicating both pluses and minuses of our system compared with former methods. We also presented the practical experiments illustrating that our approach has a high robustness against uncertainties from sensor occlusions and extraneous observation in a highly dynamic environment. Additionally our system was demonstrated to easily deal with initialization and have the ability of quick recovery from a localization failure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号