排序方式: 共有55条查询结果,搜索用时 31 毫秒
41.
Face recognition by independent component analysis 总被引:74,自引:0,他引:74
Bartlett M.S. Movellan J.R. Sejnowski T.J. 《Neural Networks, IEEE Transactions on》2002,13(6):1450-1464
A number of current face recognition algorithms use face representations found by unsupervised statistical methods. Typically these methods find a set of basis images and represent faces as a linear combination of those images. Principal component analysis (PCA) is a popular example of such methods. The basis images found by PCA depend only on pairwise relationships between pixels in the image database. In a task such as face recognition, in which important information may be contained in the high-order relationships among pixels, it seems reasonable to expect that better basis images may be found by methods sensitive to these high-order statistics. Independent component analysis (ICA), a generalization of PCA, is one such method. We used a version of ICA derived from the principle of optimal information transfer through sigmoidal neurons. ICA was performed on face images in the FERET database under two different architectures, one which treated the images as random variables and the pixels as outcomes, and a second which treated the pixels as random variables and the images as outcomes. The first architecture found spatially local basis images for the faces. The second architecture produced a factorial face code. Both ICA representations were superior to representations based on PCA for recognizing faces across days and changes in expression. A classifier that combined the two ICA representations gave the best performance. 相似文献
42.
Jiucang Hao Attias H. Nagarajan S. Te-Won Lee Sejnowski T.J. 《IEEE transactions on audio, speech, and language processing》2009,17(1):24-37
This paper presents a new approximate Bayesian estimator for enhancing a noisy speech signal. The speech model is assumed to be a Gaussian mixture model (GMM) in the log-spectral domain. This is in contrast to most current models in frequency domain. Exact signal estimation is a computationally intractable problem. We derive three approximations to enhance the efficiency of signal estimation. The Gaussian approximation transforms the log-spectral domain GMM into the frequency domain using minimal Kullback-Leiber (KL)-divergency criterion. The frequency domain Laplace method computes the maximum a posteriori (MAP) estimator for the spectral amplitude. Correspondingly, the log-spectral domain Laplace method computes the MAP estimator for the log-spectral amplitude. Further, the gain and noise spectrum adaptation are implemented using the expectation-maximization (EM) algorithm within the GMM under Gaussian approximation. The proposed algorithms are evaluated by applying them to enhance the speeches corrupted by the speech-shaped noise (SSN). The experimental results demonstrate that the proposed algorithms offer improved signal-to-noise ratio, lower word recognition error rate, and less spectral distortion. 相似文献
43.
Hao J Lee TW Sejnowski TJ 《IEEE transactions on audio, speech, and language processing》2010,18(6):1127-1136
This paper presents a novel probabilistic approach to speech enhancement. Instead of a deterministic logarithmic relationship, we assume a probabilistic relationship between the frequency coefficients and the log-spectra. The speech model in the log-spectral domain is a Gaussian mixture model (GMM). The frequency coefficients obey a zero-mean Gaussian whose covariance equals to the exponential of the log-spectra. This results in a Gaussian scale mixture model (GSMM) for the speech signal in the frequency domain, since the log-spectra can be regarded as scaling factors. The probabilistic relation between frequency coefficients and log-spectra allows these to be treated as two random variables, both to be estimated from the noisy signals. Expectation-maximization (EM) was used to train the GSMM and Bayesian inference was used to compute the posterior signal distribution. Because exact inference of this full probabilistic model is computationally intractable, we developed two approaches to enhance the efficiency: the Laplace method and a variational approximation. The proposed methods were applied to enhance speech corrupted by Gaussian noise and speech-shaped noise (SSN). For both approximations, signals reconstructed from the estimated frequency coefficients provided higher signal-to-noise ratio (SNR) and those reconstructed from the estimated log-spectra produced lower word recognition error rate because the log-spectra fit the inputs to the recognizer better. Our algorithms effectively reduced the SSN, which algorithms based on spectral analysis were not able to suppress. 相似文献
44.
Classifying facial actions 总被引:20,自引:0,他引:20
Donato G. Bartlett M.S. Hager J.C. Ekman P. Sejnowski T.J. 《IEEE transactions on pattern analysis and machine intelligence》1999,21(10):974-989
The facial action coding system (FAGS) is an objective method for quantifying facial movement in terms of component actions. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include: analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions 相似文献
45.
Makeig Scott; Jung Tzyy-Ping; Sejnowski Terrence J. 《Canadian Metallurgical Quarterly》2000,54(4):266
Examined performance patterns and concurrent EEG spectra in 4 Ss (mean age of 30.5 yrs) performing a continuous visuomotor compensatory tracking task in 15–20 min bouts during a 42-hr sleep deprivation study. During periods of good performance, participants made compensatory trackball movements about twice per second, attempting to keep a target disk near a central ring. Results indicate that autocorrelations of time series representing the distance of the target disk from the ring center showed that during periods of poor performance marked near-18-sec cycles in performance again appeared. There were phases of poor or absent performance accompanied by an increase in EEG power that was largest at 3–4 Hz. These studies show that in drowsy humans, opening and closing of the gates of behavioral awareness is marked not by the appearance of (12–14 Hz) sleep spindles, but by prominent EEG amplitude changes in the low theta band. Further, both EEG and behavioral changes during drowsiness often exhibit stereotyped 18-sec cycles. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献
46.
K Zhang I Ginzburg BL McNaughton TJ Sejnowski 《Canadian Metallurgical Quarterly》1998,79(2):1017-1044
Physical variables such as the orientation of a line in the visual field or the location of the body in space are coded as activity levels in populations of neurons. Reconstruction or decoding is an inverse problem in which the physical variables are estimated from observed neural activity. Reconstruction is useful first in quantifying how much information about the physical variables is present in the population and, second, in providing insight into how the brain might use distributed representations in solving related computational problems such as visual object recognition and spatial navigation. Two classes of reconstruction methods, namely, probabilistic or Bayesian methods and basis function methods, are discussed. They include important existing methods as special cases, such as population vector coding, optimal linear estimation, and template matching. As a representative example for the reconstruction problem, different methods were applied to multi-electrode spike train data from hippocampal place cells in freely moving rats. The reconstruction accuracy of the trajectories of the rats was compared for the different methods. Bayesian methods were especially accurate when a continuity constraint was enforced, and the best errors were within a factor of two of the information-theoretic limit on how accurate any reconstruction can be and were comparable with the intrinsic experimental errors in position tracking. In addition, the reconstruction analysis uncovered some interesting aspects of place cell activity, such as the tendency for erratic jumps of the reconstructed trajectory when the animal stopped running. In general, the theoretical values of the minimal achievable reconstruction errors quantify how accurately a physical variable is encoded in the neuronal population in the sense of mean square error, regardless of the method used for reading out the information. One related result is that the theoretical accuracy is independent of the width of the Gaussian tuning function only in two dimensions. Finally, all the reconstruction methods considered in this paper can be implemented by a unified neural network architecture, which the brain feasibly could use to solve related problems. 相似文献
47.
KT Moortgat CH Keller TH Bullock TJ Sejnowski 《Canadian Metallurgical Quarterly》1998,95(8):4684-4689
What are the limits and modulators of neural precision? We address this question in the most regular biological oscillator known, the electric organ command nucleus in the brainstem of wave-type gymnotiform fish. These fish produce an oscillating electric field, the electric organ discharge (EOD), used in electrolocation and communication. We show here that the EOD precision, measured by the coefficient of variation (CV = SD/mean period) is as low as 2 x 10(-4) in five species representing three families that range widely in species and individual mean EOD frequencies (70-1,250 Hz). Intracellular recording in the pacemaker nucleus (Pn), which commands the EOD cycle by cycle, revealed that individual Pn neurons of the same species also display an extremely low CV (CV = 6 x 10(-4), 0.8 micro sec SD). Although the EOD CV can remain at its minimum for hours, it varies with novel environmental conditions, during communication, and spontaneously. Spontaneous changes occur as abrupt steps (250 ms), oscillations (3-5 Hz), or slow ramps (10-30 s). Several findings suggest that these changes are under active control and depend on behavioral state: mean EOD frequency and CV can change independently; CV often decreases in response to behavioral stimuli; and lesions of one of the two inputs to the Pn had more influence on CV than lesions of the other input. 相似文献
48.
M Bazhenov I Timofeev M Steriade TJ Sejnowski 《Canadian Metallurgical Quarterly》1998,18(16):6444-6465
Repetitive stimulation of the dorsal thalamus at 7-14 Hz produces an increasing number of spikes at an increasing frequency in neocortical neurons during the first few stimuli. Possible mechanisms underlying these cortical augmenting responses were analyzed with a computer model that included populations of thalamocortical cells, thalamic reticular neurons, up to two layers of cortical pyramidal cells, and cortical inhibitory interneurons. Repetitive thalamic stimulation produced a low-threshold intrathalamic augmentation in the model based on the deinactivation of the low-threshold Ca2+ current in thalamocortical cells, which in turn induced cortical augmenting responses. In the cortical model, augmenting responses were more powerful in the "input" layer compared with those in the "output" layer. Cortical stimulation of the network model produced augmenting responses in cortical neurons in distant cortical areas through corticothalamocortical loops and low-threshold intrathalamic augmentation. Thalamic stimulation was more effective in eliciting augmenting responses than cortical stimulation. Intracortical inhibition had an important influence on the genesis of augmenting responses in cortical neurons: A shift in the balance between intracortical excitation and inhibition toward excitation transformed an augmenting responses to long-lasting paroxysmal discharge. The predictions of the model were compared with in vivo recordings from neurons in cortical area 4 and thalamic ventrolateral nucleus of anesthetized cats. The known intrinsic properties of thalamic cells and thalamocortical interconnections can account for the basic properties of cortical augmenting responses. 相似文献
49.
Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources 总被引:3,自引:0,他引:3
An extension of the infomax algorithm of Bell and Sejnowski (1995) is presented that is able blindly to separate mixed signals with sub- and supergaussian source distributions. This was achieved by using a simple type of learning rule first derived by Girolami (1997) by choosing negentropy as a projection pursuit index. Parameterized probability distributions that have sub- and supergaussian regimes were used to derive a general learning rule that preserves the simple architecture proposed by Bell and Sejnowski (1995), is optimized using the natural gradient by Amari (1998), and uses the stability analysis of Cardoso and Laheld (1996) to switch between sub- and supergaussian regimes. We demonstrate that the extended infomax algorithm is able to separate 20 sources with a variety of source distributions easily. Applied to high-dimensional data from electroencephalographic recordings, it is effective at separating artifacts such as eye blinks and line noise from weaker electrical signals that arise from sources in the brain. 相似文献
50.
The basis function theory of spatial representations explains how neurons in the parietal cortex can perform nonlinear transformations from sensory to motor coordinates. The authors present computer simulations showing that unilateral parietal lesions leading to a neuronal gradient in basis function maps can account for the behavior of patients with hemineglect, including (a) neglect in fine cancellation and line bisection experiments; (b) neglect in multiple frames of reference simultaneously; (c) relative neglect, a form of what is sometime called object-centered neglect; and (d) neglect without optic ataxia. Contralateral neglect arises in the model because the lesion produces an imbalance in the salience of stimuli that is modulated by the orientation of the body in space. These results strongly support the basis function theory for spatial representations in humans and provide a computational model of hemineglect at the single-cell level. (PsycINFO Database Record (c) 2010 APA, all rights reserved) 相似文献