首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
雾天是影响高速公路交通安全的重要因素。研究从监控图像进行高速公路雾天能见度的自动识别方法可以为交通管理部门的智能管理和决策提供技术支持。根据大气散射模型分析出与雾浓度相关的多个物理因素,提出了综合这些物理因素的多通路融合识别网络。该网络使用三个通路联合学习深度视觉特征、传输矩阵特征和场景深度特征,并设计注意力融合模块来自适应地融合这三类特征以进行能见度等级识别。同时构建了一个合成数据集和一个真实的高速公路场景数据集,用于网络参数学习和性能评估。实景数据集中的图像是从中国多条高速公路的监控视频中收集的。在这两个数据集上的实验表明,所提方法可以适应不同的监控拍摄场景,能够比现有方法更准确地识别能见度等级,有效提升了识别精度。  相似文献   

2.
Although iris recognition technology has been reported to be more stable and reliable than other biometric systems, performance can be degraded due to many factors such as small eyes, camera defocusing, eyelash occlusions and specular reflections on the surface of glasses. In this paper, we propose a new multi-unit iris authentication method that uses score level fusion based on a support vector machine (SVM) and a quality assessment method for mobile phones. Compared to previous research, this paper presents the following two contributions. First, we reduced the false rejection rate and improved iris recognition accuracy by using iris quality assessment. Second, if even two iris images were determined to be of bad quality, we captured the iris images again without using a recognition process. If only one iris image among the left and right irises was regarded as a good one, it was used for recognition. However, if both the left and right iris images were good, we performed multi-unit iris recognition using score level fusion based on a SVM. Experimental results showed that the accuracy of the proposed method was superior to previous methods that used only one good iris image or those methods that used conventional fusion methods.  相似文献   

3.
In this paper we propose a new approach to real-time view-based pose recognition and interpolation. Pose recognition is particularly useful for identifying camera views in databases, video sequences, video streams, and live recordings. All of these applications require a fast pose recognition process, in many cases video real-time. It should further be possible to extend the database with new material, i.e., to update the recognition system online. The method that we propose is based on P-channels, a special kind of information representation which combines advantages of histograms and local linear models. Our approach is motivated by its similarity to information representation in biological systems but its main advantage is its robustness against common distortions such as clutter and occlusion. The recognition algorithm consists of three steps: (1) low-level image features for color and local orientation are extracted in each point of the image; (2) these features are encoded into P-channels by combining similar features within local image regions; (3) the query P-channels are compared to a set of prototype P-channels in a database using a least-squares approach. The algorithm is applied in two scene registration experiments with fisheye camera data, one for pose interpolation from synthetic images and one for finding the nearest view in a set of real images. The method compares favorable to SIFT-based methods, in particular concerning interpolation. The method can be used for initializing pose-tracking systems, either when starting the tracking or when the tracking has failed and the system needs to re-initialize. Due to its real-time performance, the method can also be embedded directly into the tracking system, allowing a sensor fusion unit choosing dynamically between the frame-by-frame tracking and the pose recognition.  相似文献   

4.
研究基于3D加速度传感器的空间手写识别技术,提出一种基于时频融合特征的分类识别方法。从加速度数据中提取短时能量 (STE)特征及低频分量,经快速傅里叶变换后提取频域特征WPD+FFT,将时域特征STE和频域特征WPD+FFT进行特征融合,利用主成分分析法对其降维,采用支持向量机进行分类识别。实验结果表明,该方法能提高空间手写识别系统的识别率。  相似文献   

5.
6.
Audio-visual speech modeling for continuous speech recognition   总被引:3,自引:0,他引:3  
This paper describes a speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments. The system consists of three components: a visual module; an acoustic module; and a sensor fusion module. The visual module locates and tracks the lip movements of a given speaker and extracts relevant speech features. This task is performed with an appearance-based lip model that is learned from example images. Visual speech features are represented by contour information of the lips and grey-level information of the mouth area. The acoustic module extracts noise-robust features from the audio signal. Finally the sensor fusion module is responsible for the joint temporal modeling of the acoustic and visual feature streams and is realized using multistream hidden Markov models (HMMs). The multistream method allows the definition of different temporal topologies and levels of stream integration and hence enables the modeling of temporal dependencies more accurately than traditional approaches. We present two different methods to learn the asynchrony between the two modalities and how to incorporate them in the multistream models. The superior performance for the proposed system is demonstrated on a large multispeaker database of continuously spoken digits. On a recognition task at 15 dB acoustic signal-to-noise ratio (SNR), acoustic perceptual linear prediction (PLP) features lead to 56% error rate, noise robust RASTA-PLP (relative spectra) acoustic features to 7.2% error rate and combined noise robust acoustic features and visual features to 2.5% error rate  相似文献   

7.
In this paper we address the problem of classifying vector sets. We motivate and introduce a novel method based on comparisons between corresponding vector subspaces. In particular, there are two main areas of novelty: (i) we extend the concept of principal angles between linear subspaces to manifolds with arbitrary nonlinearities; (ii) it is demonstrated how boosting can be used for application-optimal principal angle fusion. The strengths of the proposed method are empirically demonstrated on the task of automatic face recognition (AFR), in which it is shown to outperform state-of-the-art methods in the literature.  相似文献   

8.
This paper addresses the robustness issue of information fusion for visual recognition. Analyzing limitations in existing fusion methods, we discover two key factors affecting the performance and robustness of a fusion model under different data distributions, namely (1) data dependency and (2) fusion assumption on posterior distribution. Considering these two factors, we develop a new framework to model dependency based on probabilistic properties of posteriors without any assumption on the data distribution. Making use of the range characteristics of posteriors, the fusion model is formulated as an analytic function multiplied by a constant with respect to the class label. With the analytic fusion model, we give an equivalent condition to the independent assumption and derive the dependency model from the marginal distribution property. Since the number of terms in the dependency model increases exponentially, the Reduced Analytic Dependency Model (RADM) is proposed based on the convergent property of analytic function. Finally, the optimal coefficients in the RADM are learned by incorporating label information from training data to minimize the empirical classification error under regularized least square criterion, which ensures the discriminative power. Experimental results from robust non-parametric statistical tests show that the proposed RADM method statistically significantly outperforms eight state-of-the-art score-level fusion methods on eight image/video datasets for different tasks of digit, flower, face, human action, object, and consumer video recognition.  相似文献   

9.
一种融合两种主成分分析的人脸识别方法   总被引:1,自引:1,他引:0  
提出了一种融合两种主成分分析的人脸识别方法。首先,利用两种不同的主成分分析方法分别获得人脸识别结果;然后,从信息融合的角度出发,采用模糊综合的原理对结果进行融合,给出最终的识别结果。基于ORL人脸数据库的实验证明该方法的识别性能优于单一的主成分分析方法。  相似文献   

10.
基于ICA的非线性自适应特征融合的人耳识别   总被引:3,自引:0,他引:3  
针对单一特征的人耳识别对旋转角度鲁棒性差的问题,提出一种非线性自适应特征融合的方法.首先提取人耳的2种具有互补性质的独立成分特征,然后将它们加权串联形成高维融合特征;最后通过核主元分析方法实现非线性降维.实验结果表明,当人耳有姿态旋转时,融合特征较单一特征的识别率有显著提升,且文中方法比传统的串联融合的识别结果更好.  相似文献   

11.
传统多生物特征融合识别方法中人工设计特征提取存在盲目性和差异性,特征融合存在空间不匹配或维度过高等问题,为此提出一种基于深度学习的多生物特征融合识别方法。通过卷积神经网络(convolutional neural networks,CNN)提取人脸和虹膜特征、参数化t-SNE算法特征降维和支持向量机(support vector machine,SVM)分类组合进行融合识别。实验结果表明,该融合识别方法与单一生物特征识别以及其它融合识别方法相比,鲁棒性增强,识别性能提升明显。  相似文献   

12.
The subtitle recognition under multimodal data fusion in this paper aims to recognize text lines from image and audio data. Most existing multimodal fusion methods tend to be associated with pre-fusion as well as post-fusion, which is not reasonable and difficult to interpret. We believe that fusing images and audio before the decision layer, i.e., intermediate fusion, to take advantage of the complementary multimodal data, will benefit text line recognition. To this end, we propose: (i) a novel cyclic autoencoder based on convolutional neural network. The feature dimensions of the two modal data are aligned under the premise of stabilizing the compressed image features, thus the high-dimensional features of different modal data are fused at the shallow level of the model. (ii) A residual attention mechanism that helps us improve the performance of the recognition. Regions of interest in the image are enhanced and regions of disinterest are weakened, thus we can extract the features of the text regions without further increasing the depth of the model (iii) a fully convolutional network for video subtitle recognition. We choose DenseNet-121 as the backbone network for feature extraction, which effectively enabling the recognition of video subtitles in complex backgrounds. The experiments are performed on our custom datasets, and the automatic and manual evaluation results show that our method reaches the state-of-the-art.  相似文献   

13.
Organisations are increasingly relying on Big Data to provide the opportunities to discover correlations and patterns in data that would have previously remained hidden, and to subsequently use this new information to increase the quality of their business activities. In this paper we present a ‘story’ of Big Data from the initial data collection and to the end visualization, passing by the data fusion, and the analysis and clustering tasks. For this, we present a complete work flow on (a) how to represent the heterogeneous collected data using the high performance RDF language, how to perform the fusion of the Big Data in RDF by resolving the issue of entity disambiguity and how to query those data to provide more relevant and complete knowledge and (b) as the data are received in data streams, we propose batchStream, a Micro-Batching version of the growing neural gas approach, which is capable of clustering data streams with a single pass over the data. The batchStream algorithm allows us to discover clusters of arbitrary shapes without any assumptions on the number of clusters. This Big Data work flow is implemented in the Spark platform and we demonstrate it on synthetic and real data.  相似文献   

14.
Multimodal fusion is a complex topic. For surveillance applications audio–visual fusion is very promising given the complementary nature of the two streams. However, drawing the correct conclusion from multi-sensor data is not straightforward. In previous work we have analysed a database with audio–visual recordings of unwanted behavior in trains (Lefter et al., 2012) and focused on a limited subset of the recorded data. We have collected multi- and unimodal assessments by humans, who have given aggression scores on a 3 point scale. We showed that there are no trivial fusion algorithms to predict the multimodal labels from the unimodal labels since part of the information is lost when using the unimodal streams. We proposed an intermediate step to discover the structure in the fusion process. This step is based upon meta-features and we find a set of five which have an impact on the fusion process. In this paper we extend the findings in (Lefter et al., 2012) for the general case using the entire database. We prove that the meta-features have a positive effect on the fusion process in terms of labels. We then compare three fusion methods that encapsulate the meta-features. They are based on automatic prediction of the intermediate level variables and multimodal aggression from state of the art low level acoustic, linguistic and visual features. The first fusion method is based on applying multiple classifiers to predict intermediate level features from the low level features, and to predict the multimodal label from the intermediate variables. The other two approaches are based on probabilistic graphical models, one using (Dynamic) Bayesian Networks and the other one using Conditional Random Fields. We learn that each approach has its strengths and weaknesses in predicting specific aggression classes and using the meta-features yields significant improvements in all cases.  相似文献   

15.
多模态情感识别是当前情感计算研究领域的重要内容,针对人脸表情和动作姿态开展双模态情感识别研究,提出一种基于双边稀疏偏最小二乘的表情和姿态的双模态情感识别方法.首先,从视频图像系列中分别提取表情和姿态两种模态的空时特征作为情感特征矢量.然后,通过双边稀疏偏最小二乘(BSPLS)的数据降维方法来进一步提取两组模态中的情感特征,并组合成新的情感特征向量.最后,采用了两种分类器来进行情感的分类识别.以国际上广泛采用的FABO表情和姿态的双模态情感数据库为实验数据,并与多种子空间方法(主成分分析、典型相关分析、偏最小二乘回归)进行对比实验来评估本文方法的识别性能.实验结果表明,两种模态融合后相比单模态更加有效,双边稀疏偏最小二乘(BSPLS)算法在几种方法中得到最高的情感识别率.  相似文献   

16.
In this paper we focus on the aggregation of IDS alerts, an important component of the alert fusion process. We exploit fuzzy measures and fuzzy sets to design simple and robust alert aggregation algorithms. Exploiting fuzzy sets, we are able to robustly state whether or not two alerts are “close in time”, dealing with noisy and delayed detections. A performance metric for the evaluation of fusion systems is also proposed. Finally, we evaluate the fusion method with alert streams from anomaly-based IDS.  相似文献   

17.
赵炯  樊养余 《测控技术》2010,29(11):37-40
提出一种新的KCCA特征融合算法。首先分别提取目标图像的局部特征SIFT和全局Pseudo-Zernike矩特征,并利用K-means算法对局部特征进行预处理;然后利用KCCA将两种特征提取相关特征进行融合,最后将融合特征送入SVM分类器。对遥感飞机图像库做了分类识别的仿真实验。相比于单一特征和CCA特征融合的识别策略,KCCA识别率得到明显提高,理论分析和实验结果证实了该算法具有良好的准确性与可靠性,能够有效提高图像分类识别系统的准确度。  相似文献   

18.
Most fingerprint recognition systems are based on the use of a minutiae set, which is an unordered collection of minutiae locations and orientations suffering from various deformations such as translation, rotation, and scaling. The spectral minutiae representation introduced in this paper is a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for. These characteristics enable the combination of fingerprint recognition systems with template protection schemes that require a fixed-length feature vector. This paper introduces the concept of algorithms for two representation methods: the location-based spectral minutiae representation and the orientation-based spectral minutiae representation. Both algorithms are evaluated using two correlation-based spectral minutiae matching algorithms. We present the performance of our algorithms on three fingerprint databases. We also show how the performance can be improved by using a fusion scheme and singular points.   相似文献   

19.
In this paper, a new approach of multimodal finger biometrics based on the fusion of finger vein and finger geometry recognition is presented. In the proposed method, Band Limited Phase Only Correlation (BLPOC) is utilized to measure the similarity of finger vein images. Unlike previous methods, BLPOC is resilient to noise, occlusions and rescaling factors; thus can enhance the performance of finger vein recognition. As for finger geometry recognition, a new type of geometrical features called Width-Centroid Contour Distance (WCCD) is proposed. This WCCD combines the finger width with Centroid Contour Distance (CCD). As compared with the single type of feature, the fusion of W and CCD can improve the accuracy of finger geometry recognition. Finally, we integrate the finger vein and finger geometry recognitions by a score-level fusion method based on the weighted SUM rule. Experimental evaluation using our own database which was collected from 123 volunteers resulted in an efficient recognition performance where the equal error rate (EER) was 1.78% with a total processing time of 24.22 ms.  相似文献   

20.
人耳人脸特征融合在身份鉴别中的研究   总被引:1,自引:0,他引:1  
针对单一人耳识别对姿态变化鲁棒性较差的问题,鉴于人脸在图像性质和生理位置上与人耳具有相似性和互补性,使用了多模态特征融合的方法提高姿态变化下的识别率.与传统的独立成分分析首先获得独立的基向量(ICAl)不同,提出了利用ICA直接获得独立的鉴剐特征的方法(ICA2).在USTB图像库上分别将两种ICA特征进行单模态和多模态的融合.实验表明,两种特征的融合提高了单一模态的识别率,并且多模态识别优于单一的人耳或人脸识别.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号