共查询到20条相似文献,搜索用时 46 毫秒
1.
Multimodal representation learning has gained increasing importance in various real-world multimedia applications. Most previous approaches focused on exploring inter-modal correlation by learning a common or intermediate space in a conventional way, e.g. Canonical Correlation Analysis (CCA). These works neglected the exploration of fusing multiple modalities at higher semantic level. In this paper, inspired by the success of deep networks in multimedia computing, we propose a novel unified deep neural framework for multimodal representation learning. To capture the high-level semantic correlations across modalities, we adopted deep learning feature as image representation and topic feature as text representation respectively. In joint model learning, a 5-layer neural network is designed and enforced with a supervised pre-training in the first 3 layers for intra-modal regularization. The extensive experiments on benchmark Wikipedia and MIR Flickr 25K datasets show that our approach achieves state-of-the-art results compare to both shallow and deep models in multimodal and cross-modal retrieval. 相似文献
2.
Multimedia Tools and Applications - Nowadays, digital protection has become greater prominence for daily digital activities. It’s far vital for people to keep new passwords in their minds and... 相似文献
3.
Sharma D. Gupta N. Chattopadhyay C. Mehta S. 《International Journal on Document Analysis and Recognition》2019,22(4):417-429
International Journal on Document Analysis and Recognition (IJDAR) - In recent past, there has been a steep increase in the use of online platforms for the search of desired products. Real estate... 相似文献
4.
5.
为了充分利用人脸图像的潜在信息,提出一种通过设置不同尺寸的卷积核来得到图像多尺度特征的方法,多尺度卷积自动编码器(Multi-Scale Convolutional Auto-Encoder,MSCAE)。该结构所提取的不同尺度特征反映人脸的本质信息,可以更好地还原人脸图像。这种特征提取框架是一个卷积和采样交替的层级结构,使得特征对旋转、平移、比例缩放等具有高度不变性。MSCAE以encoder-decoder模式训练得到特征提取器,用它提取特征,并融合形成用于分类的特征向量。BP神经网络在ORL和Yale人脸库上的分类结果表明,多尺度特征在识别率和性能上均优于单尺度特征。此外,MSCAE特征与HOG(Histograms of Oriented Gradients)的融合特征取得了比单一特征更高的识别率。 相似文献
6.
7.
Gao Guangwei Wang Yannan Huang Pu Chang Heyou Lu Huimin Yue Dong 《Multimedia Tools and Applications》2020,79(21-22):14903-14917
Multimedia Tools and Applications - Matching sketch facial images to mug-shot images have crucial significance in law enforcement and digital entertainment. Conventional methods always assume that... 相似文献
8.
Linear discriminant analysis (LDA) is one of the most popular supervised feature extraction techniques used in machine learning and pattern classification. However, LDA only captures global geometrical structure information of the data and ignores the geometrical structure information of local data points. Though many articles have been published to address this issue, most of them are incomplete in the sense that only part of the local information is used. We show here that there are total three kinds of local information, namely, local similarity information, local intra-class pattern variation, and local inter-class pattern variation. We first propose a new method called enhanced within-class LDA (EWLDA) algorithm to incorporate the local similarity information, and then propose a complete framework called complete global–local LDA (CGLDA) algorithm to incorporate all these three kinds of local information. Experimental results on two image databases demonstrate the effectiveness of our algorithms. 相似文献
9.
Palani Balasubramanian Elango Sivasankar Viswanathan K Vignesh 《Multimedia Tools and Applications》2022,81(4):5587-5620
Multimedia Tools and Applications - The progressive growth of today’s digital world has made news spread exponentially faster on social media platforms like Twitter, Facebook, and Weibo.... 相似文献
10.
Ajmal Mian Author Vitae 《Pattern recognition》2011,44(5):1068-1075
This paper presents an online learning approach to video-based face recognition that does not make any assumptions about the pose, expressions or prior localization of facial landmarks. Learning is performed online while the subject is imaged and gives near realtime feedback on the learning status. Face images are automatically clustered based on the similarity of their local features. The learning process continues until the clusters have a required minimum number of faces and the distance of the farthest face from its cluster mean is below a threshold. A voting algorithm is employed to pick the representative features of each cluster. Local features are extracted from arbitrary keypoints on faces as opposed to pre-defined landmarks and the algorithm is inherently robust to large scale pose variations and occlusions. During recognition, video frames of a probe are sequentially matched to the clusters of all individuals in the gallery and its identity is decided on the basis of best temporally cohesive cluster matches. Online experiments (using live video) were performed on a database of 50 enrolled subjects and another 22 unseen impostors. The proposed algorithm achieved a recognition rate of 97.8% and a verification rate of 100% at a false accept rate of 0.0014. For comparison, experiments were also performed using the Honda/UCSD database and 99.5% recognition rate was achieved. 相似文献
11.
Multimedia Tools and Applications - Age variation is a major problem in the area of face recognition under uncontrolled environment such as pose, illumination, expression. Most of the works of this... 相似文献
12.
A novel cascade face recognition system using hybrid feature extraction is proposed. Three sets of face features are extracted. The merits of Two-Dimensional Complex Wavelet Transform (2D-CWT) are analyzed. For face recognition feature extraction, it has proved that 2D-CWT compares favorably with the traditionally used 2D Gabor transform in terms of the computational complexity and features? stability. The proposed recognition system congregates three Artificial Neural Network classifiers (ANNs) and a gating network trained by the three feature sets. A computationally efficient fitness function of the genetic algorithms is proposed to evolve the best weights of the ensemble classifier. Experiments demonstrated that the overall recognition rate and reliability have been significantly improved in both still face recognition and video-based face recognition. 相似文献
13.
Pu Ying-Hung Chiu Po-Sheng Tsai Yu-Shiuan Liu Meng-Tsung Hsieh Yi-Zeng Lin Shih-Syun 《The Journal of supercomputing》2022,78(4):5285-5305
The Journal of Supercomputing - Because of the rise of deep learning and neural networks, algorithms based on deep learning have also been developed and subtly applied in daily life. This paper... 相似文献
14.
Multimedia Tools and Applications - In this paper, a novel 3D face reconstruction technique is proposed along with a sequential deep learning-based framework for face recognition. It uses the... 相似文献
15.
Santu Rana Author Vitae Wanquan Liu Author Vitae Author Vitae Svetha Venkatesh Author Vitae 《Pattern recognition》2009,42(11):2850-2862
In this paper we propose a new optimization framework that unites some of the existing tensor based methods for face recognition on a common mathematical basis. Tensor based approaches rely on the ability to decompose an image into its constituent factors (i.e. person, lighting, viewpoint, etc.) and then utilizing these factor spaces for recognition. We first develop a multilinear optimization problem relating an image to its constituent factors and then develop our framework by formulating a set of strategies that can be followed to solve this optimization problem. The novelty of our research is that the proposed framework offers an effective methodology for explicit non-empirical comparison of the different tensor methods as well as providing a way to determine the applicability of these methods in respect to different recognition scenarios. Importantly, the framework allows the comparative analysis on the basis of quality of solutions offered by these methods. Our theoretical contribution has been validated by extensive experimental results using four benchmark datasets which we present along with a detailed discussion. 相似文献
16.
A unified framework for subspace face recognition 总被引:2,自引:0,他引:2
PCA, LDA, and Bayesian analysis are the three most representative subspace face recognition approaches. In this paper, we show that they can be unified under the same framework. We first model face difference with three components: intrinsic difference, transformation difference, and noise. A unified framework is then constructed by using this face difference model and a detailed subspace analysis on the three components. We explain the inherent relationship among different subspace methods and their unique contributions to the extraction of discriminating information from the face difference. Based on the framework, a unified subspace analysis method is developed using PCA, Bayes, and LDA as three steps. A 3D parameter space is constructed using the three subspace dimensions as axes. Searching through this parameter space, we achieve better recognition performance than standard subspace methods. 相似文献
17.
针对二维主成分分析(2DPCA)提取的是人脸的全局特征,但局部特征对人脸识别的作用非常大,提出了一种基于局部特征的自适应加权2DPCA。该算法首先根据局部特征把人脸图像分为上中下三个独立的子块,2DPCA应用到每个子块,自适应地计算出每个子块对识别的不同预期贡献,并把此预期贡献值作为子块权重加权到分类器中以提高识别率,实验结果证明了此算法的有效性和可行性。 相似文献
18.
特征提取方法一直是人脸识别研究中的热点,局部特征分析(Local Feature Analysis)算法不仅能得到面部的全局特征,而且能提取出其局部特征信息,但该算法得到的结果具有过多冗余相关信息不利于识别。由于独立成分分析(Independent Component Analysis)算法能够有效地提取信号的高阶统计特性,很好地去除了各分量之间的相关性。给出了融合这两种方法的特征提取方法,经实验测试表明该算法能有效地提取面部特征。 相似文献
19.
In this paper, a novel learning methodology for face recognition, LearnIng From Testing data (LIFT) framework, is proposed. Considering many face recognition problems featured by the inadequate training examples and availability of the vast testing examples, we aim to explore the useful information from the testing data to facilitate learning. The one-against-all technique is integrated into the learning system to recover the labels of the testing data, and then expand the training population by such recovered data. In this paper, neural networks and support vector machines are used as the base learning models. Furthermore, we integrate two other transductive methods, consistency method and LRGA method into the LIFT framework. Experimental results and various hypothesis testing over five popular face benchmarks illustrate the effectiveness of the proposed framework. 相似文献
20.
人耳和侧面人脸融合的多模态身份识别 总被引:1,自引:0,他引:1
首先分别对人耳和侧面人脸建立基于全空间线性判别分析(FSLDA)的分类器;然后采用贝叶斯决策理论中常见的积、和、中值多分类器融合算法,并对投票算法进行了改进.实验结果表明,与单一的人耳或侧面人脸特征识别比较,人耳和侧面人脸融合的多模态识别率得到提高,并扩大了识别范围. 相似文献