首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 11 毫秒
Methods of multi-view learning attain outstanding performance in different fields compared with the single-view based strategies. In this paper, the Gaussian Process Latent Variable Model (GPVLM), which is a generative and non-parametric model, is exploited to represent multiple views in a common subspace. Specifically, there exists a shared latent variable across various views that is assumed to be transformed to observations by using distinctive Gaussian Process projections. However, this assumption is only a generative strategy, being intractable to simply estimate the fused variable at the testing step. In order to tackle this problem, another projection from observed data to the shared variable is simultaneously learned by enjoying the view-shared and view-specific kernel parameters under the Gaussian Process structure. Furthermore, to achieve the classification task, label information is also introduced to be the generation from the latent variable through a Gaussian Process transformation. Extensive experimental results on multi-view datasets demonstrate the superiority and effectiveness of our model in comparison to state-of-the-art algorithms.  相似文献   

Multi-view learning exploits structural constraints among multiple views to effectively learn from data. Although it has made great methodological achievements in recent years, the current generalization theory is still insufficient to prove the merit of multi-view learning. This paper blends stability into multi-view PAC-Bayes analysis to explore the generalization performance and effectiveness of multi-view learning algorithms. We propose a novel view-consistency regularization to produce an informative prior that helps to obtain a stability-based multi-view bound. Furthermore, we derive an upper bound on the stability coefficient that is involved in the PAC-Bayes bound of multi-view regularization algorithms for the purpose of computation, taking the multi-view support vector machine as an example. Experiments provide strong evidence on the advantageous generalization bounds of multi-view learning over single-view learning. We also explore strengths and weaknesses of the proposed stability-based bound compared with previous non-stability multi-view bounds experimentally.  相似文献   

A major challenge to appearance-based learning techniques is the robustness against data corruption and irrelevant within-class data variation. This paper presents a robust kernel for kernel-based approach to achieving better robustness on several visual learning problems. Incorporating a robust error function used in robust statistics together with a deformation invariant distance measure, the proposed kernel is shown to be insensitive to noise and robust to intra-class variations. We prove that this robust kernel satisfies the requirements for a valid kernel, so it has good properties when used with kernel-based learning machines. In the experiments, we validate the superior robustness of the proposed kernel over the state-of-the-art algorithms on several applications, including hand-written digit classification, face recognition and data visualization.  相似文献   

Multi-view learning for classification has achieved a remarkable performance compared with the single-view based methods. Inspired by the instance based learning which directly regards the instance as the prior and well preserves the valuable information in different instances, a Multi-view Instance Attention Fusion Network (MvIAFN) is proposed to efficiently exploit the correlation across both instances and views. Specifically, a small number of instances from different views are first sampled as the set of templates. Given an additional instance and based on the similarities between it and the selected templates, it can be re-presented by following an attention strategy. Thanks for this strategy, the given instance is capable of preserving the additional information from the selected instances, achieving the purpose of extracting the instance-correlation. Additionally, for each sample, we not only perform the instance attention in each single view but also get the attention across multiple views, allowing us to further fuse them to obtain the fused attention for each view. Experimental results on datasets substantiate the effectiveness of our proposed method compared with state-of-the-arts.  相似文献   

一种改进预测结构的多视点视频编码   总被引:1,自引:0,他引:1       下载免费PDF全文
多视点视频编码是视频编码的研究热点之一。针对联合多视点视频编码(Joint Multi-view Video Coding,JMVC)采用的分层B帧预测结构编码复杂度高,随机访问性能较差等缺点,提出了一种改进的预测结构。所提出的预测结构对B视点中以前一帧作为其时间参考的帧仅采用时间预测,对所有P视点的非关键帧均不进行视点间预测,有效地降低了计算复杂度,提高了随机访问性能。通过选取合适的I视点位置,以减少结构简化带来的编码效率损失。实验结果表明,与分层B帧预测结构相比,所提出的预测结构在保证编码效率损失不大的情况下,显著降低了平均编码时间。改进的预测结构也具有更好的随机访问性能。  相似文献   

Video-based, real-time multi-view stereo   总被引:1,自引:0,他引:1  
We investigate the problem of obtaining a dense reconstruction in real-time, from a live video stream. In recent years, multi-view stereo (MVS) has received considerable attention and a number of methods have been proposed. However, most methods operate under the assumption of a relatively sparse set of still images as input and unlimited computation time. Video based MVS has received less attention despite the fact that video sequences offer significant benefits in terms of usability of MVS systems. In this paper we propose a novel video based MVS algorithm that is suitable for real-time, interactive 3d modeling with a hand-held camera. The key idea is a per-pixel, probabilistic depth estimation scheme that updates posterior depth distributions with every new frame. The current implementation is capable of updating 15 million distributions/s. We evaluate the proposed method against the state-of-the-art real-time MVS method and show improvement in terms of accuracy.  相似文献   

Zhu  Xiaobin  Li  Zhuangzi  Zhang  Xiao-Yu  Li  Peng  Xue  Ziyu  Wang  Lei 《Multimedia Tools and Applications》2019,78(20):29271-29290
Multimedia Tools and Applications - Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image classification and related tasks. However, the fully-connected...  相似文献   

针对高光谱遥感图像中标记样本获取困难的问题,研究如何选择少量高质量的查询样本进行交互标记的多视图主动学习算法。首先采用不同尺度和方向的三维Gabor滤波器组提取高光谱图像空谱特征;然后挑选出类别判别能力较强的三维Gabor特征来构建多视图;最后提出一种基于多视图后验概率差异最小(MPPD)的样本查询策略。实验初选30个标记样本,经过100次迭代后,三维Gabor特征多视图结合MPPD查询策略在ROSIS Pavia University和AVIRIS Indiana Pines两个数据集上的总体分类精度分别达到94.16%和91.30%,表明通过三维Gabor可以有效提取高光谱遥感图像空谱特征,提供具有多样性和互补性的特征视图。结合MPPD查询策略能挑选出最有价值的查询样本。  相似文献   

Many applications in computer vision and computer graphics require dense correspondences between images of multi-view video streams. Most state-of-the-art algorithms estimate correspondences by considering pairs of images. However, in multi-view videos, several images capture nearly the same scene. In this article we show that this redundancy can be exploited to estimate more robust and consistent correspondence fields. We use the multi-video data structure to establish a confidence measure based on the consistency of the correspondences in a loop of three images. This confidence measure can be applied after flow estimation is terminated to find the pixels for which the estimate is reliable. However, including the measure directly into the estimation process yields dense and highly accurate correspondence fields. Additionally, application of the loop consistency confidence measure allows us to include sparse feature matches directly into the dense optical flow estimation. With the confidence measure, spurious matches can be successfully suppressed during optical flow estimation while correct matches contribute to increase the accuracy of the flow.  相似文献   

随着对地观测技术的进步,海量地学时空场数据的积累对时空场数据的建模、检索与分析提出新的要求。基于张量结构构建多维时空场数据组织方法,建立了基于时空立方体模型的数据存储结构,并定义了相应的数据操作功能与数据接口,进而设计了时空场数据的分层索引机制及基于张量运算算子的地学时空场数据分析方法。基于卫星测高数据的系统验证结果表明:本模型可有效支撑多维时空场数据的表达、检索与分析,是对高维时空场数据分析与建模的有益探索。  相似文献   

Due to the noise disturbance and limited number of training samples, within-set and between-set sample covariance matrices in canonical correlation analysis (CCA) usually deviate from the true ones. In this paper, we re-estimate within-set and between-set covariance matrices to reduce the negative effect of this deviation. Specifically, we use the idea of fractional order to respectively correct the eigenvalues and singular values in the corresponding sample covariance matrices, and then construct fractional-order within-set and between-set scatter matrices which can obviously alleviate the problem of the deviation. On this basis, a new approach is proposed to reduce the dimensionality of multi-view data for classification tasks, called fractional-order embedding canonical correlation analysis (FECCA). The proposed method is evaluated on various handwritten numeral, face and object recognition problems. Extensive experimental results on the CENPARMI, UCI, AT&T, AR, and COIL-20 databases show that FECCA is very effective and obviously outperforms the existing joint dimensionality reduction or feature extraction methods in terms of classification accuracy. Moreover, its improvements for recognition rates are statistically significant on most cases below the significance level 0.05.  相似文献   

针对多模态融合效果不佳,不能充分挖掘特定时间段,多视角关键情感信息的问题,提出了一种基于多视角的时序多模态情感分类模型,用于提取特定时间段,多视角下的关键情感信息。首先,对文本标题及文本内容两种视角下的数据进行低维空间词嵌入和序列表达,提取不同视角的多模态时序特征,对图片截取,水平镜像两种视角下的数据进行特征提取;其次,采用循环神经网络构建多模态数据的时序序列交互特征,增大互信息;最后,基于对比学习进行联合训练,完成情感分类。该模型在两个多模态情感分类基准数据集Yelp和Mutli-Zol上评估,准确度分别为73.92%、69.15%。综合实验表明,多视角的特定时间段多模态语句序列可提升模型性能。  相似文献   

We present a variational segmentation method which exploits color, edge and spatial information between an arbitrary number of views. In contrast to purely image based information like color and gradient, spatial consistency is a new cue for segmentation, which originates from the field of 3D reconstruction. We show that this cue can be easily integrated in a variational formulation and allows pixel-accurate segmentation, even for objects which are hard to segment. The use of inherently parallel algorithms and the implementation on modern GPUs allows us to apply this method to semi-supervised and completely automatic settings. On publicly available datasets we show that our method is faster and more accurate than the state of the art. The successful applications within a catadioptric measurement system and multi-view background subtraction shows its practical relevance.  相似文献   

Reflected solar radiances measured by the pushbroom cameras of the Multiangle Imaging SpectroRadiometer (MISR) on the Terra satellite at nine viewing angles are combined to give eight stereo pairs. These are analyzed with stereo-photogrammetric methods to measure the geometry of a convective cloud system. Both cloud-top heights and cloud sides are retrieved with a precision of about 200-300 m. Two case studies of deep, convective clouds over ocean are considered. The accuracy of the MISR retrieval is tested in the first case study by reference to coincident, higher resolution stereo data from ASTER, showing how the accuracy of the cloud-top height retrieval is improved using the oblique MISR views. In the second case study, the entire cross-section of the cloud aligned with the viewing azimuthal direction is measured, using all nine cameras. The methodology presented is an important step towards more routine retrievals of the 3D geometrical reconstruction of isolated, deep-convective clouds. Such reconstructions are a necessary prerequisite to the subsequent 3D radiative transfer modeling used to aid the remote sensing of the elusive microphysical properties of such clouds.  相似文献   

毛金莲 《计算机应用》2013,33(7):1955-1959
针对现有多视角学习算法在构建近邻图时缺乏数据自适应性问题,提出一种自适应多视角学习(AMVL)算法。该算法首先利用L1范数具有自动数据样本选择的特性,对不同视角分别构建有向的L1图;然后根据得到的L1图,最小化各个视角下的低维重建误差;最后对不同视角间进行多视角全局坐标对齐,得到自适应多视角学习算法的目标函数。此外,还提出一种迭代优化求解方法来对所提目标函数进行优化求解。将该算法应用到图像分类问题,在Corel5K和NUS-WIDE-OBJECT两个公共图像数据集上与现有算法进行对比。实验结果表明:所提方法在这两个数据集上可以分别提高最高5%和2%的分类准确率;优化求解算法可以保证在100次迭代内收敛;算法所得到的近邻数目具有数据自适应性。  相似文献   

Since the multiple kernel representation opened in tracking the possibility of representing several features of the target in the same model, tracking multiple features using kernel-based methods has received a great attention. In spite of these efforts, the formulation has been reduced to tracking planar targets or targets rotating inside a plane parallel to the image plane. The aim of this paper is to extend the multi-kernel tracking to cope with situations different to those. To this end, we consider the triangular mesh described by the centers of the kernels and we develop the estimation of a set of affine transforms, one at each mesh triangle, subject to the constraints that each affine transform of a triangle must be compatible with the affine transforms coming from contiguous triangles. The method is applied to sequences including face and car tracking. Results show an outperformance respect to previous kernel tracking methods, which generally work with a too restricted set of movements.  相似文献   

一种支持向量机的混合核函数   总被引:2,自引:0,他引:2  
核函数是支持向量机的核心,不同的核函数将产生不同的分类效果.由于普通核函数各有其利弊,为了得到学习能力和泛化能力较强的核函数,根据核函数的基本性质,两个核函数之和仍然是核函数,将局部核函数和全局核函数线性组合构成新的核函数--混合核函数.该核函数吸取了局部核函数和全局核函数的优点.利用混合核函数进行流程企业供应链预测实验,仿真结果验证了该核函数的有效性和正确性.  相似文献   

为解决多视点视频编码(MVC)可变尺寸块的模式选择计算复杂度过大的问题,提出了基于模式复杂度的多视点视频编码快速模式选择算法。该算法首先分析了多视点视频编码参考模型(JMVC)中各尺寸块的分布特点。然后,提出模式复杂度的概念以确定当前宏块的模式特征。最后,将宏块分成3个不同的模式类型:如果当前宏块属于简单模式,那么仅16×16分块被检查,其他分块均跳过;若当前宏块属于中等模式,则8×8块被跳过;若当前宏块属于复杂模式,所有模式分块都要检查。这样算法对那些不必要的模式选择过程就可以提前终止,从而使得计算量大大减少。实验结果表明:所提算法能保持同JMVC中全搜索算法几乎相同的编码效率,同时使计算复杂度减少62.75%。  相似文献   

一种支持向量机的组合核函数   总被引:11,自引:0,他引:11  
张冰  孔锐 《计算机应用》2007,27(1):44-46
核函数是支持向量机的核心,不同的核函数将产生不同的分类效果,核函数也是支持向量机理论中比较难理解的一部分。通过引入核函数,支持向量机可以很容易地实现非线性算法。首先探讨了核函数的本质,说明了核函数与所映射空间之间的关系,进一步给出了核函数的构成定理和构成方法,说明了核函数分为局部核函数与全局核函数两大类,并指出了两者的区别和各自的优势。最后,提出了一个新的核函数——组合核函数,并将该核函数应用于支持向量机中,并进行了人脸识别实验,实验结果也验证了该核函数的有效性。  相似文献   

The mental workload (MWL) classification is a critical problem for quantitative assessment and analysis of operator functional state in many safety-critical situations with indispensable human–machine cooperation. The MWL can be measured by psychophysiological signals. In this work, we propose a novel restricted Boltzmann machine (RBM) architecture for MWL classification. In relation to this architecture, we examine two main issues: the optimal structure of RBM and selection of the most important EEG channels (electrodes) for MWL classification. The trial-and-error and entropy-based pruning methods are compared for the RBM structure identification. The degree of importance of EEG channels is calculated from the weights in a well-trained network in order to select the most relevant channels for classification task. Extensive comparative results showed that the selected EEG channels lead to accurate MWL classification across subjects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号