Learning Robust Similarity Measures for 3D Partial Shape Retrieval   总被引:1,自引:0,他引:1  
In this paper, we propose a novel approach to learning robust ground distance functions of the Earth Mover’s distance to make it appropriate for quantifying the partial similarity between two feature-sets. First, we define the ground distance as a monotonic transformation of commonly used feature-to-feature base distance (or similarity) measures, so that in computing the Earth Mover’s distance, the algorithm could better turn its focus on the feature pairs that are correctly matched, while being less affected by irrelevant ones. As a result, the proposed method is especially suited for 3D partial shape retrieval where occlusion and clutter are serious problems. We prove that when the transformation satisfies certain conditions, the metric property of the base distance is sufficient to guarantee the ground distance is a metric (and so is the Earth Mover’s distance), which makes fast shape retrieval on large databases technically possible. Second, we propose a discriminative learning framework to optimize the transformation function based on the real Adaboost algorithm. The optimization is performed in the space of the piecewise constant approximations of the transformation without making any parametric assumption. Finally, extensive experiments on 3D partial shape retrieval convincingly demonstrate the effectiveness of the proposed techniques.  相似文献   

在无标记人体运动跟踪过程中,由于被跟踪目标缺乏明显的特征以及背景复杂而使得跟踪到的人体运动姿态与真实值偏差较大,不能进行长序列视频跟踪.针对这一现象,提出一种基于形变外观模板匹配进行单目视频的三维人体运动跟踪算法,其中所用的人体外观模型由三维人体骨骼模型及二维纸板模型组成.首先根据人体骨骼比例约束采用逆运动学计算出关节旋转欧拉角;然后利用正向运动学求得纸板模型中像素在三维空间中的坐标,将这些像素根据摄像机成像模型投影到二维图像中得到形变外观模板;最后采用直方图匹配得到人体运动跟踪结果.实验结果表明,该算法对于一些复杂的长序列人体运动能够得到较为理想的跟踪结果,可应用于人机交互和动画制作等领域.  相似文献   

提出一种利用运动目标三维轨迹重建的视频时域同步算法.待同步的视频序列由不同相机在同一场景中同时拍摄得到,对场景及相机运动不做限制性约束.假设每帧图像的相机投影矩阵已知,首先基于离散余弦变换基重建运动目标的三维轨迹.然后提出一种基于轨迹基系数矩阵的秩约束,用于衡量不同序列子段间的空间时间对准程度.最后构建代价矩阵,并利用基于图的方法实现视频间的非线性时域同步.我们不依赖已知的点对应关系,不同视频中的跟踪点甚至可以对应不同的三维点,只要它们之间满足以下假设:观测序列中跟踪点对应的三维点,其空间位置可以用参考序列中所有跟踪点对应的三维点集的子集的线性组合描述,且该线性关系维持不变.与多数现有方法要求特征点跟踪持续整个图像序列不同,本文方法可以利用长短不一的图像点轨迹.本文在仿真数据和真实数据集上验证了提出方法的鲁棒性和性能.  相似文献   

基于形状特征k-d树的多维时间序列相似搜索   总被引:2,自引:0,他引:2  
黄河  史忠植  郑征 《软件学报》2006,17(10):2048-2056
多维时间序列是信息系统中一类重要的数据对象,相似搜索是其应用的一个核心.两个序列(子序列)相似度加以比较的常用方法是:将序列(子序列)转换成空间中的曲线,然后计算曲线间的欧几里德距离.这种方法的主要缺陷是它仅考虑了序列(子序列)间的整体距离关系,而不能体现它们自身的局部变化.针对此问题,提出了一种新的可应用于多维时间序列的快速相似搜索方法.该方法将序列(子序列)的局部变化特性与检索结构(k-d树)结合起来,使得在搜索k-d树的同时实现了序列(子序列)的局部变化匹配,从而极大地提高了查询效率和正确率.实验结果表明了算法的有效性.  相似文献   

In this paper, a new method for deformable 3D shape registration is proposed. The algorithm computes shape transitions based on local similarity transforms which allows to model not only as‐rigid‐as‐possible deformations but also local and global scale. We formulate an ordinary differential equation (ODE) which describes the transition of a source shape towards a target shape. We assume that both shapes are roughly pre‐aligned (e.g., frames of a motion sequence). The ODE consists of two terms. The first one causes the deformation by pulling the source shape points towards corresponding points on the target shape. Initial correspondences are estimated by closest‐point search and then refined by an efficient smoothing scheme. The second term regularizes the deformation by drawing the points towards locally defined rest positions. These are given by the optimal similarity transform which matches the initial (undeformed) neighborhood of a source point to its current (deformed) neighborhood. The proposed ODE allows for a very efficient explicit numerical integration. This avoids the repeated solution of large linear systems usually done when solving the registration problem within general‐purpose non‐linear optimization frameworks. We experimentally validate the proposed method on a variety of real data and perform a comparison with several state‐of‐the‐art approaches.  相似文献   

郑明明  林志毅 《计算机工程》2019,45(10):266-271
基于双调和距离的等距不变性,提出一种三维形状的相似性度量方法。给出双调和距离、形式化表达和离散计算的定义,并对形状的双调和距离矩阵进行奇异值分解。提取双调和距离矩阵的特征值作为形状描述符,将一对形状特征值的余弦距离作为形状相似度。通过TOSCA2010数据库上的实验结果表明,与FMPS方法、SHED方法相比,该方法能够较好地兼顾时间耗费度和形状匹配度。  相似文献   

We describe a novel approach for 3-D ear biometrics using video. A series of frames is extracted from a video clip and the region of interest in each frame is independently reconstructed in 3-D using shape from shading. The resulting 3-D models are then registered using the iterative closest point algorithm. We iteratively consider each model in the series as a reference model and calculate the similarity between the reference model and every model in the series using a similarity cost function. Cross validation is performed to assess the relative fidelity of each 3-D model. The model that demonstrates the greatest overall similarity is determined to be the most stable 3-D model and is subsequently enrolled in the database. Experiments are conducted using a gallery set of 402 video clips and a probe of 60 video clips. The results (95.0% rank-1 recognition rate and 3.3% equal error rate) indicate that the proposed approach can produce recognition rates comparable to systems that use 3-D range data. To the best of our knowledge, we are the first to develop a 3-D ear biometric system that obtains a 3-D ear structure from a video sequence.   相似文献   

Similarity Analysis of Video Sequences Using an Artificial Neural Network   总被引:1,自引:1,他引:0  
Comparison of video sequences is an important operation in many multimedia information systems. The similarity measure for comparison is typically based on some measure of correlation with the perceptual similarity (or difference) amongst the video sequences or with the similarity (or difference) in some measure of semantics associated with the video sequences. In content-based similarity analysis, the video data are expressed in terms of different features. Similarity matching is then performed by quantifying the feature relationships between the target video and query video shots, with either an individual feature or with a feature combination. In this study, two approaches are proposed for the similarity analysis of video shots. In the first approach, mosaic images are created from video shots, and the similarity analysis is done by determining the similarities amongst the mosaic images. In the second approach, key frames are extracted for each video shot and the similarity amongst video shots is determined by comparing the key frames of the video shots. The features extracted include image histograms, slopes, edges, and wavelets. Both individual features and feature combinations are used in similarity matching using an artificial neural network. The similarity rank of the query video shots is determined based on the values of the coefficients of determination and the mean absolute error. The study reported in this paper shows that the mosaic-based similarity analysis can be expected to yield a more reliable result, whereas the key frame-based similarity analysis could be potentially applied to a wider range of applications. The weighted non-linear feature combination is shown to yield better results than a single feature for video similarity analysis. The coefficient of determination is shown to be a better criterion than the mean absolute error in similarity matching analysis.  相似文献   

The recent introduction of 3D shape analysis frameworks able to quantify the deformation of a shape into another in terms of the variation of real functions yields a new interpretation of the 3D shape similarity assessment and opens new perspectives. Indeed, while the classical approaches to similarity mainly quantify it as a numerical score, map‐based methods also define (dense) shape correspondences. After presenting in detail the theoretical foundations underlying these approaches, we classify them by looking at their most salient features, including the kind of structure and invariance properties they capture, as well as the distances and the output modalities according to which the similarity between shapes is assessed and returned. We also review the usage of these methods in a number of 3D shape application domains, ranging from matching and retrieval to annotation and segmentation. Finally, the most promising directions for future research developments are discussed.  相似文献   

针对传统人体动画制作成本高、人体运动受捕获设备限制等缺陷,提出了一种基于单目视频运动跟踪的三维人体动画方法。首先给出了系统实现框架,然后采用比例正交投影模型及人体骨架模型来恢复关节的三维坐标,关节的旋转欧拉角由逆运动学计算得到,最后采用H-anim标准对人体建模,由关节欧拉角驱动虚拟人产生三维人体动画。实验结果表明,该系统能够对人体运动进行准确的跟踪和三维重建,可应用于人体动画制作领域。  相似文献   

提出一种基于视觉的三维模型相似性比较算法.首先计算三维模型的带深度信息的正交平面投影图像,然后采用Zernike描述子和Reeb图比较这些正交投影图像的形状相似性,最后通过正交投影图像的形状相似性来获得三维模型的相似性.实验结果表明:该算法具有较好的三维模型检索准确性,并且对坐标系旋转变换、模型噪声、网格简化和细分具有较好的鲁棒性.  相似文献   

视频相似度的衡量   总被引:7,自引:2,他引:7  
基于内容的视频检索系统中,最常用的检索方式是例子视频查询,即用户提交一部视频,系统返回相似的一系列视频,但是,怎样定义的两部视频是相似的,仍然是一个困难的问题。文中介绍了一种新的方法以解决这一难点。首先,提出了镜头质心特征向量的概念,减少了关键帧特征的存储量。其次,利用人类视觉判断中所潜在的因子,提出了视频在镜头间相似度的衡量,以及总体上相似度的衡量的方法,为不同粒度上的衡量提供了很大的灵活性,在现实意义上也是合理的。检索实验的结果证明了算法的有效性。  相似文献   

累进三维模型相似匹配算法   总被引:1,自引:1,他引:0  
提出了一种构造三维模型特征二叉树的算法,并根据特征二叉树匹配来获得三维模型的相似程度,特征二叉树与二维模型坐标系统系旋转和平移无关,且适合于三维模型累进牵引匹配,实验结果表明,文中算法能较好地匹配三维模型的相似性。  相似文献   

Accurate depth estimation is a challenging, yet essential step in the conversion of a 2D image sequence to a 3D stereo sequence. We present a novel approach to construct a temporally coherent depth map for each image in a sequence. The quality of the estimated depth is high enough for the purpose of2D to 3D stereo conversion. Our approach first combines the video sequence into a panoramic image. A user can scribble on this single panoramic image to specify depth information. The depth is then propagated to the remainder of the panoramic image. This depth map is then remapped to the original sequence and used as the initial guess for each individual depth map in the sequence. Our approach greatly simplifies the required user interaction during the assignment of the depth and allows for relatively free camera movement during the generation of a panoramic image. We demonstrate the effectiveness of our method by showing stereo converted sequences with various camera motions.  相似文献   

节点自定位技术是无线传感器网络的关键技术之一。三维序列重心算法利用锚节点两两之间的垂直平分面将定位空间分为边、面和体三类区域,缩小了未知节点可能存在的范围,并在所在范围内再次求出离未知节点最近三点组成的三角形的重心作为未知点位置的估计。该算法改善了二维序列算法误差较大的问题,且不需要增加硬件设施来实现特殊的功能。仿真结果表明,该算法可以达到较高的定位精度,能够满足三维空间中未知节点定位的应用需要。  相似文献   

红外与可见光视频序列融合算法研究   总被引:1,自引:0,他引:1  
提出了一种基于动态目标区域检测的红外与可见光图像视频序列融合方法;应用改进的混合帧差法对红外图像序列中的目标区域进行检测,并采用一种新的基于非下采样Contourlet变换的图像融合规则,对红外与可见光图像中的目标区域进行融合,并将融合后的目标区域与已配准的可见光图像的背景相结合得到最终的融合图像;实验结果表明相对于其他传统的方法,新算法所得图像的信息熵、标准差和互信息值最大,融合效果要优于其他算法;不仅具有良好的红外图像的目标特征,同时也保留了可见光图像的细节信息,并具有平移不变性以及良好的实时性。  相似文献   

基于长时间视频序列的背景建模方法研究   总被引:1,自引:0,他引:1  
针对现有背景建模算法难以处理场景非平稳变化的问题,提出一种基于长时间视频序列的背景建模方法.该方法包括训练、检索、更新三个主要步骤.在训练部分,首先将长时间视频分段剪辑并计算对应的背景图,然后通过图像降采样和降维找到背景描述子,并利用聚类算法对背景描述子进行分类,生成背景记忆字典.在检索部分,利用前景像素比例设计非平稳状态判断机制,如果发生非平稳变换,则计算原图描述子与背景字典中描述子之间的距离,距离最近的背景描述子对应的背景图片即为此时背景.在更新部分,利用前景像素比例设计更新判断机制,如果前景比例始终过大,则生成新背景,并更新背景字典以及背景图库.当出现非平稳变化时(如光线突变),本算法能够将背景模型恢复问题转化为背景检索问题,确保背景模型的稳定获得.将该框架与短时空域信息背景模型(以ViBe、MOG为例)融合,重点测试非平稳变化场景下的背景估计和运动目标检测结果.在多个视频序列上的测试结果表明,该框架可有效处理非平稳变化,有效改善目标检测效果,显著降低误检率.  相似文献   

文中应用形状分布算法度量由CT数据重建出的三维骨骼的相似度,并提出了改进.基本步骤是:在模型表面选择随机点,计算每两点间的距离,以形状分布函数来构建形状分布直方图,通过比较形状分布直方图来给出不同骨骼间的相似度.实验表明,该方法能够准确度量骨骼的相似性.  相似文献   

文中应用形状分布算法度量由CT数据重建出的三维骨骼的相似度,并提出了改进。基本步骤是:在模型表面选择随机点,计算每两点间的距离,以形状分布函数来构建形状分布直方图,通过比较形状分布直方图来给出不同骨骼间的相似度。实验表明,该方法能够准确度量骨骼的相似性。  相似文献   

