首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 119 毫秒
1.
胡正平  张敏姣  邱悦  潘佩云  郑媛 《信号处理》2019,35(7):1180-1190
针对行人姿势、外部遮挡、光照强度和摄像设备等内外部条件变化导致的行人再识别率较低的问题,提出时空特征结合相机关联自适应特征增强-MFA的视频行人再识别算法。本文首先基于视频提取时空梯度方向直方图(HOG3D)特征,基于图像提取表观特征,然后将两者结合作为视频行人目标的特征描述子,从而提高特征描述有效性;距离度量时将特征进行自适应特征增强后再作边际费希尔分析(Marginal Fisher Analysis, MFA),增强共性特征之间的联系,进一步提高距离度量阶段对特征的判别性。基于iLIDS-VID 和PRID 2011两大视频行人数据集讨论加入时空梯度方向直方图特征和相机关联自适应特征增强的算法性能提升,多组实验结果表明,该算法能够充分利用视频中包含的运动信息,得到鲁棒的视频行人再识别匹配模型,提高行人再识别的匹配精度。   相似文献   

2.
基于统计推断的行人再识别算法   总被引:1,自引:0,他引:1  
行人再识别是指给定一张行人图像,在已有的可能来源于非交叠摄像机视场的行人图像库中,识别出与此人相同的图像。研究该问题有着非常重要的现实意义,同时也面临许多挑战。该文提出一种基于统计推断的行人再识别算法。该算法从统计推断的角度出发学习两幅行人图像的相似度度量函数,利用此函数从行人图像库中搜索待查询的人。在公共数据集VIPeR上的实验表明,该算法性能优于已有的行人再识别算法,学习相似度度量函数的时间花销明显少于已有的基于学习的算法,并且在只有少量训练样本时,缓解了学习相似度度量函数的过拟合问题。  相似文献   

3.
《现代电子技术》2020,(5):36-41
受到光照、视角、姿态等因素的影响,跨摄像机的行人再识别是一项相当具有挑战性的研究工作。为了进一步提升行人再识别的匹配精度,设计了更具判别性的特征表达,即增强局部最大出现频次(eLOMO)描述子,并提出基于提升方法融合多个距离度量的匹配模型。在提取eLOMO特征时采用从水平条与密集网格两种不同的尺度中提取颜色与纹理特征,从而获得更具判别性的行人外观描述子。在匹配模型上,采用自适应提升(AdaBoost)方法来融合多种距离度量学习模型的优势,从而实现对行人外观的匹配。在行人再识别公共数据集VIPeR和PRID450S上的实验结果表明,该方法能够有效地提升行人再识别的性能。  相似文献   

4.
随着视频监控设备的广泛应用,行人再识别成为智能视频监控中的关键任务,具有广阔的应用前景。该文提出一种基于深度分解网络前景提取和映射模型学习的行人再识别算法。首先利用DDN模型对行人图像进行前景分割,然后提取前景图像的颜色直方图特征和原图像的Gabor纹理特征,利用提取的行人特征,学习不同摄像机之间的交叉映射模型,最后通过学习的映射模型将查寻集和候选集中的行人特征变换到一个特征分布较为一致的空间中,进行距离度量和排序。实验证明该算法能够提取较为鲁棒的行人特征,可克服背景干扰问题,行人再识别匹配率得到有效的提高。   相似文献   

5.
由于目标姿态、摄像头角度、光线条件等因素的影响,行人重识别仍然是一个具有挑战性的问题。目前大多数方法主要注重提高重识别精度,对实时性考虑较少。因此,本文提出了一种基于增强聚合通道特征(ACF)的实时行人重识别算法。利用ACF对行人进行检测,并在此基础上,结合直方图特征和纹理特征构成增强ACF,作为行人重识别的特征描述子。利用测度学习方法对重识别模型进行训练。在4个数据集上的实验结果表明,与传统的重识别特征相比,提出的特征描述子逼近最好的重识别准确率,并且具有更快的计算速度。整个行人检测与重识别系统的运行速度达到10 frame·s^(-1)以上,基本可以满足实时行人重识别的需求。  相似文献   

6.
张述照  阮秋琦  安高云 《信号处理》2014,30(11):1279-1285
为了有效的克服遮挡问题准确跟踪行人,本文提出了一种通过不断学习新的外观模型来自适应跟踪行人的跟踪算法。该算法首先将颜色不变量特征平面作为根特征来表示初始特征空间;然后将跟踪问题转化为0或1的二进制问题,通过局部最小二乘法(PLS)来对目标外观特征和对应的类型标签进行建模得到前景和背景的模板。随着目标外观的变化,本文利用局部最小二乘法(PLS)在颜色不变量平面上分析多个外观特征的样本信息,不断的更新模板,从而达到对遮挡具有很好鲁棒性的行人跟踪效果。通过对通用数据集进行试验表明:该算法在颜色暗淡和颜色鲜明的视频图片中都能达到很好的跟踪效果。   相似文献   

7.
针对2维激光雷达获取的点云信息,在类圆弧人腿形状识别算法的基础上,提出了一种可以降低行人识别错误率的多算法组合。该方法先采用高斯滤波算法降低噪声的影响,然后利用近邻聚类算法对数据进行聚类处理,再利用组合的聚类中心角算法和最小二乘圆拟合算法对聚类后的数据进行行人腿部检测,完成行人的识别。该算法混合利用LabVIEW和matlab 软件平台,并使用激光雷达对现场的行人进行了识别验证,该多算法组合与单一使用一种识别算法相比,行人识别的错误率由40%降低到了10%以下,充分说明该算法组合具有良好的性能。  相似文献   

8.
针对复杂背景下的行人检测,易受环境的干扰(尤其是树木)而导致识别不准确以及行人特征提取困难等问题:根据双目视觉获取图像深度信息,文章提出了一种基于双目视觉的目标深度定位的方法,减少了传统式特征描述的运算,获得了感兴趣区域;然后引入卷积神经网络的行人模型,对感兴趣区域进行精确识别,从而达到行人检测的目的。模型使用INRIA行人数据库和MIT数据库进行混合训练,深度学习行人的轮廓特征,形成行人模型来对感兴趣区域判定。实验表明,文章提出的方法确实有效,优于一般的深度学习模型的行人检测以及传统的行人检测方法。  相似文献   

9.
为了提高行人重识别距离度量MLAPG算法的鲁棒性,该文提出基于等距度量学习策略的行人重识别Equid-MLAPG算法。 MLAPG算法中正负样本对在映射空间的分布不均衡导致间距超参数受负样本对距离影响更大,因此该文设计的Equid-MLAPG算法要求正样本对映射成为变换空间中的一个点,即正样本对在变换空间中距离为零,使算法收敛时正负样本对距离分布不存在交叉部分。实验表明Equid-MLAPG算法能在常用的行人重识别数据集上取得良好的实验效果,具有更好的识别率和广泛的适用性。  相似文献   

10.
《现代电子技术》2019,(10):175-178
针对行人再识别过程中,光照、摄像机设置等因素影响行人图像颜色以及在提取图像特征时丢失部分图像细节的问题,提出一种基于重叠条纹特征融合的行人再识别方法。在提取特征前,对图像进行重叠条纹分割,对所分割的条纹提取HSV颜色直方图和Gabor纹理特征直方图,HSV颜色直方图可以增强图像颜色信息的鉴别性,而重叠条纹分割方法解决丢失图像细节问题,Gabor纹理特征对图像的边缘敏感,增加图像的细节信息,融合所提取的图像特征,形成特征描述子;然后用交叉视角逻辑度量学习算法进行识别;最后在VIPER和GRID图像库上进行实验,rank1分别达到了31.68%和16.32%,rank10和rank20也有明显提高。结果表明所提方法能够提高行人再识别的识别率。  相似文献   

11.
To determine the three-dimensional (3-D) shape of a live embryo is a technically challenging task. The authors show that reconstructions of live embryos can be done by collecting images from different viewing angles using a robotic macroscope, establishing point correspondences between these views by block matching, and using a new 3-D reconstruction algorithm that accommodates camera positioning errors. The algorithm assumes that the images are orthographic projections of the object and that the camera scaling factors are known. Point positions and camera errors are found simultaneously. Reconstructions of test objects and embryos show that meaningful reconstructions are possible only when camera positioning and alignment errors are accommodated since these errors can be substantial. Reconstructions of early-stage axolotl embryos were made from sets of 33 images. In a typical reconstruction, 781 points, each visible in at least three different views, were used to form 1511 triangles to represent the embryo surface. The resulting reconstruction had a mean radius of error of 0.27 pixels (1.1 μm). Mathematical properties of the reconstruction algorithm are identified and discussed  相似文献   

12.
邵枫  蒋刚毅  郁梅 《光电子快报》2009,5(3):232-235
Color and geometry inconsistency between different views is an urgent problem in multi-view imaging applications. In this paper, we present a color correction and geometric calibration method for multi-view images on the basis of feature correspondences between views. First, keypoints in views are detected by using scale invariant feature transform, and accurately matched by bi-directional feature matching between difference views. Then multiplicative and additive errors between matching keypoints are calculated to achieve color correction. In addition, an affine transformation between minimum cost matching keypoints is established to achieve geometric calibration. The experimental results verify the effectiveness of the proposed method in color correction and geometric calibration, and a higher coding efficiency is obtained.  相似文献   

13.
Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.  相似文献   

14.
张国山  张培崇  王欣博 《红外与激光工程》2018,47(2):203004-0203004(9)
场景外观剧烈变化引起的感知偏差和感知变异给视觉场景识别带来了很大的挑战。现有的利用卷积神经网络(CNN)的视觉场景识别方法大多数直接采用CNN特征的距离并设置阈值来衡量两幅图像之间的相似性,当场景外观剧烈变化时效果较差,为此提出了一种新的基于多层次特征差异图的视觉场景识别方法。首先,一个在场景侧重的数据集上预训练的CNN模型被用来对同一场景中感知变异的图像和不同场景中感知偏差的图像进行特征提取。然后,根据CNN不同层特征具有的不同特性,融合多层CNN特征构建多层次特征差异图来表征两幅图像之间的差异。最后,视觉场景识别被看作二分类问题,利用特征差异图训练一个新的CNN分类模型来判断两幅图像是否来自同一场景。实验结果表明,由多层CNN特征构建的特征差异图能很好地反映两幅图像之间的差异,文中提出的方法能有效地克服感知偏差和感知变异,在场景外观剧烈变化下取得很好的识别效果。  相似文献   

15.
This work presents a novel method for the visual servoing control problem based on second-order conic optimization. Special cases of the proposed method provide similar results as those obtained by the position-based and image-based visual servoing methods. The goal in our approach is to minimize both the end-effector trajectory in the Cartesian space and image feature trajectories simultaneously. For this purpose, a series of second-order conic optimization problems is solved. Each problem starts from the current camera pose and finds the camera velocity as well as the next camera pose such that (1) the next camera pose is as close as possible to the line connecting the initial and desired camera poses, and (2) the next feature points are as close as possible to the corresponding lines connecting the initial and desired feature points. To validate our approach, we provide simulations and experimental results for several different camera configurations.  相似文献   

16.
In multi-view video, a number of cameras capture the same scene from different viewpoints. Color variations between the camera views may deteriorate the performance of multi-view video coding or virtual view rendering. In this paper, a fast color correction method for multi-view video is proposed by modeling spatio-temporal variation. In the proposed method, multi-view keyframes are defined to establish the spatio-temporal relationships for accurate and fast implementation. For keyframes, accurate color correction is performed based on spatial color discrepancy model that disparity estimation is used to find correspondence points between views, and linear regression is performed on these sets of points to find the optimal correction coefficients. For non-keyframes, fast color correction is performed based on temporal variations model that time-invariant regions are detected to reflect the change trends of correction coefficients. Experimental results show that compared with other methods, the proposed method can promote the correction speed greatly without noticeable quality degradation, and obtain higher coding performance.  相似文献   

17.
For the robust detection of pedestrians in intelligent video surveillance, an approach to multi-view and multi-plane data fusion is proposed. Through the estimated homography, foreground regions are projected from multiple camera views to a reference view. To identify false-positive detections caused by foreground intersections of non-corresponding objects, the homographic transformations for a set of parallel planes, which are from the head plane to the ground, are applied. Multiple features including occupancy information and colour cues are extracted from such planes for joint decision-making. Experimental results on real world sequences have demonstrated the good performance of the proposed approach in pedestrian detection for intelligent visual surveillance.  相似文献   

18.
In this article, we present some simple and effective techniques for accurately calibrating a multi-camera acquisition system. The proposed methods were proven to be capable of accurate results even when using very simple calibration target sets and low-cost imaging devices, such as standard TV-resolution cameras connected to commercial frame-grabbers. In fact, the performance of our calibration approach yielded results that were about the same as that of other traditional calibration methods based on large 3D target sets. The proposed calibration strategy is based on a multi-view multi-camera approach. This was based on the analysis of a number of views of a simple calibration target-set placed in different (unknown) positions. Furthermore, the method is based on a self-calibration approach, which can refine the a priori knowledge of the world coordinates of the targets (even when such information is very poor) while estimating the parameters of the camera model. Finally we proposed a method, to make the calibration technique adaptive through the analysis of natural scene features, allowing the camera parameters to hold accurate throughout the acquisition session in the presence of parameter drift  相似文献   

19.
Demosaicing, or color filter array (CFA) interpolation, estimates missing color channels of raw mosaiced images from a CFA to reproduce full‐color images. It is an essential process for single‐sensor digital cameras with CFAs. In this paper, a new demosaicing method for digital cameras with Bayer‐like W‐RGB CFAs is proposed. To preserve the edge structure when reproducing full‐color images, we propose an edge direction–adaptive method using color difference estimation between different channels, which can be applied to practical digital camera use. To evaluate the performance of the proposed method in terms of CPSNR, FSIM, and S‐CIELAB color distance measures, we perform simulations on sets of mosaiced images captured by an actual prototype digital camera with a Bayer‐like W‐RGB CFA. The simulation results show that the proposed method demosaics better than a conventional one by approximately +22.4% CPSNR, +0.9% FSIM, and +36.7% S‐CIELAB distance.  相似文献   

20.
Multiview super resolution image reconstruction (SRIR) is often cast as a resampling problem by merging non-redundant data from multiple images on a finer grid, while inverting the effect of the camera point spread function (PSF). One main problem with multiview methods is that resampling from nonuniform samples (provided by multiple images) and the inversion of the PSF are highly nonlinear and ill-posed problems. Non-linearity and ill-posedness are typically overcome by linearization and regularization, often through an iterative optimization process, which essentially trade off the very same information (i.e. high frequency) that we want to recover. We propose a different point of view for multiview SRIR that is very much like single-image methods which extrapolate the spectrum of one image selected as reference from among all views. However, for this, the proposed method relies on information provided by all other views, rather than prior constraints as in single-image methods which may not be an accurate source of information. This is made possible by deriving explicit closed-form expressions that define how the local high frequency information that we aim to recover for the reference high resolution image is related to the local low frequency information in the sequence of views. The locality of these expressions due to modeling using wavelets reduces the problem to an exact and linear set of equations that are well-posed and solved algebraically without requiring regularization or interpolation. Results and comparisons with recently published state-of-the-art methods show the superiority of the proposed solution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号