首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Pan  Baiyu  Zhang  Liming  Yin  Hanxiong  Lan  Jun  Cao  Feilong 《Multimedia Tools and Applications》2021,80(13):19179-19201

3D movies/videos have become increasingly popular in the market; however, they are usually produced by professionals. This paper presents a new technique for the automatic conversion of 2D to 3D video based on RGB-D sensors, which can be easily conducted by ordinary users. To generate a 3D image, one approach is to combine the original 2D color image and its corresponding depth map together to perform depth image-based rendering (DIBR). An RGB-D sensor is one of the inexpensive ways to capture an image and its corresponding depth map. The quality of the depth map and the DIBR algorithm are crucial to this process. Our approach is twofold. First, the depth maps captured directly by RGB-D sensors are generally of poor quality because there are many regions missing depth information, especially near the edges of objects. This paper proposes a new RGB-D sensor based depth map inpainting method that divides the regions with missing depths into interior holes and border holes. Different schemes are used to inpaint the different types of holes. Second, an improved hole filling approach for DIBR is proposed to synthesize the 3D images by using the corresponding color images and the inpainted depth maps. Extensive experiments were conducted on different evaluation datasets. The results show the effectiveness of our method.

  相似文献   

2.
Multimedia Tools and Applications - With the advent of stereo camera saliency object detection for RGB-D image is attracting more and more interest. Most existing algorithms treat RGB-D image as...  相似文献   

3.
This paper proposes a two-stage system for text detection in video images. In the first stage, text lines are detected based on the edge map of the image leading in a high recall rate with low computational time expenses. In the second stage, the result is refined using a sliding window and an SVM classifier trained on features obtained by a new Local Binary Pattern-based operator (eLBP) that describes the local edge distribution. The whole algorithm is used in a multiresolution fashion enabling detection of characters for a broad size range. Experimental results, based on a new evaluation methodology, show the promising overall performance of the system on a challenging corpus, and prove the superior discriminating ability of the proposed feature set against the best features reported in the literature.  相似文献   

4.
3D object detection is a critical part of environmental perception systems and one of the most fundamental tasks in understanding the 3D visual world, which benefit a series of downstream real-world applications. RGB-D images include object texture and semantic information, as well as depth information describing spatial geometry. Recently, numerous 3D object detection models for RGB-D images have been proposed with excellent performance, but summaries in this area are still absent. To stimulate future research, this paper provides a detailed analysis of current developments in 3D object detection methods for RGB-D images to motivate future research. It covers three major parts, including background on 3D object detection, RGB-D data details, and comparative results of state-of-the-art methods on several publicly available datasets, with an emphasis on contributions, design ideas, and limitations, as well as insightful observations and inspiring future research directions.  相似文献   

5.
提出了一种监控场景下的面部遮挡检测方法。基于AdaBoost算法进行人脸验证,通过面部划分,分块分析是否存在遮挡情况。首先判断是否有人进入,在有人进入的情况下进行面部遮挡检测,对眼部区域采用AdaBoost方法及墨镜特征提取方法判断是否遮挡,而对嘴部区域采用高斯肤色模型进行判断。实验结果表明,该方法能实时检测面部遮挡的情况,并达到了较好的效果,适用于银行ATM等监控场景,具有较高的应用价值。  相似文献   

6.
Yang  Ning  Zhang  Chen  Zhang  Yumo  Yang  Haowei  Du  Ling 《Multimedia Tools and Applications》2022,81(25):35831-35842
Multimedia Tools and Applications - Within-image co-salient object detection (wCoSOD) identifies the common and salient objects within an image, which can benefit for many applications, such as...  相似文献   

7.
目的 视觉显著性在众多视觉驱动的应用中具有重要作用,这些应用领域出现了从2维视觉到3维视觉的转换,从而基于RGB-D数据的显著性模型引起了广泛关注。与2维图像的显著性不同,RGB-D显著性包含了许多不同模态的线索。多模态线索之间存在互补和竞争关系,如何有效地利用和融合这些线索仍是一个挑战。传统的融合模型很难充分利用多模态线索之间的优势,因此研究了RGB-D显著性形成过程中多模态线索融合的问题。方法 提出了一种基于超像素下条件随机场的RGB-D显著性检测模型。提取不同模态的显著性线索,包括平面线索、深度线索和运动线索等。以超像素为单位建立条件随机场模型,联合多模态线索的影响和图像邻域显著值平滑约束,设计了一个全局能量函数作为模型的优化目标,刻画了多模态线索之间的相互作用机制。其中,多模态线索在能量函数中的权重因子由卷积神经网络学习得到。结果 实验在两个公开的RGB-D视频显著性数据集上与6种显著性检测方法进行了比较,所提模型在所有相关数据集和评价指标上都优于当前最先进的模型。相比于第2高的指标,所提模型的AUC(area under curve),sAUC(shuffled AUC),SIM(similarity),PCC(Pearson correlation coefficient)和NSS(normalized scanpath saliency)指标在IRCCyN数据集上分别提升了2.3%,2.3%,18.9%,21.6%和56.2%;在DML-iTrack-3D数据集上分别提升了2.0%,1.4%,29.1%,10.6%,23.3%。此外还进行了模型内部的比较,验证了所提融合方法优于其他传统融合方法。结论 本文提出的RGB-D显著性检测模型中的条件随机场和卷积神经网络充分利用了不同模态线索的优势,将它们有效融合,提升了显著性检测模型的性能,能在视觉驱动的应用领域发挥一定作用。  相似文献   

8.
为自动有效地获取交通监控场景中的多车道信息,提出一种利用骨架化边缘的多车道检测算法,以克服视频处理对固定场景和明确的先验车道位置信息的依赖。算法主要针对静态的交通背景图处理,采用背景提取、滤波和数字形态学预处理等,由Hough变换确定车道位置的骨架线;由行车方向约束车道线角度,利用车道线几何成像特性检测出准车道线,获取车道线和车道区域。实验表明,对不同的交通场景和不同光照条件,该方法能有效检测多车道,鲁棒性强,具有较高的工程应用价值。  相似文献   

9.
目标识别是实现视频监控智能分析的基础,但在光照、阴影以及杂乱背景等场景中,往往会出现目标误判以及不合理聚类等问题。针对上述问题,提出一种基于人类视觉系统(HVS)的视频监控目标提取方法,结合HVS视觉关注原理,优化背景差法检测结果中存在的重复检测和错误分割问题,并根据HVS的跟踪特点以及目标运动的连续性,结合相邻帧检测结果,达到目标区域的完整准确提取;最后,基于实际采集视频进行仿真实验,证明所提目标检测算法结果准确性更高,在复杂背景下也有良好的检测效果。  相似文献   

10.
Variation in illumination conditions caused by weather, time of day, etc., makes the task difficult when building video surveillance systems of real world scenes. Especially, cast shadows produce troublesome effects, typically for object tracking from a fixed viewpoint, since it yields appearance variations of objects depending on whether they are inside or outside the shadow. In this paper, we handle such appearance variations by removing shadows in the image sequence. This can be considered as a preprocessing stage which leads to robust video surveillance. To achieve this, we propose a framework based on the idea of intrinsic images. Unlike previous methods of deriving intrinsic images, we derive time-varying reflectance images and corresponding illumination images from a sequence of images instead of assuming a single reflectance image. Using obtained illumination images, we normalize the input image sequence in terms of incident lighting distribution to eliminate shadowing effects. We also propose an illumination normalization scheme which can potentially run in real time, utilizing the illumination eigenspace, which captures the illumination variation due to weather, time of day, etc., and a shadow interpolation method based on shadow hulls. This paper describes the theory of the framework with simulation results and shows its effectiveness with object tracking results on real scene data sets.  相似文献   

11.
3D object pose estimation for grasping and manipulation is a crucial task in robotic and industrial applications. Robustness and efficiency for robotic manipulation are desirable properties that are still very challenging in complex and cluttered scenes, because 3D objects have different appearances, illumination and occlusion when seen from different viewpoints. This article proposes a Semantic Point Pair Feature (PPF) method for 3D object pose estimation, which combines the semantic image segmentation using deep learning with the voting-based 3D object pose estimation. The Part Mask RCNN ispresented to obtain the semantic object-part segmentation related to the point cloud of object, which is combined with the PPF method for 3D object pose estimation. In order to reduce the cost of collecting datasets in cluttered scenes, a physically-simulated environment is constructed to generate labeled synthetic semantic datasets. Finally, two robotic bin-picking experiments are demonstrated and the Part Mask RCNN for scene segmentation is evaluated through the constructed 3D object datasets. The experimental results show that the proposed Semantic PPF methodimproves the robustness and efficiency of 3D object pose estimation in cluttered scenes with partial occlusions.  相似文献   

12.
13.
In this paper, we propose a context-sensitive technique for unsupervised change detection in multitemporal remote sensing images. The technique is based on fuzzy clustering approach and takes care of spatial correlation between neighboring pixels of the difference image produced by comparing two images acquired on the same geographical area at different times. Since the ranges of pixel values of the difference image belonging to the two clusters (changed and unchanged) generally have overlap, fuzzy clustering techniques seem to be an appropriate and realistic choice to identify them (as we already know from pattern recognition literatures that fuzzy set can handle this type of situation very well). Two fuzzy clustering algorithms, namely fuzzy c-means (FCM) and Gustafson-Kessel clustering (GKC) algorithms have been used for this task in the proposed work. For clustering purpose various image features are extracted using the neighborhood information of pixels. Hybridization of FCM and GKC with two other optimization techniques, genetic algorithm (GA) and simulated annealing (SA), is made to further enhance the performance. To show the effectiveness of the proposed technique, experiments are conducted on two multispectral and multitemporal remote sensing images. A fuzzy cluster validity index (Xie-Beni) is used to quantitatively evaluate the performance. Results are compared with those of existing Markov random field (MRF) and neural network based algorithms and found to be superior. The proposed technique is less time consuming and unlike MRF does not require any a priori knowledge of distributions of changed and unchanged pixels.  相似文献   

14.
The change-detection problem can be viewed as an unsupervised classification problem with two classes corresponding to changed and unchanged areas. Image differencing is a widely used approach to change detection. It is based on the idea of generating a difference image that represents the modulus of the spectral change vectors associated with each pixel in the study area. To separate out the changed and unchanged classes in the difference image automatically, any unsupervised technique can be used. Thresholding is one of the cheapest techniques among them. However, in thresholding approaches, selection of the best threshold value is not a trivial task. In this work, several non-fuzzy and fuzzy histogram thresholding techniques are investigated and compared for the change-detection problem. Experimental results, carried out on different multitemporal remote sensing images (acquired before and after an event), are used to assess the effectiveness of each of the thresholding techniques. Among all the thresholding techniques investigated here, Liu's fuzzy entropy followed by Kapur's entropy are found to be the most robust techniques.  相似文献   

15.
Multimedia Tools and Applications - With the emergence of consumer RGB-D sensors, discriminative modeling has been shown to perform well in estimating human body pose. However, articulated hand...  相似文献   

16.
为解决监视视频实时分析应用中行人检测效率低的问题,提出一种快速行人检测方法。首先,采用运动侦测方法提取运动区域,并结合行人检测要求对运动区域进行尺寸扩展、归一化和拼接操作;然后,在拼接图像上结合积分图快速提取各运动区域的Haar特征,并采用双支持向量机实现快速的特征分类;最后,结合包围盒相交策略进行帧间滤波,降低行人误检现象。实验表明,本文方法不仅可以实时检测行人目标,而且检测错误率低于现有主流方法。  相似文献   

17.
针对传统火灾火焰探测技术存在不稳定、误判率高的缺点,提出了一种基于人工神经网络的火焰检测与识别算法。通过分析火焰图像的动态特性,利用火焰图像序列的离心率、放射性和整体移动等特征信息,结合学习向量量化(LVQ)神经网络进行训练仿真。实验结果表明,该算法能有效提高监控视频图像中可疑火焰的快速分类,稳定性强,具有较高的火焰识别准确率。  相似文献   

18.
This study proposes a superpixel-based active contour model (SACM) for unsupervised change detection from satellite images. The accuracy of change detection produced by the traditional active contour model suffers from the trade-off parameter. The SACM is designed to address this limitation through the incorporation of the spatial and statistical information of superpixels. The proposed method mainly consists of three steps. First, the difference image is created with change vector analysis method from two temporal satellite images. Second, statistical region merging method is applied on the difference image to produce a superpixel map. Finally, SACM is designed based on the superpixel map to detect changes from the difference image. The SACM incorporates spatial and statistical information and retains the accurate shapes and outlines of superpixels. Experiments were conducted on two data sets, namely Landsat-7 Enhanced Thematic Mapper Plus and SPOT 5, to validate the proposed method. Experimental results show that SACM reduces the effects of the trade-off parameter. The proposed method also increases the robustness of the traditional active contour model for input parameters and improves its effectiveness. In summary, SACM often outperforms some existing methods and provides an effective unsupervised change detection method.  相似文献   

19.
In this paper, we present a real-time image processing technique for the detection of steam in video images. The assumption made is that the presence of steam acts as a blurring process, which changes the local texture pattern of an image while reducing the amount of details. The problem of detecting steam is treated as a supervised pattern recognition problem. A statistical hidden Markov tree (HMT) model derived from the coefficients of the dual-tree complex wavelet transform (DT-CWT) in small 48×48 local regions of the image frames is used to characterize the steam texture pattern. The parameters of the HMT model are used as an input feature vector to a support vector machine (SVM) technique, specially tailored for this purpose. By detecting and determining the total area covered by steam in a video frame, a computerized image processing system can automatically decide if the frame can be used for further analysis. The proposed method was quantitatively evaluated by using a labelled image data set with video frames sampled from a real oil sand video stream. The classification results were 90% correct when compared to human labelled image frames. The technique is useful as a pre-processing step in automated image processing systems.  相似文献   

20.
With the advancement in digital video technology, video surveillance has been playing its vital role for ensuring safety and security. The surveillance systems are deployed in wide range of applications to invigilate stuffs and to analyse the activities in the environment. From the single or multi surveillance camera, a huge amount of data is generated, stored and processed for security purpose. Due to time constraints, it is a very tedious process for an analyst to go through the full content. This limitation has been overcome by the use of video summarization. The video summarization is intended to afford comprehensible analysis of video by removing duplications and extracting key frames from the video. To make an easily interpreted outline, the various available video summarization methods will try to shot the summary of the main occurrences, scenes, or objects in a frame. Depending on the applications, it is required to summarize the happenings in the scene and detect the objects (static/dynamic) which is recorded in the video. Hence this paper provides the various methods used for video summarization and a comparative study of different techniques. It also presents different object detection, object classification and object tracking algorithms available in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号