首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Pan  Baiyu  Zhang  Liming  Yin  Hanxiong  Lan  Jun  Cao  Feilong 《Multimedia Tools and Applications》2021,80(13):19179-19201

3D movies/videos have become increasingly popular in the market; however, they are usually produced by professionals. This paper presents a new technique for the automatic conversion of 2D to 3D video based on RGB-D sensors, which can be easily conducted by ordinary users. To generate a 3D image, one approach is to combine the original 2D color image and its corresponding depth map together to perform depth image-based rendering (DIBR). An RGB-D sensor is one of the inexpensive ways to capture an image and its corresponding depth map. The quality of the depth map and the DIBR algorithm are crucial to this process. Our approach is twofold. First, the depth maps captured directly by RGB-D sensors are generally of poor quality because there are many regions missing depth information, especially near the edges of objects. This paper proposes a new RGB-D sensor based depth map inpainting method that divides the regions with missing depths into interior holes and border holes. Different schemes are used to inpaint the different types of holes. Second, an improved hole filling approach for DIBR is proposed to synthesize the 3D images by using the corresponding color images and the inpainted depth maps. Extensive experiments were conducted on different evaluation datasets. The results show the effectiveness of our method.

  相似文献   

2.
RGB-D cameras like PrimeSense and Microsoft Kinect are popular sensors in the simultaneous localization and mapping researches on mobile robots because they can provide both vision and depth information. Most of the state-of-the-art RGB-D SLAM systems employ the Iterative Closest Point (ICP) algorithm to align point features, whose spatial positions are computed by the corresponding depth data of the sensors. However, the depth measurements of features are often disturbed by noise because visual features tend to lie at the margins of real objects. In order to reduce the estimation error, we propose a method that extracts and selects the features with reliable depth values, i.e. planar point features. The planar features can benefit the accuracy and robustness of traditional ICP, while holding a reasonable computation cost for real-time applications. An efficient RGB-D SLAM system based on planar features is also demonstrated, with trajectory and map results from open datasets and a physical robot in real-world experiments.  相似文献   

3.
目的 为了提升高效视频编码(HEVC)的编码效率,使之满足高分辨率、高帧率视频实时编码传输的需求。由分析可知帧内编码单元(CU)的划分对HEVC的编码效率有决定性的影响,通过提高HEVC的CU划分效率,可以大大提升HEVC编码的实时性。方法 通过对视频数据分析发现,视频数据具有较强的时间、空间相关性,帧内CU的划分结果也同样具有较强的时间和空间相关性,可以利用前一帧以及当前帧CU的划分结果进行预判以提升帧内CU划分的效率。据此,本文给出一种帧内CU快速划分算法,先根据视频相邻帧数据的时间相关性和帧内数据空间相关性初步确定当前编码块的编码树单元(CTU)形状,再利用前一帧同位CTU平均深度、当前帧已编码CTU深度以及对应的率失真代价值决定当前编码块CTU的最终形状。算法每间隔指定帧数设置一刷新帧,该帧采用HM16.7模型标准CU划分以避免快速CU划分算法带来的误差累积影响。结果 利用本文算法对不同分辨率、不同帧率的视频进行测试,与HEVC的参考模型HM16.7相比,本文算法在视频编码质量基本不变,视频码率稍有增加的情况下平均可以节省约40%的编码时间,且高分辨率高帧率的视频码率增加幅度普遍小于低分辨率低帧率的视频码率。结论 本文算法在HEVC的框架内,利用视频数据的时间和空间相关性,通过优化帧内CU划分方法,对提升HEVC编码,特别是提高高分辨率高帧率视频HEVC编码的实时性具有重要作用。  相似文献   

4.
Image segmentation is one of the most important topics in the field of computer vision. As a result, many image segmentation approaches have been proposed, and interactive methods based on energy minimization such as GrabCut, have shown successful results. Automating the entire segmentation process is, however, very difficult because virtually all interactive methods require a considerable amount of user interaction. We believe that if additional information is provided to users in order to guide them effectively, the amount of interaction required can be reduced. Consequently, in this paper we propose an efficient foreground extraction algorithm, which utilizes depth information from RGB-D sensors such as Microsoft Kinect and offers users guidance in the foreground extraction process. Our approach can be applied as a pre-processing step for interactive and energy-minimization-based segmentation approaches. Our proposed method is able to segment the foreground from images and give hints that reduce interaction with users. In our method, we make use of the characteristics of depth information captured by RGB-D sensors and describe them using information from the structure tensor. Further, we show experimentally that our proposed method separates foreground from background sufficiently well for real world images.  相似文献   

5.
RGB-D sensors have become in recent years a product of easy access to general users. They provide both a color image and a depth image of the scene and, besides being used for object modeling, they can also offer important cues for object detection and tracking in real time. In this context, the work presented in this paper investigates the use of consumer RGB-D sensors for object detection and pose estimation from natural features. Two methods based on depth-assisted rectification are proposed, which transform features extracted from the color image to a canonical view using depth data in order to obtain a representation invariant to rotation, scale and perspective distortions. While one method is suitable for textured objects, either planar or non-planar, the other method focuses on texture-less planar objects. Qualitative and quantitative evaluations of the proposed methods are performed, showing that they can obtain better results than some existing methods for object detection and pose estimation, especially when dealing with oblique poses.  相似文献   

6.
Although the introduction of commercial RGB-D sensors has enabled significant progress in the visual navigation methods for mobile robots, the structured-light-based sensors, like Microsoft Kinect and Asus Xtion Pro Live, have some important limitations with respect to their range, field of view, and depth measurements accuracy. The recent introduction of the second- generation Kinect, which is based on the time-of-flight measurement principle, brought to the robotics and computer vision researchers a sensor that overcomes some of these limitations. However, as the new Kinect is, just like the older one, intended for computer games and human motion capture rather than for navigation, it is unclear how much the navigation methods, such as visual odometry and SLAM, can benefit from the improved parameters. While there are many publicly available RGB-D data sets, only few of them provide ground truth information necessary for evaluating navigation methods, and to the best of our knowledge, none of them contains sequences registered with the new version of Kinect. Therefore, this paper describes a new RGB-D data set, which is a first attempt to systematically evaluate the indoor navigation algorithms on data from two different sensors in the same environment and along the same trajectories. This data set contains synchronized RGB-D frames from both sensors and the appropriate ground truth from an external motion capture system based on distributed cameras. We describe in details the data registration procedure and then evaluate our RGB-D visual odometry algorithm on the obtained sequences, investigating how the specific properties and limitations of both sensors influence the performance of this navigation method.  相似文献   

7.
Manimaran  S.  Sastry  V. N.  Gopalan  N. P. 《The Journal of supercomputing》2022,78(14):16336-16363

Sensors play a vital role in the smartphone for sensing-enabled mobile activities and applications. Different sources, like mobile applications and websites, access the sensors and use them for various purposes. The user needs permission to access the permission-imposed sensors. Using the generic sensor application programming interface, the user can access the no-permission-imposed sensors directly without any permission. Attackers target these sensors and make the smartphones vulnerable at the application, device and network levels. The attackers access the sensor’s information and use it for different purposes like personal identification number identification and user personal information theft. This paper presents STMAD, a novel allowlist-based intrusion prevention system to mitigate sensor-based threats on smartphones by detecting malicious access of an attacker through different channels. STMAD functions as a lightweight preventive mechanism for all sensors on the smartphone and preventing attackers from accessing sensors maliciously. The experimental results show that the proposed defense mechanism is more efficient and consumes minimal overhead. An informal security analysis also proved that the STMAD protects against various attacks.

  相似文献   

8.
郭磊  王晓东  徐博文  王健 《计算机应用》2018,38(4):1157-1163
针对高效视频编码(HEVC)帧内预测过程中的高计算复杂度问题,提出一种基于纹理特征的预测模式选择和编码单元划分的快速帧内预测算法。利用每一深度层纹理方向强度判断编码单元是否需要进行分割,并且减少候选模式数量。首先,在每一深度层编码单元上结合像素方差,以像素点为单位计算相应的纹理方向强度,确定其纹理复杂度并结合阈值策略预测最终划分深度;其次,比较垂直和水平方向强度关系及统计预测候选模式概率分布,以减少预测模式数量,确定最优候选模式子集,进一步降低编码复杂度。所提算法与平台HM15.0相比,编码时间平均节省51.997%,BDPSNR(Bjontegaard Delta Peak Signal-to-Noise Rate)仅降低0.059 dB,BDBR(Bjontegaard Delta Bit Rate)仅上升了1.018%。实验数据表明,在保证信噪比和比特率基本不变的同时,所提算法能有效降低编码复杂度,利于HEVC的实时视频应用。  相似文献   

9.

This paper presents novel hardware of a unified architecture to compute the 4?×?4, 8?×?8, 16?×?16 and 32?×?32 efficient two dimensional (2-D) integer DCT using one block 1-D DCT for the HEVC standard with less complexity and material design. As HEVC large transforms suffer from the huge number of computations especially multiplications, this paper presents a proposition of a modified algorithm reducing the computational complexity. The goal is to ensure the maximum circuit reuse during the computation while keeping the same quality of encoded videos. The hardware architecture is described in VHDL language and synthesized on Altera FPGA. The hardware architecture throughput reaches a processing rate up to 52 million of pixels per second at 90 MHz frequency clock. An IP core is presented using the embedded video system on a programmable chip (SoPC) for implementation and validation of the proposed design. Finally, the proposed architecture has significant advantages in terms of hardware cost and improved performance compared to related work existing in the literature. This architecture can be used in ultra-high definition real-time TV coding (UHD) applications.

  相似文献   

10.

Video standards are crucial for exchanging video content, enabling a myriad of services and supporting a wide variety of devices ranging from personal devices to clouds and IoT. One of the core requirements in video standards is the rate control that regulates the bit allocation and picture quality. This paper presents an overview of rate control techniques in the HEVC video coding standard. While providing an insight into the rate control mechanism specific to HEVC, it describes the basic operating principle of rate control algorithms, including their essential parameter, outputs, and performance measures. We review rate control in past coding standards and bring out the basic features of HEVC that drive the need for new rate control algorithms. Alongside, we delineate the Rate-Distortion model-based taxonomy of various algorithms, including their classification criteria. The paper gives out another classification of the rate control algorithms based on their basic principle and mechanisms. The article also explains the scalable extension of HEVC, namely SHVC, while highlighting some of the possible SHVC rate control design challenges. Finally, we present some of the unresolved research issues in HEVC rate control and outline possible future research directions.

  相似文献   

11.
Lin  HongWei  Li  Xiangqun  Gao  Mingliang  Deng  Keyan  Xu  Yongsheng 《Multimedia Tools and Applications》2022,81(9):12495-12518

High efficiency video coding (HEVC) has achieved high coding efficiency as the video coding standard. For rate control in HEVC, the conventional R-λ scheme is based on mean absolute difference in allocating bits; however, the scheme does not fully utilize the perceptual importance variation to guide rate control, thus the subjective and objective quality of coded videos has room to improve. Therefore, in this paper, we propose a rate control scheme that considers perceptual importance. We first develop a perceptual importance analysis scheme to accurately abstract the spatial and temporal perceptual importance maps of video contents. The results of the analysis are then used to guide the bit allocation. Utilizing this model, a region-level bit allocation procedure is developed to maintain video quality balance. Subsequently, a largest coding unit (LCU)-level bit allocation scheme is designed to obtain the target bit of each LCU. To achieve a more accurate bitrate, an improved R-λ model based on the Broyden-Fletcher-Goldfarb-Shanno model is utilized to update the R-λ parameter. The experimental results showed that our method not only improved subjective and objective video quality with lower bitrate errors compared to the original RC in HEVC, but also outperformed state-of-the-art methods.

  相似文献   

12.
Recently, high-efficiency video coding (HEVC) has been developed as a new video coding standard focusing on the coding of ultrahigh definition videos as high-resolution and high-quality videos are getting more popular. However, one of the most important challenges in this new standard is its encoding time complexity. Due to this it is quite difficult to implement the HEVC encoder as a real-time system. In this paper, we have addressed this problem in a new way. Generally, for a natural video sequence good amount of coding blocks are “skip” in nature, which need not be transmitted and can be generated in the decoder side using the reference pictures. In this paper, we propose an early skip detection technique for the HEVC. Our proposed method is based on identifying the motionless and homogeneous regions in a video sequence. Moreover, a novel entropy difference-based calculation is proposed in this paper which can predict the skip coding blocks more accurately in a natural video sequence. The experimental result shows our proposed technique can achieve more than 30 % encoding time reduction than the conventional HEVC encoder with negligible degradation in video quality.  相似文献   

13.
零树编码算法是一种有效的图像编码算法,但是噪声会破坏零树结构特性,影响零树编码算法的效率,针对噪声图像提出一种基于多扫描阈值的图像分割编码算法,该算法利用多扫描阈值结构对噪声图像进行软阈值去噪、并完善逐次逼近量化过程;对重要高频子带采用图像分割编码,只对重要系数进行编码, 将大量非重要系数集中成图像块不予编码,更有效地降低码率.实验结果表明,算法有效地去除了噪声,提高了编码图像的质量和编码效率,在相同的压缩比条件下,算法在编码速度、图像复原质量方面都优于EZW算法.  相似文献   

14.
This paper presents an automatic real-time video matting system. The proposed system consists of two novel components. In order to automatically generate trimaps for live videos, we advocate a Time-of-Flight (TOF) camera-based approach to video bilayer segmentation. Our algorithm combines color and depth cues in a probabilistic fusion framework. The scene depth information returned by the TOF camera is less sensitive to environment changes, which makes our method robust to illumination variation, dynamic background and camera motion. For the second step, we perform alpha matting based on the segmentation result. Our matting algorithm uses a set of novel Poisson equations that are derived for handling multichannel color vectors, as well as the depth information captured. Real-time processing speed is achieved through optimizing the algorithm for parallel processing on graphics hardware. We demonstrate the effectiveness of our matting system on an extensive set of experimental results.  相似文献   

15.
基于注意力感知和语义感知的RGB-D室内图像语义分割算法   总被引:1,自引:0,他引:1  
近年来,全卷积神经网络有效提升了语义分割任务的准确率.然而,由于室内环境的复杂性,室内场景语义分割仍然是一个具有挑战性的问题.随着深度传感器的出现,人们开始考虑利用深度信息提升语义分割效果.以往的研究大多简单地使用等权值的拼接或求和操作来融合RGB特征和深度特征,未能充分利用RGB特征与深度特征之间的互补信息.本文提出...  相似文献   

16.
High efficiency video coding (HEVC), the latest international video coding standard, greatly outperforms previous standards such as H.264/AVC in terms of coding bitrate and video quality. The coding efficiency improvement in HEVC is achieved by introducing several new techniques such as recursive quad-tree structure and increased number of intra prediction modes. However, computational load is also increased due to employing the new techniques. In this paper, we propose a solution for fast I-frame coding in HEVC standard using homogeneity of Coding Units (CUs). The proposed solution consists of two stages. In the first stage, we evaluate CU homogeneity by computing a parameter named dominant direction strength and predict CU size by this means. In the second stage, we select 11 modes out of 35 for the specified CU size based on dominant direction of the CU. Experimental results indicate that the proposed method achieves on average 45.8 % reduction on coding time, with very similar coding efficiency as the HEVC reference software. Moreover, we designed tree-stage pipelined architecture for our method which can operate at 235 MHz maximum clock rate which means it can be used for real-time coding of all intra configuration of HEVC videos up to level 6.2.  相似文献   

17.
将帧率变换技术与新型视频压缩编码标准HEVC相结合有利于提升视频的压缩效率。针对直接利用HEVC码流信息中的低帧率视频的运动矢量进行帧率上变换时效果不理想的问题,文中提出了一种基于运动矢量细化的帧率上变换与HEVC结合的视频压缩算法。首先,在编码端对原始视频进行抽帧,降低视频帧率;其次,对低帧率视频进行HEVC编解码;然后,在解码端与从HEVC码流中提取出的运动矢量相结合,利用前向-后向联合运动估计对其进行进一步的细化,使细化后的运动矢量更加接近于对象的真实运动;最后,利用基于运动补偿的帧率上变换技术将视频序列恢复至原始帧率。实验结果表明,与HEVC标准相比,所提算法在同等视频质量下可节省一定的码率。同时,与其他算法相比,在节省码率相同的情况下,所提算法重建视频的PSNR值平均可提升0.5 dB。  相似文献   

18.
针对压缩域视频的运动对象分割在复杂背景下分割精度不高的问题,提出一种基于最新压缩编码HEVC的运动分割方法。首先从HEVC压缩码流中提取块划分和相对应的运动矢量信息,并分别在帧内和帧间对运动矢量进行空域和时域的标签分类,然后利用MRF模型对标签场进行运动一致性估计,得到更精确的运动目标,最后输出MRF分割后形成的掩模信息。通过实验证明,该运动分割方法能够达到有效并可靠的分割效果,尤其对于多目标运动的视频分割效果优于其他比较的方法。  相似文献   

19.
The new video coding standard, High Efficiency Video Coding (HEVC), achieves much higher coding efficiency than the state-of-the-art H.264. Transcoding H.264 video to HEVC video is important to enable gradual migration to HEVC. Therefore, a fast H.264 to HEVC transcoding algorithm based on region feature analysis is proposed. First, each frame is segmented into three regions in units of coding tree unit (CTU) based on the correlation between image coding complexities and coding bits of the H.264 source stream. Then the searching depth range of each CTU is adaptively decided according to the region type. After that, motion vectors are de-noise filtered and clustered in order to analyze the region features of coding unit (CU). Based on the analysis results, the minimum searching depth of CU and partitions of prediction unit (PU) are optimally selected, and the motion vector predictor and search window size of motion estimation are also optimally decided for further reduction of the computational complexity. Experimental results show that the proposed algorithm achieves a significant improvement on transcoding speed, while maintaining high Rate-Distortion performance.  相似文献   

20.
The latest video coding standard High Efficiency Video Coding (HEVC) can achieve much higher coding efficiency than previous video coding standards. Particularly, by exploiting the hierarchical B-picture prediction structure, temporal redundancy among neighbor frames is eliminated remarkably well. In practice, videos available to consumers usually contain many repeated shots, such as TV series, movies, and talk shows. According to our observations, when these videos are encoded by HEVC with the hierarchical B-picture structure, the temporal correlation in each shot is well exploited. However, the long-term correlation between repeated shots has not been used. We propose a long-term prediction (LTP) scheme to use the long-term temporal correlation between correlated shots in a video. The long-term reference (LTR) frames of a source video are chosen by clustering similar shots and extracting the representative frames, and a modified hierarchical B-picture coding structure based on an LTR frame is introduced to support long-term temporal prediction. An adaptive quantization method is further designed for LTR frames to improve the overall video coding efficiency. Experimental results show that up to 22.86% coding gain can be achieved using the new coding scheme.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号