首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
Guaranteeing interoperability between devices and applications is the core role of standards organizations. Since its first JPEG standard in 1992, the Joint Photographic Experts Group (JPEG) has published several image coding standards that have been successful in a plethora of imaging markets. Recently, these markets have become subject to potentially disruptive innovations owing to the rise of new imaging modalities such as light fields, point clouds, and holography. These so‐called plenoptic modalities hold the promise of facilitating a more efficient and complete representation of 3D scenes when compared to classic 2D modalities. However, due to the heterogeneity of plenoptic products that will hit the market, serious interoperability concerns have arisen. In this paper, we particularly focus on the holographic modality and outline how the JPEG committee has addressed these tremendous challenges. We discuss the main use cases and provide a preliminary list of requirements. In addition, based on the discussion of real‐valued and complex data representations, we elaborate on potential coding technologies that range from approaches utilizing classical 2D coding technologies to holographic content‐aware coding solutions. Finally, we address the problem of visual quality assessment of holographic data covering both visual quality metrics and subjective assessment methodologies.  相似文献   

2.
Video coding technologies have played a major role in the explosion of large market digital video applications and services. In this context, the very popular MPEG-x and H-26x video coding standards adopted a predictive coding paradigm, where complex encoders exploit the data redundancy and irrelevancy to ‘control’ much simpler decoders. This codec paradigm fits well applications and services such as digital television and video storage where the decoder complexity is critical, but does not match well the requirements of emerging applications such as visual sensor networks where the encoder complexity is more critical. The Slepian–Wolf and Wyner–Ziv theorems brought the possibility to develop the so-called Wyner–Ziv video codecs, following a different coding paradigm where it is the task of the decoder, and not anymore of the encoder, to (fully or partly) exploit the video redundancy. Theoretically, Wyner–Ziv video coding does not incur in any compression performance penalty regarding the more traditional predictive coding paradigm (at least for certain conditions). In the context of Wyner–Ziv video codecs, the so-called side information, which is a decoder estimate of the original frame to code, plays a critical role in the overall compression performance. For this reason, much research effort has been invested in the past decade to develop increasingly more efficient side information creation methods. This paper has the main objective to review and evaluate the available side information methods after proposing a classification taxonomy to guide this review, allowing to achieve more solid conclusions and better identify the next relevant research challenges. After classifying the side information creation methods into four classes, notably guess, try, hint and learn, the review of the most important techniques in each class and the evaluation of some of them leads to the important conclusion that the side information creation methods provide better rate-distortion (RD) performance depending on the amount of temporal correlation in each video sequence. It became also clear that the best available Wyner–Ziv video coding solutions are almost systematically based on the learn approach. The best solutions are already able to systematically outperform the H.264/AVC Intra, and also the H.264/AVC zero-motion standard solutions for specific types of content.  相似文献   

3.
林森  赵振禹  任晓奎  陶志勇 《红外与激光工程》2022,51(8):20210702-1-20210702-12
3D点云数据处理在物体分割、医学图像分割和虚拟现实等领域起到了重要作用。然而现有3D点云学习网络全局特征提取范围小,难以描述局部高级语义信息,进而导致点云特征表述不完整。针对这些问题,提出一种基于语义信息补偿全局特征的物体点云分类分割网络。首先,将输入的点云数据对齐到规范空间,进行数据的输入转换预处理。然后,利用扩张边缘卷积模块提取转换后数据的每一层特征,并叠加生成全局特征。而在局部特征提取时,利用提取到的低级语义信息来描述高级语义信息和有效几何特征,用于补偿全局特征中遗漏的点云特征。最后,融合全局特征和局部高级语义信息得到点云的整体特征。实验结果表明,文中方法在分类和分割性能上优于目前经典和新颖的算法。  相似文献   

4.
针对激光雷达点云的稀疏性和空间离散分布的特点,通过结合体素划分和图表示方法设计了新的图卷积特征提取模块,提出一种基于体素化图卷积神经网络的激光雷达三维点云目标检测算法。该方法通过消除传统3D卷积神经网络的计算冗余性,不仅提升了网络的目标检测能力,并且提高了点云拓扑信息的分析能力。文中设计的方法在KITTI公开数据集的车辆、行人、骑行者的3D目标检测和鸟瞰图目标检测任务的检测性能相比基准网络均有了有效提升,尤其在车辆3D目标检测任务上最高提升了13.75%。实验表明:该方法采用图卷积特征提取模块有效提高了网络整体检测性能和数据拓扑关系的学习能力,为三维点云目标检测任务提供了新的方法。  相似文献   

5.
系统分析云环境中数据确定性删除面临的主要挑战,指出云计算虚拟化与多租户的特征,以及租赁、按需交付的商业模式是云环境中存在诸多安全问题需要确定性删除服务的根本原因,并给出云数据确定性删除的深层次含义;面向安全的角度从基于可信执行环境的确定性删除、基于密钥管理的确定性删除和基于访问控制策略的确定性删除3个方面对近年来相关研究工作进行深入分析和评述,并指出各种关键技术与方法的优势及存在的共性问题;最后给出云数据确定性删除领域未来的发展趋势。  相似文献   

6.
In recent years, the interest in multiview video systems has increased. In these systems, a typical predictive coding approach exploits the inter-view correlation at a joint encoder, requiring the various cameras to communicate among them. However, many applications ask for simple sensing systems preventing the various cameras to communicate among them, and thus the adoption of a predictive coding approach. Wyner–Ziv (WZ) video coding is a promising solution for those applications since it is the WZ decoder task to (fully or partly) exploit the video redundancy. The rate-distortion (RD) performance of WZ video coding strongly depends on the quality of the so-called side information (SI), which is a decoder estimate of the original frame to code. In multiview WZ (MV-WZ) video coding, the target is to exploit in the best way the available correlation not only in time, as for the monoview case, but also between views. Thus, the multiview SI results from the fusion of a temporally created SI and an inter-view created SI. In this context, the main objective of this paper is to propose a classification taxonomy to organize the many inter-view SI creation and SI fusion techniques available in the literature and to review the most relevant techniques in each class. The inter-view SI creation techniques are classified into two classes, notably matching and scene geometry based, while the SI fusion techniques are classified into three classes, notably time, view and time-view driven. After reviewing the most relevant inter-view SI creation and SI fusion techniques guided by the proposed classification taxonomy, conclusions are drawn about the current status quo, thus allowing to better identify the next research challenges in the multiview WZ video coding paradigm.  相似文献   

7.
3D多视点立体显示及其关键技术   总被引:3,自引:0,他引:3  
张兆杨  安平  刘苏醒   《电子器件》2008,31(1):302-307
作为基于 DTV/HDTV 的二维(2D)显示之后的下一代视频显示技术,三维(3D)多视点立体显示已成为国际上的研究热点之一.为建立多视点立体显示系统,阐述了相关的关键技术,包括:光场表示模型和光场获取系统、高效的与现行视频标准兼容的多视点编码和传输方法、解码端任意位置视点的高效绘制方法、3D显示技术以及多视点自由立体显示.针对上述关键技术,分析了当前国际上的发展趋势及存在的问题,同时提出了一种基于交互式自由立体显示的 3D 视频处理系统的解决方案.  相似文献   

8.
近年随着3维数据采集技术不断发展,大场景 点云数据的获取越来越方便。目前深 度学习网络框架在2维图像处理领 域越来越成熟,而大场景点云是一种3维无规则化的数据,3维卷积神经网络直接处理大场 景3维数据会存在分类精度低和计 算复杂等问题。因此为了有效解决基于深度学习的点云分类任务中存在的计算时间长和分类 精度低的问题,本文提出基于二值 神经网络的大场景点云分类方法,针对不规则的3维点云数据设计特征值计算方法,基于IR -Net二值神经网络处理输入的点云 特征图像,进一步采用Dynamic ReLU激活函数,提高神经网络的计算效率,最后得出点云分 类结果。实验结果表明,所提出 的方法在Oakland数据集上分类精度达到97.6%,在GML数据集中取得 了92.3%和97.2%的分类精度,实验结果证明Dy -ResNet 能够有效提升了点云分类的精度,减少计算的复杂度,并提高了训练效率。  相似文献   

9.
何周燕  蒋志迪  郁梅 《光电子.激光》2021,32(10):1046-1054
作为物理对象在三维空间的有效表示方法,三维彩色点云可以提供丰富的沉浸式视觉体验,但在其获取、处理、编码传输等各环节会引入失真,从而导致其视觉质量下降.因此,如何监测彩色点云的视觉质量是一个亟待解决的重要问题.本文将三维彩色点云投影到二维平面,提出了一种基于全局与局部感知特征的彩色点云视觉质量评价方法.首先,将三维彩色点云转化为彩色纹理投影图与几何投影图.然后,根据三维彩色点云的纹理与几何失真在其投影图中的不同表象,分别描述并提取其失真特征;其中,在彩色纹理投影图中提取全局颜色与局部纹理特征,在几何投影图中提取全局与局部几何特征.最后,将所有全局和局部感知特征构成最终的特征向量预测彩色点云的视觉质量.在两个主观评价数据库(SJTU-PCQA、CPCD2.0)进行测试的实验结果表明,所提出方法在性能上优于13个现有代表性视觉质量评价方法,与主观感知质量有更好的一致性.  相似文献   

10.
点云编码是支撑点云广泛应用的关键技术之一,是近期技术研究和标准化领域的热点。对点云几何信息和属性信息编码技术演进进行了回顾,并针对稠密点云和稀疏点云的几种典型编码方法的编码效率进行了比较。未来点云编码研究将集中于利用帧间预测去除动态点云的不同帧之间的相关性,以及端到端点云编码、任务驱动的点云编码等方面。  相似文献   

11.
Raw point cloud processing using capsule networks is widely adopted in classification, reconstruction, and segmentation due to its ability to preserve spatial agreement of the input data. However, most of the existing capsule based network approaches are computationally heavy and fail at representing the entire point cloud as a single capsule. We address these limitations in existing capsule network based approaches by proposing PointCaps, a novel convolutional capsule architecture with parameter sharing. Along with PointCaps, we propose a novel Euclidean distance routing algorithm and a class-independent latent representation. The latent representation captures physically interpretable geometric parameters of the point cloud, with dynamic Euclidean routing, PointCaps well-represents the spatial (point-to-part) relationships of points. PointCaps has a significantly lower number of parameters and requires a significantly lower number of FLOPs while achieving better reconstruction with comparable classification and segmentation accuracy for raw point clouds compared to state-of-the-art capsule networks.  相似文献   

12.
Existing point cloud classification researches are usually conducted on datasets with complete structure and clear semantics. However, in real point cloud scenes, the occlusion and truncation may destroy the completeness of objects affecting the classification performance. To solve this problem, we propose an incomplete point cloud classification network (IPC-Net) with data augmentation and similarity measurement. The proposed network learns the feature representation of incomplete point clouds and the semantic differences compared to the complete ones for classification. Specifically, IPC-Net adopts a random erasing-based data augmentation to deal with incomplete point clouds. IPC-Net also introduces an auxiliary loss function weighted by attention scores to measure the similarity between the incomplete and the complete point clouds. Extensive experiments verify that IPC-Net has the ability to classify incomplete point clouds and significantly improves the robustness of point cloud classification under different completeness.  相似文献   

13.
With the rapid development of computer vision, point clouds technique was widely used in practical applications, such as obstacle detection, roadside detection, smart city construction, etc. However, how to efficiently identify the large scale point clouds is still an open challenge. For relieving the large computation consumption and low accuracy problem in point cloud classification, a large scale point cloud classification framework based on light bottle transformer (light-BotNet) is proposed. Firstly, the two-dimensional (2D) and three-dimensional (3D) feature values of large scale point cloud were extracted for constructing point cloud feature images, which employed the prior knowledge to normalize the point cloud features. Then, the feature images are input to the classification network, and the light-BotNet network is applied for point cloud classification. It is an interesting attempt to combine the traditional image features with the transformer network. For proving the performance of the proposed method, the large scale point cloud benchmark Oakland 3D is utilized. In the experiments, the proposed method achieved 98.1% accuracy on the Oakland 3D dataset. Compared with the other methods, it can both reduce the memory consumption and improve the classification accuracy in large scale point cloud classification.  相似文献   

14.
SLAM(Simultaneously Localization And Mapping)同步定位与地图构建作为移动机器人智能感知的关键技术。但是,大多已有的SLAM方法是在静止环境下实现的,当环境中存在移动频繁的障碍物时,SLAM建图会产生运动畸变,导致机器人无法进行精准的定位导航。同时,激光雷达等三维扫描设备获得的三维点云数据存在着大量的冗余三维数据点,过多的冗余数据不仅浪费大量的存储空间,同时也影响了各种点云处理算法的实时性。针对以上问题,本文提出一种SLAM运动畸变去除方法和一种基于曲率的点云数据分类简化框架。它通过激光插值法优化SLAM运动畸变,将优化后的点云数据分类简化。它能在提高SLAM建图精度,同时也很好的消除三维点云数据中特征不明显区域的冗余数据点,大大提高计算机运行效率。  相似文献   

15.
Ubiquitous computing is emerging as a new paradigm in next-generation information technology. This new paradigm has been embodied into tremendous business models and applications through lots of ubiquity-related technologies. In this study, a new taxonomy for these business applications and technologies is suggested. In order to prove the practical values, two case applications of the taxonomy are conducted. In the cases, 24 ubiquitous computing services and 19 ubiquitous computing projects are classified so that the status quo of ubiquitous computing is analyzed.  相似文献   

16.
在自由视点电视(FTV)系统的发送端,数据由多摄 像机采集的纹理图和其相应的深度信息组成;在接收端,虚拟视点由视点纹理序列和估计的 深度信息经过3D变换绘制。因此,获取高质量的深度信息是FTV系统的一个重 要部分。由于当前非交互方式深度估计方法是逐帧进行的,所得到的深度图序列往往缺乏时 域一致性。理 想情况下相邻帧静止区域的深度值应该相同,但是对这些区域深度值的估计结果往往不同, 这将严重影 响编码效率和绘制质量。由于深度图表征的是纹理图中相应场景离摄像机的距离,所以可以 通过对纹理图 的有效分析,判断出错误的深度值。通过对深度值可靠性和当前区域运动属性的判断,提出 一种基于 自适应时域加权的深度图一致性增强等。实验表明,本文算法能有效抑制静止区域深度值 不连续的错误,产生 更加稳定的深度图序列,使虚拟视点的时域绘制质量得到增强,同时编码效率得到提高。  相似文献   

17.
Yuan  Hui  Zhang  Dexiang  Wang  Weiwei  Li  Yujun 《Mobile Networks and Applications》2020,25(5):1863-1872
Mobile Networks and Applications - 3D point cloud is one of the most common and basic 3D object representation model that is widely used in virtual/augmented reality applications, e.g., immersive...  相似文献   

18.
There existed many visual tracking methods that are based on sparse representation model, most of them were either generative or discriminative, which made object tracking more difficult when objects have undergone large pose change, illumination variation or partial occlusion. To address this issue, in this paper we propose a collaborative object tracking model with local sparse representation. The key idea of our method is to develop a local sparse representation-based discriminative model (SRDM) and a local sparse representation-based generative model (SRGM). In the SRDM module, the appearance of a target is modeled by local sparse codes that can be formed as training data for a linear classifier to discriminate the target from the background. In the SRGM module, the appearance of the target is represented by sparse coding histogram and a sparse coding-based similarity measure is applied to compute the distance between histograms of a target candidate and the target template. Finally, a collaborative similarity measure is proposed for measuring the difference of the two models, and then the corresponding likelihood of the target candidates is input into a particle filter framework to estimate the target state sequentially over time in visual tracking. Experiments on some publicly available benchmarks of video sequences showed that our proposed tracker is robust and effective.  相似文献   

19.
Aiming at the under-segmentation of 3D point cloud semantic segmentation caused by the lack of contextual fine-grained information of the point cloud,an algorithm based on contextual attention CNN was proposed for 3D point cloud semantic segmentation.Firstly,the fine-grained features in local area of the point cloud were mined through the attention coding mechanism.Secondly,the contextual features between multi-scale local areas were captured by the contextual recurrent neural network coding mechanism and compensated with the fine-grained local features.Finally,the multi-head mechanism was used to enhance the generalization ability of the network.Experiments show that the mIoU of the proposed algorithm on the three standard datasets of ShapeNet Parts,S3DIS and vKITTI are 85.4%,56.7% and 38.1% respectively,which has good segmentation performance and good generalization ability.  相似文献   

20.
点云语义分割是三维点云数据处理的基础步骤,是三维场景理解分析、重建和目标识别的关键环节.针对目前对三维点云进行语义分割使用的点云信息少和精度不高的问题,本文在利用点云三维坐标信息的基础上,增加了点云RGB信息和所属房间的归一化坐标信息,从而丰富了神经网络输入端的信息量,进一步提高了模型的分割精度,最后利用PointNe...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号