首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The increase in Internet bandwidth and the developments in 3D video technology have paved the way for the delivery of 3D Multi-View Video (MVV) over the Internet. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D video experience may well be degraded unless content-aware precautionary mechanisms and adaptation methods are deployed. In this work, a novel adaptive MVV streaming method is introduced which addresses the future generation 3D immersive MVV experiences with multi-view displays. When the user experiences network congestion, making it necessary to perform adaptation, the rate-distortion optimum set of views that are pre-determined by the server, are truncated from the delivered MVV streams. In order to maintain high Quality of Experience (QoE) service during the frequent network congestion, the proposed method involves the calculation of low-overhead additional metadata that is delivered to the client. The proposed adaptive 3D MVV streaming solution is tested using the MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard. Both extensive objective and subjective evaluations are presented, showing that the proposed method provides significant quality enhancement under the adverse network conditions.  相似文献   

2.
多视点视频是指在场景中放置多台摄像机,记录下多个视点数据,提供给用户视点选择和场景漫游的交互式媒体应用.多个摄像机从不同视角同时拍摄同一场景得到的一组视频信号,每一个摄像机代表一个不同的视角.可同时传输多个空间角度的视频流到用户端,并合成用户所需要的视域图像.多视点视频是一种新型的具有立体感和交互操作功能的视频,是未来一种极具应用前景的多媒体应用.然而,当前多视点的无线网络带宽分配机制中,都没有考虑大量的、不同解码能力的用户共存时的效率问题.一般情况下,合成视域往往需要至少左右两边(两条以上)的参考视频同时传输到用户端,才能使合成的视域质量不低于直接传输的视点质量,使得网络数据量成倍增加.同时,用户的设备性能影响用户感知质量.网络中手机屏幕和高清大屏显示对网络传输视频数据率的要求不同,必须考虑用户设备解码能力的限制,才能真正提供用户满意的感知质量.本文通过考虑移动无线网络带宽约束,考察不同用户端硬件的解码能力、视域大小及带宽消耗,利用博弈理论,使整体网络资源收益最大.本文分别考虑了几种特定场景下多视点视频传输的无线网络的资源分配.第一,已知网络用户的满意度参数,不考虑带宽的限制(带宽充足),如何确定每个用户需要支付的单位价格.第二,在用户个数不确定的条件下,如何判断出可以接入的用户个数和用户需提供的单位价格.第三,同时考虑用户设备对最大场景复杂度解码能力受限以及网络带宽受限两个约束条件,同时进行用户接入控制和多视点的视频质量优化,使得网络的收益和用户的效用得到最大化.本文对提出的算法进行了理论分析,证明了本文参数设置的合理性.在多视点移动网络资源调度中,本文提出的算法可以方便设置所需的价格参数.从视域大小、价格、用户效用、网络收益等各方面对实验性能进行比较.仿真结果显示本方法在同等实验条件下,多视点用户效用提升分别为5%和12%,网络总体收益增加32%.本文算法可以同时满足网络收益和用户整体效用最优,提高多视点视频在多用户下的网络资源利用率.  相似文献   

3.

The Internet of things (IoT) has received a great deal of attention in recent years, and is still being approached with a wide range of views. At the same time, video data now accounts for over half of the internet traffic. With the current availability of beyond high definition, it is worth understanding the performance effects, especially for real-time applications. High Efficiency Video Coding (HEVC) aims to provide reduction in bandwidth utilisation while maintaining perceived video quality in comparison with its predecessor codecs. Its adoption aims to provide for areas such as television broadcast, multimedia streaming/storage, and mobile communications with significant improvements. Although there have been attempts at HEVC streaming, the literature/implementations offered do not take into consideration changes in the HEVC specifications. Beyond this point, it seems little research exists on real-time HEVC coded content live streaming. Our contribution fills this current gap in enabling compliant and real-time networked HEVC visual applications. This is done implementing a technique for real-time HEVC encapsulation in MPEG-2 Transmission Stream (MPEG-2 TS) and HTTP Live Streaming (HLS), thereby removing the need for multi-platform clients to receive and decode HEVC streams. It is taken further by evaluating the transmission of 4k UHDTV HEVC-coded content in a typical wireless environment using both computers and mobile devices, while considering well-known factors such as obstruction, interference and other unseen factors that affect the network performance and video quality. Our results suggest that 4kUHD can be streamed at 13.5 Mb/s, and can be delivered to multiple devices without loss in perceived quality.

  相似文献   

4.
Three-Dimensional Multi-View Video (3D MVV) contains diverse video streams taken by different cameras around an object. Thence, it is an imperative assignment to fulfill efficient compression to attain future resource bonds whilst preserving a decisive reception MVV quality. The extensive 3D MVV encoding and transmission over mobile or Internet are vulnerable to packet losses on account of the existence of severe channel faults and restricted bandwidth. In this work, we propose a new Encoder-Independent Decoder-Dependent Depth-Assisted Error Concealment (EIDD-DAEC) algorithm. It invests the depth correlations between the temporally, spatially, and inter-view adjoining Macro-Blocks (MBs) to conceal the erroneous streams. At the encoder, the existing inter-view, temporal, and spatial matching are exploited to efficiently compress the 3D MVV streams and to estimate the Disparity Vectors (DVs) and Motion Vectors (MVs). At the decoder, the gathered MVs and DVs from the received coded streams are used to calculate additional depth-assisted MVs and DVs, which are afterwards combined with the collected candidate texture color MVs and DVs groups for concealing the lost MBs of inter- and intra-encoded frames. Finally, the optimum DVs and MVs concealment candidates are selected by the Directional Interpolation Error Concealment Algorithm (DIECA) and Decoder Motion Vector Estimation Algorithm (DMVEA), respectively. Experimental results on several standardized 3D MVV sequences verified the efficacy of the proposed EIDD-DAEC algorithm by achieving ameliorated efficacious objective and subjective results without generating and transporting depth maps at the encoder. The proposed work achieves high 3D MVV quality performance with an improved average Peak Signal-to-Noise Ratio (PSNR) gain by up to 0.95 ~ 2.70 dBs compared to the state-of-the-art error concealment algorithms, which do not employ depth-assisted correlations at different Quantization Parameters (QPs) and Packet Loss Rates (PLRs) of 40%.  相似文献   

5.

Internet Protocol Television (IPTV) is an emerging network application in the internet world. One of the most reliable networks is IPTV which gives high speed for internet services. As IPTV offers many live services on user demand and it has many advantages. But still, some problem exists in the existing implementation such as degradation of quality and delay while maintaining limited frames and efficient bandwidth consumption over the network channel. The efficient bandwidth utilization is a major issue in IPTV platforms. Integrating the video processing on network platform is the challenging task in video on demand (VoD) application. This paper overcomes the drawbacks of existing IPTV by using Frame Frequency Error Optimization (FFEO) based HEVC approach which is called as U-HEVC. The FFEO method upgrades the video quality by interpolation of frames. U-HEVC delivers 50% better compression similar to the existing HEVC standard and it also provides better visual quality at half the bit rates. The Analysis of proposed U-HEVC attain better results compared to existing HEVC compression algorithms that higher number of packets get affected at different bit rate levels. In HEVC the Frame loss of 1 Mbps is 0.38%, 2 Mbps is 0.46%, 4 Mbps is 0.63% and 8 Mbps is 0.94%. When compared to the U-HEVC the Frame loss is somewhat high in HEVC. This paper presents the studies on IPTV environment based on U-HEVC using frame frequency error optimization technique.

  相似文献   

6.
Video compression technology is an important research part to the intelligent user interface for interactive multimedia system using technologies and services such as image processing, pattern recognition, computer vision and cloud computing service. Recently, high-efficiency video coding (HEVC) has been established as the demand of very high-quality multimedia service like ultrahigh definition video service. High-efficiency video coding (HEVC) standard has three units such as coding unit (CU), prediction unit (PU) and transform unit. It has too many complexities to improve coding performance. We propose a fast algorithm which can be possible to apply for both CU and PU parts. To reduce the computational complexity, we propose CU splitting algorithm based on rate–distortion cost of CU about the parent and current levels to terminate the CU decision early. In terms of PU, we develop fast PU decision based on spatio-temporal and depth correlation for PU level. Finally, experimental results show that our algorithm provides a significant time reduction for encoding with a small loss in video quality, compared to the original HEVC Test Model (HM) version 10.0 software and the previous algorithm.  相似文献   

7.
The major surveillance camera manufacturers have begun incorporating wireless networking functionality into their products to enable wireless access. However, the video feeds from such cameras can only be accessed within the transmission range of the cameras. These cameras must be connected to backbone infrastructure in order to access them from more than one hop away. This network infrastructure is both time-consuming and expensive to install, making it impractical in many rapid deployment situations (e.g., to provide temporary surveillance at a crime scene). To overcome this problem, we propose the MeshVision system that incorporates wireless mesh network functionality directly into the cameras. Video streams can be pulled from any camera within a network of MeshVision cameras, irrespective of how many hops away that camera is. To manage the trade-off between video stream quality and the number of video streams that could be concurrently accessed over the network, MeshVision uses a bandwidth adaptation mechanism. This mechanism monitors the wireless network looking for drops in link quality or signs of congestion and adjusts the quality of existing video streams in order to reduce that congestion. A significant benefit of the approach is that it is of low cost, requiring only a software upgrade of the cameras.  相似文献   

8.

In this paper, we propose a new video conferencing system that presents correct gaze directions of a remote user by switching among images obtained from multiple cameras embedded in a screen according to a local user’s position. Our proposed method reproduces a situation like that in which the remote user is in the same space as the local user. The position of the remote user to be displayed on the screen is determined so that the positional relationship between the users is reproduced. The system selects one of the embedded cameras whose viewing direction towards the remote user is the closest to the local user’s viewing direction to the remote user’s image on the screen. As a result of quantitative evaluation, we confirmed that, in comparison with the case using a single camera, the accuracy of gaze estimation was improved by switching among the cameras according to the position of the local user.

  相似文献   

9.
It is widely accepted that the growth of Internet and the improvement of Internet’s network conditions helped real-time applications to flourish. The demand for Ultra-High Definition video is constantly increasing. Apart from video and sound, a new kind of real-time data is making its appearance, haptic data. The efficient synchronization of video, audio, and haptic data is a rather challenging effort. The new High-Efficiency Video Coding (HEVC) is quite promising for real-time ultra-high definition video transferring through the Internet. This paper presents related work on High-Efficiency Video Coding. It points out the challenges and the synchronization techniques that have been proposed for synchronizing video and haptic data. Comparative tests between H.264 and HEVC are undertaken. Measurements for the network conditions of the Internet are carried out. The equations for the transferring delay of all the inter-prediction configurations of the HEVC are defined. Finally, it proposes a new efficient algorithm for transferring a real-time HEVC stream with haptic data through the Internet.  相似文献   

10.
We proposed an approach to create plausible free-viewpoint relighting video using multi-view cameras array under general illumination. Given the multi-view video dataset recorded using a set of industrial cameras under general uncontrolled and unknown illumination, we first reconstruct 3D model of the captured target using existing multi-view stereo approach. Using the coarse geometry reconstruction, we estimate the spatially varying surface reflectance in the spherical harmonics domain considering the spatial and temporal coherence. With the estimated geometry and reflectance, the 3D target is relit to the novel illumination with the environment map of the target environment. Relit performance is enhanced using a flow- and quotient-based transfer strategy to achieve detailed and plausible performance relighting. Finally, the free-viewpoint video is generated using a view-dependent rendering strategy. Experimental results on various dataset show that our approach enables plausible free-view relighting, and opens up a path towards relightable free-viewpoint video using less complex acquisition setups.  相似文献   

11.
Light field imaging can capture both spatial and angular information of a 3D scene and is considered as a prospective acquisition and display solution to supply a more natural and fatigue-free 3D visualization. However, one problem that occupies an important position to deal with the light field data is the sheer size of data volume. In this context, efficient coding schemes for this particular type of image are needed. In this paper, we propose a hybrid linear weighted prediction and intra block copy based light field image codec architecture based on high efficiency video coding screen content coding extensions (HEVC SCC) standard to effectively compress the light field image data. In order to improve the prediction accuracy, a linear weighted prediction method is integrated into HEVC SCC standard, where a locally correction weighted based method is used to derive the weight coefficient vector. However, for the non-homogenous texture area, a best match in linear weighted prediction method does not necessarily lead to a good prediction of the coding block. In order to alleviate such shortcoming, the proposed hybrid codec architecture explores the idea of using the intra block copy scheme to find the best prediction of the coding block based on rate-distortion optimization. For the reason that the used “try all then select best” intra mode decision method is time-consuming, we further propose a fast mode decision scheme for the hybrid codec architecture to reduce the computation complexity. Experimental results demonstrate the advantage of the proposed hybrid codec architecture in terms of different quality metrics as well as the visual quality of views rendered from decompressed light field content, compared to the HEVC intra-prediction method and several other prediction methods in this field.  相似文献   

12.
由于多视点立体视频合成具有数据量大,图像处理速度要求较高,支持的立体视角有限等特点,这些问题一直没有很好的解决并已成为多视点立体视频产业化的瓶颈。针对这种情况,提出了一种基于立体图像融合算法与人眼跟踪算法的立体视频处理系统。首先,按顺序循环读取立体视频中的每一帧,然后用立体图像融合算法对每一帧进行合成运算,接下来将融合后的图像依原有顺序进行显示与播放。同时加入人眼跟踪算法,根据观看者眼部所处的位置实时投放对应视区的图像。图像融合算法与人眼跟踪的结合有效地扩大了立体视角。实验结果表明,该方法实现了将多视点视频在立体显示器中以自由立体显示的方式展现出来,使观看者在屏幕前可以自由移动而不影响立体观看效果,同时播放速度流畅,能给观众带来比较真实的立体感受。  相似文献   

13.
In many scenarios a dynamic scene is filmed by multiple video cameras located at different viewing positions. Visualizing such multi-view data on a single display raises an immediate question—which cameras capture better views of the scene? Typically, (e.g. in TV broadcasts) a human producer manually selects the best view. In this paper we wish to automate this process by evaluating the quality of a view, captured by every single camera. We regard human actions as three-dimensional shapes induced by their silhouettes in the space-time volume. The quality of a view is then evaluated based on features of the space-time shape, which correspond with limb visibility. Resting on these features, two view quality approaches are proposed. One is generic while the other can be trained to fit any preferred action recognition method. Our experiments show that the proposed view selection provide intuitive results which match common conventions. We further show that it improves action recognition results.  相似文献   

14.
Cloud Mobile 3D Display Gaming has been recently proposed where 3D video rendering and encoding are performed on cloud servers, with the resulting 3D video streamed wirelessly to mobile devices with 3D displays. This approach has the advantage of relieving high computation, power and storage requirements of 3D display gaming from mobile devices, while enabling game developers to focus on a single rich version of the game which can be experienced from any mobile device and platform. However, it is challenging to stream 3D video over dynamically fluctuating and often constrained mobile networks. In this paper, we propose a novel technique called Asymmetric and Selective Object Rendering (ASOR) which proves to be more powerful than previous solutions for Cloud based Mobile 3D display gaming. Specifically, this technique will enable rendering engine to intelligently decide whether or not to render an individual object and how good the corresponding texture detail will be if rendered, and the settings can be asymmetric for two views. Thus, unimportant objects can trade quality for reduced bitrate while important objects can remain high quality so that the overall user experience is optimized given certain bandwidth constraints. To quantitatively measure the user experience and bitrate by applying different rendering settings, we develop a user experience model and a bitrate model. We further propose an optimization algorithm which uses the above two models to automatically decide the optimal rendering settings for left view and right view to ensure the best user experience given the network conditions. Experiments conducted using real 4G-LTE network profiles on commercial cloud service with different genres of games demonstrate significant improvement in user experience when the proposed optimization algorithm is applied.  相似文献   

15.
Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, because the output is composed of a sequence of asynchronous events rather than actual intensity images, traditional vision algorithms cannot be applied, so that a paradigm shift is needed. We introduce the problem of event-based multi-view stereo (EMVS) for event cameras and propose a solution to it. Unlike traditional MVS methods, which address the problem of estimating dense 3D structure from a set of known viewpoints, EMVS estimates semi-dense 3D structure from an event camera with known trajectory. Our EMVS solution elegantly exploits two inherent properties of an event camera: (1) its ability to respond to scene edges—which naturally provide semi-dense geometric information without any pre-processing operation—and (2) the fact that it provides continuous measurements as the sensor moves. Despite its simplicity (it can be implemented in a few lines of code), our algorithm is able to produce accurate, semi-dense depth maps, without requiring any explicit data association or intensity estimation. We successfully validate our method on both synthetic and real data. Our method is computationally very efficient and runs in real-time on a CPU.  相似文献   

16.
相比于之前主流的H.264视频压缩编码标准,HEVC在保证重建视频质量相同的前提下,可以将码率降低近50%,节省了传输所需的带宽.即便如此,由于一些特定的网络带宽限制,为继续改善HEVC视频编码性能,进一步提升对视频的压缩效率仍然是当前研究的热点.本文提出一种HEVC标准编码与帧率变换方法相结合的新型的视频压缩编码算法,首先在编码端,提出一种自适应抽帧方法,降低原视频帧率,减少所需传输数据量,对低帧率视频进行编解码;在解码端,结合从HEVC传输码流中提取的运动信息以及针对HEVC编码特定的视频帧的分块模式信息等,对丢失帧运动信息进行估计;最后,通过本文提出的改进基于块覆盖双向运动补偿插帧方法对视频进行恢复重建.实验结果证实了本文所提算法的有效性.  相似文献   

17.
基于H.264标准的多视点视频编码方案的研究   总被引:2,自引:0,他引:2  
为研究一种新的高效的多视点视频编码方法,提高编码效率,并有效地提高视点间随机切换访问的能力,利用H.264中的新技术多参考帧、SP/SI帧、分层B帧编码等,根据时空预测编码结构的方法,提出了一种基于分层B帧并有利于视点间随机切换访问的多视点视频编码方案.实验结果表明,该方案在提高了编码效率的同时,在视点较多的情况下能够有效地提高视点间随机切换访问的能力.  相似文献   

18.
Videos and other multimedia contents become increasing popular among users of the Internet nowadays. With the improvement of underlying infrastructure of the Internet, users are allowed to enjoy video contents with much higher quality than last decade. Content delivery networks (CDNs) are a type of content hosting solution that widely used across the Internet. Content providers offload the task of content hosting to CDN providers and redirect users’ requests to CDNs. Video contents, especially high quality videos at real-time has occupying a major part of the Internet traffic. It is challenging to handle such workloads even for a large- scale CDN. Load balancing algorithms are critical to address this issue. However, traditional load balancing algorithms such as round-robin and randomization are unaware of user side requirements. Therefore, it is not uncommon that requests for high-quality videos at real-time are not satisfied. In this paper, we try to fulfill such requests by integrating software-defined networking technology with CDN infrastructure. We also propose revised load balancing algorithms and develop simulations to verify our approaches. The results show that the proposed algorithms achieve much higher user satisfaction in bandwidth-idle environments.  相似文献   

19.
Inter-camera registration in multi-view systems with overlapped views has a particularly long and sophisticated research history within the computer vision community. Moreover, when applied to Distributed Video Coding, in systems with at least one moving camera it represents a real challenge due to the necessary data at decoder for generating the side information without any a priori knowledge of each instant camera position. This paper proposes a solution to this problem based on successive multi-view registration and motion compensated extrapolation for on-the-fly re-correlation of two views at decoder. This novel technique for side information generation is codec-independent, robust and flexible with regard to any free motion of the cameras. Furthermore, it doesn’t require any additional information from encoders nor communication between cameras or offline training stage. We also propose a metric for an objective assessment of the multi-view correlation performance.  相似文献   

20.
Automated virtual camera control has been widely used in animation and interactive virtual environments. We have developed a multiple sparse camera based free view video system prototype that allows users to control the position and orientation of a virtual camera, enabling the observation of a real scene in three dimensions (3D) from any desired viewpoint. Automatic camera control can be activated to follow selected objects by the user. Our method combines a simple geometric model of the scene composed of planes (virtual environment), augmented with visual information from the cameras and pre-computed tracking information of moving targets to generate novel perspective corrected 3D views of the virtual camera and moving objects. To achieve real-time rendering performance, view-dependent textured mapped billboards are used to render the moving objects at their correct locations and foreground masks are used to remove the moving objects from the projected video streams. The current prototype runs on a PC with a common graphics card and can generate virtual 2D views from three cameras of resolution 768×576 with several moving objects at about 11 fps.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号