首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
    
In this paper, we propose an estimation method that estimates the throughput of upcoming video segments based on variations in the network throughput observed during the download of previous video segments. Then, we propose a rate-adaptive algorithm for Hypertext Transfer Protocol (HTTP) streaming. The proposed algorithm selects the quality of the video based on the estimated throughput and playback buffer occupancy. The proposed method selects high-quality video segments, while minimizing video quality changes and the risk of playback interruption, improving user’s experience. We evaluate the algorithm for single- and multi-user environments and demonstrate that it performs remarkably well under varying network conditions. Furthermore, we determine that it efficiently utilizes network resources to achieve a high video rate; competing HTTP clients achieve equitable video rates. We also confirm that variations in the playback buffer size and segment duration do not affect the performance of the proposed algorithm.  相似文献   

2.
    
Seamless streaming of high quality video under unstable network condition is a big challenge. HTTP adaptive streaming (HAS) provides a solution that adapts the video quality according to the network conditions. Traditionally, HAS algorithm runs at the client side while the clients are unaware of bottlenecks in the radio channel and competing clients. The traditional adaptation strategies do not explicitly coordinate between the clients, servers, and cellular networks. The lack of coordination has been shown to lead to suboptimal user experience. As a response, multi-access edge computing (MEC)-assisted adaptation techniques emerged to take advantage of computing and content storage capabilities in mobile networks. In this study, we investigate the performance of both MEC-assisted and client-side adaptation methods in a multi-client cellular environment. Evaluation and comparison are performed in terms of not only the video rate and dynamics of the playback buffer but also the fairness and bandwidth utilization. We conduct extensive experiments to evaluate the algorithms under varying client, server, dataset, and network settings. Results demonstrate that the MEC-assisted algorithms improve fairness and bandwidth utilization compared to the client-based algorithms for most settings. They also reveal that the buffer-based algorithms achieve significant quality of experience; however, these algorithms perform poorly compared with throughput-based algorithms in protecting the playback buffer under rapidly varying bandwidth fluctuations. In addition, we observe that the preparation of the representation sets affects the performance of the algorithms, as does the playback buffer size and segment duration. Finally, we provide suggestions based on the behaviors of the algorithms in a multi-client environment.  相似文献   

3.
    
With the prevalence of accessible depth sensors, dynamic skeletons have attracted much attention as a robust modality for action recognition. Convolutional neural networks (CNNs) excel at modeling local relations within local receptive fields and are typically inefficient at capturing global relations. In this article, we first view the dynamic skeletons as a spatio-temporal graph (STG) and then learn the localized correlated features that generate the embedded nodes of the STG by message passing. To better extract global relational information, a novel model called spatial–temporal graph interaction networks (STG-INs) is proposed, which perform long-range temporal modeling of human body parts. In this model, human body parts are mapped to an interaction space where graph-based reasoning can be efficiently implemented via a graph convolutional network (GCN). After reasoning, global relation-aware features are distributed back to the embedded nodes of the STG. To evaluate our model, we conduct extensive experiments on three large-scale datasets. The experimental results demonstrate the effectiveness of our proposed model, which achieves the state-of-the-art performance.  相似文献   

4.
    
Video transcoding is to create multiple representations of a video for content adaptation. It is deemed as a core technique in Adaptive BitRate (ABR) streaming. How to manage video transcoding affects the performance of ABR streaming in various aspects, including operational cost, streaming delays, Quality of Experience (QoE), etc. Therefore, the problems of implementing video transcoding in ABR streaming must be systematically studied to improve the overall performance of the streaming services. These problems become more worthy of investigation with the emergence of the edge-cloud continuum, which makes the resource allocation for video transcoding more complicated. To this end, this paper provides an investigation of the main technical problems related to video transcoding in ABR streaming, including designing a rate profile for video transcoding, providing resources for video transcoding in clouds, and caching multi-bitrate video contents in networks, etc. We analyze these problems from the perspective of resource allocation in the edge-cloud continuum and cast them into resource and Quality of Service (QoS) optimization problems. The goal is to minimize resource consumption while guaranteeing the QoS for ABR streaming. We also discuss some promising research directions for the ABR streaming services.  相似文献   

5.
Maintaining the quality of videos in resource-intensive IPTV services is challenging due to the nature of packet-based content distribution networks (CDN). Network impairments are unpredictable and highly detrimental to the quality of video content. Quality of the end user experience (QoE) has become a critical service differentiator. An efficient real-time quality assessment service in distribution networks is the foundation of service quality monitoring and management. The perceptual impact of individual impairments varies significantly and is influenced by complex impact factors. Without differentiating the impact of quality violation events to the user experience, existing assessment methodologies based on network QoS such as packet loss rate cannot provide adequate supports for the IPTV service assessment. A discrete perceptual impact evaluation quality assessment (DEQA) framework is introduced in this paper. The proposed framework enables a real-time, non-intrusive assessment service by efficiently recognising and assessing individual quality violation events in the IPTV distribution network. The discrete perceptual impacts to a media session are aggregated for the overall user level quality evaluation. With its deployment scheme the DEQA framework also facilitates efficient network diagnosis and QoE management. To realise the key assessment function of the framework and investigate the proposed advanced packet inspection mechanism, we also introduce the dedicated evaluation testbed—the LA2 system. A subjective experiment with data analysis is also presented to demonstrate the development of perceptual impact assessment functions using analytical inference, the tools of the LA2 system, subjective user tests and statistical modelling.  相似文献   

6.
The proposed work aims at analyzing the quality perceived by the user when streaming video on tablet devices. The contributions of this paper are: (i) to analyze the results of subjective quality assessments to determine which Quality of Service (QoS) parameters mainly affect the users’ Quality of Experience (QoE) in video streaming over tablet devices; (ii) to define a parametric quality model useful in system control and optimization for the considered scenarios; (iii) to compare the performance of the proposed model with subjective quality results obtained in alternative state-of-the-art studies and investigate whether other models could be applied to our case and vice versa.  相似文献   

7.
    
Motivated by the powerful capability of deep neural networks in feature learning, a new graph-based neural network is proposed to learn local and global relational information on skeleton sequences represented as spatio-temporal graphs (STGs). The pipeline of our network architecture consists of three main stages. As the first stage, spatial–temporal sub-graphs (sub-STGs) are projected into a latent space in which every point is represented as a linear subspace. The second stage is based on message passing to acquire the localized correlated features of the nodes in the latent space. The third stage relies on graph convolutional networks (GCNs) to reason the long-range spatio-temporal dependencies through a graph representation of the latent space. Finally, the average pooling layer and the softmax classifier are then employed to predict the action categories based on the extracted local and global correlations. We validate our model in terms of action recognition using three challenging datasets: the NTU RGB+D, Kinetics Motion, and SBU Kinect Interaction datasets. The experimental results demonstrate the effectiveness of our approach and show that our proposed model outperforms the state-of-the-art methods.  相似文献   

8.
    
Detection of salient objects in image and video is of great importance in many computer vision applications. In spite of the fact that the state of the art in saliency detection for still images has been changed substantially over the last few years, there have been few improvements in video saliency detection. This paper proposes a novel non-local fully convolutional network architecture for capturing global dependencies more efficiently and investigates the use of recently introduced non-local neural networks in video salient object detection. The effect of non-local operations is studied separately on static and dynamic saliency detection in order to exploit both appearance and motion features. A novel deep non-local fully convolutional network architecture is introduced for video salient object detection and tested on two well-known datasets DAVIS and FBMS. The experimental results show that the proposed algorithm outperforms state-of-the-art video saliency detection methods.  相似文献   

9.
    
The performance of computer vision algorithms can severely degrade in the presence of a variety of distortions. While image enhancement algorithms have evolved to optimize image quality as measured according to human visual perception, their relevance in maximizing the success of computer vision algorithms operating on the enhanced image has been much less investigated. We consider the problem of image enhancement to combat Gaussian noise and low resolution with respect to the specific application of image retrieval from a dataset. We define the notion of image quality as determined by the success of image retrieval and design a deep convolutional neural network (CNN) to predict this quality. This network is then cascaded with a deep CNN designed for image denoising or super resolution, allowing for optimization of the enhancement CNN to maximize retrieval performance. This framework allows us to couple enhancement to the retrieval problem. We also consider the problem of adapting image features for robust retrieval performance in the presence of distortions. We show through experiments on distorted images of the Oxford and Paris buildings datasets that our algorithms yield improved mean average precision when compared to using enhancement methods that are oblivious to the task of image retrieval. 1  相似文献   

10.
    
In this paper, we study the impact of quantization, frame dropping and spatial down-sampling on the perceived quality of compressed video streams. Based on the analysis of quality ratings obtained from extensive subjective tests, we propose a no-reference metric (named MDVQM) for video quality estimation in the presence of both spatial and temporal quality impairments. The proposed metric is based on the per-pixel bitrate of the encoded stream and selected spatial and temporal activity measures extracted from the video content. All the values required to compute the proposed video quality metric can be obtained without using the original reference video which makes the metric for instance useful for making transcoding decisions in a wireless video transmission scenario. Different from comparable metrics in the literature, we have also considered the case when both frame rate and frame size are changed simultaneously. The validation results show that the proposed metric provides more accurate estimation of the video quality than the state of the art metrics.  相似文献   

11.
提出了一种基于深层特征学习的无参考(NR)立体图 像质量评价方 法。与传统人工提取图像特征不同,采用卷积神经网络(CNN)自动提取图像特征,评价过程 分为训练和 测试两阶段。在训练阶段,将图像分块训练CNN网络,利用CNN提取图像块特征,并结合不同 的整合方式 得到图像的全局特征,通过支持向量回归(SVR)建立主观质量与全局特征的回归模型;在测 试阶段,由已训练的CNN网 络和回归模型,得到左右图像和独眼图的质量。最后,根据人眼双目视觉特性融合左图像、 右图像和独眼 图的质量,得到立体图像质量。本文方法在LIVE-I和LIVE-II数据库上的Spearman等级系 数(SROCC)分别达 到了0.94,评价结果准确,与人眼的主 观感受一致。  相似文献   

12.
基于三元卷积神经网络的行人再辨识算法多数采用欧式距离度量行人之间的相似度,并配合铰链(hinge)损失函数进行卷积神经网络的训练。然而,这种作法存在两个不足:欧式距离作为行人相似度,鉴别力不够强;铰链损失函数的间隔(Margin)参数设定依赖于人工预先设定且在训练过程中无法自适应调整。为此,针对上述两个不足进行改进,该文提出一种基于新型三元卷积神经网络的行人再辨识算法,以提高行人再辨识的准确率。首先,提出一种归一化混合度量函数取代传统的度量方法进行行人相似度计算,提高了行人相似度度量的鉴别力;其次,提出采用Log-logistic函数代替铰链函数,无需人工设定间隔参数,改进了特征与度量函数的联合优化效果。实验结果表明,所提出的算法在Auto Detected CUHK03 和VIPeR两个数据库上的准确率均获得显著的提升,验证了所提出算法的优越性。  相似文献   

13.
    
The increasing popularity of video gaming competitions, the so called eSports, has contributed to the rise of a new type of end-user: the passive game video streaming (GVS) user. This user acts as a passive spectator of the gameplay rather than actively interacting with the content. This content, which is streamed over the Internet, can suffer from disturbing network and encoding impairments. Therefore, assessing the user’s perceived quality, i.e the Quality of Experience (QoE), in real-time becomes fundamental. For the case of natural video content, several approaches already exist that tackle the client-side real-time QoE evaluation. The intrinsically different expectations of the passive GVS user, however, call for new real-time quality models for these streaming services. Therefore, this paper presents a real-time Reduced-Reference (RR) quality assessment framework based on a low-complexity psychometric curve-fitting approach. The proposed solution selects the most relevant, low-complexity objective feature. Afterwards, the relationship between this feature and the ground-truth quality is modelled based on the psychometric perception of the human visual system (HVS). This approach is validated on a publicly available dataset of streamed game videos and is benchmarked against both subjective scores and objective models. As a side contribution, a thorough accuracy analysis of existing Objective Video Quality Metrics (OVQMs) applied to passive GVS is provided. Furthermore, this analysis has led to interesting insights on the accuracy of low-complexity client-based metrics as well as to the creation of a new Full-Reference (FR) objective metric for GVS, i.e. the Game Video Streaming Quality Metric (GVSQM).  相似文献   

14.
In January 2014, the new ITU-T P.913 recommendation for measuring subjective video, audio and multimedia quality in any environment has been published. This document does not contain any time-continuous subjective method. However, environmental parameter values are changing continuously in a majority of outdoor and also most indoor environments. To be aware of their impact on the perceived quality, a time-continuous quality assessment methodology is necessary. In previous standards, targeting laboratory-based test settings, a desk-mounted slider of substantial size is recommended. Unfortunately, there are many environments where such a device cannot be used.In this paper, new feedback tools for mobile time-continuous rating are presented and analysed. We developed several alternatives to the generally adopted desk-mounted slider as a rating device. In order to compare the tools, we defined a number of performance measures that can be used in further studies. The suitability and efficacy of the rating scheme based on measurable parameters as well as user opinions is compared. One method, the finger count, seems to outperform the others from all points of view. It was been judged to be easy to use with low potential for distractions. Furthermore, it reaches a similar precision level as the slider, while requiring lower user reaction and scoring times. Low reaction times are particularly important for time-continuous quality assessment, where the reliability of a mapping between impairments and user ratings plays an essential role.  相似文献   

15.
We examine the effect that variations in the temporal quality of videos have on global video quality. We also propose a general framework for constructing temporal video quality assessment (QA) algorithms that seek to assess transient temporal errors, such as packet losses. The proposed framework modifies simple frame-based quality assessment algorithms by incorporating a temporal quality variance factor. We use packet loss from channel errors as a specific study of practical significance. Using the PSNR and the SSIM index as exemplars, we are able to show that the new video QA algorithms are highly responsive to packet loss errors.  相似文献   

16.
    
A deep learning method called PTR-CNN (Predicted frame with Transform unit partition and prediction Residual aided CNN) is proposed for in-loop filtering in video compression. To reduce the computational complexity of an end-to-end CNN in-loop filter, a non-learning method of reference frame selection is designed to select the highest quality frame based on the frame’s blurriness and smoothiness scores. The transform unit (TU) partition and the prediction residual (PR) of the current frame are used as extra inputs to the neural network as the filtering guidance. The selected similar and high quality reference frame (RF) and the current unfiltered frame (CUF) are input to a CNN based motion compensation module to generate a predicted frame (PF). Finally input the PF, the CUF, the CUF’s TU partition and the CUF’s PR into the main CNN to reconstruct the filtered frame. The model is implemented in Tensorflow and tested in HEVC and AV1. Experimental results show that the complexity of proposed PTR-CNN is less than SOTA CNN-based reference aided in-loop filtering methods and slightly outperforms their RD performance. The scheme introduces a complexity overhead of 7% on the encoder. In particular, for random access, the proposed model achieves 11.78% coding gain over HEVC with DBF/SAO off, while has a gain of 4.76% over HEVC with DBF/SAO on. Ablation study demonstrates that the RF contributes about 10% of the total gain, and the TU and PR contribute over 4% of the total one, proving the effectiveness of each module. Moreover, it is observed that the proposed method can restore detailed structures and textures and hence improve the subjective quality.  相似文献   

17.
In this paper, we study the quality of experience (QoE) issues in scalable video coding (SVC) for its adaptation in video communications. A QoE assessment database is developed according to SVC scalabilities. Based on the subjective evaluation results, we derive the optimal scalability adaptation track for the individual video and further summarize common scalability adaptation tracks for videos according to their spatial information (SI) and temporal information (TI). Based on the summarized adaptation tracks, we conclude some general guidelines for the effective SVC video adaptation. A rate-QoE model for SVC adaptation is derived accordingly. Experimental results show that the proposed QoE-aware scalability adaptation scheme significantly outperforms the conventional adaptation schemes in terms of QoE. Moreover, the proposed QoE model reflects the rate and QoE relationship in SVC adaptation and thus, provides a useful methodology to estimate video QoE which is important for QoE-aware scalable video streaming.  相似文献   

18.
    
Objective image quality metrics try to estimate the perceptual quality of the given image by considering the characteristics of the human visual system. However, it is possible that the metrics produce different quality scores even for two images that are perceptually indistinguishable by human viewers, which have not been considered in the existing studies related to objective quality assessment. In this paper, we address the issue of ambiguity of objective image quality assessment. We propose an approach to obtain an ambiguity interval of an objective metric, within which the quality score difference is not perceptually significant. In particular, we use the visual difference predictor, which can consider viewing conditions that are important for visual quality perception. In order to demonstrate the usefulness of the proposed approach, we conduct experiments with 33 state-of-the-art image quality metrics in the viewpoint of their accuracy and ambiguity for three image quality databases. The results show that the ambiguity intervals can be applied as an additional figure of merit when conventional performance measurement does not determine superiority between the metrics. The effect of the viewing distance on the ambiguity interval is also shown.  相似文献   

19.
    
The involvement of external vendors in semiconductor industries increases the chance of hardware Trojan (HT) insertion in different phases of the integrated circuit (IC) design. Recently, several partial reverse engineering (RE) based HT detection techniques are reported, which attempt to reduce the time and complexity involved in the full RE process by applying machine learning or image processing techniques in IC images. However, these techniques fail to extract the relevant image features, not robust to image variations, complicated, less generalizable, and possess a low detection rate. Therefore, to overcome the above limitations, this paper proposes a new partial RE based HT detection technique that detects Trojans from IC layout images using Deep Convolutional Neural Network (DCNN). The proposed DCNN model consists of stacking several convolutional and pooling layers. It layer-wise extracts and selects the most relevant and robust features automatically from the IC images and eliminates the need to apply the feature extraction algorithm separately. To prevent the over-training of the DCNN model, a new stopping condition method and two new metrics, namely Accuracy difference measure (ADM) and Loss difference measure (LDM), are proposed that halts the training only when the performance of our model genuinely drops. Further, to combat the issue of process variations and fabrication noise generated during the RE process, we include noisy images with varying parameters in the training process of the model. We also apply the data augmentation and regularization techniques in the model to address the issues of underfitting and overfitting. Experimental evaluation shows that the proposed technique provides 99% and 97.4% accuracy on Trust-Hub and synthetic ISCAS dataset, respectively, which is on-an-average 15.83% and 21.69% higher than the existing partial RE based techniques.  相似文献   

20.
Multidimensional video scalability refers to the possibility that a video sequence can be adapted according to given conditions of video consumption by adjusting one or more of its features such as frame size, frame rate, and spatial quality. An important issue in implementing an adaptive video distribution scheme using scalability is how to maximize the quality of experience for the delivered contents, which raises a more fundamental issue, that is, how to estimate perceived quality of scalable video contents. This paper evaluates existing state-of-the-art objective quality metrics, including both generic image/video metrics and ones particularly developed for scalable videos, on the problem of quality assessment of multidimensional video scalability. It is shown that, on the whole, some recently developed metrics targeting scalability perform best. The results are thoroughly discussed in relation to the nature of the problem in comparison to what has been reported in existing studies for other problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号