This paper presents a novel No-Reference Video Quality Assessment (NR-VQA) model that utilizes proposed 3D steerable wavelet transform-based Natural Video Statistics (NVS) features as well as human perceptual features. Additionally, we proposed a novel two-stage regression scheme that significantly improves the overall performance of quality estimation. In the first stage, transform-based NVS and human perceptual features are separately passed through the proposed hybrid regression scheme: Support Vector Regression (SVR) followed by Polynomial curve fitting. The two visual quality scores predicted from the first stage are then used as features for the similar second stage. This predicts the final quality scores of distorted videos by achieving score level fusion. Extensive experiments were conducted using five authentic and four synthetic distortion databases. Experimental results demonstrate that the proposed method outperforms other published state-of-the-art benchmark methods on synthetic distortion databases and is among the top performers on authentic distortion databases. The source code is available at https://github.com/anishVNIT/two-stage-vqa. 相似文献
Sorting-based reversible data hiding (RDH) methods like pixel-value-ordering (PVO) can predict pixel values accurately and achieve an extremely low distortion on the embedded image. However, the excellent performance of these methods was not well explained in previous works, and there are unexploited common points among them. In this paper, we propose a general multi-predictor (GMP) framework to summarize PVO-based RDH methods and explain their high prediction accuracy. Moreover, by utilizing the proposed GMP framework, a more efficient sorting-based RDH method is given as an example to show the generality and applicability of our framework. Comparing with other PVO-based methods, the proposed example method can achieve significant improvement in embedding performance. It is hopeful that more efficient sorting-based RDH algorithms can be designed according to our proposed framework by designing better predictors and their combination methods. 相似文献
The efficiency of training visual attention in the central and peripheral visual field was investigated by means of a visual detection task that was performed in a naturalistic visual environment including numerous, time-varying visual distractors. We investigated the minimum number of repetitions of the training required to obtain the top performance and whether intra-day training improved performance as efficiently as inter-day training. Additionally, our research aimed to find out whether exposure to a demanding task such as a microsurgical intervention may cancel out the effects of training.
Results showed that performance in visual attention peaked within three (for tasks in the central visual field) to seven (for tasks in the periphery) days subsequent to training. Intra-day training had no significant effect on performance. When attention training was administered after exposure to stress, improvement of attentional performance was more pronounced than when training was completed before the exposure. Our findings support the implementation of training in situ at work for more efficient results.
Practitioner Summary: Visual attention is important in an increasing number of workplaces, such as with surveillance, inspection, or driving. This study shows that it is possible to train visual attention efficiently within three to seven days. Because our study was executed in a naturalistic environment, training results are more likely to reflect the effects in the real workplace. 相似文献
The visual brain fuses the left and right images projected onto the two eyes from a stereoscopic 3D (S3D) display, perceives parallax, and rebuilds a sense of depth. In this process, the eyes adjust vergence and accommodation to adapt to the depths and parallax of the points they gazed at. Conflicts between accommodation and vergence when viewing S3D content potentially lead to visual discomfort. A variety of approaches have been taken towards understanding the perceptual bases of discomfort felt when viewing S3D, including extreme disparities or disparity gradients, negative disparities, dichoptic presentations, and so on. However less effort has been applied towards understanding the role of eye movements as they relate to visual discomfort when viewing S3D. To study eye movements in the context of S3D viewing discomfort, a Shifted-S3D-Image-Database (SSID) is constructed using 11 original nature scene S3D images and their 6 shifted versions. We conducted eye-tracking experiments on humans viewing S3D images in SSID while simultaneously collecting their judgments of experienced visual discomfort. From the collected eye-tracking data, regions of interest (ROIs) were extracted by kernel density estimation using the fixation data, and an empirical formula fitted between the disparities of salient objects marked by the ROIs and the mean opinion scores (MOS). Finally, eye-tracking data was used to analyze the eye movement characteristics related to S3D image quality. Fifteen eye movement features were extracted, and a visual discomfort predication model learned using a support vector regressor (SVR). By analyzing the correlations between features and MOS, we conclude that angular disparity features have a strong correlation with human judgments of discomfort. 相似文献
Transmitted-reference (TR) ultra-wideband (UWB) communication systems have gained increasing popularity for the usage in the low data rate application, due to its non-coherent receiver structure. In conventional TR system, non-coherency at the receiver is achieved by sending reference pulses prior to the data-bearing pulses. Then, at the receiver side, reference pulses are used as template signals for correlation with data-bearing pulses. Therefore, the orthogonality between reference and data pulses is obtained in time division multiple access (TDMA) fashion. However, the implementation of a wideband delay line is very difficult in the current low power integrated circuits. In this paper, a TR method called Chaos-Based TR (CB-TR) is proposed. In the proposed method, chaotic sequences are used to separate the reference and data pulses. Such approach exploits the benefits of chaotic signals, such as non-periodicity, easy-to-generate, impulse-like autocorrelation value and low cross-correlation value. Furthermore, in order to decrease the influence of some negative properties of conventional chaotic maps, a modified chaotic generator (MCS) is proposed. Simulation results over the IEEE 802.15.4a channel model show comparable bit error rate performance to other TR methods. 相似文献