首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For a variety of applications such as video surveillance and event annotation, the spatial–temporal boundaries between video objects are required for annotating visual content with high-level semantics. In this paper, we define spatial–temporal sampling as a unified process of extracting video objects and computing their spatial–temporal boundaries using a learnt video object model. We first provide a computational approach for learning an optimal key-object codebook sequence from a set of training video clips to characterize the semantics of the detected video objects. Then, dynamic programming with the learnt codebook sequence is used to locate the video objects with spatial–temporal boundaries in a test video clip. To verify the performance of the proposed method, a human action detection and recognition system is constructed. Experimental results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.  相似文献   

2.
The development explained in this article proves that is possible to trace dishonest users who upload videos with sensitive content to the YouTube service. To achieve tracing these traitor users, fingerprint marks are embedded by a watermarking algorithm into each copy of the video before distributing it. Our experiments show that if the watermarking algorithm is carefully configured and the fingerprints are correctly chosen, the traitor, or a member of a set of traitors who have performed a collusion attack, can be found from a pirate video uploaded to the YouTube service.  相似文献   

3.
Video services have appeared in the recent years due to advances in video coding and convergence to IP networks. As these emerging services mature, the ability to deliver adequate quality to end-users becomes increasingly important. However, the transmission of digital video over error-prone and bandwidth-limited networks may produce spatial and temporal visual distortions in the decoded video. Both types of impairments affect the perceived video quality. In this paper, we examine the impact of spatio–temporal artefacts in video and especially how both types of errors interact to affect the overall perceived video quality. We show that the impact of the spatial quality on overall video quality is dependent on the temporal quality and vice-versa. We observe that the introduction of a degradation in one modality affects the quality perception in the other modality, and this change is larger for high-quality conditions than for low-quality conditions. The contribution of the spatial quality to the overall quality is found to be greater than the contribution of the temporal quality. Our results also indicate that low-motion talking-head content can be more negatively affected by temporal frame freezing artefacts than other general type of content with higher motion. Based on the results of a subjective experiment, we propose an objective model to predict overall video quality by integrating the contributions of a spatial quality and a temporal quality. The non-linear model shows a very high linear correlation with subjective data.  相似文献   

4.
This paper proposes an Iterative Joint Source–Channel Decoding (IJSCD) scheme for error resilient transmission of H.264 compressed video over noisy channels by using the available H.264 compression, e.g., Context-based Adaptive Binary Arithmetic Coding (CABAC), and channel coding, i.e., rate-1/2 Recursive Systematic Convolutional (RSC) code, in transmission. At the receiver, the turbo decoding concept is explored to develop a joint source–channel decoding structure using a soft-in soft-out channel decoder in conjunction with the source decoding functions, e.g., CABAC-based H.264 semantic verification, in an iterative manner. Illustrative designs of the proposed IJSCD scheme for an Additive White Gaussian Noise (AWGN) channel, including the derivations of key parameters for soft information are discussed. The performance of the proposed IJSCD scheme is shown for several video sequences. In the examples, for the same desired Peak Signal-to-Noise Ratio (PSNR), the proposed IJSCD scheme offers a savings of up to 2.1 dB in required channel Signal-to-Noise Ratio (SNR) as compared to a system using the same RSC code alone. The complexity of the proposed scheme is also evaluated. As the number of iterations is controllable, a tradeoff can be made between performance improvement and the overall complexity.  相似文献   

5.
The performance of video tone-mapping operators is investigated in a rating experiment using two criteria: overall quality and fidelity to real-world experience. The study includes a tone-curve used in commercial cameras, rarely considered in tone-mapping evaluation studies. The quality is measured for a range of parameter settings, revealing the importance of parameter fine-tuning and often unsatisfactory results of the default operator parameters. In order to explain what makes best performing operators better, the results are analysed in relation to image statistics and the characteristics of the tone-mapping function. Our observations are: state-of-the-art tone mapping produces measurably better results than camera’s S-shaped curve for high dynamic range scenes with important content spanned across a wide dynamic range; differences in colour reproduction strongly affect the results; fidelity and quality criteria produce similar results when no reference is present; and state-of-the-art operators produce the results of comparable quality when their parameters are well selected.  相似文献   

6.
We have designed and implemented a controllable software architecture for a VideoonDemand (VOD) server. With the proposed software architecture, many system design issues can be investigated. For example, we studied several disk striping schemes in the storage subsystem and examined the impact of the disk striping schemes on the utilization of system resources. In the processing component, we observed that additional concurrent video streams can be supported by using efficient memory interleaving. Buffering with a large memory space in the processing subsystem is a common technique to alleviate the latency variance of accessing different system components. By employing userlevel control and scheduling, the variance can be decreased thereby reducing the resulting buffer space needed for each video stream. In the network subsystem, we adopted a serverdriven approach for investigating MPEG2 video delivery over Asynchronous Transfer Mode (ATM) networks. The VOD server controls the pace of video transmission and reduces the complexity of the client. Since the client has limited buffer space (due to cost considerations), we have reduced the buffer requirement by regulating the transmission based on timing information embedded in the MPEG2 streams. Our research and experimental results are based on a VOD server which is currently under construction. The prototype server is based on an SGI sharedmemory multiprocessor with a mass storage system consisting of RAID3 disk arrays. Using 30 RAID3 disk arrays, preliminary experimental results show that the prototype server can potentially support more than 360 highquality video streams with careful design and coordination of the different system components.  相似文献   

7.
In the context of registration between videos and geographic information system (GIS)-based 3D building models—for instance in augmented reality applications—we propose a solution to one of the most critical problems, namely the registration initialization. Successful automatic 2D/3D matching is achieved by combining two context-dependent improvements. On one hand, we associate semantic information to the low-level primitives we used to reduce the problem complexity. On the other hand, we avoid false initial registration solutions by analyzing the convergence of the iterative pose computation in a RANSAC framework. We require that videos are acquired together with global positioning system measures. We also present how such a registration can be exploited, once it has been performed for the whole video. Textures of visible buildings are extracted from the images. A new algorithm for fa?ade texture fusion based on statistical analysis of the texels color is presented. It allows us to remove from the final textures all occluding objects in front of the viewed building fa?ades.  相似文献   

8.
In this paper, we propose a novel Wyner–Ziv-based video compression scheme which supports encoding a new type of inter frame called ‘M-frame’. Different from traditional multi-hypothesis inter frames, the M-frame is specially compressed with its two neighbor frames as reference at the encoder, but can be identically reconstructed by using any one of them as prediction at the decoder. Based on this, the proposed Wyner–Ziv-based bidirectionally decodable video compression scheme supports decoding the frames in a video stream in both temporal order and reverse order. Unlike the other schemes which support reverse playback, our scheme achieves the reversibility with low extra cost of storage and bandwidth. In error-resilient test, our scheme outperforms H.264 based schemes up to 3.5 dB at same bit rate. The proposed scheme also provides more flexibility for stream switching.  相似文献   

9.
Network quality of service (NQoS) of IP networks is unpredictable and impacts the quality of networked multimedia services. Adaptive voice and video schemes are therefore vital for the provision of voice over IP (VoIP) services for optimised quality of experience (QoE). Traditional adaptation schemes based on NQoS do not take perceived quality into consideration even though the user is the best judge of quality. Additionally, uncertainties inherent in NQoS parameter measurements make the design of adaptation schemes difficult and their performance suboptimal. This paper presents a QoE-driven adaptation scheme for voice and video over IP to solve the optimisation problem to provide optimal QoE for networked voice and video applications. The adaptive VoIP architecture was implemented and tested both in NS2 and in an Open IMS Core network to allow extensive simulation and test-bed evaluation. Results show that the scheme was optimally responsive to available network bandwidth and congestion for both voice and video and optimised delivered QoE for different network conditions, and is friendly to TCP traffic.  相似文献   

10.
11.
Stopping to think becomes more important to do the harder that doing it becomes. Television is very popular and has been used with great success for many decades all over the world. Videotelephony is relatively unpopular, despite long study and some recent advances. It seems obvious to adopt the pictorial culture of television as a guide to the development of videotelephony. The authors believe this assumption is not only fundamentally mistaken, but is partly responsible for the unpopularity of videotelephony. To encourage broader and deeper debate, some artistic and engineering aspects of pictorial culture are discussed in exploring how videotelephony might be made more appealing. The implications for future telepresence systems and for new virtual-world multimedia environments are discussed and topics for further work suggested.  相似文献   

12.
数字电视业务平台及其应用开发服务提供商茁壮网络公司(iPanel)日前宣布,该公司与美国Active video公司(原ICTV公司)联合建立的实验室已经诞生,新实验室位于iPanel总部所在的中国深圳市。  相似文献   

13.
Transform coding has been widely used in video coding standards, such as H.264 advanced video coding (H.264/AVC) and high efficiency video coding (HEVC). But the coded video sequences suffer from annoying coding artifacts, such as blocking and ringing artifacts. In this paper, we propose the quadtree-based non-local Kuan’s (QNLK) filter to suppress the quantization noise optimally and improve the objective and subjective quality of the reconstructed frame simultaneously. The proposed filter takes advantage of the non-local Kuan’s (NLK) filter to restore the quantized signal in transform domain. Restored coefficients are then projected onto designed quantization constraint sets (QCS). Quadtree-based signaling strategy is used at the end of QNLK for adaptive filtering on/off control. Experimental results of QNLK show that the proposed method achieves significant objective coding gain and visual quality improvement, compared with both H.264/AVC high profile and HEVC.  相似文献   

14.
The transmission of wireless video in acceptable quality is only possible by following an end-to-end approach. WaveVideo is an integrated, adaptive video coding architecture designed for heterogeneous wireless networks. It includes basic video compression algorithms based on wavelet transformations, an efficient channel coding, a filter architecture for receiver-based media scaling, and error-control methods to adapt video transmissions to the wireless environment. Using a joint source/channel coding approach, WaveVideo offers a high degree of error tolerance on noisy channels, still being competitive in terms of compression. Adaptation to channel conditions and user requirements is implemented on three levels. The coding itself features spatial and temporal measures to conceal transmission errors. Additionally, the amount of introduced error-control information is controlled by feedback. The video stream coding, applied to multicast capable networks, can serve different user's needs efficiently at the same time by scaling the video stream in the network according to receivers' quality requirements. The WaveVideo architecture is unique in terms of its capability to use QoS mapping and adaptation functions across all network nodes providing the same uniform interface.  相似文献   

15.
This paper describes a new algorithm for detecting cuts, thereby segmenting a video into shots. Our Webbased video library contains a large volume of news and documentary material; most of the transitions between shots in that type of programming are cuts, rather than dissolves or other complex transitions. We have developed an accurate multiattribute algorithm for detecting cuts in video programs. The algorithm uses a motion metric to identify a set of cuts, then uses luminance histograms to eliminate false cuts. Our experimental results show that this algorithm is more accurate than previous motionbased transition detection algorithms.  相似文献   

16.
In this paper, a novel multidimensional underwater video dehazing method is presented to restore and enhance the underwater degraded videos. Videos in the underwater suffer from medium scattering and light absorption. The absorption of light traveling in the water makes the underwater hazing videos different from the atmosphere hazing videos. In order to dehaze the underwater videos, a spatial–temporal information fusion method is proposed which includes two main parts. One is transmission estimation, which is based on the correlation between the adjacent frames of videos to keep the color consistency, where fast tracking and the least square method are used to reduce the influence of camera and object motions and water flowing. Another part is background light estimation to keep consistent atmospheric light values in a video. Extensive experimental results demonstrate that the proposed algorithm can have superior haze removing and color balancing capabilities.  相似文献   

17.
18.
1 Introduction With the ubiquitous use of Internet and the deployment of next generation of networks, video communications are increa- singly becoming the major services in demand. Unlike data transmission, video communication is essentially time-sensitiv…  相似文献   

19.
Stereoscopic video coding (SSVC) plays an important role in various 3D video applications. In SSVC, robust stereoscopic video transmission over error-prone networks is still a challenge problem to be solved. In this paper, we propose a joint encoder–decoder error control framework for SSVC, where error-resilient source coding, transmission network conditions, and error concealment scheme are jointly considered to achieve better error robustness performance. The proposed joint encoder–decoder error control framework includes two parts: an error concealment algorithm at the decoder side and a rate–distortion optimized error resilience algorithm at the encoder side. For error concealment at the decoder side, an overlapped block motion and disparity compensation based error concealment scheme is proposed to adaptively utilize inter-view correlations and temporal correlations. For error resilience at the encoder side, first, the inter-view refreshment is proposed for SSVC to suppress error propagations. Then, an end-to-end distortion model for SSVC is derived, which jointly considers the transmission network conditions, inter-view refreshment, and error concealment tools at the decoder side. Finally, based on the derived end-to-end distortion model, the rate–distortion optimized error resilience algorithm is presented to adaptively select inter-view, inter- or intra-coding for SSVC. The experimental results show that the proposed joint encoder–decoder error control framework has superior error robustness performance for stereoscopic video transmission over error-prone networks.  相似文献   

20.
This paper presents the design of a fifth-order low-pass elliptic filter that employs a parallel connection of two all-pass sections to satisfy specifications commonly used in video frequency applications. Operating with a sampling frequency of 16 MHz, the IC prototype was implemented in a standard double-poly CMOS 0.8 μm process. The experimental verification showed a passband frequency deviation smaller than 0.08 dB up to the passband edge frequency of 3.4 MHz, and an output noise power of 0.97 ${\mu {\rm V}_{\rm RMS}}/{\sqrt {Hz}}This paper presents the design of a fifth-order low-pass elliptic filter that employs a parallel connection of two all-pass sections to satisfy specifications commonly used in video frequency applications. Operating with a sampling frequency of 16 MHz, the IC prototype was implemented in a standard double-poly CMOS 0.8 μm process. The experimental verification showed a passband frequency deviation smaller than 0.08 dB up to the passband edge frequency of 3.4 MHz, and an output noise power of 0.97 , resulting in a dynamic range of 49.1 dB. The filter structure enables multiple fault detection and suits modern automated testing configurations to allow accurate estimation of the actually implemented transfer function parameters, an issue of increasing importance in VLSI circuit design. The relative area required for testing the fifth-order filter is only 8% of the total filter area, and decreases as the filter order increases. Jorge Morales Ca?ive was born in Cienfuegos, Cuba, in 1963. He received the B.Sc. and M.Sc. degrees from the Technical University of San Petersburg, Russia, in 1986 and 1988, respectively, and the D.Sc. degree from the Federal University of Rio de Janeiro, Brazil, in 1991, all in electrical engineering. From 1988 to 1994, he worked at CEADEN, in Havana, Cuba, on the development of nuclear equipments. From 1994 to 1997, he worked at INOR, in Havana, Cuba, on the research and development of acquisition systems and image processing for nuclear medicine. His research interests are in the areas of analog and digital signal processing. Antonio Petraglia (S’89-M’91-SM’99) received the Engineer and M.Sc. degrees from the Federal University of Rio de Janeiro (UFRJ), Brazil, in 1977 and 1982, respectively, and the Ph.D. degree from the University of California, Santa Barbara (UCSB), in 1991, all in electrical engineering. In 1979, he joined the Faculty of UFRJ as an Associate Professor of electrical engineering, where he served as a Co-Chair in the Department of Electronic Engineering from 1982 to 1984. During the second semester of 1991, he was a post-Doctoral researcher with the Department of Electrical and Computer Engineering at UCSB. Since 1992 he has been on the faculty of the Program for Post-Graduate Engineering at UFRJ, where in 1997 he established the Laboratory for the Processing of Analog and Digital Signals. From March 2001 through March 2002 he was a Visiting Scholar with the Electrical Engineering Department at the University of California, Los Angeles. He has been involved in teaching and research activities in the areas of analog and digital signal processing, and in mixed analog-digital integrated circuit design. He is a distinguished member of the Brazilian Millenium Group in Nanoelectronics and Microelectronics in 2006-2008. Dr. Petraglia served as an Associate Editor for the IEEE Transactions on Circuits and Systems-II: Analog and Digital Signal Processing in 2002–2003 Mariane Rembold Petraglia (M’97) received the B.Sc. degree in electronic engineering from the Federal University of Rio de Janeiro, Brazil, in 1985, and the M.Sc. and Ph.D. degrees in electrical engineering from the University of California, Santa Barbara, in 1988 and 1991, respectively. From 1992 to 1993, she was with the Department of Electrical Engineering, Catholic University of Rio de Janeiro, Brazil. Since 1993, she has been with the Department of Electronic Engineering and with the Program of Electrical Engineering, COPPE, at the Federal University of Rio de Janeiro, where she is presently an Associate Professor. From March 2001 to February 2002, she was a Visiting Researcher with the Adaptive Systems Laboratory, at the University of California, Los Angeles. Her research interests are in adaptive signal processing, multirate systems, and image processing. Dr. Petraglia is a member of Tau Beta Pi, and a distinguished member of the Brazilian Millenium Group in Nanoelectronics and Microelectronics in 2006–2008. She is serving as an Associate Editor for the IEEE Transactions on Signal Processing since Nov. 2004.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号