首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Automatic text segmentation and text recognition for video indexing   总被引:13,自引:0,他引:13  
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos.  相似文献   

3.
In the last half century the most used video storage devices have been the magnetic tapes, where the information are stored in analog format based on the electromagnetism principles. When the digital technique has become the most used, it was necessary to convert analog information in digital format in order to preserve these data. Unfortunately, analog videos may be affected by drops that produce some visual defect which could be acquired during the digitization process. Despite there are many hardware to perform the digitization, just few implement the automatic correction of these defects. In some cases, drop removal is possible through the analog device. However, when a damaged already-converted video is owned, a correction based on image processing technique is the unique way to enhance the videos. In this paper, the drop, also known as “Tracking Error” or “Mistracking,” is analyzed. We propose an algorithm to detect the drops’ visual artifacts in the converted videos, as well as a digital restoration method.  相似文献   

4.
Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.  相似文献   

5.
赵奇  刘皎瑶  徐敬东 《计算机工程》2007,33(22):134-136,
提出了一种结合音视频双重特征检测视频内容的新方法,以提高对视频内容的识别准确率.该方法分别对视觉特征和音频特征进行分析,引入支持向量机对音频段进行分类,并综合音视域的分析结果对视频内容进行判断.针对特殊视频片断进行分析,证明结合音视特征的分析方法可行有效,可应用于视频内容监控及特定视频片段的检索与分割.  相似文献   

6.
Production model based digital video segmentation   总被引:20,自引:1,他引:19  
Effective and efficient tools for segmenting and content-based indexing of digital video are essential to allow easy access to video-based information. Most existing segmentation techniques do not use explicit models of video. The approach proposed here is inspired and influenced by well established video production processes. Computational models of these processes are developed. The video models are used to classify the transition effects used in video and to design automatic edit effect detection algorithms. Video segmentation has been formulated as a production model based classification problem. The video models are also used to define segmentation error measures. Experimental results from applying the proposed technique to commercial cable television programming are presented.  相似文献   

7.

Athlete detection and action recognition in sports video is a very challenging task due to the dynamic and cluttered background. Several attempts for automatic analysis focus on athletes in many sports videos have been made. However, taekwondo video analysis remains an unstudied field. In light of this, a novel framework for automatic techniques analysis in broadcast taekwondo video is proposed in this paper. For an input video, in the first stage, athlete tracking and body segmentation are done through a modified Structure Preserving Object Tracker. In the second stage, the de-noised frames which completely contain the body of analyzed athlete from video sequence, are trained by a deep learning network PCANet to predict the athlete action of each single frame. As one technique is composed of many consecutive actions and each action corresponds a video frame, focusing on video sequences to achieve techniques analysis makes sense. In the last stage, linear SVM is used with the predicted action frames to get a techniques classifier. To evaluate the performance of the proposed framework, extensive experiments on real broadcast taekwondo video dataset are provided. The results show that the proposed method achieves state-of-the-art results for complex techniques analysis in taekwondo video.

  相似文献   

8.
在 MPEG- 4视频编码标准中 ,为了实现基于视频内容的交互功能 ,视频序列的每一帧由视频对象面来表示 ,而生成视频对象面 ,需要对视频序列中运动对象进行有效分割 ,并跟踪运动对象随时间的变化 .在视频分割方法中 ,交互式分割视频对象能满足分割的效率与质量指标要求 ,因此提出了一种交互分割与自动跟踪相结合的方式来分割视频语义对象 ,即在初始分割时 ,依据用户的交互与形态学的分水线分割算法相结合提取视频对象轮廓 ,并用改进的轮廓跟踪方法有效提高视频对象轮廓的精度 ;对后续帧的跟踪 ,采用六参数仿射变换跟踪运动对象轮廓的变化 ,用平移估算的运动矢量作为初始值 ,计算六参数仿射变换的参数 .实验结果表明 ,该方法能有效地分割并跟踪视频运动对象  相似文献   

9.
10.
Graph-based multilevel temporal video segmentation   总被引:1,自引:0,他引:1  
This paper presents a graph-based multilevel temporal video segmentation method. In each level of the segmentation, a weighted undirected graph structure is implemented. The graph is partitioned into clusters which represent the segments of a video. Three low-level features are used in the calculation of temporal segments’ similarities: visual content, motion content and shot duration. Our strength factor approach contributes to the results by improving the efficiency of the proposed method. Experiments show that the proposed video scene detection method gives promising results in order to organize videos without human intervention.  相似文献   

11.
基于混沌特征的运动模式分割和动态纹理分类   总被引:1,自引:0,他引:1  
王勇  胡士强 《自动化学报》2014,40(4):604-614
采用混沌理论对动态纹理中的像素值序列建模,提取动态纹理中的像素值序列的相关特征量,将视频用特征向量矩阵表示. 通过均值漂移(Mean shift)算法对矩阵中的特征向量聚类,实现对视频中的运动模式分割. 然后,采用地球移动距离(Earth mover’s distance,EMD)度量不同视频的差异,对动态纹理视频分类. 本文对多个数据库测试表明:1)分割算法可以分割出视频中不同的运动模式;2)提出的特征向量可以很好地描述动态纹理系统;3)分类算法可以对动态纹理视频分类,且对视频中噪声干扰具有一定的鲁棒性.  相似文献   

12.
This paper targets at the problem of automatic semantic indexing of news videos by presenting a video annotation and retrieval system which is able to perform automatic semantic annotation of news video archives and provide access to the archives via these annotations. The presented system relies on the video texts as the information source and exploits several information extraction techniques on these texts to arrive at representative semantic information regarding the underlying videos. These techniques include named entity recognition, person entity extraction, coreference resolution, and semantic event extraction. Apart from the information extraction components, the proposed system also encompasses modules for news story segmentation, text extraction, and video retrieval along with a news video database to make it a full-fledged system to be employed in practical settings. The proposed system is a generic one employing a wide range of techniques to automate the semantic video indexing process and to bridge the semantic gap between what can be automatically extracted from videos and what people perceive as the video semantics. Based on the proposed system, a novel automatic semantic annotation and retrieval system is built for Turkish and evaluated on a broadcast news video collection, providing evidence for its feasibility and convenience for news videos with a satisfactory overall performance.  相似文献   

13.
14.
针对虚拟到真实驾驶场景翻译中成对的数据样本缺乏以及前后帧不一致等问题,提出一种基于生成对抗网络的视频翻译模型。为解决数据样本缺乏问题,模型采取“双网络”架构,将语义分割场景作为中间过渡分别构建前、后端网络。在前端网络中,采用卷积和反卷积框架,并利用光流网络提取前后帧的动态信息,实现从虚拟场景到语义分割场景的连续的视频翻译;在后端网络中,采用条件生成对抗网络框架,设计生成器、图像判别器和视频判别器,并结合光流网络,实现从语义分割场景到真实场景的连续的视频翻译。实验利用从自动驾驶模拟器采集的数据与公开数据集进行训练和测试,在多种驾驶场景中能够实现虚拟到真实场景的翻译,翻译效果明显好于对比算法。结果表明,所提模型能够有效解决前后帧不连续和动态目标模糊的问题,使翻译的视频更为流畅,并且能适应多种复杂的驾驶场景。  相似文献   

15.
16.
In this work we are concerned with detecting non-collaborative videos in video sharing social networks. Specifically, we investigate how much visual content-based analysis can aid in detecting ballot stuffing and spam videos in threads of video responses. That is a very challenging task, because of the high-level semantic concepts involved; of the assorted nature of social networks, preventing the use of constrained a priori information; and, which is paramount, of the context-dependent nature of non-collaborative videos. Content filtering for social networks is an increasingly demanded task: due to their popularity, the number of abuses also tends to increase, annoying the user and disrupting their services. We propose two approaches, each one better adapted to a specific non-collaborative action: ballot stuffing, which tries to inflate the popularity of a given video by giving “fake” responses to it, and spamming, which tries to insert a non-related video as a response in popular videos. We endorse the use of low-level features combined into higher-level features representation, like bag-of-visual-features and latent semantic analysis. Our experiments show the feasibility of the proposed approaches.  相似文献   

17.
The human eye cannot see subtle motion signals that fall outside human visual limits, due to either limited resolution of intensity variations or lack of sensitivity to lower spatial and temporal frequencies. Yet, these invisible signals can be highly informative when amplified to be observable by a human operator or an automatic machine vision system. Many video magnification techniques have recently been proposed to magnify and reveal these signals in videos and image sequences. Limitations, including noise level, video quality and long execution time, are associated with the existing video magnification techniques. Therefore, there is value in developing a new magnification method where these issues are the main consideration. This study presents a new magnification method that outperforms other magnification techniques in terms of noise removal, video quality at large magnification factor and execution time. The proposed method is compared with four methods, including Eulerian video magnification, phase-based video magnification, Riesz pyramid for fast phase-based video magnification and enhanced Eulerian video magnification. The experimental results demonstrate the superior performance of the proposed magnification method regarding all video quality metrics used. Our method is also 60–70% faster than Eulerian video magnification, whereas other competing methods take longer to execute than Eulerian video magnification.  相似文献   

18.
19.
基于多模态的检测方法是过滤成人视频的有效手段,然而现有方法中缺乏准确的音频语义表示方法。因此本文提出融合音频单词与视觉特征的成人视频检测方法。先提出基于周期性的能量包络单元(简称EE)分割算法,将音频流准确地分割为EE的序列;再提出基于EE和BoW(Bag-of-Words)的音频语义表示方法,将EE的特征描述为音频单词的出现概率;采用复合加权方法融合音频单词与视觉特征的检测结果;还提出基于周期性的成人视频判别算法,与基于周期性的EE分割算法前后配合,以充分利用周期性进行检测。实验结果表明,与基于视觉特征的方法相比,本文方法显著提高了检测性能。当误检率为9.76%时,检出率可达94.44%。  相似文献   

20.
This paper tackles the problem of surveillance video content modelling. Given a set of surveillance videos, the aims of our work are twofold: firstly a continuous video is segmented according to the activities captured in the video; secondly a model is constructed for the video content, based on which an unseen activity pattern can be recognised and any unusual activities can be detected. To segment a video based on activity, we propose a semantically meaningful video content representation method and two segmentation algorithms, one being offline offering high accuracy in segmentation, and the other being online enabling real-time performance. Our video content representation method is based on automatically detected visual events (i.e. ‘what is happening in the scene’). This is in contrast to most previous approaches which represent video content at the signal level using image features such as colour, motion and texture. Our segmentation algorithms are based on detecting breakpoints on a high-dimensional video content trajectory. This differs from most previous approaches which are based on shot change detection and shot grouping. Having segmented continuous surveillance videos based on activity, the activity patterns contained in the video segments are grouped into activity classes and a composite video content model is constructed which is capable of generalising from a small training set to accommodate variations in unseen activity patterns. A run-time accumulative unusual activity measure is introduced to detect unusual behaviour while usual activity patterns are recognised based on an online likelihood ratio test (LRT) method. This ensures robust and reliable activity recognition and unusual activity detection at the shortest possible time once sufficient visual evidence has become available. Comparative experiments have been carried out using over 10 h of challenging outdoor surveillance video footages to evaluate the proposed segmentation algorithms and modelling approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号