首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 125 毫秒
1.
视频数据中的文本是视频语义理解和检索的重要信息来源.文中对视频中文本的检测、定位、提取、增强和识别进行了研究.提出了应用小波模极大值算法检测视频帧文本所在的位置,用由粗到精的多层定位方法以及金字塔模型,对于多尺度的静止和滚动中英文文字进行提取,最后对文本区域进行二值化.实验表明文中方法取得了良好的效果.  相似文献   

2.
基于模糊同质性映射的文本检测方法   总被引:2,自引:0,他引:2  
视频图像中的文本是从语义层次对视频图像内容进行描述的非常有效信息,文本检测为基于语义的图像检索提供了条件。该文提出了一种基于模糊逻辑和同质映射相结合的文本检测方法,首先利用最大信息熵准则将原始图像模糊化;然后构造基于边缘信息和纹理信息的图像同质性,并利用它将图像映射到模糊同质性空间;最后在模糊同质性空间通过纹理分析检测文本区域。与直接在图像空间域中提取特征的文本检测方法相比,该方法对复杂背景视频图像的文本检测取得了更好的效果,并且适用于多种类型的视频图像中文本的检测。  相似文献   

3.
邵晨曦  李海波  王李忠 《电子技术》2009,36(11):24-24,23
新闻视频的标题字幕中包含有丰富的语义信息,是实现自动化视频检索、分析和理解的重要信息源之一。通过对新闻字幕的特点进行分析,文章提出了一种基于Prewitt算子的新闻字幕检测的方法。算法首先对图像进行灰度变换,然后利用Prewitt算子进行边缘检测,最后进行字幕区域的探测与合并,检测出字幕。通过对不同的新闻视频帧进行实验,获得了较高的探测准确率。试验证明,文章提出的方法能够较好的完成新闻字幕检测的任务。  相似文献   

4.
吴进 《电视技术》2011,35(11):118-120
从影视字幕中获取相应的文本,可为影视节目的内容检索提供一种重要的手段.针对从视频数据中截取下来的帧数据,对灰度化后的彩色单帧视频中的字幕分割方法进行了探讨,提出了区域检测算法的处理方案,实验结果表明用该方法获取文本信息是合理有效的.  相似文献   

5.
检测并提取视频图像中的文本信息对视频图像和内容的理解意义重大。以现有的文本检测算法为基础,提出一种基于角点与BP神经网络相结合的文本检测算法。该算法首先应用多尺度角点算法提取文本角点信息并初步定位文本行,接着提取文本特征,最后应用BP神经网络精确定位文本。实验结果表明,此算法与经典方法相比具有更高的正确率和鲁棒性,视频中文本的正确检测率达到90.3%。  相似文献   

6.
《信息技术》2015,(9):118-120
文中设计了一种基于开源库OpenCV、FreeType和VLC的多种文字视频字幕叠加工具。该工具首先读取字幕文件中的时间信息和文本信息,然后按时间信息解码对应的视频帧,最后将字幕的文本内容绘制到视频帧中。克服了OpenCV只支持西文字符绘制、不支持中文字符的缺点,并支持多种字体的绘制。该工具可在视频截图上批量叠加不同字体的字幕,能够很便捷地生成大量字幕素材,用于视频字幕检测算法的设计或评估。  相似文献   

7.
基于时空分布特征的新闻字幕检测改进算法   总被引:2,自引:0,他引:2  
在分析视频字幕时空分布特征的基础上,采用帧间差分方法提取字幕候选区域,再采用双峰阈值法对字幕区域二值化处理,最后采用积分投影法对字幕区域作行列分割,提取出用于OCR识别系统的单个文字图像.实验表明算法有较高的准确率,能满足视频检索中获取语义信息的需求.  相似文献   

8.
《现代电子技术》2014,(9):50-52
提出了一种基于CDF9-7小波和自适应Otsu算法的视频图像字幕分割算法。首先从视频中截取视频图像,并对其进行灰度化等预处理,其次对预处理后的图像进行CDF9-7小波变换,获取其水平和垂直的高频分量HH,然后使用自适应Otsu算法分割出该高频分量图像中的字幕区域。实验表明,该算法分割效果良好,具有一定的鲁棒性和自适应性。以这些区域为基础,可以进一步完成诸如视频字幕识别等图像分析、理解和识别操作。  相似文献   

9.
任通 《电视技术》2014,38(5):190-193
字幕分割是指对检测定位到的视频字幕图像进行分割,使其字符像素与本底背景像素分离,二值化为可供OCR软件识别的字幕图像。为了克服字幕图像分割中容易出现的过分割及欠分割现象,提出一种基于字幕区域和外扩区域"白像素"数量增量比判决的字幕图像分割算法,该算法通过逐步改变图像分割阈值,以分析图像分割结果作为反馈来判决当前分割效果的好坏,从而确定最优分割阈值。大量结果表明,该算法性能良好,其分割效果远优于传统的OTSU算法、K均值聚类等经典算法。  相似文献   

10.
基于快速8-连通域标记的视频字幕提取新算法   总被引:1,自引:1,他引:0  
提出了一种基于彩色Roberts边缘算子、形态学和快速8-连通区域标记的视频字幕提取新方法.首先用彩色边缘算子提取出字幕的边缘信息,再通过形态学处理得到字幕连通区域,最后采用快速8-连通区域标记新方法进行连通区域分析得到最终字幕区域.实验证明,该算法能够快速定位字幕区域,定位精度较高,具有较好的稳健性.  相似文献   

11.
This paper presents a system for providing interactive broadcast services for live soccer video that is based on instant semantics acquisition. Currently, we have implemented two such interactive services: live event alert and on-the-fly language selection. The live event alert service has a small time lag of about 30 s for a short video clip to reach its final viewer and at most 1.5 min for a long clip of the live event. The on-the-fly language selection service allows users to choose their preferred contents and preferred language. The motivation for this work is that such interactive services will greatly increase the value of live soccer video. Currently, similar systems attempt to derive semantics of a soccer game from gamelog in freestyle text format and low-level features of the video, which is a challenging task. In this paper, we tackle this challenge with a combination of both gamelog input tool and targeted algorithm proposed in this paper. Our system is powered by our proposed semantic gamelog input tool that facilitates fast and accurate input of a semantic gamelog that contains basic semantic information of atomic events. When an interesting event occurs, our system performs boundary detection of these events by combining features extracted from the video with additional information from the semantic gamelog. This additional information facilitates our system to achieve accurate and very fast boundary detection of these events to support our live event alert service. Our system also implements a gamelog translation machine which translates the semantic gamelog (encoded in a game-specific code) into any natural language, provided that there is a configuration file for that language. Combining our gamelog translation machine with existing text-to-speech technology, we provide the on-the-fly language selection service. (Currently, our system supports English, Chinese, and Malay.)  相似文献   

12.
褚晶辉  董越  吕卫 《电视技术》2014,38(3):188-191
视频中包含的文字信息与视频的语义内容有很强的相关性,将视频中的文字信息提取出来进行分析处理可以有效地理解电视视频语义,从而实现对视频内容的安全监控。针对文字检测提出一种基于小波变换、角点特征图像和统计特征的有效方法,并运用基于彩色空间的文字提取方法获取二值图像,更有利于后面OCR的文字识别。  相似文献   

13.
Overlay text brings important semantic clues in video content analysis such as video information retrieval and summarization, since the content of the scene or the editor's intention can be well represented by using inserted text. Most of the previous approaches to extracting overlay text from videos are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to detect and extract the overlay text from the video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background, a transition map is first generated. Then candidate regions are extracted by a reshaping method and the overlay text regions are determined based on the occurrence of overlay text in each candidate. The detected overlay text regions are localized accurately using the projection of overlay text pixels in the transition map and the text extraction is finally conducted. The proposed method is robust to different character size, position, contrast, and color. It is also language independent. Overlay text region update between frames is also employed to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.  相似文献   

14.
As social networks are getting more and more popular day by day, large numbers of users becoming constantly active social network users. In this way, there is a huge amount of data produced by users in social networks. While social networking sites and dynamic applications of these sites are actively used by people, social network analysis is also receiving an increasing interest. Moreover, semantic understanding of text, image, and video shared in a social network has been a significant topic in the network analysis research. To the best of the author's knowledge, there has not been any comprehensive survey of social networks, including semantic analysis. In this survey, we have reviewed over 200 contributions in the field, most of which appeared in recent years. This paper not only aims to provide a comprehensive survey of the research and application of social network analysis based on semantic analysis but also summarizes the state‐of‐the‐art techniques for analyzing social media data. First of all, in this paper, social networks, basic concepts, and components related to social network analysis were examined. Second, semantic analysis methods for text, image, and video in social networks are explained, and various studies about these topics are examined in the literature. Then, the emerging approaches in social network analysis research, especially in semantic social network analysis, are discussed. Finally, the trending topics and applications for future directions of the research are emphasized; the information on what kind of studies may be realized in this area is given.  相似文献   

15.
视频是数据处理中综合性能最高,包含内容最广的载体.视频题目通过文字表达,内容通过连续图像帧表达,另外部分视频还包含背景音乐或者解说旁白.因此,视频处理即是对文字、图像、声音的多模态处理.着眼于多模态处理技术,提出基于语义空间统一表征的视频多模态内容分析框架,利用多种架构的深度神经网络,对视频的文字、图像、音频进行分别处理,为达到统一的功效,将不同结构的深度神经网络归结到语义空间,通过语义空间进行综合认知.提出的架构清晰、层次分明,对于视频理解的建模具有指导意义.  相似文献   

16.
基于高速电视图像文件与跟踪转台实时运动数据信息的存储结构,在VC 6.0环境下编程开发一种测量软件,解决了在实际工程应用中实现图像信息与跟踪转台实时信息的融合与回放时所遇到的对4G大文件的读写操作问题以及图像与实时数据融合时的时间同步问题,提出了一种将高速电视图像及实时信息的融合与回放的通用方法,在工程应用中取得了良好的效果.  相似文献   

17.
Among the human users of the Internet of Things, the hearing-impaired is a special group of people for whom normal information expression forms, such as voice and video are unaccessible, and most of them have some difficulty in understanding information in text form. The hearing-impaired are accustomed to receiving information expressed in sign language. For this situation, a new information expression form for the Internet of Things oriented toward the hearing-impaired is proposed in this paper, and the new expression is based on sign language video synthesis. Under the sign synthesis frame, three modules are necessary: constructing database, searching for appropriate sign language video units and transition units, and generating interpolated frames. With this method, text information could be transformed into sign language expression for the hearing-impaired.  相似文献   

18.
李妍 《移动信息》2024,46(2):216-219
文本分析是自然语言处理领域中的重要任务,其意义在于将大量文本数据分为不同类别,以便更好地理解和管理信息。文本分析的应用极为广泛,可用于垃圾邮件过滤、情感分析、新闻分类等领域,对信息组织和检索具有重要影响。然而,文本分析面临着文本数据维度高、语义复杂性、标注数据不足等挑战,为解决以上问题,文中深入研究了机器学习技术在文本分析中的应用,以期能提高文本分类的性能和效率。  相似文献   

19.
 本文针对训练数据较少以及在基于图的分类算法中的文本表示问题,提出了一种基于潜在语义分析技术和直推式谱图算法的文本分类方法LSASGT,该方法将潜在语义分析技术和直推式谱图算法这两种基于谱分析理论的技术有机地结合在一起,对所有训练数据和测试数据进行统一建模,挖掘数据中潜在的多种结构信息.LSASGT引入潜在语义分析技术用于构造文本图表示模型,在能够反映人的分类标准的潜在语义特征空间中,描述文本之间的语义相关性;基于这样的文本表示,利用半监督的直推式谱图算法进行文本分类.在基准英文文本分类数据集Reuters21578和中文文本分类数据集Tan-Corp上的实验结果表明,本文给出的LSASGT文本分类方法获得了较好的分类结果.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号