一种压缩域上的快速标题文字探测算法及其应用 A Fast Caption Text Detection Algorithm on MPEG Compressed Video and Its Application期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种压缩域上的快速标题文字探测算法及其应用

引用本文：	王伟强,高文,高文.一种压缩域上的快速标题文字探测算法及其应用[J].计算机学报,2001,24(6):620-626.

作者姓名：	王伟强高文高文

作者单位：	1. 中国科学院计算技术研究所 2. 哈尔滨工业大学计算机科学与工程系

基金项目：	国家自然科学基金重点项目!(6 978930 1)，国家“八六三”高技术研究发展计划项目!(86 3-30 6 -ZT0 3-0 1-2 )，中国科学院百人计划的

摘要：	提出了一种在MPEG压缩流上基于模型的快速标题文字探测算法。对标题文字叠加模型的分析表明，视频流各分量值在标题文字区将属于特定的范围区间。基于该特征，该文给出了一种利用色度分量统计特征在压缩域上检测标题的快速算法，并对算法的加速以及模型建立方法进行了讨论，该技术被成功地应用到自动创建图片目录，实现了用户通过极少目的图片对一天新闻节目内容的快速浏览。实验结果表明算法不仅具有理想的正确率96．6％与查全率100％，而且具有超实时的探测速度。
关键词：	视频编码 MPEG 压缩域标题文字探测算法视频检索
修稿时间：	2000年5月25日
A Fast Caption Text Detection Algorithm on MPEG Compressed Video and Its Application

WANG Wei-Qiang,GAO Wen,GAO Wen.A Fast Caption Text Detection Algorithm on MPEG Compressed Video and Its Application[J].Chinese Journal of Computers,2001,24(6):620-626.

Authors:	WANG Wei-Qiang GAO Wen GAO Wen

Affiliation:	WANG Wei Qiang 1) GAO Wen 1),2) 1)

Abstract:	In the field of content based visual information retrieval (CBVIR), automatic detection of high level visual features is a significant topic. Text present in video frames, especially captions, plays an important role in understanding video content. In this paper, we present a fast algorithm that automatically detects caption text in MPEG compressed video. Analysis of the model, used to overlap captions on video frames, shows the component value of each pixel in the caption region belongs to a specific range. Based on the point, our approach can fast detect captions on compression domain, through exploiting statistic features of caption texts' chrominance components. First, the appoach extracts the DC image of the current frame. Then Cb and Cr values of the pixels in the caption region are checked, to determine whether they accord with the model. Lastly, the consistency verification is imposed to eliminate noises. Besides the details of the algorithm, we discuss a mechanism to speed up the detection process. In the mechanism, a larger granularity is applied to sample frames, so that the number of the frames checked becomes less. We formulate a method to choose an appropriate granularity, so that the system does not miss any caption or mistake two captions for a single one. Before the algorithm works, a model needs to be constructed to characterize the distribution of caption texts' chrominance component values. So the paper also deals with the process of how to construct the model semi automatically. We successfully applied the algorithm to automatically generate pictorial catalogues, which summarizes essence of news content with the small number of keyframes containing captions. In the evaluation experiments, the results show the algorithm has not only ideal accuracy of 96.6% and recall of 100%, but also a detection speed faster than real time, which make the algorithm very attractive.

Keywords:	content based visual information retrieval caption text index of news video
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏