两阶段的视频字幕检测和提取算法 Two-stage Method for Video Caption Detection and Extraction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

两阶段的视频字幕检测和提取算法

引用本文：	王智慧,李佳桐,谢斯言,周佳,李豪杰,樊鑫.两阶段的视频字幕检测和提取算法[J].计算机科学,2018,45(8):50-53, 62.

作者姓名：	王智慧李佳桐谢斯言周佳李豪杰樊鑫

作者单位：	大连理工大学国际信息与软件学院辽宁大连116621,大连理工大学软件学院辽宁大连116621,大连理工大学软件学院辽宁大连116621,大连理工大学软件学院辽宁大连116621,大连理工大学国际信息与软件学院辽宁大连116621,大连理工大学国际信息与软件学院辽宁大连116621

基金项目：	本文受国家自然科学基金(61472059,61772108)资助

摘要：	视频字幕检测和提取是视频理解的关键技术之一。文中提出一种两阶段的字幕检测和提取算法,将字幕帧和字幕区域分开检测,从而提高检测效率和准确率。第一阶段进行字幕帧检测:首先,根据帧间差算法进行运动检测,对字幕进行初步判断,得到二值化图像序列；然后,根据普通字幕和滚动字幕的动态特征对该序列进行二次筛选,得到字幕帧。第二阶段对字幕帧进行字幕区域检测和提取:首先,利用Sobel边缘检测算法初检文字区域；然后,利用高度约束等剔除背景,并根据宽高比区分出纵向字幕和横向字幕,从而得到字幕帧中的所有字幕,即静止字幕、普通字幕、滚动字幕。该方法减少了需要检测的帧数,将字幕检测效率提高了约11%。实验对比结果证明, 相比单一使用帧间差和边缘检测的方法,该方法在F值上提升约9%。
关键词：	视频字幕检测和提取帧间差动态特征 Sobel边缘检测
收稿时间：	2017/10/24 0:00:00
修稿时间：	2017/12/11 0:00:00
Two-stage Method for Video Caption Detection and Extraction

WANG Zhi-hui,LI Jia-tong,XIE Si-yan,ZHOU Ji,LI Hao-jie and FAN Xin.Two-stage Method for Video Caption Detection and Extraction[J].Computer Science,2018,45(8):50-53, 62.

Authors:	WANG Zhi-hui LI Jia-tong XIE Si-yan ZHOU Ji LI Hao-jie and FAN Xin

Affiliation:	Department of International Information and Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China,Department of Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China,Department of Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China,Department of Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China,Department of International Information and Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China and Department of International Information and Software Technology,Dalian University of Technology,Dalian,Liaoning 116621,China

Abstract:	Video caption detection and extraction is one of the key technologies for video understanding.This paper proposed a two-stage approach which divides the process into caption frame and caption area,improving the caption detection efficiency and accuracy.In the first stage,caption frame detection and extraction is conducted.Firstly, the motion detection is performed according to the gray correlation frame difference,the captions are judged initially,and a new binary image sequence is obtained.Then,according to dynamic characteristics of ordinary captions and scrolling captions,the new sequence is screened two times to get caption frame.In the second stage,caption area detection and extraction is conducted.Firstly,the Sobel edge detection algorithm is used to detect the caption region,and the background is eliminated according to the constraint height.Then according to the aspect ratio,the vertical and horizontal captions are distinguished,and all captions in the caption frame can be obtained,including static captions,ordinary captions and scrol-ling captions.This method reduces the frames which need to be detected and improves caption detection efficiency by 11%.The experimental results show that the proposed method can approximately improve the F score by 9% compared with the methods of separately using the gray correlation frame difference and edge detection.

Keywords:	Video caption Detection and extraction Gray correlation frame difference Dynamic characteristics Sobel edge detection

	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏