Mosaicing-by-recognition for video-based text recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Mosaicing-by-recognition for video-based text recognition

Authors:	Seiichi Uchida Hiromitsu Miyazaki Hiroaki Sakoe

Affiliation:	1. NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato-Wakamiya, Atsugi-shi, Kanagawa 243-0198, Japan;2. Faculty of Information Science and Electrical Engineering, Kyushu University, Fukuoka-shi, Fukuoka 819-0395, Japan;3. NTT Communication Science Laboratories, NTT Corporation, Seika-cho, Kyoto 619-0237, Japan;1. Department of Computer Science and Technology, Ocean University of China, Qingdao 266100, China;2. Synchromedia Laboratory for Multimedia Communication in Telepresence, École de Technologie Supérieure, Montréal H3C 1K3, Canada;1. Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia;2. Institute of Mathematical Sciences, University of Malaya, Kuala Lumpur, Malaysia;3. Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India;4. National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China;1. University of Illinois at Urbana-Champaign, 201 N. Goodwin Avenue, Urbana, IL 61801, USA;2. Inha University, 1103 High-tech Center, Yonghyun-dong 253, Nam-gu, Incheon, Republic of Korea;3. Samsung Research America - Silicon Valley, 75 West Plumeria Drive, San Jose, CA 95134, USA

Abstract:	Text recognition captured in multiple frames by a hand-held video camera is a challenging task because it is possible to capture and recognize a longer line of text while improving the quality of the text image by utilizing the redundancy of the overlapping areas between the frames. For this task, the video frames should be registered, i.e., mosaiced, after compensating for their distortions due to camera shakes. In this paper, a mosaicing-by-recognition technique is proposed where the problems of video mosaicing and text recognition are formulated as a unified optimization problem and solved by a dynamic programming-based optimization algorithm simultaneously and collaboratively. Experimental results indicate that, even if the frames undergo various distortions such as rotation, scaling, translation, and nonlinear speed fluctuation of camera movement, the proposed technique provides fine mosaic image by accurate distortion estimation (around 90% of perfect estimation) and character recognition accuracy (over 95%).

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏