Text recognition captured in multiple frames by a hand-held video camera is a challenging task because it is possible to capture and recognize a longer line of text while improving the quality of the text image by utilizing the redundancy of the overlapping areas between the frames. For this task, the video frames should be registered, i.e., mosaiced, after compensating for their distortions due to camera shakes. In this paper, a mosaicing-by-recognition technique is proposed where the problems of video mosaicing and text recognition are formulated as a unified optimization problem and solved by a dynamic programming-based optimization algorithm simultaneously and collaboratively. Experimental results indicate that, even if the frames undergo various distortions such as rotation, scaling, translation, and nonlinear speed fluctuation of camera movement, the proposed technique provides fine mosaic image by accurate distortion estimation (around 90% of perfect estimation) and character recognition accuracy (over 95%).  相似文献   

This paper describes an outdoor positioning system for vehicles that can be applied to an urban canyon by using an omnidirectional infrared (IR) camera and a digital surface model (DSM). By means of omnidirectional IR images, this system enables robust positioning in urban areas where satellite invisibility caused by buildings hampers high-precision GPS measurements. The omnidirectional IR camera can generate IR images with an elevation of 20–70° for the surrounding area of 360°. The image captured by the camera is highly robust to light disturbances in the outdoor environment. Through the IR camera, the sky appears distinctively dark; this enables easy detection of the border between the sky and the buildings captured in white due to the difference in the atmospheric transmittance rate between visible light and IR rays. The omnidirectional image, which includes several building profiles, is compared with building-restoration images produced by the corresponding DSM in order to determine the self-position. Field experiments in an urban area show that the proposed outdoor positioning method is valid and effective, even if high-rise buildings cause satellite blockage that affects GPS measurements.  相似文献   

Multimedia Tools and Applications - Localization of text from camera captured images with complex background is now-a-days a growing demand of modern IT enable service. Most of the current text...  相似文献   

Cheap, ubiquitous, high-resolution digital cameras have led to opportunities that demand camera-based text understanding, such as wearable computing or assistive technology. Perspective distortion is one of the main challenges for text recognition in camera captured images since the camera may often not have a fronto-parallel view of the text. We present a method for perspective recovery of text in natural scenes, where text can appear as isolated words, short sentences or small paragraphs (as found on posters, billboards, shop and street signs etc.). It relies on the geometry of the characters themselves to estimate a rectifying homography for every line of text, irrespective of the view of the text over a large range of orientations. The horizontal perspective foreshortening is corrected by fitting two lines to the top and bottom of the text, while the vertical perspective foreshortening and shearing are estimated by performing a linear regression on the shear variation of the individual characters within the text line. The proposed method is efficient and fast. We present comparative results with improved recognition accuracy against the current state-of-the-art.  相似文献   

Document recognition is a lively research area with much effort concentrated on optical character recognition. Less attention is paid to locating and extracting text from the general (non-desktop, non-scanner) environment. Such contact-free extraction of text from a general scene has applications in the context of wearable computing, robotic vision, point and click document capture, or as an aid for visually handicapped people. Here, a novel automatic text reading system is introduced using an active camera focused on text regions already located in the scene (using our recent work). Initially, a located region of text is analysed to determine the optimal zoom that would foveate onto it. Then a number of images are captured over the text region to construct a high-resolution mosaic composite of the whole region. This magnified image of the text is suitable for reading by humans or for recognition by OCR, or even for text-to speech synthesis. Although we employed a low resolution camera, we still obtained very good results. ID="A1"Correspondance and offprint requests to: Dr M. Mirmehdi, Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK. Email: majid@cs.bris.ac.uk  相似文献   

长期以来,森林火灾检测一直都是世界范围内的一个重要研究课题,对于保护地球环境及人类安全都有重要意义.基于视频监控的火灾检测对于实时性和正确性要求很高,而传感式探测器和传统图像型探测器不能满足要求.提出了一种基于动态纹理特征分析的新型图像型火灾检测算法,对于森林这样的复杂大空间场景尤为适用.通过对CCD摄像机拍摄的视频图像,建立线性动力系统(LDS)模型,分析其动态纹理特征,最后利用Adaboost分类器判断火灾是否存在.实验结果表明,此算法能够达到95%的检测准确率,且具有较好的应用前景.  相似文献   

Documents may be captured at any orientation when viewed with a hand-held camera. Here, a method of recovering fronto-parallel views of perspectively skewed text documents in single images is presented, useful for ‘point-and-click’ scanning or when generally seeking regions of text in a scene. We introduce a novel extension to the commonly used 2D projection profiles in document recognition to locate the horizontal vanishing point of the text plane. Following further analysis, we segment the lines of text to determine the style of justification of the paragraphs. The change in line spacings exhibited due to perspective is then used to locate the document's vertical vanishing point. No knowledge of the camera focal length is assumed. Using the vanishing points, a fronto-parallel view is recovered which is then suitable for OCR or other high-level recognition. We provide results demonstrating the algorithm's performance on documents over a wide range of orientations.  相似文献   

手持相机拍摄的文档图像存在不同程度的镜头失真。根据文档图像的文本行信息,提出了一种基于数学形态学的镜头校正算法。首先利用自适应阈值方法分割文档图像,并通过形态学闭运算把连通体聚类为文本行。然后利用二次多项式模型拟合文本行的中心线,并建立径向失真校正的目标函数。该目标函数把中心线对应的曲线映射为直线,从而求出文档图像的镜头失真参数。实验结果表明,该校正算法可以有效地校正文档图像各种程度的径向失真。  相似文献   

This paper presents a survey on the latest methods of moving object detection in video sequences captured by a moving camera. Although many researches and excellent works have reviewed the methods of object detection and background subtraction for a fixed camera, there is no survey which presents a complete review of the existing different methods in the case of moving camera. Most methods in this field can be classified into four categories; modeling based background subtraction, trajectory classification, low rank and sparse matrix decomposition, and object tracking. We discuss in details each category and present the main methods which proposed improvements in the general concept of the techniques. We also present challenges and main concerns in this field as well as performance metrics and some benchmark databases available to evaluate the performance of different moving object detection algorithms.  相似文献   


Analyzing videos and images captured by unmanned aerial vehicles or aerial drones is an emerging application attracting significant attention from researchers in various areas of computer vision. Currently, the major challenge is the development of autonomous operations to complete missions and replace human operators. In this paper, based on the type of analyzing videos and images captured by drones in computer vision, we have reviewed these applications by categorizing them into three groups. The first group is related to remote sensing with challenges such as camera calibration, image matching, and aerial triangulation. The second group is related to drone-autonomous navigation, in which computer vision methods are designed to explore challenges such as flight control, visual localization and mapping, and target tracking and obstacle detection. The third group is dedicated to using images and videos captured by drones in various applications, such as surveillance, agriculture and forestry, animal detection, disaster detection, and face recognition. Since most of the computer vision methods related to the three categories have been designed for real-world conditions, providing real conditions based on drones is impossible. We aim to explore papers that provide a database for these purposes. In the first two groups, some survey papers presented are current. However, the surveys have not been aimed at exploring any databases. This paper presents a complete review of databases in the first two groups and works that used the databases to apply their methods. Vision-based intelligent applications and their databases are explored in the third group, and we discuss open problems and avenues for future research.


