首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Automatic text segmentation and text recognition for video indexing   总被引:13,自引:0,他引:13  
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos.  相似文献   

2.
A labelling approach for the automatic recognition of tables of contents (ToC) is described in this paper. A prototype is used for the electronic consulting of scientific papers in a digital library system named Calliope. This method operates on a roughly structured ASCII file, produced by OCR. The recognition approach operates by text labelling without using any a priori model. Labelling is based on part-of-speech tagging (PoS) which is initiated by a primary labelling of text components using some specific dictionaries. Significant tags are first grouped into homogeneous classes according to their grammar categories and then reduced in canonical forms corresponding to article fields: “title” and “authors”. Non-labelled tokens are integrated in one or another field by either applying PoS correction rules or using a structure model generated from well-detected articles. The designed prototype operates very well on different ToC layouts and character recognition qualities. Without manual intervention, a 96.3% rate of correct segmentation was obtained on 38 journals, including 2,020 articles, accompanied by a 93.0% rate of correct field extraction. Received April 5, 2000 / Revised February 19, 2001  相似文献   

3.
4.
Performance evaluation is crucial for improving the performance of OCR systems. However, this is trivial and sophisticated work to do by hand. Therefore, we have developed an automatic performance evaluation system for a printed Chinese character recognition (PCCR) system. Our system is characterized by using real-world data as test data and automatically obtaining the performance of the PCCR system by comparing the correct text and the recognition result of the document image. In addition, our performance evaluation system also provides some evaluation of performance for the segmentation module, the classification module, and the post-processing module of the PCCR system. For this purpose, a segmentation error-tolerant character-string matching algorithm is proposed to obtain the correspondence between the correct text and the recognition result. The experiments show that our performance evaluation system is an accurate and powerful tool for studying deficiencies in the PCCR system. Although our approach is aimed at the PCCR system, the idea also can be applied to other OCR systems.  相似文献   

5.
TBL是一种被广泛应用于自然语言处理的基于转换的机器学习算法,将这种算法扩展到OCR领域,用来进行字符分割的计算.字符分割是字符识别的一个步骤,它的准确度在一定程度上关系到最终识别结果.为了比较试验的结果,收集了很多的手写字体,通过规则的提取和应用,在探测分段边界过程中也能达到令人满意的结果.  相似文献   

6.
During the last forty years, Human Handwriting Processing (HHP) has most often been investigated under the frameworks of character (OCR) and pattern recognition. In recent years considerable progress has been made, and to date HHP can be viewed much more as an automatic Handwriting Reading (HR) task for the machine. In this paper we propose the use of handwriting invariants, a physical model for a first segmentation, a logical model for segmentation and recognition, a fundamental equation of handwriting, and to integrate several sources of perception and of knowledge in order to design Handwriting Reading Systems (HRS), which would be more universal systems than is currently the case. At the dawn of the 3rd millennium, we guess that HHP will be considered more as a perceptual and interpretation task requiring knowledge gained from studies on human language. This paper gives some guidelines and presents examples to design systems able to perceive and interpret, i.e., read, handwriting automatically. Received October 30, 1998 / Revised January 30, 1999  相似文献   

7.
OCR软件对图像背景的字符的处理能力有限,为了提高OCR的识别率必须对字符进行预处理。该文提出采用SUSAN拐角检测算法生成图像字符区域的拐角响应图,然后利用拐角过滤算法去除错误的拐角响应生成字符候选区域,最后应用了形态数学变换将字符笔画精确地分离出。经实验检验本算法较好地完成字符笔画提取,是一种提高OCR软件识别率的有效方法。  相似文献   

8.
提出了一个基于内容的新闻视频浏览和查询系统NewsBR,这个系统是建立在非常准确的新闻故事分段和主题字幕文本提取之上的,它的主要特征包括:基于类别的新闻故事浏览,基于关键帧的视频摘要和基于关键词的新闻故事查询,本文详细讲述了新闻故事的分段,主题字幕文本的提取和在此之上的基于内容的视频浏览和查询,这个系统对于全面了解新闻视频的内容很有帮助且行之有效.  相似文献   

9.
针对仪表标牌上一些字符间距较小,传统分割方法不准确,字符识别率不高的问题,提出了一种标牌粘连字符自适应定位分割重建识别算法。首先对标牌图像进行中值滤波、二值化等预处理;其次运用数学形态学方法对预处理后的图像进行开运算及腐蚀,将字符间一些无用的信息去掉,增大字符间距;继而通过形心算法找出每个字符的几何中心坐标,并通过Sobel边缘检测算子根据几何中心坐标获取每个字符边框,建立ROI,再返回标牌原图利用已经建立的ROI从中分割字符,在分割的每个字符后加5像素宽的矩形间隔条后重建字符图像,再进行OCR字符识别。经过对993块标牌进行字符识别实验,算法的识别率达到95.7%,表明文中算法是对标牌字符识别的一种有效算法。  相似文献   

10.
11.
In the literature, many feature types are proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not yet been done. In particular, evaluations on OCR documents are very rare. In this paper we investigate seven text representations based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts. Received February 17, 1998 / Revised April 8, 1998  相似文献   

12.
Detection and recognition of textual information in an image or video sequence is important for many applications. The increased resolution and capabilities of digital cameras and faster mobile processing allow for the development of interesting systems. We present an application based on the capture of information presented at a slide-show presentation or at a poster session. We describe the development of a system to process the textual and graphical information in such presentations. The application integrates video and image processing, document layout understanding, optical character recognition (OCR), and pattern recognition. The digital imaging device captures slides/poster images, and the computing module preprocesses and annotates the content. Various problems related to metric rectification, key-frame extraction, text detection, enhancement, and system integration are addressed. The results are promising for applications such as a mobile text reader for the visually impaired. By using powerful text-processing algorithms, we can extend this framework to other applications, e.g., document and conference archiving, camera-based semantics extraction, and ontology creation.Received: 18 December 2003, Revised: 1 November 2004, Published online: 2 February 2005  相似文献   

13.
基于角点检测和自适应阈值的新闻字幕检测   总被引:3,自引:2,他引:1       下载免费PDF全文
张洋  朱明 《计算机工程》2009,35(13):186-187
目前用于提取新闻视频帧中字幕的方法准确率和检测速度普遍较低,尤其对于分辨率和对比度较小的标题文字,检测效果很差。针对上述问题,提出一种基于角点检测和自适应阈值的字幕检测方法。该方法利用角点检测确定标题帧中的文字区域并进行灰度变换,利用自适应阈值的方法对其进行二值化,得到OCR可识别的文字图片。实验表明,该方法可以快速有效地提取出分辨率和对比度较小的新闻视频标题字幕。  相似文献   

14.
王祖辉  姜维 《计算机工程》2009,35(13):188-189,
目前用于提取新闻视频帧中字幕的方法准确率和检测速度普遍较低,尤其对于分辨率和对比度较小的标题文字,检测效果很差.针对上述问题,提出一种基于角点检测和自适应阈值的字幕检测方法.该方法利用角点检测确定标题帧中的文字区域并进行灰度变换,利用自适应阈值的方法对其进行二值化,得到OCR可识别的文字图片.实验表明,该方法可以快速有效地提取出分辨率和对比度较小的新闻视频标题字幕.  相似文献   

15.
An architecture for handwritten text recognition systems   总被引:1,自引:1,他引:0  
This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy. Received October 30, 1998 / Revised January 15, 1999  相似文献   

16.
An optical character recognition (OCR) framework is developed and applied to handprinted numeric fields recognition. The numeric fields were extracted from binary images of VISA? credit card application forms. The images include personal identity numbers and telephone numbers. The proposed OCR framework is a cascaded neural networks. The first stage is a self-organizing feature map algorithm. The second stage maps distance values into allograph membership values using a gradient descent learning algorithm. The third stage is a multi-layer feedforward network. In this paper, we present experimental results which demonstrate the ability to read handprinted numeric fields. Experiments were performed on a test data set from the CCL/ITRI database which consists of over 90,390 handwritten numeric digits.  相似文献   

17.
Optical Character Recognition (OCR) is the process of recognizing printed or handwritten text on paper documents. This paper proposes an OCR system for Arabic characters. In addition to the preprocessing phase, the proposed recognition system consists mainly of three phases. In the first phase, we employ word segmentation to extract characters. In the second phase, Histograms of Oriented Gradient (HOG) are used for feature extraction. The final phase employs Support Vector Machine (SVM) for classifying characters. We have applied the proposed method for the recognition of Jordanian city, town, and village names as a case study, in addition to many other words that offers the characters shapes that are not covered with Jordan cites. The set has carefully been selected to include every Arabic character in its all four forms. To this end, we have built our own dataset consisting of more than 43.000 handwritten Arabic words (30000 used in the training stage and 13000 used in the testing stage). Experimental results showed a great success of our recognition method compared to the state of the art techniques, where we could achieve very high recognition rates exceeding 99%.  相似文献   

18.
Text displayed in a video is an essential part for the high-level semantic information of the video content. Therefore, video text can be used as a valuable source for automated video indexing in digital video libraries. In this paper, we propose a workflow for video text detection and recognition. In the text detection stage, we have developed a fast localization-verification scheme, in which an edge-based multi-scale text detector first identifies potential text candidates with high recall rate. Then, detected candidate text lines are refined by using an image entropy-based filter. Finally, Stroke Width Transform (SWT)- and Support Vector Machine (SVM)-based verification procedures are applied to eliminate the false alarms. For text recognition, we have developed a novel skeleton-based binarization method in order to separate text from complex backgrounds to make it processible for standard OCR (Optical Character Recognition) software. Operability and accuracy of proposed text detection and binarization methods have been evaluated by using publicly available test data sets.  相似文献   

19.
英文字符特征提取系统   总被引:1,自引:0,他引:1  
庞东虎  金伟杰 《计算机仿真》2007,24(12):208-210
英文字符识别是模式识别的一个重要分支,具有广泛的应用领域.字符识别主要包括文档切分、单词切分、字符识别及后处理几部分.文中描述的是英文字符识别系统实现了从图像扫描到得到识别结果的全过程, 而字符特征提取是文本的重点内容.以五十二个英文字符为研究对象,具体包括了图像预处理、特征提取、建立模板、分类器设计、后处理等步骤.文章对OCR领域中应用比较广泛的网格特征、外围特征、穿越特征等特征和几种距离分类器分别进行比较分析,并进行大量的实验.实验结果表明识别准确率和识别处理时间方面具有良好性能.  相似文献   

20.
Dot-matrix text recognition is a difficult problem, especially when characters are broken into several disconnected components. We present a dot-matrix text recognition system which uses the fact that dot-matrix fonts are fixed-pitch, in order to overcome the difficulty of the segmentation process. After finding the most likely pitch of the text, a decision is made as to whether the text is written in a fixed-pitch or proportional font. Fixed-pitch text is segmented using a pitch-based segmentation process that can successfully segment both touching and broken characters. We report performance results for the pitch estimation, fixed-pitch decision and segmentation, and recognition processes. Received October 18, 1999 / Revised April 21, 2000  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号