共查询到20条相似文献,搜索用时 15 毫秒
1.
An Approach for Recognition and Interpretation of Mathematical Expressions in Printed Document 总被引:3,自引:0,他引:3
In this paper, we propose an approach for understanding Mathematical Expressions (MEs) in a printed document. The system
is divided into three main components: (i) detection of MEs in a document; (ii) recognition of the symbols present in each
ME; and (iii) arrangement of the recognised symbols. The MEs printed in separate lines are detected without any character
recognition whereas the embedded expressions (mixed with normal text) are detected by recognising the mathematical symbols
in text. Some structural features of the MEs are used for both cases. The mathematical symbols are grouped into two classes
for convenience. At first, the frequently occurring symbols are recognised by a stroke-feature analysis technique. Recognition
of less frequent symbols involves a hybrid of feature-based and template-based technique. The bounding-box coordinates and
the size information of the symbols help to determine the spatial relationships among the symbols. A set of predefined rules
is used to form the meaningful symbol groups so that a logical arrangement of the mathematical expression can be obtained.
Experiments conducted using this approach on a large number of documents show high accuracy. 相似文献
2.
3.
A new technique for automatic input of printed circuit layout data to a computer graphics system, providing a powerful alternative to the existing manual digitization method, is described. The technique is based on optical scanning and graphics recognition. Substantial reduction in the digitized data is made possible by using the properties of printed circuit art work. The technique has been extended to encompass some areas of engineering documentation. 相似文献
4.
在电子出版及许多音乐研究中,均需将原文乐谱转化为被计算机可读的数据。本文提出一种用字符识别方法的乐谱图象识别系统。它基于拆分结构技术,将原文转换为局部结构图切分乐符,有效地减少了数据量,且不受图象弯曲和倾斜的影响。文中给出了识别结果的文本输出及其图象恢复。 相似文献
5.
A multi-step recognition process is developed for extracting compound forest cover information from manually produced scanned historical topographic maps of the 19th century. This information is a unique data source for GIS-based land cover change modeling. Based on salient features in the image the steps to be carried out are character recognition, line detection and structural analysis of forest symbols. Semantic expansion implying the meanings of objects is applied for final forest cover extraction. The procedure resulted in high accuracies of 94% indicating a potential for automatic and robust extraction of forest cover from larger areas. 相似文献
6.
7.
Osamu Hori Shigeyoshi Shimotsuji Fumihiko Hoshino Toshiaki Ishii 《Machine Vision and Applications》1993,6(2-3):100-109
This paper shows that probabilistic relaxation is an effective method in the automatic interpretation of line drawings consisting
of lines, symbols, and characters, such as electricity distribution diagrams superimposed on maps. The line interpretation
problem has been newly formulated as a labeling problem in which probabilistic relaxation is used to obtain globally consistent
results. The proposed automatic interpretation method consists of two stages. The first is segmentation and recognition of
primitive components, such as symbols, characters, and long lines. The second is long-line interpretation, where probabilistic
relaxation is introduced. 相似文献
8.
9.
设计并实现了一个支持笔输入的乐谱编辑器,用户使用笔和书写板输入乐谱手势符号,利用基于网格编码的单笔划手势识别算法识别手势符号,生成与输入相应的乐谱,具有实时播放的功能。与传统交互界面的乐谱编辑器相比较,该系统的交互方式更加符合人们对乐谱的书写和认知习惯,使乐谱的输入过程变得简单、自然、高效。 相似文献
10.
介绍了一个印刷体数学公式识别系统,它由公式字符识别和结构分析两部分组成。在公式字符识别中,采用了一些适用于公式字符的特殊处理方法;在结构分析中,根据数学公式的结构布局,采用了一种将“自顶向下”和“自底向上”策略相结合的数学公式结构分析方法,实现了数学公式的重用,实验表明,这种方法能取得较好的识别效果。 相似文献
11.
基于统计特征的印刷体数学公式上/下标关系判别 总被引:6,自引:2,他引:6
印刷体数学公式与普通文本相比有许多不同的特点,其二维结构决定了公式识别不仅包含字符识别,更重要的是对其结构的分析。上/下标关系是公式中出现频繁又难于解决的特殊结构,容易与水平关系混淆。该文提出两种基于统计特征的印刷体数学公式上/下标关系判别方法,一种直接分析符号的外接矩形,另一种利用了符号的识别结果。实验结果表明,两种方法与同类方法相比都有改进,其中利用识别结果进行判别的方法不仅能将上/下标与水平关系很好地区分开,而且具有很大的类间距离。 相似文献
12.
The recognition of Indian and Arabic handwriting is drawing increasing attention in recent years. To test the promise of existing handwritten numeral recognition methods and provide new benchmarks for future research, this paper presents some results of handwritten Bangla and Farsi numeral recognition on binary and gray-scale images. For recognition on gray-scale images, we propose a process with proper image pre-processing and feature extraction. In experiments on three databases, ISI Bangla numerals, CENPARMI Farsi numerals, and IFHCDB Farsi numerals, we have achieved very high accuracies using various recognition methods. The highest test accuracies on the three databases are 99.40%, 99.16%, and 99.73%, respectively. We justified the benefit of recognition on gray-scale images against binary images, compared some implementation choices of gradient direction feature extraction, some advanced normalization and classification methods. 相似文献
13.
Utpal Garain B. B. Chaudhuri 《International Journal on Document Analysis and Recognition》2005,7(4):241-259
This paper is concerned with research on OCR (optical character recognition) of printed mathematical expressions. Construction
of a representative corpus of technical and scientific documents containing expressions is discussed. A statistical investigation
of the corpus is presented, and usefulness of this analysis is demonstrated in the related research problems, namely, (i)
identification and segmentation of expression zones from the rest of the document, (ii) recognition of expression symbols,
(iii) interpretation of expression structures, and (iv) performance evaluation of a mathematical expression recognition system.
Moreover, a groundtruthing format has been proposed to facilitate automatic evaluation of expression recognition techniques.
Received: 10 July 2003, Accepted: 22 November 2004, Published online: 18 March 2005
Correspondence to: Utpal Garain 相似文献
14.
15.
16.
17.
图符提取是地形图自动输入的关键步骤,通过图符的提取可以得到地形图的许多重要信息。因此为实现地形图符的自动提取,提出了一种基于改进的形态学腐蚀运算的图符提取算法,该算法具有高识别率和较快的运行速度,并可克服线划图粘连引起的误识别。该算法适用于实际地形图自动输入系统,对于类似线划图图符提取的其它领域也有较好的应用前景。 相似文献
18.
19.
金属断口图像中标定符号信息是进一步计算图像对应实际物理空间距离的依据.标定符号通常为印刷体,所以准确定位是正确识别的前提和关键.对强噪声复杂背景下的金属断口图像标定符号的定位算法的研究,先对直线特征明显的标尺符号定位,其中对Radon变换进行分块改进,使快速性和准确性有了明显改善.字符定位利用符号的纹理特征进行数学形态学粗定位和图像边缘模板匹配精定位结合的方法,并根据标尺位置和长度等信息缩小搜索区域.实验结果表明,该算法的定位准确率达到94%. 相似文献
20.
神经网络在车辆牌照字符识别中的应用 总被引:7,自引:0,他引:7
在车辆牌照自动识别系统中,因自然因素或采用因素使得原本原则的印刷字符产生畸变,给字符识别带来了很大困难。本文在特征抽取的基础上,采用BP网络进行分类,并附加线性感知器来实现单字的有效识别。该方法算法简便,识别率高,可适用于多种高噪声环境中的印刷字体识别。 相似文献