首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper the theoretical basis is presented and the implementation of a term rewriting system based on algebraic specifications is described. The input to this system is represented by an algebraic specification language, which forms not only the set of axioms but also the sorts, variables, operators and terms of a specific simulated theory or application. Rewriting and matching mechanisms provide the formal methodology for evaluating terms and proving assertions in an algebraic theory. Specifications are evaluated by interpreting terms by means of rewrite rules. The rules are described by the axioms of the specifications where the finite termination and congruence properties are assumed. A term rewriting system to recognize handwritten Hindu numerals is introduced as a case study. Besides rewriting, a robust algorithm is proposed to segment the numeral's image into strokes based on feature points and to identify cavity features. A syntactic representation (term) of the input image is matched and rewritten against a set of rules. Experimental results proved that the proposed system is tolerant to recognize a variety of numeral shapes with 96% successful recognition rate.  相似文献   

2.
3.
徐海燕  姜瑛 《软件学报》2021,32(7):2183-2203
随着开发者社区和代码托管平台成为程序员获取代码的主要途径,针对代码的用户评论数量急剧增加.用户在使用代码后给出的评论中包含多种静态和动态的代码质量属性信息,但是由于用户评论多为复杂句,使得评论中包含的代码质量属性难以判断.针对复杂用户评论的代码质量属性判断将有助于分析用户评论中的代码质量信息,有助于开发者在了解用户的代...  相似文献   

4.
Chinese characters are constructed by strokes according to structural rules. Therefore, the geometric configurations of characters are important features for character recognition. In handwritten characters, stroke shapes and their spatial relations may vary to some extent. The attribute value of a structural identification is then a fuzzy quantity rather than a binary quantity. Recognizing these facts, we propose a fuzzy attribute representation (FAR) to describe the structural features of handwritten Chinese characters for an on-line Chinese character recognition (OLCCR) system. With a FAR. a fuzzy attribute graph for each handwritten character is created, and the character recognition process is thus transformed into a simple graph matching problem. This character representation and our proposed recognition method allow us to relax the constraints on stroke order and stroke connection. The graph model provides a generalized character representation that can easily incorporate newly added characters into an OLCCR system with an automatic learning capability. The fuzzy representation can describe the degree of structural deformation in handwritten characters. The character matching algorithm is designed to tolerate structural deformations to some extent. Therefore, even input characters with deformations can be recognized correctly once the reference dictionary of the recognition system has been trained using a few representative learning samples. Experimental results are provided to show the effectiveness of the proposed method.  相似文献   

5.
This paper deals with an optical character recognition (OCR) system for handwritten Gujarati numbers. One may find so much of work for Indian languages like Hindi, Kannada, Tamil, Bangala, Malayalam, Gurumukhi etc, but Gujarati is a language for which hardly any work is traceable especially for handwritten characters. Here in this work a neural network is proposed for Gujarati handwritten digits identification. A multi layered feed forward neural network is suggested for classification of digits. The features of Gujarati digits are abstracted by four different profiles of digits. Thinning and skew-correction are also done for preprocessing of handwritten numerals before their classification. This work has achieved approximately 82% of success rate for Gujarati handwritten digit identification.  相似文献   

6.
胡正平  宋淑芬 《自动化学报》2012,38(9):1420-1427
为了构建一个快速鲁棒的图像识别算法, 提出基于类别相关近邻子空间的最大似然稀疏表示图像识别算法. 考虑到每个测试样本的不同分布特性及训练样本选择的类别代表性原则, 不再将所有训练样本作为稀疏表示的字典, 而是基于距离相近准则选择合适子空间, 从每个类别中选取自适应数量的局部近邻构成新的字典, 在减少训练样本的同时保留了稀疏表示原有的子空间结构. 然后基于最大似然稀疏表示识别方法, 将稀疏表示的保真度表示为余项的最大似然函数, 并将识别问题转化为加权的稀疏优化问题. 在公用人脸与数字识别数据库上的实验证明该算法的合理性, 提高识别速度的同时保证了识别精度和算法的鲁棒性, 特别是对于遮挡与干扰图像具有较好的适应性.  相似文献   

7.
Analysis of stroke structures of handwritten Chinese characters   总被引:3,自引:0,他引:3  
Most handwritten Chinese character recognition systems suffer from the variations in geometrical features for different writing styles. The stroke structures of different styles have proved to be more consistent than geometrical features. In an on-line recognition system, the stroke structure can be obtained according to the sequences of writing via a pen-based input device such as a tablet. But in an off-line recognition system, the input characters are scanned optically and saved as raster images, so the stroke structure information is not available. In this paper, we propose a method to extract strokes from an off-line handwritten Chinese character. We have developed four new techniques: 1) a new thinning algorithm based on Euclidean distance transformation and gradient oriented tracing, 2) a new line approximation method based on curvature segmentation, 3) artifact removal strategies based on geometrical analysis, and 4) stroke segmentation rules based on splitting, merging and directional analysis. Using these techniques, we can extract and trace the strokes in an off-line handwritten Chinese character accurately and efficiently.  相似文献   

8.
The problem of handwritten digit recognition has long been an open problem in the field of pattern classification and of great importance in industry. The heart of the problem lies within the ability to design an efficient algorithm that can recognize digits written and submitted by users via a tablet, scanner, and other digital devices. From an engineering point of view, it is desirable to achieve a good performance within limited resources. To this end, we have developed a new approach for handwritten digit recognition that uses a small number of patterns for training phase. To improve the overall performance achieved in classification task, the literature suggests combining the decision of multiple classifiers rather than using the output of the best classifier in the ensemble; so, in this new approach, an ensemble of classifiers is used for the recognition of handwritten digit. The classifiers used in proposed system are based on singular value decomposition (SVD) algorithm. The experimental results and the literature show that the SVD algorithm is suitable for solving sparse matrices such as handwritten digit. The decisions obtained by SVD classifiers are combined by a novel proposed combination rule which we named reliable multi-phase particle swarm optimization. We call the method “Reliable” because we have introduced a novel reliability parameter which is applied to tackle the problem of PSO being trapped in local minima. In comparison with previous methods, one of the significant advantages of the proposed method is that it is not sensitive to the size of training set. Unlike other methods, the proposed method uses just 15 % of the dataset as a training set, while other methods usually use (60–75) % of the whole dataset as the training set. To evaluate the proposed method, we tested our algorithm on Farsi/Arabic handwritten digit dataset. What makes the recognition of the handwritten Farsi/Arabic digits more challenging is that some of the digits can be legally written in different shapes. Therefore, 6000 hard samples (600 samples per class) are chosen by K-nearest neighbor algorithm from the HODA dataset which is a standard Farsi/Arabic digit dataset. Experimental results have shown that the proposed method is fast, accurate, and robust against the local minima of PSO. Finally, the proposed method is compared with state of the art methods and some ensemble classifier based on MLP, RBF, and ANFIS with various combination rules.  相似文献   

9.
手写汉字识别是手写汉字输入的基础。目前智能设备中的手写汉字输入法无法根据用户的汉字书写习惯,动态调整识别模型以提升手写汉字的正确识别率。通过对最新深度学习算法及训练模型的研究,提出了一种基于用户手写汉字样本实时采集的个性化手写汉字输入系统的设计方法。该方法将采集用户的手写汉字作为增量样本,通过对服务器端训练生成的手写汉字识别模型的再次训练,使识别模型能够更好地适应该用户的书写习惯,提升手写汉字输入系统的识别率。最后,在该理论方法的基础上,结合新设计的深度残差网络,进行了手写汉字识别的对比实验。实验结果显示,通过引入实时采集样本的再次训练,手写汉字识别模型的识别率有较大幅度的提升,能够更有效的满足用户在智能设备端对手写汉字输入系统的使用需求。  相似文献   

10.
We present a new method for blind document bleed-through removal based on separate Markov Random Field (MRF) regularization for the recto and for the verso side, where separate priors are derived from the full graph. The segmentation algorithm is based on Bayesian Maximum a Posteriori (MAP) estimation. The advantages of this separate approach are the adaptation of the prior to the contents creation process (e.g., superimposing two handwritten pages), and the improvement of the estimation of the recto pixels through an estimation of the verso pixels covered by recto pixels; moreover, the formulation as a binary labeling problem with two hidden labels per pixels naturally leads to an efficient optimization method based on the minimum cut/maximum flow in a graph. The proposed method is evaluated on scanned document images from the 18th century, showing an improvement of character recognition results compared to other restoration methods.  相似文献   

11.
12.
基于可伸缩矢量图SVG的在线手写汉字是以SVG图像作为汉字图像格式、以SVG的path对象作为笔画的基本存储单元来对汉字进行显示和存储的,笔画的轮廓是以手写过程中记录的坐标值作为特征数值加以确定的。基于此种SVG手写汉字存储和表示形式,本文提出一种基于图论的在线连续手写汉字多步分割方法。该方法根据汉字笔画间的坐标位置关系对手写笔画序列构建无向图模型,并利用图的广度优先搜索将原笔画序列分割为互不连通的笔画部件,使偏旁部首分离较远、非粘连汉字得到正确分割;然后利用改进的tarjan算法对部件中的粘连字符进行分割,最后基于笔画部件间距,利用二分类迭代算法对间距进行分类,找出全局最佳分割位置,对过分割的部件进行重组合并。实验结果表明,该方法对于在线手写汉字的分割是有效可行的。  相似文献   

13.
基于模具的手写数字串切分算法及其应用   总被引:3,自引:0,他引:3  
张洪刚  吴铭  刘刚  郭军 《计算机学报》2003,26(7):819-824
提出了一种基于模具的手写数字串切分算法,该算法通过总结手写数字串中字符之间的连接特点,归纳出一套合理的切分曲线类型,并根据这些曲线类型设计出多种切分模具,从而将字符的切分过程变为各种模具的试用和优选过程.通过在银行票据OCR系统中的应用,验证了算法的有效性.  相似文献   

14.
The retrieval of information from scanned handwritten documents is becoming vital with the rapid increase of digitized documents, and word spotting systems have been developed to search for words within documents. These systems can be either template matching algorithms or learning based. This paper presents a coherent learning based Arabic handwritten word spotting system which can adapt to the nature of Arabic handwriting, which can have no clear boundaries between words. Consequently, the system recognizes Pieces of Arabic Words (PAWs), then re-constructs and spots words using language models. The proposed system produced promising result for Arabic handwritten word spotting when tested on the CENPARMI Arabic documents database.  相似文献   

15.
目标跟踪问题中目标所在环境的变化对跟踪效果有较大影响.鉴于此,提出一种基于弹性网结构的稀疏表示模型,并在粒子滤波框架下设计一种应用稀疏表示模型的抗干扰动态弹性网目标跟踪算法.同时,设计一种根据环境变化程度动态更新稀疏表示模型参数的方法,以克服光照变化等干扰对算法跟踪质量的影响.此外,所提出算法通过使用各向异性核函数计算各候选区域为跟踪目标所在位置的概率,能够提高跟踪算法的准确性,并改进字典模板更新方法,确保模板更新的准确性与及时性,保证跟踪质量.经实验验证,所提出的动态弹性网跟踪算法与其他跟踪算法相比,在光照等扰动下具有更好的跟踪效果,在遮挡及快速运动等情况下也能够有效保证跟踪精度.  相似文献   

16.
当前的英文语法纠错模型往往忽略了有利于语法纠错的文本句法知识, 从而使得英语语法纠错模型的纠错能力受到影响. 针对上述问题, 提出一种基于差分融合句法特征的英语语法纠错模型. 首先, 本文提出的句法编码器不仅可以直接从文本中无监督地生成依存关系图和成分句法树信息, 而且还能将上述两种异构的句法结构进行特征融合, 编码成高维的句法表征. 其次, 为了同时利用文本中的语义和句法信息, 差分融合模块先使用差分正则化加强语义编码器捕获句法编码器未能生成的语义特征, 然后采用协同注意力将句法表征和语义表征进一步融合, 作为Transformer编码端的输出特征, 最终输入到解码端, 从而生成语法正确的文本. 在CoNLL-2014 英文纠错任务数据集上进行对比实验, 结果表明, 该方法的准确率和F0.5值优于基于Copy-Augmented Transformer的语法纠错模型, 其F0.5值提升了5.2个百分点, 并且句法知识避免了标注数据过少问题, 具有更优的文本纠错效果.  相似文献   

17.
18.
自由手写体因其书写风格差异大、上下文无关及识别准确度要求高等原因导致其识别难度大的问题。针对手写体数字识别的特点及要求,提出一种新的基于组合结构特征的自由手写体数字识别算法。通过扩展的字符结构特征识别算法自动、鲁棒地提取手写体数字字符端点、分叉点、横线等多种结构特征,并组合应用这些结构特征构造决策树完成手写体字符的自动识别。实验结果表明基于组合结构特征的自由手写体数字识别算法的鲁棒性和识别率明显优于传统方法。  相似文献   

19.
杨光正 《自动化学报》1993,19(5):625-628
本文在文法产生式表达知识的基础上讨论了句法知识系统的推理方法。Earley算法是一种高效的句法分析算法,它可成功地用作句法知识系统的搜索策略。本文还讨论了句法知识系统的启发式搜索策略,并且提出了一种高效的深度优先搜索策略。  相似文献   

20.

In this paper, an application of speaker identification in automobile industry is proposed. The work is divided into two main categories. The first part deals with the task of speaker identification where a system is trained and tested for multiple users using a database of isolated Hindi digits and Hindi sentences. A hybrid new algorithm is used for speaker identification which captures the benefits of both LPC and MFCC feature extraction technique. The new proposed technique shows an improvement of 2.05% over conventional MFCC features for isolated Hindi digits and 12.41% for Hindi sentences. It also shows an improvement of 53.26 over LPC for Hindi sentence and 32.51% for isolated Hindi digit over LPC. The proposed features were also tested for real time noisy environment by adding speech and F16 noise to test voice samples with varying degree of distortion starting from 0 to 20 dB. The second part describes the interfacing techniques and design of the hardware configuration for seat adjustment. The proposed model is designed using MATLAB. Speech samples from users are recorded through a microphone. Different features of this wav file are evaluated and fed into the model generated during testing phase. Depending on outcome from the classifier, a user is identified. Once the user is successfully identified, signals are sent to the servo motor through arduino microcontroller interfaced through MATLAB to automatically adjust the driver’s seat.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号