首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
一个高精度的简、繁体印刷体汉字文本识别系统   总被引:2,自引:0,他引:2  
本文叙述了一个基于改进的“汉字识别特征点方法”的高精度简、繁体印刷体汉字文本识别系统。引入特征点的方向属性, 明显地提高了“汉字识别特征点方法”的汉字识别率。文中阐述了该系统各主要环节的原理。经过百万汉字真实印刷文本的严格测试,本系统汉字识别率达到97.84%。对质量较高的真实印刷文本, 汉字识别率达到99%以上。  相似文献   

2.
本文介绍用Turbo C编写汉字的方法,用该方法编写的汉字可在多种中/西文DOS下运行。文中也对汉字菜单的编写技术做了讨论。  相似文献   

3.
王建平  陈军  徐晓冰  王熹徽 《微机发展》2006,16(10):104-107
提出了一种模糊统计方法的脱机手写体汉字特征提取方法,结合小波网格方法和汉字笔画密度特征方法对汉字进行特征提取,并运用支持向量机方法,通过机器学习对脱机手写汉字识别。仿真实验表明,支持向量机方法在脱机手写汉字识别中有良好的识别性能及模糊统计方法是有效的。  相似文献   

4.
针对目前复杂环境下车牌汉字图像识别率较低,识别时间较长等问题,提出了一种基于伪Zernike矩和独立主成分分析(ICA)的改进概率神经网络(PNN)车牌汉字识别方法.该方法是将车牌汉字图像的伪Zernike矩通过独立主成分分析降维,再将降维后的特征输入所提出的一种基于代表点的改进概率神经网络中进行训练和识别,从而有效地实现车牌汉字的识别.将该方法应用于复杂环境下的车牌汉字图像识别实验,实验结果表明,该方法能有效地降低特征维数,减少识别时间,并能显著地提高车牌汉字的识别率.  相似文献   

5.
陈静  穆志纯  方新  杜大鹏 《计算机工程》2007,33(11):170-172
汉字识别是汉语、汉字认知研究的一个重要研究领域。该文提出了一个基于多层自组织神经网络的模型,从汉字字形聚类及汉字部件拆分的角度,对基于汉字认知的汉字识别过程进行了初步的探索。模拟研究结果表明,模型通过学习能够识别出汉字的结构类型和部件,发现汉字识别中的规律,在一定程度上模拟了汉字的识别。  相似文献   

6.
手写汉字的集群识别   总被引:3,自引:0,他引:3  
为了降低单个汉字的分辨率,论文分析了通用的汉字识别模型,并在此基础上建立了适于多字识别的集群识别模型。为了充分论证集群识别模型的观点,本文从理论证明和实验两方面获得支持根据。实验结果表明基于多字识别模型的集群识别能可靠提高对连续文字的识别效果,是手写汉字识别中很有希望的发展方向。  相似文献   

7.
本文提出了一种在隐含马尔可夫模型(HMM)框架下建立的识别脱机手写汉字的方法,介绍了以HMM对脱机手写汉字进行建模、识别的整个过程,并给出了实验结果对国标一级3755个汉字的识别率,在两种测试集上分别达到96.4%和91.5%.  相似文献   

8.
基于SVM的脱机手写汉字机器学习识别方法研究   总被引:3,自引:1,他引:3  
提出了一种模糊统计方法的脱机手写体汉字特征提取方法.结合小波网格方法和汉字笔画密度特征方法对汉字进行特征提取,并运用支持向量机方法,通过机器学习对脱机手写汉字识别。仿真实验表明,支持向量机方法在脱机手写汉字识别中有良好的识别性能及模糊统计方法是有效的。  相似文献   

9.
本文仔细研究了地图背景下的汉字的特征,特别是汉字与地图背景的关系,并就此提出了一些特定的预处理技术,包括图文分离、行倾斜校正、"倾斜字"校正、字切分等算法.经过这些预处理,就能从地图背景下分离出汉字,并将汉字变成单个标准点阵模块,为汉字识别奠定基础,最终实现地图下的汉字自动化输入。  相似文献   

10.
汉字认知心理研究对机器自动识别汉字的启示   总被引:4,自引:1,他引:3  
几项认知心理学实验研究从不同角度一致证实, 方块汉字的四个等分象限所含的字形特征信息童不同。在人类识别汉字时作用也不一样。其中以左上象限最重要, 右下象限的作用则要弱得多。本文结合部件的象限位置频率, 讨论了这些结果对汉字机器识别的一些启示。  相似文献   

11.
图象信号识别并行算法的研究   总被引:2,自引:0,他引:2  
文章针对汉字识别是一个大类别模式识别问题给出了一个结论,离线手写汉字图象的识别在单机环境下很难提高其运行速度,必须用并行算法来解决此类问题,作者对此进行了有效地尝试。文章论述了图象识别的有效算法并行实现的设计,并对其性能进行了分析。  相似文献   

12.
多文种环境下汉字内码识别算法的研究   总被引:9,自引:4,他引:9  
汉字内码向ISO/IEC 10646过渡是实现计算机用文字编码统一的必然趋势,但目前在一段时间内仍将存在多种汉字内码并存的情况,所以实现汉字内码的自动识别是保证汉字多内码并存的关键。本文主要探讨了如何在多内码并存的多文种环境中实现汉字内码自动识别的问题,并提供了多种汉字内码识别算法,包括基于内码分布、标点符号特征、字频特征和语义特征的识别算法等。在此基础上,本文对不同的识别算法进行分析和评估。在对目标样本的测试中,以上算法的识别率最高可以达到99.9%以上。  相似文献   

13.
A constrained approach to multifont Chinese character recognition   总被引:1,自引:0,他引:1  
The constraint graph is introduced as a general character representation framework for recognizing multifont, multiple-size Chinese characters. Each character class is described by a constraint graph model. Sampling points on a character skeleton are taken as nodes in the graph. Connection constraints and position constraints are taken as arcs in the graph. For patterns of the same character class, the model captures both the topological invariance and the geometrical invariance in a general and uniform way. Character recognition is then formulated as a constraint-based optimization problem. A cooperative relaxation matching algorithm that solves this optimization problem is developed. A practical optical character recognition (OCR) system that is able to recognize multifont, multiple-size Chinese characters with a satisfactory performance was implemented  相似文献   

14.
赵彦斌  李庆华 《计算机应用》2006,26(6):1396-1397
文本相似性分析、聚类和分类多基于特征词,由于汉语词之间无分隔符,汉语分词及高维特征空间的处理等基础工作必然引起高计算费用问题。探索了一种在不使用特征词的条件下,使用汉字间的关系进行文本相似性分析的研究思路。首先定义了文本中汉字与汉字之间关系的量化方法,提出汉字关联度的概念,然后构造汉字关联度矩阵来表示汉语文本,并设计了一种基于汉字关联度矩阵的汉语文本相似性度量算法。实验结果表明,汉字关联度优于二字词词频、互信息、T检验等统计量。由于无需汉语分词,本算法适用于海量中文信息处理。  相似文献   

15.
论汉字码本数据库管理技术   总被引:2,自引:1,他引:2  
任何一种中文输入法的研究中都会遇到码本的处理问题。在不同的时期,由于应用需求的不同,使得码本呈现出不同的表现形式。本文首先提出了汉字码本数据库的概念,它是指能够实现汉字字符信息到其相应属性的对应关系的数据结构。之后,本文讨论了不同层次上的两种码本:数据库码本和二进制码本。根据实践的经验,文中将不同阶段的汉字码本数据库分成文本文件形式、数据库码本形式和二进制文件形式,并且分别讨论了对这些码本的管理技术。  相似文献   

16.
A Dialectal Chinese Speech Recognition Framework   总被引:2,自引:0,他引:2       下载免费PDF全文
A framework for dialectal Chinese speech recognition is proposed and studied, in which a relatively small dialectal Chinese (or in other words Chinese influenced by the native dialect) speech corpus and dialect-related knowledge are adopted to transform a standard Chinese (or Putonghua, abbreviated as PTH) speech recognizer into a dialectal Chinese speech recognizer. Two kinds of knowledge sources are explored: one is expert knowledge and the other is a small dialectal Chinese corpus. These knowledge sources provide information at four levels: phonetic level, lexicon level, language level, and acoustic decoder level. This paper takes Wu dialectal Chinese (WDC) as an example target language. The goal is to establish a WDC speech recognizer from an existing PTH speech recognizer based on the Initial-Final structure of the Chinese language and a study of how dialectal Chinese speakers speak Putonghua. The authors propose to use context-independent PTH-IF mappings (where IF means either a Chinese Initial or a Chinese Final), context-independent WDC-IF mappings, and syllable-dependent WDC-IF mappings (obtained from either experts or data), and combine them with the supervised maximum likelihood linear regression (MLLR) acoustic model adaptation method. To reduce the size of the multi-pronunciation lexicon introduced by the IF mappings, which might also enlarge the lexicon confusion and hence lead to the performance degradation, a Multi-Pronunciation Expansion (MPE) method based on the accumulated uni-gram probability (AUP) is proposed. In addition, some commonly used WDC words are selected and added to the lexicon. Compared with the original PTH speech recognizer, the resulting WDC speech recognizer achieves 10-18% absolute Character Error Rate (CER) reduction when recognizing WDC, with only a 0.62% CER increase when recognizing PTH. The proposed framework and methods are expected to work not only for Wu dialectal Chinese but also for other dialectal Chinese languages and even other languages.  相似文献   

17.
Bees Algorithm is a population-based method that is a computational bound algorithm whose inspired by the natural behavior of honey bees to finds a near-optimal solution for the search problem. Recently, many parallel swarm based algorithms have been developed for running on GPU (Graphic Processing Unit). Since nowadays developing a parallel Bee Algorithm running on the GPU becomes very important. In this paper, we extend the Bees Algorithm (CUBA (i.e. CUDA based Bees Algorithm)) in order to be run on the CUDA (Compute Unified Device Architecture). CUBA (CUDA based Bees Algorithm). We evaluate the performance of CUBA by conducting some experiments based on numerous famous optimization problems. Results show that CUBA significantly outperforms standard Bees Algorithm in numerous different optimization problems.  相似文献   

18.
提出了基于改进SSDA算法的机器人视觉彩色目标识别方法,利用颜色分量的权重系数对SSDA算法进行了改进,同时,在图像特征提取时引入目标的形状和大小信息。实验表明,这些措施有效地减少了运算量,提高了目标识别的准确性,具有较好的实时性和鲁棒性。  相似文献   

19.
Membrane systems are parallel distributed computing models that are used in a wide variety of areas. Use of a sequential machine to simulate membrane systems loses the advantage of parallelism in Membrane Computing. In this paper, an innovative classification algorithm based on a weighted network is introduced. Two new algorithms have been proposed for simulating membrane systems models on a Graphics Processing Unit (GPU). Communication and synchronization between threads and thread blocks in a GPU are time-consuming processes. In previous studies, dependent objects were assigned to different threads. This increases the need for communication between threads, and as a result, performance decreases. In previous studies, dependent membranes have also been assigned to different thread blocks, requiring inter-block communications and decreasing performance. The speedup of the proposed algorithm on a GPU that classifies dependent objects using a sequential approach, for example with 512 objects per membrane, was 82×, while for the previous approach (Algorithm 1), it was 8.2×. For a membrane system with high dependency among membranes, the speedup of the second proposed algorithm (Algorithm 3) was 12×, while for the previous approach (Algorithm 1) and the first proposed algorithm (Algorithm 2) that assign each membrane to one thread block, it was 1.8×.  相似文献   

20.
我国汉字历史悠久。从商周到现代,经历了几千年的历史演变过程。它的演变融入了中国多年的民族传统文化精华。它不但是一种能表意的视觉符号,而且是一种有很强表现力和感染力的设计元素。有很强的装饰性和民族特色。其本身就是美的构成图形。具有很强烈的符号标识感和高度的艺术感染力。理解中国汉字所蕴涵的丰富文化情感内涵,将它运用到现代设计中,这不仅对现代设计,而且对中国古文化的续承和传播有着重要的意义。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号