手写汉字识别是模式识别与机器学习的重要研究方向和应用领域;近年来,随着深度学习理论方法的完善、新技术的层出不穷,深度神经网络在图像识别分类、图像生成等典型应用中取得了突破性的进展,其中,深度残差网络作为最新的研究成果,已成功应用于手写数字识别、图片识别分类等多个领域;将研究深度残差网络在脱机孤立手写汉字识别中的应用方法,通过改进残差学习模块的单元结构,优化深度残差网络性能,同时通过对训练集的预处理,从数据层面实现训练生成模型性能的提升,最后设计实验,验证深度残差网络、End-to-End模式在脱机手写汉字识别中的可行性,分析、总结存在的问题及今后的研究方向。  相似文献   

针对已有文本识别网络由于深度不够而识别准确率较低的问题,文中提出一种改进的端到端文本识别网络结构。首先,将文本作为序列,采用残差模块将文本按列切分成特征向量输入循环层。这种残差结构增加了卷积网络的深度,使网络保持对文本图像的最佳表征能力,实现对文本信息的捕捉。另一方面,残差模块采用堆叠层来学习残差映射,在层数加深的情况下提高了网络的收敛性。然后,采用循环层对这些文本特征序列进行上下文建模,并把建模结果输入Softmax层以获得序列对应标签的预测,实现了对任意长度文本的识别。循环层使用长短时记忆网络学习文本之间的依赖关系,解决长序列训练过程中的"梯度消失"问题。最后,通过最优路径方法进行文本标签转录。该方法找到一条路径使其概率最大,并输出这条路径对应的序列为最优序列。改进的文本识别网络结构增加了深度,提高了文本图像的特征描述能力和在噪声下的稳定性。在多个测试数据集(ICDAR2003,ICDAR2013,SVT和IIIT5K)上将所提算法与已有典型算法进行实验对比分析,结果表明该网络结构能够得到更高的场景文本识别准确率,验证了其有效性。  相似文献   

针对提高不同笔体下的手写识别准确率进行了研究,将深度卷积神经网络与自动编码器相结合,设计卷积自编码器网络层数,形成深度卷积自编码神经网络。首先采用双线性插值方法分别对MNIST数据集与一万幅自制中国大学生手写数字图片进行图像预处理,然后先使用单一MNIST数据集对深度卷积自编码神经网络进行训练与测试;最后使用MNIST与自制数据集中5 000幅混合,再次训练该网络,对另外5 000幅进行测试。实验数据表明,所提深度卷积自编码神经网络在MNIST测试集正确率达到99.37%,有效提高了准确率;且5 000幅自制数据集模型测试正确率达99.33%,表明该算法实用性较强,在不同笔体数字上得到了较高的识别准确率,模型准确有效。  相似文献   

由于传统循环神经网络具有复杂的结构,需要大量的数据才能在连续语音识别中进行正确训练,并且训练需要耗费大量的时间,对硬件性能要求很大.针对以上问题,提出了基于残差网络和门控卷积神经网络的算法,并结合联结时序分类算法,构建端到端中文语音识别模型.该模型将语谱图作为输入,通过残差网络提取高层抽象特征,然后通过堆叠门控卷积神经...  相似文献   

交通标志识别是无人驾驶系统和智能驾驶系统不可少的关键技术之一,为了提高交通标志的识别准确率,进一步提高无人驾驶汽车在行驶过程中的安全性,提出了一种基于深度残差网络的交通标志识别方法,利用不同尺寸的残差模块进行堆叠,构建了具有100层卷积层的网络模型.以比利时交通标志数据集BTSC作为实验数据,优化网络模型后得到的识别准...  相似文献   

从高分辨率遥感影像中提取并检测路网一直都是计算机视觉研究的热点和难点。目前,基于深度学习的遥感影像路网检测方法大部分都是以卷积运算为基础的卷积神经网络,而以深度可分离卷积运算为基础深度可分离卷积神经网络作为以卷积运算为基础的卷积神经网络的替代神经网络,不仅在特征提取能力上优于卷积神经网络,而且在参数量和计算量方面也低于卷积神经网络。鉴于此,该文利用深度可分离卷积运算替换卷积运算,并引入残差模块,构造了深度可分离残差网络进行遥感影像的路网自动检测的应用。实验结果表明,在RRSI和CHN6-CUG数据集上,虽然深度可分离残差网络的准确率和损失与相对应的卷积神经网络和残差网络的准确率和损失的区别不大,但是深度可分离残差网络的训练耗时时长远远低于相对应的卷积神经网络和残差网络的训练耗时时长,而且深度可分离残差网络的路网检测实际结果也优于相对应的卷积神经网络和残差网络的路网检测实际结果。  相似文献   

手写文本识别方法主要应用于文本输入技术,对人机交互领域的发展起关键作用。针对多数在线输入法无法识别中英文混合手写识别的问题,提出一种在线中英文混合手写文本识别方法。通过对文本笔画进行基于水平相对位置、垂直重叠率、面积重叠率规则的整合以及连笔切分,得到一系列字符片段,同时利用笔画个数、宽高比、中心偏离、平滑度等几何特征和识别置信度,对字符片段进行中英文分类。在此基础上,根据分类结果并结合自然语言模型的路径评价及动态规划搜索算法,分别对候选的中、英文字符片段进行合并处理,得到待识别的中、英文字符序列,并将其分别送入卷积神经网络的中、英文识别模型中,得到手写文本识别结果。实验结果表明,在线手写中英文混合文本识别正确率达93.67%,不仅能切分在线手写中文文本行,而且对包含字符连笔的在线手写中英文文本行也有较好的切分效果。  相似文献   

针对深度残差网络在小型移动设备的人脸识别应用中存在的网络结构复杂、时间开销大等问题,提出一种基于深度残差网络的轻量级模型。首先对深度残差网络的结构进行精简优化,并结合知识转移方法,从深度残差网络(教师网络)中重构出轻量级残差网络(学生网络),从而在保证精度的同时,降低网络的结构复杂度;然后在学生网络中通过分解标准卷积减少模型的参数,从而降低特征提取网络的时间复杂度。实验结果表明,在LFW、VGG-Face、AgeDB和CFP-FP等4个不同数据集上,所提模型在识别精度接近主流人脸识别方法的同时,单张推理时间达到16 ms,速度提升了10%~20%。可见,所提模型能够在推理速度得到有效提升的同时识别精度基本不下降。  相似文献   

近年来随着多种小型智能探测设备的出现(如无人机、小型智能车等),给传统雷达目标识别方法带来了巨大挑战;在使用雷达对此类小型目标进行探测时得到的信号回波能量通常较低,导致在复杂环境噪声与杂波影响下难以使用传统恒虚警(CFAR)目标检测方法对其进行识别;针对以上问题,结合深度学习的方法提出一种基于残差连接长短期记忆网络(LSTM,long short-term memory)的多类别雷达目标识别模型,以同一距离门的相邻时间点的回波序列数据作为样本来设计数据集,使用多层的LSTM网络提取雷达回波样本中的时序信息,并在网络中加入残差连接以避免网络层数增多出现网络退化问题,同时将用于多类别分类问题的CCE(categorical cross-entropy)函数作为网络的损失函数来训练网络,实现对包括无人机、智能车、行人以及噪声在内的4类目标的识别和分类;试验结果表明基于残差连接LSTM网络的多类别雷达目标识别模型相比于传统恒虚警检测方法具有更高的识别准确率和F1值。  相似文献   

This paper presents a handwritten text biometric recognition system suitable to be applied to short sequences of text (words). Strokes are considered the structural units of handwriting with words being regarded as two separate sequences: one of pen-down and one of pen-up strokes. Unsupervised categorization by means of a self-organized map allows mapping strokes to integers and the efficient comparison of the sequences by means of dynamic time warping. Measures obtained from each sequence are combined in a later step. This separation gives us the opportunity to show that pen-up strokes possess a surprisingly high discriminative power, while the performance of the combination suggests they may carry non-redundant information with respect to pen-down strokes.A writer identification rate of 92.38% and a minimum of detection cost function of 0.046 (4.6%) is achieved with 370 users and just one word. Results are improved up to 96.46% and 0.033 (3.3%) when combining two words.  相似文献   

This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition, concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are described in this paper. Preliminary experiments show promising results in terms of speed and accuracy. Received October 30, 1998 / Revised January 15, 1999  相似文献   

Multimedia Tools and Applications - A significant issue in the domain of optical character recognition is handwritten text recognition. Here, two novel feature extraction techniques are proposed...  相似文献   

Multimedia Tools and Applications - Offline Handwritten Text Recognition (HTR) has been an active area of research due to its wide range of applications and challenges. Recently, many offline HTR...  相似文献   

This paper presents a new Bayesian-based method of unconstrained handwritten offline Chinese text line recognition. In this method, a sample of a real character or non-character in realistic handwritten text lines is jointly recognized by a traditional isolated character recognizer and a character verifier, which requires just a moderate number of handwritten text lines for training. To improve its ability to distinguish between real characters and non-characters, the isolated character recognizer is negatively trained using a linear discriminant analysis (LDA)-based strategy, which employs the outputs of a traditional MQDF classifier and the LDA transform to re-compute the posterior probability of isolated character recognition. In tests with 383 text lines in HIT-MW database, the proposed method achieved the character-level recognition rates of 71.37% without any language model, and 80.15% with a bi-gram language model, respectively. These promising results have shown the effectiveness of the proposed method for unconstrained handwritten offline Chinese text line recognition.  相似文献   

A database for handwritten text recognition research   总被引:4,自引:0,他引:4  
An image database for handwritten text recognition research is described. Digital images of approximately 5000 city names, 5000 state names, 10000 ZIP Codes, and 50000 alphanumeric characters are included. Each image was scanned from mail in a working post office at 300 pixels/in in 8-bit gray scale on a high-quality flat bed digitizer. The data were unconstrained for the writer, style, and method of preparation. These characteristics help overcome the limitations of earlier databases that contained only isolated characters or were prepared in a laboratory setting under prescribed circumstances. Also, the database is divided into explicit training and testing sets to facilitate the sharing of results among researchers as well as performance comparisons  相似文献   

This paper deals with the problem of off-line handwritten text recognition. It presents a system of text recognition that exploits an original principle of adaptation to the handwriting to be recognized. The adaptation principle is based on the automatic learning, during the recognition, of the graphical characteristics of the handwriting. This on-line adaptation of the recognition system relies on the iteration of two steps: a word recognition step that allows to label the writer's representations (allographs) on the whole text and a re-evaluation step of character models. Tests carried out on a sample of 15 writers, all unknown by the system, show the interest of the proposed adaptation scheme since we obtain during iterations an improvement of recognition rates both at the letter and the word levels.  相似文献   

This paper investigates rejection strategies for unconstrained offline handwritten text line recognition. The rejection strategies depend on various confidence measures that are based on alternative word sequences. The alternative word sequences are derived from specific integration of a statistical language model in the hidden Markov model based recognition system. Extensive experiments on the IAM database validate the proposed schemes and show that the novel confidence measures clearly outperform two baseline systems which use normalised likelihoods and local n-best lists, respectively.  相似文献   

