首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 203 毫秒
1.
采用识别技术的用户界面往往由于识别率的限制容易出错,如何为这类界面提供自然高效的纠错方法十分重要.手写数学公式具有二维结构,难以识别和纠错.提出一种用于纠正手写数学公式识男噜错误的多通道技术.它允许用户使用笔纠正切分错误,用笔和语音纠正符号识别和表达式结构分析错误.该技术的核心是一个多通道融合算法.融合算法以笔选择的符号和语音作为输入,根据语音输入的类型是数学术语或者数学符号分别选择融合方法,最后修正手写公式并输出最有可能的识别结果.实验结果表明,该技术能有效地纠正手写数学公式识别中的错误,它比基于笔的单通道纠错技术更加高效.  相似文献   

2.
后处理是检测和纠正文字识别后文本中错误的重要步骤,老挝语文字识别结果中存在大量相似字符替换错误及字符断裂、粘连导致的字符插入、删除错误,针对该问题进行分析,该文提出了一种融合字符形状特征的多任务老挝语文字识别后纠错方法.该方法引入基于长短期记忆网络的seq2seq模型架构,将老挝字形特征融入模型以辅助模型对相似字符替换错误的纠正,针对文本中插入、删除错误在编码端联合多尺度卷积网络以不同的卷积核大小提取文本的局部特征;再使用语言模型对解码端预测的文本序列与原始文本进行重排名,得到最佳候选;同时,采用多任务学习的方式,以错误检测辅任务优化模型纠错效果,此外,该文以数据增强的方式扩充数据集.实验结果表明,该方法使老挝文字识别的字符错率低至7.94%.  相似文献   

3.
李宇霞  孙永奇  闫茹  朱卫国 《计算机工程》2021,47(1):255-263,274
光学字符识别技术可有效提高票据应用中票据信息录入的工作效率。针对票据的复杂背景与不规范手写字符降低票据识别准确率的问题,结合卷积神经网络图像识别与语义可靠性,提出一种可靠性优先的路径搜索方法,以降低模糊字符对搜索路径的干扰。利用基于公司名结构特点的前后缀推断策略,有效解决公司名前后缀识别错误问题。采用结巴中文分词与字符位置信息检查识别结果中的错误,并将长短期记忆语言模型与在传统字形相似度基础上引入的汉字部件相似度相结合进行纠错。实验结果表明,通过将纠错策略与该方法相结合可有效提高公司名识别准确率至93.08%。  相似文献   

4.
手写票据识别是模式识别中的研究难点之一,手写体风格多样、票据背景复杂等原因导致手写票据识别的准确率不高。大写金额作为票据中最重要的部分,对其进行准确识别是手写票据自动识别的关键。对基于分割的手写体大写金额识别及处理问题进行研究,提出一种基于卷积神经网络(CNN)与有限状态自动机的手写体大写金额识别方法。在利用过分割和组合过分割项得到单字符后使用CNN对其进行识别。通过对字符进行分类、定义各类字符之间的逻辑关系构造用于语法检查的有限状态自动机,通过语法自动机在识别结果中选择符合语法规则的字符串,并在路径搜索中利用语法自动机优化搜索性能。在此基础上,运用语法自动机对模糊字符进行预测,以纠正CNN的识别错误。实验结果表明,该方法在对大写金额单字符和文本行进行识别时准确率分别高达98.2%与96.6%。  相似文献   

5.
在线手写数学公式识别面临书写字符的不确定性、数学公式结构的复杂性,以及公式书写风格因人而异等问题,特别是在公式书写中出现偶然性错误和包含复杂结构的情况下,现有的仅依赖机器的识别算法的识别准确率较低.为了解决这一问题,提出了人在回路的手写公式识别方法,该方法主要在结构分析阶段引入了人的参与,借助人对结构中歧义笔画的修改和结构补笔操作,完善和界定结构笔画和结构内笔画信息.为了评估该方法的有效性,将其与不含用户参与信息的一个基线识别方法在结构识别率和表达式识别率方面进行了对比分析.结果表明,该方法能够有效地促进用户参与到手写识别过程,同时,针对实验收集的手写数学公式数据,引入用户参与的方法能够有效地提高手写数学公式的结构和表达式识别率,分别提高了9.26%和13.99%.  相似文献   

6.
中文拼写纠错是一项检测和纠正文本中拼写错误的任务。大多数中文拼写错误是在语义、读音或字形上相似的字符被误用,因此常见的做法是对不同模态提取特征进行建模。但将不同特征直接融合或是利用固定权重进行求和,使得不同模态信息之间的重要性关系被忽略以及模型在识别错误时会出现偏差,阻止了模型以有效的方式学习。为此,提出了一种新的模型以改善这个问题,称为基于文本序列错误概率和中文拼写错误概率融合的汉语纠错算法。该方法使用文本序列错误概率作为动态权重、中文常见拼写错误概率作为固定权重,对语义、读音和字形信息进行了高效融合。模型能够合理控制不同模态信息流入混合模态表示,更加针对错误发生处进行学习。在SIGHAN基准上进行的实验表明,所提模型的各项评估分数在不同数据集上均有提升,这验证了该算法的可行性。  相似文献   

7.
在字符识别领域,对粘连字符的识别是一个被广泛关注的技术难点,而且粘连字符的分割更是产生识别错误的主要原因之一.为了快速准确地进行字符分割,在总结已有方法的特点及不足的基础上,针对电子阅读笔系统的工作特点和实时性要求,提出并实现了一种面向电子阅读笔系统的基于词片识别的分割算法.该方法由于通过对字母组合的识别,降低了传统的基于孤立字符识别方法对于字符切分的要求,而且以中心生长法和改进的峰谷函数为切分工具来进行字符分割,简单实用,因而其在减少因粘连字符切分错误引起的识别错误的同时,不仅降低了运算复杂度,而且适合在阅读笔等嵌入式设备上应用.实验证明,该算法不仅效率高,而且实现简单,还能够降低分割错误带来的识别错误.  相似文献   

8.
基于压缩传感的手写字符识别方法   总被引:1,自引:0,他引:1  
基于新出现的压缩传感理论,提出了一种鲁棒的手写字符识别方法,能很好地对含有噪声的字符进行识别.该方法通过对测试字符进行稀疏表示,采用l1范数最小化算法求得最稀疏的系数解,所获得的系数具有明显的类别信息,从而易于对测试字符进行分类.实验结果表明,该方法具有很好的噪声鲁棒性.  相似文献   

9.
针对手写汉字字符图像识别率受随机噪声影响的问题,提出了一种基于深度学习与抑制噪声相结合的新算法。该算法主要应用于拥有随机噪声的手写汉字字符图片,是其在Python环境下,利用Caffe平台建立抑制噪声与卷积神经网络相结合的模型,通过模型移除噪声并正确识别手写汉字。另外,新算法去除噪声的同时对字符形态没有改变,保留了汉字的原始信息。结果在其两种不同的噪声(高斯噪声和椒盐噪声)下,逐渐提升其噪声强度,进行多次实验,同时与其他方法对比,最终得到其平均识别率为97.05%。实验结果表明,该模型和算法具有效率快、识别能力强的优点。  相似文献   

10.
针对手写数学公式的识别和计算问题,提出了一种基于卷积神经网络的字符训练方法。利用计算机视觉对数学公式图片进行预处理,采用卷积神经网络进行二维矩阵转换,得到了对应的字符符号,通过后缀表达式计算了识别结果。运用Softmax函数训练了字符模型,统计和分析了几种类型的数学公式识别和计算结果。实验结果证明,通过训练字符能有效提高正确率,该方法可为复杂手写数学公式识别和计算提供参考。  相似文献   

11.
12.
In a noisy environment speech recognizers make mistakes. In order that these errors can be detected the system can synthesize the word recognized and the user can respond by saying “correction” when the word was not recognized correctly. The mistake can then be corrected.Two error-correcting strategies have been investigated. In one, repetition-with-elimination, when a mistake has been detected the system eliminates its last response from the active vocabulary and then the user repeats the word that has been misrecognized. In the other, elimination-without-repetition, the system suggests the next-most-likely word based on the output of its pattern-matching algorithm. It was found that the former strategy, with the user repeating the word, required less trials to correct the recognition errors.A model which relates the average number of corrections to the recognition rate has been developed which provides a good fit to the data.  相似文献   

13.
为实现自然语音纠错,提升自然语音识别与拼读的正确率,研究人工智能技术在自然语音纠错与反馈系统设计中的应用。设计由前端学习单元与后端支撑单元组成的自然语音纠错与反馈系统,预处理采集到的自然语音片段,基于片段间距离划分因素,提取自然语音片段特征,采用隐马尔可夫模型识别自然语音,基于B2规范语料,采用动态时间归整方法纠错与评分识别到的自然语音,通过反馈模块将识别、纠错、评分结果反馈给用户。对比实验的结果表明,设计的自然语音纠错与反馈系统的语音识别率高于95%,纠错结果与实际错误一致,可提升自然语音拼读的正确率。  相似文献   

14.
在自然人机对话中,由于环境噪声、方言口音等因素带来的语音识别错误以及语义分析的不充分等原因,计算机在理解用户交互意图时出现偏差,使得计算机对要反馈的话题出现错误,造成人机对话进程的断裂.以面向咖啡为主题的漫谈式人机对话为例,将对话中断分为3种情况:话题反馈不当引起中断、话题正确情况下的模糊反馈不当和精确反馈不当引起中断.根据用户与计算机对话的记录分析比较上述3种情况下人机对话进程断裂情况.统计数据结果表明,话题反馈不当带来的对话中断最为明显,在对话进程断裂情况中达到了60.1%的比例;在话题反馈正确情况下,模糊回答不当和精确回答不当带来的话题中断比例分别为22.2%和21.6%;在语音识别错误情况下,语义分析会带来数量更大的反馈错误.实验数据分析结果表明,在语音识别错误情况下,根据上下文信息提高计算机对用户话题反馈的准确率,能够有效降低人机对话的中断,提高人机对话的自然度.该工作为自然人机对话的意图分类重要性提供了数据分析和实验论证.  相似文献   

15.
Humans use a combination of gesture and speech to interact with objects and usually do so more naturally without holding a device or pointer. We present a system that incorporates user body-pose estimation, gesture recognition and speech recognition for interaction in virtual reality environments. We describe a vision-based method for tracking the pose of a user in real time and introduce a technique that provides parameterized gesture recognition. More precisely, we train a support vector classifier to model the boundary of the space of possible gestures, and train Hidden Markov Models (HMM) on specific gestures. Given a sequence, we can find the start and end of various gestures using a support vector classifier, and find gesture likelihoods and parameters with a HMM. A multimodal recognition process is performed using rank-order fusion to merge speech and vision hypotheses. Finally we describe the use of our multimodal framework in a virtual world application that allows users to interact using gestures and speech.  相似文献   

16.
李海霞  张擎 《计算机应用》2015,35(10):2789-2792
针对多模态生物特征识别系统并行融合模式中使用方便性和使用效率方面的问题,在现有序列化多模态生物特征识别系统的基础上,提出了一种结合并行融合和序列化融合的多生物特征识别系统框架。框架中首先采用步态、人脸与指纹三种生物特征的不同组合方式以加权相加的得分级融合算法进行的识别过程;其次,利用在线的半监督学习技术提高弱特征的识别性能,从而进一步增强系统的使用方便性和识别可靠性。理论分析和实验结果表明,在此框架下,随使用时间的推移,系统能够通过在线学习提高弱分类器的性能,用户的使用方便性和系统的识别精度都得到了进一步提升。  相似文献   

17.
《Ergonomics》2012,55(11):1943-1957
Abstract

Errors, whether created by the user, the recognizer, or inadequate systems design, are an important consideration in the more widespread and successful use of automatic speech recognition (ASR). An experiment is described in which recognition errors are studied under different types of feedback. Subjects entered data verbally to a microcomputer according to four experimental conditions: namely, orthogonal combinations of spoken and visual feedback presented concurrently or terminally after six items. Although no significant differences in terms of error rates or speed of data entry were shown across the conditions, analysis of the time penalty for error correction indicated that as a general rule, there is a small timing advantage for terminal feedback, when the error rate is low. It was found that subjects do not monitor visual feedback with the same degree of accuracy as spoken, as a larger number of incorrect data entry strings was being confirmed as correct. Further evidence for the use of ‘second best’ recognition data is given, since correct recognition on re-entry could be increased from 83·0% to 92·4% when the first choice recognition was deleted from the second attempt. Finally, the implications for error correction protocols in system design are discussed.  相似文献   

18.
In this paper we describe the design concepts and prototype implementation of a situation aware ubiquitous computing system using multiple modalities such as National Marine Electronics Association (NMEA) data from global positioning system (GPS) receivers, text, speech, environmental audio, and handwriting inputs. While most mobile and communication devices know where and who they are, by accessing context information primarily in the form of location, time stamps, and user identity, the concept of sharing of this information in a reliable and intelligent fashion is crucial in many scenarios. A framework which takes the concept of context aware computing to the level of situation aware computing by intelligent information exchange between context aware devices is designed and implemented in this work. Four sensual modes of contextual information like text, speech, environmental audio, and handwriting are augmented to conventional contextual information sources like location from GPS, user identity based on IP addresses (IPA), and time stamps. Each device derives its context not necessarily using the same criteria or parameters but by employing selective fusion and fission of multiple modalities. The processing of each individual modality takes place at the client device followed by the summarization of context as a text file. Exchange of dynamic context information between devices is enabled in real time to create multimodal situation aware devices. A central repository of all user context profiles is also created to enable self-learning devices in the future. Based on the results of simulated situations and real field deployments it is shown that the use of multiple modalities like speech, environmental audio, and handwriting inputs along with conventional modalities can create devices with enhanced situational awareness.   相似文献   

19.
This paper proposes a new technique to increase the robustness of spoken dialogue systems employing an automatic procedure that aims to correct frames incorrectly generated by the system’s component that deals with spoken language understanding. To do this the technique carries out a training that takes into account knowledge of previous system misunderstandings. The correction is transparent for the user as he is not aware of some mistakes made by the speech recogniser and thus interaction with the system can proceed more naturally. Experiments have been carried out using two spoken dialogue systems previously developed in our lab: Saplen and Viajero, which employ prompt-dependent and prompt-independent language models for speech recognition. The results obtained from 10,000 simulated dialogues show that the technique improves the performance of the two systems for both kinds of language modelling, especially for the prompt-independent language model. Using this type of model the Saplen system increases sentence understanding by 19.54%, task completion by 26.25%, word accuracy by 7.53%, and implicit recovery of speech recognition errors by 20.3%, whereas for the Viajero system these figures increase by 14.93%, 18.06%, 6.98% and 15.63%, respectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号