期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Isolated structural error analysis of printed mathematical expressions

P. Pavan Kumar Arun Agarwal Chakravarthy Bhagvati 《Pattern Analysis & Applications》2018,21(4):1097-1107

相似文献

2.

Precision in geodetic correction of TM data as a function of the number, spatial distribution,and success in matching of control points: a simulation 总被引：1，自引：0，他引：1

M.L. Labovitz J.W. Marvin 《Remote sensing of environment》1986,20(3):237-252

The sources of error affecting geodetic correction of TM data have been enumerated in NASA specifications and cited in the research literature. A useful typology of these errors is the geometric errors remaining after applying systematic correction data and those errors arising from the location, generation, and application of ground control points. The relative magnitudes of the effects of these errors are studied via a simulation of the geodetic rectification process which is constructed such that operationalized (quantified) forms of errors are the simulation parameters. The kernel of this simulation is a rectification procedure called GED (geodetic error determination) which uses control point distortion data to adjust the orbit/attitude data, which in turn is used to adjust transformations between the satellite focal plane and the ground. Errors associated with base map digitization and control point generation produce the greatest impact on rectification. In the absence of such errors few control points are needed to geodetically correct the data close to National Map Accuracy Standards for 1:24,000 series maps. Evenness in the distribution of control points is not critical. 相似文献

3.

Learning to make errors: evidence from a driving task simulation

《Ergonomics》2012,55(10-11):1241-1250

Human errors represent a mismatch between the demands of an operational system and what the operator does. If they cannot be reversed, their consequences may be severe. Errors are frequently classified as design-or operator-induced. A third class of errors may also be identified, namely process-induced errors. Such errors arise out of on-going processes which typically extend over time. One such process is that of learning. In relation to the acquisition of skills, for example, learning frequently involves a trial-and-error component. Accidents by inexperienced drivers may represent a severe consequence of such errors. Errors may also arise out of particular learning experiences which provide a distorted underestimate of objective risk and/or motivate high risk behaviour. These phenomena are investigated in a computer simulation of the driving task. The relationship is discussed between various kinds of learning experience and the development of situations in which the possibility of error recovery declines. Some suggestions for reducing the frequency of irreversible errors and for increasing the data base for human error in vehicle driving are made. 相似文献

4.

中文文本自动校对综述

李云汉施运梅李宁田英爱《中文信息学报》2022,36(9):1

文本校对在新闻发布、书刊出版、语音输入、汉字识别等领域有着极其重要的应用价值,是自然语言处理领域中的一个重要研究方向。该文对中文文本自动校对技术进行了系统性的梳理,将中文文本的错误类型分为拼写错误、语法错误和语义错误,并对这三类错误的校对方法进行了梳理,对中文文本自动校对的数据集和评价方法进行了总结,最后展望了中文文本自动校对技术的未来发展。相似文献

5.

Errors and error correction in automatic speech recognition systems

《Ergonomics》2012,55(11):1943-1957

Abstract

Errors, whether created by the user, the recognizer, or inadequate systems design, are an important consideration in the more widespread and successful use of automatic speech recognition (ASR). An experiment is described in which recognition errors are studied under different types of feedback. Subjects entered data verbally to a microcomputer according to four experimental conditions: namely, orthogonal combinations of spoken and visual feedback presented concurrently or terminally after six items. Although no significant differences in terms of error rates or speed of data entry were shown across the conditions, analysis of the time penalty for error correction indicated that as a general rule, there is a small timing advantage for terminal feedback, when the error rate is low. It was found that subjects do not monitor visual feedback with the same degree of accuracy as spoken, as a larger number of incorrect data entry strings was being confirmed as correct. Further evidence for the use of ‘second best’ recognition data is given, since correct recognition on re-entry could be increased from 83·0% to 92·4% when the first choice recognition was deleted from the second attempt. Finally, the implications for error correction protocols in system design are discussed. 相似文献

6.

Effect of Character Animacy and Preparatory Motion on Perceptual Magnitude of Errors in Ballistic Motion

P. S. A. Reitsma J. Andrews N. S. Pollard 《Computer Graphics Forum》2008,27(2):201-210

An increasing number of projects have examined the perceptual magnitude of visible artifacts in animated motion. These studies have been performed using a mix of character types, from detailed human models to abstract geometric objects such as spheres. We explore the extent to which character morphology influences user sensitivity to errors in a fixed set of ballistic motions replicated on three different character types. We find user sensitivity responds to changes in error type or magnitude in a similar manner regardless of character type, but that users display a higher sensitivity to some types of errors when these errors are displayed on more human‐like characters. Further investigation of those error types suggests that being able to observe a period of preparatory motion before the onset of ballistic motion may be important. However, we found no evidence to suggest that a mismatch between the preparatory phase and the resulting ballistic motion was responsible for the higher sensitivity to errors that was observed for the most humanlike character. 相似文献

7.

Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

Weinman Jerod J. Learned-Miller Erik Hanson Allen R. 《IEEE transactions on pattern analysis and machine intelligence》2009,31(10):1733-1746

Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and storefronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19 percent, the lexicon reduces word recognition error by 35 percent, and sparse belief propagation reduces the lexicon words considered by 99.9 percent with a 12X speedup and no loss in accuracy. 相似文献

8.

一种基于词片识别的字符分割算法

下载免费PDF全文

岳思聪王庆赵荣椿《中国图象图形学报》2006,11(1):8-12

在字符识别领域,对粘连字符的识别是一个被广泛关注的技术难点,而且粘连字符的分割更是产生识别错误的主要原因之一.为了快速准确地进行字符分割,在总结已有方法的特点及不足的基础上,针对电子阅读笔系统的工作特点和实时性要求,提出并实现了一种面向电子阅读笔系统的基于词片识别的分割算法.该方法由于通过对字母组合的识别,降低了传统的基于孤立字符识别方法对于字符切分的要求,而且以中心生长法和改进的峰谷函数为切分工具来进行字符分割,简单实用,因而其在减少因粘连字符切分错误引起的识别错误的同时,不仅降低了运算复杂度,而且适合在阅读笔等嵌入式设备上应用.实验证明,该算法不仅效率高,而且实现简单,还能够降低分割错误带来的识别错误. 相似文献

9.

A paradigm for handwriting-based intelligent tutors

Lisa Anthony Jie Yang Kenneth R. Koedinger 《International journal of human-computer studies》2012,70(11):866-887

This paper presents the interaction design of, and demonstration of technical feasibility for, intelligent tutoring systems that can accept handwriting input from students. Handwriting and pen input offer several affordances for students that traditional typing-based interactions do not. To illustrate these affordances, we present evidence, from tutoring mathematics, that the ability to enter problem solutions via pen input enables students to record algebraic equations more quickly, more smoothly (fewer errors), and with increased transfer to non-computer-based tasks. Furthermore our evidence shows that students tend to like pen input for these types of problems more than typing. However, a clear downside to introducing handwriting input into intelligent tutors is that the recognition of such input is not reliable. In our work, we have found that handwriting input is more likely to be useful and reliable when context is considered, for example, the context of the problem being solved. We present an intelligent tutoring system for algebra equation solving via pen-based input that is able to use context to decrease recognition errors by 18% and to reduce recognition error recovery interactions to occur on one out of every four problems. We applied user-centered design principles to reduce the negative impact of recognition errors in the following ways: (1) though students handwrite their problem-solving process, they type their final answer to reduce ambiguity for tutoring purposes, and (2) in the small number of cases in which the system must involve the student in recognition error recovery, the interaction focuses on identifying the student’s problem-solving error to keep the emphasis on tutoring. Many potential recognition errors can thus be ignored and distracting interactions are avoided. This work can inform the design of future systems for students using pen and sketch input for math or other topics by motivating the use of context and pragmatics to decrease the impact of recognition errors and put user focus on the task at hand. 相似文献

10.

基于多通道融合的连续手写识别纠错方法 总被引：1，自引：0，他引：1

敖翔王绪刚戴国忠王宏安《软件学报》2007,18(9):2162-2173

在基于识别的界面中,用户的满意度不但由识别准确度决定,而且还受识别错误的纠正过程的影响.提出一种基于多通道融合的连续手写笔迹识别错误的纠正方法.该方法允许用户通过口述书写内容纠正手写识别中的字符提取和识别的错误.该纠错方法的核心是一种多通道融合算法.该算法通过利用语音输入约束最优手写识别结果的搜索,可纠正手写字符的切分错和识别错.实验评估结果表明,该融合算法能够有效纠正错误,计算效率高.与另外两种手写识别错误纠正方法相比,该方法具有更高的纠错效率. 相似文献

11.

Development and application of a white box approach to integration testing

Allen Haley Stuart Zweben 《Journal of Systems and Software》1984,4(4):309-315

Program testing techniques can be classified in many ways. One classification is that of “black box” vs. “white box” testing. In black box testing, test data are selected according to the purpose of the program independent of the manner in which the program is actually coded. White box testing, on the other hand, makes use of the properties of the source code to guide the testing process. A white box testing strategy, which involves integrating a previously validated module into a software system is described. It is shown that, when doing the integration testing, it is not enough to treat the module as a “black box,” for otherwise certain integration errors may go undetected. For example, an error in the calling program may cause an error in the module's input which only results in an error in the module's output along certain paths through the module. These errors can be classified as Integration Domain Errors, and Integration Computation Errors. The results indicate that such errors can be detected by the module by retesting a set of paths whose cardinality depends only on the dimensionality of the module's input for integration domain errors, and on the dimensionality of the module's inputs and outputs for integration computation errors. In both cases the number of paths that need be retested do not depend on the module's path complexity. An example of the strategy as applied to the path testing of a COBOL program is presented. 相似文献

12.

An image-based automatic Arabic translation system

Yi Chang^{Author Vitae} Datong Chen Author VitaeAuthor Vitae Jie Yang Author Vitae 《Pattern recognition》2009,42(9):2127-1138

In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module. 相似文献

13.

Reducing Data Cache Susceptibility to Soft Errors 总被引：1，自引：0，他引：1

Vilas Sridharan Hossein Asadi Mehdi B. Tahoori David Kaeli 《Dependable and Secure Computing, IEEE Transactions on》2006,3(4):353-364

Data caches are a fundamental component of most modern microprocessors. They provide for efficient read/write access to data memory. Errors occurring in the data cache can corrupt data values or state, and can easily propagate throughout the memory hierarchy. One of the main threats to data cache reliability is soft (transient, nonreproducible) errors. These errors can occur more often than hard (permanent) errors, and most often arise from single event upsets (SEUs) caused by strikes from energetic particles such as neutrons and alpha particles. Many protection techniques exist for data caches; the most common are ECC (error correcting codes) and parity. These protection techniques detect all single bit errors and, in the case of ECC, correct them. To make proper design decisions about which protection technique to use, accurate design-time modeling of cache reliability is crucial. In addition, as caches increase in storage capacity, another important goal is to reduce the failure rate of a cache, to limit disruption to normal system operation. In this paper, we present our modeling approach for assessing the impact of soft errors using architectural simulators. We also describe a new technique for reducing the vulnerability of data caches: refetching. By selectively refetching cache lines from the ECC-protected L2 cache, we can significantly reduce the vulnerability of the L1 data cache. We discuss and present results for two different algorithms that perform selective refetch. Experimental results show that we can obtain an 85 percent decrease in vulnerability when running the SPEC2K benchmark suite while only experiencing a slight decrease in performance. Our results demonstrate that selective refetch can cost-effectivety decrease the error rate of an L1 data cache 相似文献

14.

Error Detection and Prediction Algorithms: Application in Robotics

Xin W. Chen Shimon Y. Nof 《Journal of Intelligent and Robotic Systems》2007,48(2):225-252

This article presents research on error detection and prediction algorithms in robotics. Errors, defined as either agent error or Co-net error, are analyzed and compared. Three new error detection and prediction algorithms (EDPAs) are then developed, and validated by detecting and predicting errors in typical pick and place motions of an Adept Cobra 800 robot. A laser Doppler displacement meter (LDDM™) MCV-500 is used to measure the position of robot gripper in 105 experiment runs. Results show that combined EDPAs are preferred to detect and predict displacement errors in sequential robot motions. 相似文献

15.

三位一体字标注的汉语词法分析

于江德胡顺义余正涛《中文信息学报》2015,29(6):1-7

针对汉语词法分析中分词、词性标注、命名实体识别三项子任务分步处理时多类信息难以整合利用,且错误向上传递放大的不足,该文提出一种三位一体字标注的汉语词法分析方法,该方法将汉语词法分析过程看作字序列的标注过程,将每个字的词位、词性、命名实体三类信息融合到该字的标记中,采用最大熵模型经过一次标注实现汉语词法分析的三项任务。并在Bakeoff2007的PKU语料上进行了封闭测试,通过对该方法和传统分步处理的分词、词性标注、命名实体识别的性能进行大量对比实验,结果表明,三位一体字标注方法的分词、词性标注、命名实体识别的性能都有不同程度的提升,汉语分词的F值达到了96.4%,词性标注的标注精度达到了95.3%,命名实体识别的F值达到了90.3%,这说明三位一体字标注的汉语词法分析性能更优。相似文献

16.

Does the cost function matter in Bayes decision rule?

Schlü ter R Nussbaum-Thom M Ney H 《IEEE transactions on pattern analysis and machine intelligence》2012,34(2):292-301

In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high. 相似文献

17.

印刷体汉字识别后处理方法的研究

张宏涛龙翀朱小燕孙俊《中文信息学报》2009,23(6):67-72

高阶N-gram语言模型在OCR后处理方面有着广泛的应用,但也面临着因模型复杂度大导致的数据稀疏,以及耗费较多的时空资源等问题。该文针对印刷体汉字识别的后处理,提出了一种基于字节的语言模型的后处理算法。通过采用字节作为语言模型的基本表示单位,模型的复杂度大大降低,从而数据稀疏问题得到很大程度上缓解。实验证明,采用基于字节的语言模型的后处理系统能够以极少的时空开销获取很好的识别性能。在有部分分割错误的测试集上,正确率从88.67%提高到了98.32%,错误率下降了85.18%,运行速度较基于字以及基于词的系统有了大幅的提升,提高了后处理系统的综合性能;与目前常用的基于词的语言模型后处理系统相比,新系统能够节省95%的运行时间和98%的内存资源,但系统识别率仅降低了1.11%。相似文献

18.

OCR error correction using correction patterns and self-organizing migrating algorithm

Nguyen Quoc-Dung Le Duc-Anh Phan Nguyet-Minh Zelinka Ivan 《Pattern Analysis & Applications》2021,24(2):701-721

Optical character recognition (OCR) systems help to digitize paper-based historical achieves. However, poor quality of scanned documents and limitations of text recognition techniques result in different kinds of errors in OCR outputs. Post-processing is an essential step in improving the output quality of OCR systems by detecting and cleaning the errors. In this paper, we present an automatic model consisting of both error detection and error correction phases for OCR post-processing. We propose a novel approach of OCR post-processing error correction using correction pattern edits and evolutionary algorithm which has been mainly used for solving optimization problems. Our model adopts a variant of the self-organizing migrating algorithm along with a fitness function based on modifications of important linguistic features. We illustrate how to construct the table of correction pattern edits involving all types of edit operations and being directly learned from the training dataset. Through efficient settings of the algorithm parameters, our model can be performed with high-quality candidate generation and error correction. The experimental results show that our proposed approach outperforms various baseline approaches as evaluated on the benchmark dataset of ICDAR 2017 Post-OCR text correction competition.

相似文献

19.

An analysis of the divergence problem in the Kalman filter

Price C. 《Automatic Control, IEEE Transactions on》1968,13(6):699-702

The effect of modeling errors in a linear discrete stochastic system upon the Kalman filter state estimates is investigated. Errors in both plant dynamics and noise covariances are permitted. The errors are characterized in such a manner that a linear recursion relation for the actual estimation error covariances can be derived. Conditions which guarantee that the covariance matrix remains bounded are described in terms of the asymptotic stability of the homogeneous part of the covariance equation and the boundedness of the forcing terms in the inhomogeneous equation. 相似文献

20.

What size test set gives good error rate estimates? 总被引：1，自引：0，他引：1

Guyon I. Makhoul J. Schwartz R. Vapnik V. 《IEEE transactions on pattern analysis and machine intelligence》1998,20(1):52-64

We address the problem of determining what size test set guarantees statistically significant results in a character recognition task, as a function of the expected error rate. We provide a statistical analysis showing that if, for example, the expected character error rate is around 1 percent, then, with a test set of at least 10,000 statistically independent handwritten characters (which could be obtained by taking 100 characters from each of 100 different writers), we guarantee, with 95 percent confidence, that: (1) the expected value of the character error rate is not worse than 1.25 E, where E is the empirical character error rate of the best recognizer, calculated on the test set; and (2) a difference of 0.3 E between the error rates of two recognizers is significant. We developed this framework with character recognition applications in mind, but it applies as well to speech recognition and to other pattern recognition problems 相似文献