共查询到20条相似文献,搜索用时 15 毫秒
1.
为了在视频图像中进行字幕信息的实时提取,提出了一套简捷而有效的方法。首先进行文字事件检测,然后进行边缘检测、阈值计算和边缘尺寸限制,最后依据文字像素密度范围进一步滤去非文字区域的视频字幕,提出的叠加水平和垂直方向边缘的方法,加强了检测到的文字的边缘;对边缘进行尺寸限制过滤掉了不符合文字尺寸的边缘。应用投影法最终确定视频字幕所在区域。最后,利用OCR识别技术对提取出来的文字区域进行识别,完成视频中文字的提取。以上方法的结合保证了提出算法的正确率和鲁棒性。 相似文献
2.
Sign language in Arab World has been recently recognized and documented. There have been no serious attempts to develop a recognition system that can be used as a communication means between hearing-impaired and other people. This paper introduces the first automatic Arabic sign language (ArSL) recognition system based on hidden Markov models (HMMs). A large set of samples has been used to recognize 30 isolated words from the Standard Arabic sign language. The system operates in different modes including offline, online, signer-dependent, and signer-independent modes. Experimental results on using real ArSL data collected from deaf people demonstrate that the proposed system has high recognition rate for all modes. For signer-dependent case, the system obtains a word recognition rate of 98.13%, 96.74%, and 93.8%, on the training data in offline mode, on the test data in offline mode, and on the test data in online mode respectively. On the other hand, for signer-independent case the system obtains a word recognition rate of 94.2% and 90.6% for offline and online modes respectively. The system does not rely on the use of data gloves or other means as input devices, and it allows the deaf signers to perform gestures freely and naturally. 相似文献
3.
The Journal of Supercomputing - With the recent advancements in information and communication technologies, the creation and storage of documents has become digitalized. Therefore, many documents... 相似文献
4.
5.
For the first time, a genetic framework using contextual knowledge is proposed for segmentation and recognition of unconstrained handwritten numeral strings. New algorithms have been developed to locate feature points on the string image, and to generate possible segmentation hypotheses. A genetic representation scheme is utilized to show the space of all segmentation hypotheses (chromosomes). For the evaluation of segmentation hypotheses, a novel evaluation scheme is introduced, in order to improve the outlier resistance of the system. Our genetic algorithm tries to search and evolve the population of segmentation hypotheses, and to find the one with the highest segmentation/recognition confidence. The NIST NSTRING SD19 and CENPARMI databases were used to evaluate the performance of our proposed method. Our experiments showed that proper use of contextual knowledge in segmentation, evaluation and search greatly improves the overall performance of the system. On average, our system was able to obtain correct recognition rates of 95.28% and 96.42% on handwritten numeral strings using neural network and support vector classifiers, respectively. These results compare favorably with the ones reported in the literature. 相似文献
6.
Offline recognition of unconstrained handwritten texts using HMMs and statistical language models 总被引:2,自引:0,他引:2
Vinciarelli A Bengio S Bunke H 《IEEE transactions on pattern analysis and machine intelligence》2004,26(6):709-720
This paper presents a system for the offline recognition of large vocabulary unconstrained handwritten texts. The only assumption made about the data is that it is written in English. This allows the application of Statistical Language Models in order to improve the performance of our system. Several experiments have been performed using both single and multiple writer data. Lexica of variable size (from 10,000 to 50,000 words) have been used. The use of language models is shown to improve the accuracy of the system (when the lexicon contains 50,000 words, the error rate is reduced by approximately 50 percent for single writer data and by approximately 25 percent for multiple writer data). Our approach is described in detail and compared with other methods presented in the literature to deal with the same problem. An experimental setup to correctly deal with unconstrained text recognition is proposed. 相似文献
7.
《Expert systems with applications》2007,32(3):832-840
We have created a diagnostic/prognostic software tool for the analysis of complex systems, such as monitoring the “running health” of helicopter rotor systems. Although our software is not yet deployed for real-time in-flight diagnosis, we have successfully analyzed the data sets of actual helicopter rotor failures supplied to us by the US Navy. In this paper, we discuss both critical techniques supporting the design of our stochastic diagnostic system as well as issues related to its full deployment. We also present four examples of its use.Our diagnostic system, called DBAYES, is composed of a logic-based, first-order, and Turing-complete set of software tools for stochastic modeling. We use this language for modeling time-series data supplied by sensors on mechanical systems. The inference scheme for these software tools is based on a variant of Pearl’s loopy belief propagation algorithm [Pearl, P. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann]. Our language contains variables that can capture general classes of situations, events, and relationships. A Turing-complete language is able to reason about potentially infinite classes and situations, similar to the analysis of dynamic Bayesian networks. Since the inference algorithm is based on a variant of loopy belief propagation, the language includes expectation maximization type learning of parameters in the modeled domain. In this paper we briefly present the theoretical foundations for our first-order stochastic language and then demonstrate time-series modeling and learning in the context of fault diagnosis. 相似文献
8.
9.
J��r?me Schmid Jos�� A. Iglesias?Guiti��n Enrico Gobbetti Nadia Magnenat-Thalmann 《The Visual computer》2011,27(2):85-95
Despite the ability of current GPU processors to treat heavy parallel computation tasks, its use for solving medical image
segmentation problems is still not fully exploited and remains challenging. A lot of difficulties may arise related to, for
example, the different image modalities, noise and artifacts of source images, or the shape and appearance variability of
the structures to segment. Motivated by practical problems of image segmentation in the medical field, we present in this
paper a GPU framework based on explicit discrete deformable models, implemented over the NVidia CUDA architecture, aimed for
the segmentation of volumetric images. The framework supports the segmentation in parallel of different volumetric structures
as well as interaction during the segmentation process and real-time visualization of the intermediate results. Promising
results in terms of accuracy and speed on a real segmentation experiment have demonstrated the usability of the system. 相似文献
10.
Kumar Pradeep Saini Rajkumar Roy Partha Pratim Dogra Debi Prosad 《Multimedia Tools and Applications》2018,77(7):8823-8846
Multimedia Tools and Applications - Sign language is the only means of communication for speech and hearing impaired people. Using machine translation, Sign Language Recognition (SLR) systems... 相似文献
11.
Automatic understanding and recognition of human shopping behavior has many potential applications, attracting an increasing interest in the marketing domain. The reliability and performance of the automatic recognition system is highly influenced by the adopted theoretical model of behavior. In this work, we address the analogy between human shopping behavior and a natural language. The adopted methodology associates low-level information extracted from video data with semantic information using the proposed behavior language model. Our contribution on the action recognition level consists of proposing a new feature set which fuses Histograms of Optical Flow (HOF) with directional features. On the behavior level we propose combining smoothed bi-grams with the maximum dependency in a chain of conditional probabilities. The experiments are performed on both laboratory and real-life datasets. The introduced behavior language model achieves an accuracy of 87% on the laboratory data and 76% on the real-life dataset, an improvement of 11% and 8% respectively over the baseline model, by incorporating semantic knowledge and capturing correlations between the basic actions. 相似文献
12.
《Computer Speech and Language》2001,15(2):127-148
The aim of this work is to show the ability of stochastic regular grammars to generate accurate language models which can be well integrated, allocated and handled in a continuous speech recognition system. For this purpose, a syntactic version of the well-known n -gram model, called k -testable language in the strict sense (k -TSS), is used. The complete definition of a k -TSS stochastic finite state automaton is provided in the paper. One of the difficulties arising in representing a language model through a stochastic finite state network is that the recursive schema involved in the smoothing procedure must be adopted in the finite state formalism to achieve an efficient implementation of the backing-off mechanism. The use of the syntactic back-off smoothing technique applied to k -TSS language modelling allowed us to obtain a self-contained smoothed model integrating several k -TSS automata in a unique smoothed and integrated model, which is also fully defined in the paper. The proposed formulation leads to a very compact representation of the model parameters learned at training time: probability distribution and model structure. The dynamic expansion of the structure at decoding time allows an efficient integration in a continuous speech recognition system using a one-step decoding procedure. An experimental evaluation of the proposed formulation was carried out on two Spanish corpora. These experiments showed that regular grammars generate accurate language models (k -TSS) that can be efficiently represented and managed in real speech recognition systems, even for high values of k, leading to very good system performance. 相似文献
13.
14.
Sharif Muhammad Khan Muhammad Attique Zahid Farooq Shah Jamal Hussain Akram Tallha 《Pattern Analysis & Applications》2020,23(1):281-294
Pattern Analysis and Applications - Human action recognition from a video sequence has received much attention lately in the field of computer vision due to its range of applications in... 相似文献
15.
In this paper a general fuzzy hyperline segment neural network is proposed [P.M. Patil, Pattern classification and clustering using fuzzy neural networks, Ph.D. Thesis, SRTMU, Nanded, India, January 2003]. It combines supervised and unsupervised learning in a single algorithm so that it can be used for pure classification, pure clustering and hybrid classification/clustering. The method is applied to handwritten Devanagari numeral character recognition and also to the Fisher Iris database. High recognition rates are achieved with less training and recall time per pattern. The algorithm is rotation, scale and translation invariant. The recognition rate with ring data features is found to be 99.5%. 相似文献
16.
John F. Pitrelli Amit Roy 《International Journal on Document Analysis and Recognition》2003,5(2-3):126-137
We discuss development of a word-unigram language model for online handwriting recognition. First, we tokenize a text corpus
into words, contrasting with tokenization methods designed for other purposes. Second, we select for our model a subset of
the words found, discussing deviations from an N-most-frequent-words approach. From a 600-million-word corpus, we generated a 53,000-word model which eliminates 45% of word-recognition
errors made by a character-level-model baseline system. We anticipate that our methods will be applicable to offline recognition
as well, and to some extent to other recognizers, such as speech recognizers and video retrieval systems.
Received: November 1, 2001 / Revised version: July 22, 2002 相似文献
17.
18.
F. Zamora-Martínez V. Frinken S. España-Boquera M.J. Castro-Bleda A. Fischer H. Bunke 《Pattern recognition》2014
Unconstrained off-line continuous handwritten text recognition is a very challenging task which has been recently addressed by different promising techniques. This work presents our latest contribution to this task, integrating neural network language models in the decoding process of three state-of-the-art systems: one based on bidirectional recurrent neural networks, another based on hybrid hidden Markov models and, finally, a combination of both. Experimental results obtained on the IAM off-line database demonstrate that consistent word error rate reductions can be achieved with neural network language models when compared with statistical N-gram language models on the three tested systems. The best word error rate, 16.1%, reported with ROVER combination of systems using neural network language models significantly outperforms current benchmark results for the IAM database. 相似文献
19.
Automatic road marking recognition is a key problem within the domain of automotive vision that lends support to both autonomous urban driving and augmented driver assistance such as situationally aware navigation systems. Here we propose an approach to this problem based on the extraction of robust road marking features via a novel pipeline of inverse perspective mapping and multi-level binarisation. A trained classifier combined with additional rule-based post-processing then facilitates the real-time delivery of road marking information as required. The approach is shown to operate successfully over a range of lighting, weather and road surface conditions. 相似文献
20.
传统的手指语识别采用卷积神经网络的方法,模型结构单一,在池化层会丢弃很多信息; Capsule(胶囊)是在神经网络中构建和抽象出的子网络,每个胶囊都专注于一些单独的任务,又能保留图像的空间特征。分析了中国手语中手指语的特征,构建并扩展了手指语图片训练集,试图用CapsNet(胶囊网络)模型解决手指语的识别任务,对比了不同参数下CapsNet的识别率,并与经典的GoogLeNet卷积网络作对比。实验结果表明,CapsNet在手语识别任务上能达到较好的识别效果。 相似文献