首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper proposes a fuzzy classification system to perform word indexing in ancient printed documents. The indexing system receives a given word selected by an user. The word is preprocessed using an aspect ratio filter, assuring that only interesting word candidates are considered. The image is classified by oriented feature extraction using Gabor filter banks. The oriented features are used to generate membership functions that characterize the selected word. This target word image is then compared to the potential matches, using a similarity matrix. The indexing system is flexible and lightweight when compared to other optimal recognizers, which allows its use in "real-time" applications. A significant test revealed that the indexer achieved very good results in terms of precision and recall in texts from XVIIth century.  相似文献   

2.
文语转换系统中基于语料的汉语自动分词研究   总被引:9,自引:0,他引:9  
基于一个实际的文语转换系统,介绍了经的一些处理方法,采用了一种改进的最大匹配法,可以切分出所有的交集歧义,提出了一基于统计模型的算法来处理其中的多交集歧义的字段,并用穷举法和一睦简单的规则相结合的方法从实用角度解决多音字的异读问题以及中文姓名的自动识别方法,解决了汉语切分歧义、多音词处理、,中文姓名的自动识别问题,达到实现一文语转换的。  相似文献   

3.
OLE和Word对象模型在题库管理系统开发中的应用   总被引:13,自引:0,他引:13  
以一个已经完成的试题库管理系统为基础,以复合文本的录入和试卷的生成为中心,对OLE技术和Word对象模型在该类系统开发中的应用进行了深入地探讨.首先概述了OLE在这类系统开发中的常用技术和Word对象模型,然后详尽地分析和比较了对象嵌入在数据录入中的3种应用方法及其优缺点,最后以模板为重点,深入探讨了OLE自动化和Word对象模型在试卷生成中的应用.同时,给出了部分应用在VB中的基本实现代码.  相似文献   

4.
For part I see ibid. vol.8, no. 1 (2000). This paper presents an application of the generalized hidden Markov models to handwritten word recognition. The system represents a word image as an ordered list of observation vectors by encoding features computed from each column in the given word image. Word models are formed by concatenating the state chains of the constituent character hidden Markov models. The novel work presented includes the preprocessing, feature extraction, and the application of the generalized hidden Markov models to handwritten word recognition. Methods for training the classical and generalized (fuzzy) models are described. Experiments were performed on a standard data set of handwritten word images obtained from the US Post Office mail stream, which contains real-word samples of different styles and qualities  相似文献   

5.
A recognition system for general isolated off-line handwritten words using an approximate segment-string matching algorithm is described. The fundamental paradigm employed is a character-based segment-then-recognize/match strategy. An additional user supplied contextual information in the form of a lexicon guides a graph search to estimate the most likely word image identity. This system is designed to operate robustly in the presence of document noise, poor handwriting, and lexicon errors. A pre-processing step is initially applied to the image to remove noise artifacts and normalize the handwriting. An oversegmentation approach is used to improve the likelihood of capturing the individual characters embedded in the word. A directed graph is constructed that contains many possible interpretations of the word image, many implausible. The most likely graph path and associated confidence is computed for each lexicon word to produce a final lexicon ranking. Experiments highlighting the characteristics of this algorithm are given  相似文献   

6.
从汉语格关系表示生成日语   总被引:3,自引:1,他引:3  
本文介绍了一个基于转换翻译的汉日机器翻译系统中日语生成子系统的设计和实现。文章首先描述了一种基于格关系的汉语依存分析树,分析树结点记录语法语义以及格关系信息;然后,针对日语的特征,分析了日语生成中的主要问题,包括译词选择、用言活用形确定、助词添加等;给出基于规则的日语生成系统的组织结构,重点介绍生成规则系统的设计和实现。最后,给出规则描述的实例以及翻译实例,提出进一步改进本系统的初步想法。  相似文献   

7.
Marking Estimation of Petri Nets With Silent Transitions   总被引:1,自引:0,他引:1  
In this paper, we deal with the problem of estimating the marking of a labeled Petri net system based on the observation of transitions labels. In particular, we assume that a certain number of transitions are labeled with the empty string , while unique labels taken from a given alphabet are assigned to each of the other transitions. Transitions labeled with the empty string are called silent because their firing cannot be observed. Under some technical assumptions on the structure of the unobservable subnet, we formally prove that the set of markings consistent with the observed word can be represented by a linear system with a fixed structure that does not depend on the length of the observed word.  相似文献   

8.
This paper proposes a new automatic speech summarization method. In this method, a set of words maximizing a summarization score is extracted from automatically transcribed speech. This extraction is performed according to a target compression ratio using a dynamic programming (DP) technique. The extracted set of words is then connected to build a summarization sentence. The summarization score consists of a word significance measure, a confidence measure, linguistic likelihood, and a word concatenation probability. The word concatenation score is determined by a dependency structure in the original speech given by stochastic dependency context free grammar (SDCFG). Japanese broadcast news speech transcribed using a large-vocabulary continuous-speech recognition (LVCSR) system is summarized using our proposed method and compared with manual summarization by human subjects. The manual summarization results are combined to build a word network. This word network is used to calculate the word accuracy of each automatic summarization result using the most similar word string in the network. Experimental results show that the proposed method effectively extracts relatively important information by removing redundant and irrelevant information.  相似文献   

9.
文本蕴涵识别是对两个文本之间语义关系的有向推理,而词汇的词义对理解文本的语义以及推理文本之间的语义蕴涵关系有着重要作用.因此,为了有效利用词汇的词义信息推断文本之间的语义蕴涵关系,该文提出一种融合词义信息的文本蕴涵识别方法.该方法首次提出将原始的词汇转化为对应的目标词义,然后利用词汇的词义信息改善文本的语义表示和文本间...  相似文献   

10.
Insertion and deletion operations are two most important operations in molecular computation, and recently these two operations gained interest in the context of molecular computation. In the proposed system, single contextual insertion and usual (with two contexts) contextual deletions are used. In addition, it is assumed that the deletion operation has higher precedence than insertions, if both are possible at a moment. Given a word t, called a context, and the single contextual insertion is performed as follows. For each occurrence of the context t in the given string y, the word t is replaced as txt. In practice, this can be achieved by a replicative transposition process. Similarly, given a pair of words (u, v), called a context, the (u, v)-contextual deletion removes the word in between u and v. Finally, the power of this system and some closure properties are studied.  相似文献   

11.
With any word over the alphabet  Π={r, , u, ū}  , we associate a connected picture in the following manner: the reading of each letter of this word induces a unit line:  r (, u, ū respectively)  stands for a right (left, up, down respectively) move. We present a rewriting system which can yield, from any word over Π, all the words describing the same picture. Particularly, we give an algorithm to find a minimal word describing a given picture: this word represents the shortest way to draw this picture without 'penup'.  相似文献   

12.
A Knuth–Bendix-style completion procedure for groups is presented that, instead of working with sets of string-rewriting rules, manipulates finite sets of word cycles. A characterization is given for the resulting sets of persistent word cycles, from which it follows that the completion procedure terminates successfully if and only if the reduced word problem of the finite group presentation considered is a finite set. In this case the resulting set of persistent word cycles yields a finite canonical string-rewriting system for every linear reduction ordering.  相似文献   

13.
J. D. Bishop  G. J. Smith 《Software》1981,11(12):1315-1329
Describes the STATUS free-text information retrieval system and experiences with its use for a bibliographic database. Factors involved in the growth of the database such as updating times, frequency of word occurrence, and file sizes are discussed in detail, and reactions to the system's performance and reliability are given.  相似文献   

14.
In this paper, a survey of works on word sense disambiguation is presented, and the method used in the Texterra system [1] is described. The method is based on calculation of semantic relatedness of Wikipedia concepts. Comparison of the proposed method and the existing word sense disambiguation methods on various document collections is given.  相似文献   

15.
In this paper, we present a new off-line word recognition system that is able to recognize unconstrained handwritten words using grey-scale images. This is based on structural and relational information in the handwritten word. We use Gabor filters to extract features from the words, and then use an evidence-based approach for word classification. A solution to the Gabor filter parameter estimation problem is given, enabling the Gabor filter to be automatically tuned to the word image properties. We also developed two new methods for correcting the slope of the handwritten words. Our experiments show that the proposed method achieves good recognition rates compared to standard classification methods.  相似文献   

16.
用户兴趣模型的更新与遗忘机制研究   总被引:1,自引:0,他引:1  
单蓉 《微型电脑应用》2011,27(7):10-11,69
根据HTML文档的特点,赋予不同标签限定的特征词相应的权重系数,从而计算特征词的权值,并根据用户的浏览速度更新兴趣模型,其次,采用在在用户模型的特征向量中引入该特征词最后一次的更新时间,并结合遗忘因子修正特征词的权重的方法,实现了模型的遗忘。实验证明,引入遗忘机制的个性化推荐系统获得了较高的推荐效率。  相似文献   

17.
张立昂 《计算机学报》1994,17(5):338-346
本文研究了几种加限制的半圈厄过程的字问题的计算复杂性。主要结果是:1.字母表只含一个符号的半圈厄过程的字问题是NP完全的;2.对于任一固定的字母表只含一个符号的半图厄过程,它的字问题属于P;3.单调的半图厄过程的字问题是PSPACE完全的;4.限制派生长度的字问题是NEXPTIME完全的。  相似文献   

18.
提出一种中文合成词识别及分词修正方法。该方法先采用词性探测从文本中提取词串,进而由提取到的词串生成词共现有向图,借鉴Bellman-Ford算法思想,设计了运行在词共现有向图中识别合成词的算法,即搜索多源点长度最长、权重值满足给定条件的路径,则该路径所对应的词串为合成词。最后,采用核心属性渗透理论对合成词标注词性,同时修正分词结果。实验结果表明,合成词识别正确率达到了91.60%,且分词修正效果良好。  相似文献   

19.
The study reported here examined perceptions of ability at computer programming and at word processing. Factors relating to personal attributes and experience were rated for importance for programming and for word processing by 117 undergraduates. Experience factors, “enjoying working with machines”, “liking technology”, and “enjoying solving complex problems” were given the highest importance ratings for programming ability, while “being good at maths” and “being good at science” were given lower ratings. Previous training and having keyboard skills were given the highest ratings for word processing. Experience of the application resulted in lower ratings being given to the importance of formal training and, in the case of experience of word processing, lower ratings on a number of other factors describing abilities which might relate to word-processing skill. The results are discussed in the context of previous work on computing stereotypes.  相似文献   

20.
蔚润义  江弘 《自动化学报》1997,23(5):684-688
研究一类线性时变不确定系统的采样控制.分析了采用的广义保持器和离散化系统的结构性质,提出了考虑微处理器主频和字长的自适应鲁棒采样控制方案,证明了闭环系统的稳定性.针对倒摆系统进行了计算机仿真研究.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号