首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
利用可变长语言模型对中文文档的关键词进行准确识别是中文信息处理中的一个重要问题。由于不存在n元语言模型的截断效应,对关键词检索的长度没有限制,因此增加了关键词识别的难度。利用PAT—tree技术设计了一个改进的可变长统计语言模型对中文文档中的关键词进行识别。在该模型基础上进行相关性检测实验。实验结果表明基于PAT—tree的改进语言模型能更好的识别关键词。  相似文献   

2.
基于HMM的关键词识别系统   总被引:4,自引:0,他引:4  
关键词识别是语音识别中一个重要的研究方向.该文提出了一种基于HMM模型的关键词识别方法.研究并实现了一种新的具有无废料模型精度的采用迭代viterbi的解码算法,从而提高了关键词系统的识别效率.  相似文献   

3.
关键词识别是语音识别中一个重要的研究方向。该文提出了一种基于HMM模型的关键词识别方法。研究并实现了一种新的具有无废料模型精度的采用迭代viterbi的解码算法,从而提高了关键词系统的识别效率。  相似文献   

4.
探讨了交互式自然口语音识别中关键词语音识别和交互式自然语言理解等核心技术。建立了关键词模型、非关键词模型和多重关键词语法原则。训练的方法是二元文法统计模型。  相似文献   

5.
关键词识别是语音识别中的一个重要研究方向,而维吾尔语的关键词识别研究刚刚开始。结合维吾尔语音节特点和考虑影响关键词识别因素,提出在HMM模型的基础上对非关键词建立垃圾模型的方法,来提高关键词的识别效率。  相似文献   

6.
关键词识别是语音识别中的一个重要研究方向,而维吾尔语的关键词识别研究刚刚开始.结合维吾尔语音节特点和考虑影响关键词识别因素,提出在HMM模型的基础上对非关键词建立垃圾模型的方法,来提高关键词的识别效率。  相似文献   

7.
为了能快速有效地识别出应用层DoS攻击, 提出一种基于HMM的应用层DoS攻击检测方法。该方法以应用层协议关键词和关键词之间的时间间隔作为输入, 采用隐马尔可夫模型来快速检测应用层DoS攻击。实验结果表明, 该方法对应用层上的多种DoS攻击都具有很高的检测率和较低的误报率。  相似文献   

8.
基于柔性匹配的中文文本特征提取方法   总被引:2,自引:0,他引:2       下载免费PDF全文
针对含有变形关键词的不良信息过滤问题,提出一种基于柔性匹配的中文文本特征信息提取方法。该方法采用柔性匹配技术识别和提取变形关键词,改进向量空间模型中特征项权重的计算方法,对具有变形形式的关键词赋予较高权重,从而提高特征信息的提取效率。实验结果表明,该方法可在保证过滤准确率的前提下,获得较高的召回率。  相似文献   

9.
为了提高关键词提取的准确率,在利用文本中相同词的前后词共现频率识别组合词的基础上,提出一种基于改进词语统计特征的朴素贝叶斯关键词提取算法。该算法选取词语的词长、词性、位置、TF-IDF值作为词语的特征项,改进了统计词长、TF-IDF和词频的方法,使长词和TF-IDF大的词具有更高的概率,而在统计词频时,考虑了词语之间包含与被包含的关系。然后,采用朴素贝叶斯模型对标记好关键词的文本进行训练,获得各个特征项出现的概率,用来提取文本的关键词。实验表明,与传统基于词频和决策树C4.5的关键词提取算法相比,采用该方法提取的关键词具有更高的准确率和可读性。  相似文献   

10.
季燕 《计算机科学》2013,40(7):129-130,161
目前应用层DDoS攻击严重危害互联网的安全。现有的检测方法只针对某种特定的应用层DDoS攻击,而不能识别应用层上其它的DDoS攻击。为了能快速有效地识别出多种应用层DDoS攻击,提出一种基于请求关键词的应用层DDoS攻击检测方法,该方法以单位时间内请求关键词的频率分布差和个数作为输入,采用隐马尔可夫模型来检测应用层DDoS攻击。实验结果表明,该方法对应用层上的多种DDoS攻击都具有很高的检测率和较低的误报率。  相似文献   

11.
In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list.  相似文献   

12.
13.
We present a wearable input system which enables interaction through 3D handwriting recognition. Users can write text in the air as if they were using an imaginary blackboard. The handwriting gestures are captured wirelessly by motion sensors applying accelerometers and gyroscopes which are attached to the back of the hand. We propose a two-stage approach for spotting and recognition of handwriting gestures. The spotting stage uses a support vector machine to identify those data segments which contain handwriting. The recognition stage uses hidden Markov models (HMMs) to generate a text representation from the motion sensor data. Individual characters are modeled by HMMs and concatenated to word models. Our system can continuously recognize arbitrary sentences, based on a freely definable vocabulary. A statistical language model is used to enhance recognition performance and to restrict the search space. We show that continuous gesture recognition with inertial sensors is feasible for gesture vocabularies that are several orders of magnitude larger than traditional vocabularies for known systems. In a first experiment, we evaluate the spotting algorithm on a realistic data set including everyday activities. In a second experiment, we report the results from a nine-user experiment on handwritten sentence recognition. Finally, we evaluate the end-to-end system on a small but realistic data set.  相似文献   

14.
Sign language spotting is the task of detecting and recognizing signs in a signed utterance, in a set vocabulary. The difficulty of sign language spotting is that instances of signs vary in both motion and appearance. Moreover, signs appear within a continuous gesture stream, interspersed with transitional movements between signs in a vocabulary and nonsign patterns (which include out-of-vocabulary signs, epentheses, and other movements that do not correspond to signs). In this paper, a novel method for designing threshold models in a conditional random field (CRF) model is proposed which performs an adaptive threshold for distinguishing between signs in a vocabulary and nonsign patterns. A short-sign detector, a hand appearance-based sign verification method, and a subsign reasoning method are included to further improve sign language spotting accuracy. Experiments demonstrate that our system can spot signs from continuous data with an 87.0 percent spotting rate and can recognize signs from isolated data with a 93.5 percent recognition rate versus 73.5 percent and 85.4 percent, respectively, for CRFs without a threshold model, short-sign detection, subsign reasoning, and hand appearance-based sign verification. Our system can also achieve a 15.0 percent sign error rate (SER) from continuous data and a 6.4 percent SER from isolated data versus 76.2 percent and 14.5 percent, respectively, for conventional CRFs.  相似文献   

15.
中文文本布局复杂,汉字种类多,书写随意性大,因而手写汉字检测是一个很有挑战的问题。本文提出了一种无分割的手写中文文档字符检测的方法。该方法用SIFT定位文本中候选关键点,然后基于关键点位置和待查询汉字大小来确定候选字符的位置,最后用两个方向动态时间规整(Dynamic Time Warping, DTW)算法来筛选候选字符。实验结果表明,该方法能够在无需将文本分割为字符的情况下准确找到待查询的汉字,并且优于传统的基于DTW字符检测方法。  相似文献   

16.
In keyword spotting from handwritten documents by text query, the word similarity is usually computed by combining character similarities, which are desired to approximate the logarithm of the character probabilities. In this paper, we propose to directly estimate the posterior probability (also called confidence) of candidate characters based on the N-best paths from the candidate segmentation-recognition lattice. On evaluating the candidate segmentation-recognition paths by combining multiple contexts, the scores of the N-best paths are transformed to posterior probabilities using soft-max. The parameter of soft-max (confidence parameter) is estimated from the character confusion network, which is constructed by aligning different paths using a string matching algorithm. The posterior probability of a candidate character is the summation of the probabilities of the paths that pass through the candidate character. We compare the proposed posterior probability estimation method with some reference methods including the word confidence measure and the text line recognition method. Experimental results of keyword spotting on a large database CASIA-OLHWDB of unconstrained online Chinese handwriting demonstrate the effectiveness of the proposed method.  相似文献   

17.
18.
提出一种基于声学分段模型的无监督语音样例检测方法。该方法首先利用高斯混合模型(Gaussian mixture model, GMM)将训练数据频谱参数转换为后验概率特征向量,采用层次聚类算法确定后验概率的边界信息,得到声学分段;然后通过k means算法将片段聚类并添加标签,构建基于后验概率的声学分段模型。检索时以模型对查询样例与检索文档的解码序列代替测量矩阵以降低检索时间,通过基于最小编辑距离的动态匹配检索查询项,最小编辑距离的代价函数由模型相似度距离矩阵修正。实验结果表明,相比GMM及传统声学分段模型,本文提出的方法性能更好,检索速度得到显著提升。  相似文献   

19.
In this paper, we propose a keyword retrieval system for locating words in historical Mongolian document images. Based on the word spotting technology, a collection of historical Mongolian document images is converted into a collection of word images by word segmentation, and a number of profile-based features are extracted to represent word images. For each word image, a fixed-length feature vector is formulated by obtaining the appropriate number of the complex coefficients of discrete Fourier transform on each profile feature. The system supports online image-to-image matching by calculating similarities between a query word image and each word image in the collection, and consequently, a ranked result is returned in descending order of the similarities. Therein, the query word image can be generated by synthesizing a sequence of glyphs when being retrieved. By experimental evaluations, the performance of the system is confirmed.  相似文献   

20.
Pattern Analysis and Applications - In this paper, we present a segmentation-free word spotting method based on Wave Kernel Signature (WKS) under the foundation of quantum mechanics. The query word...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号