首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 296 毫秒
1.
随着因特网技术的迅速发展,网上信息成几何级数增长,如何从这些海量联机非结构化文本中自动抽取出结构化信息成为目前重要的研究课题。研究了基于隐马尔可夫模型的Web信息抽取算法,着重探讨了隐马尔可夫模型在文本信息抽取中应该如何应用,数据应该如何标记,并对隐马尔可夫模型在文本信息抽取中的应用提出了几个改进的方法,建立了基于HMM的Web信息抽取模型,并对信息抽取后的数据进行了分析对比,验证了改进算法的有效性。  相似文献   

2.
研究Web文档服务的准确性和快速性,网络信息抽取成为处理海量网络信息的重要手段,而大量异构信息的有效抽取是非常困难的,为了改进和提高系统对于海量异构网页信息的抽取查全率和查准率,提出了一种新的信息抽取的方法,算法利用了隐马尔可夫模型在处理规则知识上的优势对每个页面构建HTML树,并利用Shannon熵来定位数据域,再用Maxi-mum Likelihood方法实现隐马尔可夫模型的构建,实现对Web信息的抽取。仿真结果表明,通过对大量学术论文头部结构信息的抽取,应用算法可以使信息抽取在召回率和准确率方面有明显的提高。  相似文献   

3.
基于隐马尔可夫模型的中文科研论文信息抽取   总被引:1,自引:1,他引:0       下载免费PDF全文
随着大量的科研论文出现在互联网上,从中精确地抽取论文头部信息和引文信息显得十分重要。该文提出了一种基于隐马尔可夫模型的中文科研论文头部信息和引文信息抽取算法,分析了模型结构的学习和参数估计方法。在进行信息抽取时,利用分隔符、特定标识符等格式信息对文本进行分块,利用隐马尔可夫模型进行指定域的抽取。实验结果表明,该算法具有良好的准确率和召回率。  相似文献   

4.
韩普  姜杰 《微机发展》2010,(2):245-248,252
隐马尔可夫模型(HMM)是一种强大的统计学机器学习技术,该模型已经成功地应用于连续语音识别、在线手写识别,在生物学信息中也得到了广泛的应用。由于该模型的强大的学习能力,在自然语言处理领域逐渐得到了应用。对隐马尔可夫模型在词性标注、命名实体识别、信息抽取应用中的关键问题进行了分析。着重分析了在信息抽取时使用隐马尔可夫模型的重点和难点问题,期望让更多的研究人员进一步认识和了解HMM。最后分析了隐马尔可夫模型在应用中的不足之处和改进研究。  相似文献   

5.
HMM在自然语言处理领域中的应用研究   总被引:2,自引:1,他引:1  
韩普  姜杰 《计算机技术与发展》2010,20(2):245-248,252
隐马尔可夫模型(HMM)是一种强大的统计学机器学习技术,该模型已经成功地应用于连续语音识别、在线手写识别,在生物学信息中也得到了广泛的应用。由于该模型的强大的学习能力,在自然语言处理领域逐渐得到了应用。对隐马尔可夫模型在词性标注、命名实体识别、信息抽取应用中的关键问题进行了分析。着重分析了在信息抽取时使用隐马尔可夫模型的重点和难点问题,期望让更多的研究人员进一步认识和了解HMM。最后分析了隐马尔可夫模型在应用中的不足之处和改进研究。  相似文献   

6.
网络信息抽取是从半结构化的Web海量数据中,按用户要求抽取且形成相关的有效的结构数据处理过程。论文以隐马尔科夫模型(HMM)进行数据抽取中的若干关键问题进行研究,提出了基于数据挖掘聚类的模型合并方法生成隐马尔可夫模型,即可根据数据自动生成HMM,同时对一般的隐马尔可夫模型进行了扩展,为每个抽取域生成一个隐马尔可夫模型,用于获取更多的有用信息。  相似文献   

7.
基于隐马尔可夫模型的Web信息抽取   总被引:1,自引:1,他引:0       下载免费PDF全文
刘亚清  陈荣 《计算机工程》2009,35(18):25-27
针对Web信息抽取领域中存在的“项缺失”和“项无序”问题,提出一种基于隐马尔可夫模型的Web信息抽取方法。将Web文档解析为一棵扩展的DOM树,映射待抽取的信息项为状态,映射待抽取的信息项在扩展DOM树中的路径为词汇,使用归纳算法构造隐马尔可夫模型。实验结果证明该方法可以获得更好的抽取性能。  相似文献   

8.
隐马尔可夫模型及其最新应用与发展①   总被引:2,自引:0,他引:2  
隐马尔可夫模型是序列数据处理和统计学习的一种重要概率模型,已被成功应用于许多工程任务中。首先介绍了隐马尔可夫模型的基本原理,接着综述了其在人的行为分析、网络安全和信息抽取中的最新应用。最后对最近提出来的无限状态隐马尔可夫模型的原理及最新发展进行了总结。  相似文献   

9.
隐马尔可夫模型及其最新应用与发展①   总被引:1,自引:0,他引:1  
隐马尔可夫模型是序列数据处理和统计学习的一种重要概率模型,已被成功应用于许多工程任务中。首先介绍了隐马尔可夫模型的基本原理,接着综述了其在人的行为分析、网络安全和信息抽取中的最新应用。最后对最近提出来的无限状态隐马尔可夫模型的原理及最新发展进行了总结。  相似文献   

10.
隐马尔可夫模型是序列数据处理和统计学习的一种重要概率模型,最近几年已经被成功应用到许多关于自然语言处理的任务中.简要介绍了隐马尔可夫模型,对其在词性标注应用中的难点、模型的建立,Viterbi算法等问题进行了详细论述,给出了基于隐马尔可夫模型的中文科研论文头部信息抽取过程以及模型结构的学习和参数的训练等关键问题的解决办法.  相似文献   

11.
针对目前大部分人脸表情识别算法中仅提取图像的某一类特征,导致特征参数不能全面反映脸部情感信息的问题,提出了一种基于特征融合和离散隐马尔可夫模型(HMM)识别的人脸表情识别方法。对同一个图像序列分别使用离散小波变换(DWT)和标准正交非负矩阵分解(ONMF)提取纹理信息,使用改进的主动表观模型(AAM)提取几何形变信息,再使用高维小样本下典型相关分析(CCA)对提取的两种特征进行特征融合,最后使用离散HMM来进行表情分类识别。实验结果表明,经过特征融合后,在较少特征向量维数下该方法能够达到较高的识别率和较快的识别速度。  相似文献   

12.
The identification and assessment of the environmental impacts of engineering projects is an essential step in studies on environmental impact (IES). There are methods that allow both tasks to be performed and methods that allow each of them to be carried out separately. Normally, traditional methods are used to identify and evaluate environmental impacts, such as matrices, cause-effect network diagrams or check lists. Here we report the configuration of an expert system as a tool that allows environmental impacts to be identified. The expert system is based on a geographic information system to configure the knowledge base, the inference motor and the user interface. The knowledge base comprises declarative knowledge (structured in an alphanumeric and spatial database from official cartographic information) and procedural knowledge (via heuristic rules that superimpose project actions over environmental factors). We then describe the application of the expert system to the study of the environmental impact of the R-3 motorway in the Community of Madrid, Spain. As results, running the expert system allows the identification of environmental impacts on environmental factors defined at the 1:5000 and 1:25000 cartographic scales. Finally, analysis of the results or conclusions allows the validity of the use of graphic expert systems to be compared for the identification of environmental impacts.  相似文献   

13.
We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian mixture model (GMM) or a hidden Markov model (HMM) whose hidden states correspond to the VFOA. The novelties of this paper are threefold. First, contrary to previous studies on the topic, in our setup, the potential VFOA of a person is not restricted to other participants only. It includes environmental targets as well (a table and a projection screen), which increases the complexity of the task, with more VFOA targets spread in the pan as well as tilt gaze space. Second, we propose a geometric model to set the GMM or HMM parameters by exploiting results from cognitive science on saccadic eye motion, which allows the prediction of the head pose given a gaze target. Third, an unsupervised parameter adaptation step not using any labeled data is proposed, which accounts for the specific gazing behavior of each participant. Using a publicly available corpus of eight meetings featuring four persons, we analyze the above methods by evaluating, through objective performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device or a vision-based tracking system. The results clearly show that in such complex but realistic situations, the VFOA recognition performance is highly dependent on how well the visual targets are separated for a given meeting participant. In addition, the results show that the use of a geometric model with unsupervised adaptation achieves better results than the use of training data to set the HMM parameters.  相似文献   

14.
Spatial analysis is at the basis of several decision processes in fields such as urban planning and environmental management. In order to carry it out effectively more flexible Geographic Information Systems are needed to be able to represent and manage the imperfection that invariably affects geographic information. Starting from the consideration that experts usually deal with imperfect spatial information through linguistic terms (e.g., they identify the approximate position of a phenomenon on a map and classify spatial properties through linguistic labels), this contribution proposes the use of linguistic granule of information to represent and manage imperfect spatial information. A fuzzy object-based data model is proposed as a tool for supporting spatial analysis based on the management of linguistic granule. In particular, the problem of defining methods to manage imperfect information depending on the type of imperfection is discussed. Finally, an example of spatial analysis applied to support a decision problem in environmental impact assessment is described.  相似文献   

15.
在传统的隐马尔可夫模型中,模型在某状态停留一定时间的概率随着时间的增长呈指数下降的趋势。文中使用依赖于时间的状态转移概率对状态停留时间予以刻画。首先,在采用相同特征矢量下进行了修改后的隐马尔可夫模型和传统隐马尔可夫模型的比较和分析。其次,对不同特征矢量的组合进行了对比实验。另外,在进行不同参数的组合时,文中考虑了不同特征参数及其维数对观察矢量概率输出的影响。  相似文献   

16.
提出了一种利用隐马尔可夫模型和支持向量机作为两级分类器的分类方法,实现对语音、杯碟碰撞声、开门和关门声、口哨声以及电话铃声五种环境声音的分类。对于采集和预处理后的环境声音信号,首先在第一级采用HMM模型进行初步分类,找出概率最大的两类,确定每种环境声音最有可能属于的类别,然后采用第二级SVM分类器作出进一步的判断。实验结果表明,相对于单独使用两者中任何一种作为分类器的分类方法,该方法对环境声音的识别具有更高的分类准确性。  相似文献   

17.
图像/视频的获取及传输过程中,由于物理环境及算法性能的限制,其质量难免会出现无法预估的衰减,导致其在实际场景中的应用受到限制,并对人的视觉体验造成显著影响。因此,作为计算机视觉领域的一项重要任务,图像/视频质量评价应运而生。其目的在于通过构建计算机数学模型来衡量图像/视频中的失真信息以判断其质量的好坏,达到自动预测质量的效果。在城市生活、交通监控以及多媒体直播等多个场景中具有广泛的应用前景。图像/视频质量评价研究取得了长足的发展,为计算机视觉领域中其他任务提供了一定的便利。本文在广泛调研前人研究的基础上,回顾了整个图像/视频质量评价领域的发展历程,分别列举了传统方法和深度学习方法中一些具有里程碑意义的算法和影响力较大的算法,然后从全参考、半参考和无参考3个方面分别对图像/视频质量评价领域的一些文献进行了综述,具体涉及的方法包含基于结构信息、基于人类视觉系统和基于自然图像统计的方法等;在LIVE(laboratory for image&video engineering)、CSIQ(categorical subjective image quality database)、T...  相似文献   

18.
At the central energy management center in a power system, the real time controls continuously track the load changes and endeavor to match the total power demand with total generation in such a manner that the operating cost is minimized while all the operating constraints are satisfied. However, due to the strict government regulations on environmental protection, operation at minimum cost is no longer the only criterion for dispatching electrical power. The idea behind the environmentally constrained economic dispatch formulation is to estimate the optimal generation schedule of generating units in such a manner that fuel cost and harmful emission levels are both simultaneously minimized for a given load demand. Conventional optimization techniques become very time consuming and computationally extensive for such complex optimization tasks. These methods are hence not suitable for on-line use. Neural networks and fuzzy systems can be trained to generate accurate relations among variables in complex non-linear dynamical environment, as both are model-free estimators. The existing synergy between these two fields has been exploited in this paper for solving the economic and environmental dispatch problem on-line. A multi-output modified neo-fuzzy neuron (NFN), capable of real time training is proposed for economic and environmental power generation allocation.This model is found to achieve accurate results and the training is observed to be faster than other popular neural networks. The proposed method has been tested on medium-sized sample power systems with three and six generating units and found to be suitable for on-line combined environmental economic dispatch (CEED).  相似文献   

19.
It is an effective approach to learn the influence of environmental parameters,such as additive noise and channel distortions,from training data for robust speech recognition.Most of the previous methods are based on maximum likelihood estimation criterion.However,these methods do not lead to a minimum error rate result.In this paper,a novel discriinative learning method of environmental parameters,which is based on Minimum Classification Error (MCE) criterion,is proposed.In the method,a simple classifier and the Generalized Probabilistic Descent (GPD)algorithm are adopted to iteratively learn the environmental parameters.Consequently,the clean speech features are estimated from the noisy speech features with the estimated environmental parameters,and then the estimations of clean speech features are utilized in the back-end HMM classifier,Experiments show that the best error rate reudction of 32.1% is obtained,tested on a task of 18 isolated confusion Korean words,relative to a conventional HMM system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号