首页 | 本学科首页   官方微博 | 高级检索  
     

生物医学文本中命名实体识别的智能化方法
引用本文:王浩畅,赵铁军,刘延力,于浩.生物医学文本中命名实体识别的智能化方法[J].北京邮电大学学报,2006,29(22):54-58.
作者姓名:王浩畅  赵铁军  刘延力  于浩
作者单位:1. 哈尔滨工业大学 计算机与技术学院, 哈尔滨 150001; 2. 辽河石油勘探局通信公司, 盘锦 124010
摘    要:介绍了使用机器学习方法进行生物医学文本命名实体识别的技术,包括Generalized Winnow算法、支持向量机方法和条件随机域模型。根据学习算法的特点,识别过程中使用了丰富的特征集,包括局部特征,全文特征及外部资源特征。各种类型特征的优化组合、识别结果的后处理包括缩写词识别和嵌套词识别以及边界校正等都提升了命名实体识别系统的性能。实验结果表明,通过上述策略的应用,系统取得了很好的识别结果。

关 键 词:命名实体识别  特征选择  支持向量机  条件随机域
收稿时间:2006-09-20

Intelligent Method for Name Entity Recognition from Biomedical Text
Affiliation:1. School of Computer Science and Technology, Harbin Institute of Technology, 150001, Harbin;
2. Liaohe Petroleum Reconnoitering Bureau, Communication Corp, 124010, Panjin
Abstract:These methods make extensive use of a diverse set of features, including local features, full text features and the features of external resources according to characteristic of algorithms. All the features are integrated effectively and efficiently into the recognition systems. Also the impact of different feature sets on the performance of the systems is evaluated. In order to improve the performance of systems, a post-processing module is added to deal with the abbreviation phenomena and the cascaded name entities as well as the identification of boundary errors. Evaluations of experimental results prove that the strategies of the feature selection and the post-processing modules have important contributions to better output of the systems.
Keywords:name entity recognition  feature selection  support vector machine  conditional random fields
点击此处可从《北京邮电大学学报》浏览原始摘要信息
点击此处可从《北京邮电大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号