首页 | 本学科首页   官方微博 | 高级检索  
     

面向答疑文本的词类标注方法的研究与实现
引用本文:王朝静,郑庆华.面向答疑文本的词类标注方法的研究与实现[J].计算机工程与应用,2004,40(16):57-60,74.
作者姓名:王朝静  郑庆华
作者单位:西安交通大学计算机系,西安,710049
基金项目:国家自然科学基金项目资助(编号:60103022),“十五”重大科技攻关项目(编号:2001BA101A01),教育部优秀青年教师基金项目
摘    要:针对已有词类标注方法在标注网络答疑文本时存在的不足,文章提出了一种面向自然语言答疑文本的词类标注方法。该方法根据答疑文本的特点和后续关键信息提取的需要,对已有的词类标记集进行了扩展;用统计方法标注答疑真实文本,将其结果与正确结果相比较,从中获取词类排歧规则,使规则具有较强的文本针对性,以提高规则排歧的精度;对规则进行分类和优化,提高了标注的速度;采用先规则后统计的标注方法,较好地解决了答疑文本中规则和统计方法的组合问题。目前,该方法已在基于自然语言的网络答疑系统(NaturalLanguageOrientedWebAnswerSystem,,简称NL_WAS)中实现并得到了初步应用。

关 键 词:答疑文本    词类标注  领域词  问句特征词
文章编号:1002-8331-(2004)16-0057-04

An Approach of POS Tagging Oriented Answering Text
Wang Zhaojing Zheng Qinghua.An Approach of POS Tagging Oriented Answering Text[J].Computer Engineering and Applications,2004,40(16):57-60,74.
Authors:Wang Zhaojing Zheng Qinghua
Abstract:Aiming at the shortcomings occur when the existing methods tagging the web answering text,this paper proposes an approach of part of speech(POS)tagging in answering text.This approach expands the POS tagging set based on the characteristic of answering text and the demand of key information distilling.After tagging the answering text by using the statistic_based method,the tagged result is compared with the right and corresponding one in the corpus,and from which the POS rules can be obtained.The rules acquired in this way can deal with the answering text more pertinently.Besides,classification and optimization to the rules can improve the tagging velocity.To solve the combination problem of rule_based method and statistic_based method,this paper adopts the method of rule_first and statistic_last.At present ,the approach above has been already realized and used in the Natural Language Oriented Web Answer System(NL_WAS).
Keywords:answering text  POS tagging  field words  question characteristic words
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号