首页 | 本学科首页   官方微博 | 高级检索  
     

HNC语义标注模型的构建
引用本文:谢法奎,张全.HNC语义标注模型的构建[J].计算机科学,2009,36(5):238-240.
作者姓名:谢法奎  张全
作者单位:1. 中国科学院研究生院,北京,100039;中国科学院声学研究所,北京,100190
2. 中国科学院声学研究所,北京,100190
基金项目:国家重点基础研究发展规划(973计划),中国科学院声学研究所所长择优基金 
摘    要:介绍一种基于HNC理论的、人机结合的汉语语料语义标注模型.首先分析了HNC语义标注的内容,在此基础上定义了标注的流程.因标注十分复杂,在流程的主要环节使用机器标注来帮助人工标注.具体地说,在语义块切分问题上采用最大熵模型,其正确率和召回率分别达到了83.78%和91.17%;在句类判断问题上采用基于实例的模型,其正确率达到了51.64%.运用此标注模型建设了HNC语义标注语料库,目前语料规模已达到40万字.

关 键 词:概念层次网络  语料库  最大熵模型
收稿时间:2008/6/25 0:00:00

Novel HNC Conceptual Tagging Model for Corpus
XIE Fa-kui,ZHANG Quan.Novel HNC Conceptual Tagging Model for Corpus[J].Computer Science,2009,36(5):238-240.
Authors:XIE Fa-kui  ZHANG Quan
Affiliation:Graduate School of the Chinese Academy of Sciences;Beijing 100039;China;Institute of Acoustics;Chinese Academy of Sciences;Beijing 100190;China
Abstract:This paper introduced a novel conceptual tagging model for corpus which is based on the Hierarchical Network of Concepts(HNC) theory,and which benefits from manual work and automatic machine.Firstly,the contents of tagging were given,and the process of tagging was defined.For the complexity of the process,some machine tagging ways were used to help manual work.A maximum entropy model was adopted to deal with the problem of semantic chunks segmentation,and the test precision and recall are 83.78 % and 91.17 ...
Keywords:HNC  Corpus  Maximum entropy model  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号