首页 | 本学科首页   官方微博 | 高级检索  
     

基于多策略的藏语语义角色标注研究
引用本文:龙从军,康才畯,李 琳,江 荻.基于多策略的藏语语义角色标注研究[J].中文信息学报,2014,28(5):176-181.
作者姓名:龙从军  康才畯  李 琳  江 荻
作者单位:1. 中国社科院民族所,北京 100081;
2. 青海师范大学 计算机学院,青海 西宁 810004;
3. 中国科学院软件研究所,北京 100190
基金项目:国家自然科学基金(61132009)
摘    要:语义角色标注研究对自然语言处理具有十分重要的意义。英汉语语义角色标注研究已经获得了很多成果。然而藏语语义角色标注研究不管是资源建设,还是语义角色标注的技术探讨都鲜有报道。藏语具有比较丰富的句法标记,它们把一个句子天然地分割成功能不同的语义组块,而这些语义组块与语义角色之间存在一定的对应关系。根据这个特点,该文提出规则和统计相结合的、基于语义组块的语义角色标注策略。为了实现语义角色标注,文中首先对藏语语义角色进行分类,得到语义角色标注的分类体系;然后讨论标注规则的获得情况,包括手工编制初始规则集和采用错误驱动学习方法获得扩充规则集;统计技术上,选用了条件随机场模型,并添加了有效的语言特征,最终语义角色标注的结果准确率、召回率和F值分别达到82.78%、85.71%和83.91%。

关 键 词:藏语  语义角色标注  TBL  CRFs  

Multi-Strategy Semantic Role Labeling of Tibetan
LONG Congjun,KANG Caijun,LI Lin,JIANG Di.Multi-Strategy Semantic Role Labeling of Tibetan[J].Journal of Chinese Information Processing,2014,28(5):176-181.
Authors:LONG Congjun  KANG Caijun  LI Lin  JIANG Di
Affiliation:1. Institute of Ethnology & Anthropology, Chinese Academy of Social Sciences, Beijing 100081, China;
2. The Computer College of Qinghai Normal University, Xining, Qinghai 810004, China
3. Institute of Softuare Chinese Acadeuny of Science,Beijing 100190,China
Abstract:Semantic role labeling is of great significance for natural language processing. Substantial achievements have been made in this issue for both English and Chinese. However, either the resource construction or the technology development for semantic role labeling in Tibetan is still in the initial stage. Tibetan has rich syntactic markers which naturally segment a sentence into different semantic chunks, and there are certain relationship between these chunks and semantic roles. Accordingly, this paper propose a semantic role labeling strategy for Tibetan based on semantic chunking by combining two means of rules and statistics. In order to realize the semantic role labeling, a classification system of Tibetan semantic roles is designed and then the acquisition of rules is discussed, including a manual initial rule sets and expanded rule sets from Transformation-Based Error-driven Learning (TBL). Meanwhile the Conditional Random Fields (CRFs) Model is adopted for statistical decision. Experimental results shows that the proposed semantic role labeling method achieves 82.78% in precision, 85.71% in recall, and 83.91% in F measure.
Keywords:Tibetan  Semantic Role Labeling  TBL  CRFs  
本文献已被 CNKI 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号