首页 | 本学科首页   官方微博 | 高级检索  
     

基于词汇支配度的汉语依存分析模型
引用本文:刘挺,马金山,李生.基于词汇支配度的汉语依存分析模型[J].软件学报,2006,17(9):1876-1883.
作者姓名:刘挺  马金山  李生
作者单位:哈尔滨工业大学,信息检索研究室,黑龙江,哈尔滨,150001
基金项目:国家自然科学基金;国家自然科学基金
摘    要:如何应用句法结构和词汇化是句法分析建模所面临的两个主要问题,汉语依存分析对这两方面做了初步的探索.首先通过对大规模依存树库的统计学习,获取其中的词汇依存信息,建立了一个词汇化的概率分析模型.然后引入词汇支配度的概念,以充分利用了句子中的结构信息.词汇化方法有效地弥补了以前工作中词性信息的粒度过粗问题.同时,词汇支配度增强了对句法结构的识别,有效地避免了非法结构的生成.在4 000句的测试集上,依存分析获得了约74%的正确率.

关 键 词:依存语法  句法分析  支配度  动态规划
收稿时间:2005-04-28
修稿时间:2005-10-10

Chinese Dependency Parsing Model Based on Lexical Governing Degree
LIU Ting,MA Jin-Shan and LI Sheng.Chinese Dependency Parsing Model Based on Lexical Governing Degree[J].Journal of Software,2006,17(9):1876-1883.
Authors:LIU Ting  MA Jin-Shan and LI Sheng
Affiliation:Information Retrieval Laboratory, Harbin Institute of Technology, Harbin 150001, China
Abstract:Use of structural information and lexicalization are two of the main challenges facing syntactic analysis, and they are investigated in this paper. First, the probabilities of lexical dependencies are obtained by training a large-scale dependency treebank and used to build the lexical model. Second, the governing degree of words is introduced to utilize the structure information. The lexical method overcomes the weakness of POS dependencies in the past work; meanwhile the governing degree of words is helpful to distinguish the syntactic structures so some ill-formed structures are avoided. Finally, the paper shows a good experimental result of around 74% accuracy on the test set that consists of 4000 sentences.
Keywords:dependency grammar  parsing  governing degree  dynamic programming
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号