首页 | 本学科首页   官方微博 | 高级检索  
     

汉语功能块自动分析
引用本文:周强,赵颖泽.汉语功能块自动分析[J].中文信息学报,2007,21(5):18-24.
作者姓名:周强  赵颖泽
作者单位:清华大学计算机系 智能技术与系统国家重点实验室,北京100084
摘    要:汉语功能块描述了句子的基本骨架,是联结句法结构和语义描述的重要桥梁。本文提出了两种不同功能块分析模型: 边界识别模型和序列标记模型,并使用不同的机器学习方法进行了计算模拟。通过两种模型分析结果的有机融合,充分利用了两者分析结果的互补性,对汉语句子的主谓宾状四个典型功能块的自动识别性能达到了80%以上。实验结果显示,基于局部词汇语境机器学习算法可以从不同侧面准确识别出大部分功能块,句子中复杂从句和多动词连用结构等是主要的识别难点。

关 键 词:计算机应用  中文信息处理  汉语功能块  边界识别模型  序列标记模型  模型融合  
文章编号:1003-0077(2007)05-0018-07
收稿时间:2007-04-15
修稿时间:2007-04-152007-06-25

Automatic Parsing of Chinese Functional Chunks
ZHOU Qiang,ZHAO Ying-ze.Automatic Parsing of Chinese Functional Chunks[J].Journal of Chinese Information Processing,2007,21(5):18-24.
Authors:ZHOU Qiang  ZHAO Ying-ze
Affiliation:State Key Laboratory of Intelligent System and Technology,
Dept. of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Abstract:Chinese functional chunks are defined as a series of non-overlapping,non-nested skeleton segments of a sentence,representing the implicit grammatical relations between the sentence-level predicates and their arguments.In this paper,we proposed two statistical models for parsing four main functional chunks in a sentence.In the chunk boundary detection model,we focus on building the sub models based on SVM algorithm for detecting SP(subject-predicate) and PO(predicate-object) boundaries.In the sequence labeling model,we formulate the chunking task as a sequence labeling problem and base our model on CRF algorithm.By introducing some revision rules,we build a combined parsing model which integrates the advantages of both statistical models and have achieved the best F-Score of 82.93%,86.58%,78.46% and 86.64%for subject,predicate,object and adverb functional chunks respectively.Experimental results show that the complex clauses and serial verb structures are the main recognition difficulties.
Keywords:computer application  Chinese information processing  functional chunk  boundary recognition model  sequence labeling model  model merging
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号