首页 | 本学科首页   官方微博 | 高级检索  
     

基于多元判别分析的汉语句群自动划分方法
引用本文:王荣波,李杰,黄孝喜,周昌乐,谌志群,王小华.基于多元判别分析的汉语句群自动划分方法[J].计算机应用,2015,35(5):1314-1319.
作者姓名:王荣波  李杰  黄孝喜  周昌乐  谌志群  王小华
作者单位:1. 杭州电子科技大学 认知与智能计算研究所, 杭州 310018; 2. 厦门大学 智能科学与技术系, 福建 厦门 361005
基金项目:国家自然科学基金资助项目,教育部人文社会科学研究项目青年基金资助项目
摘    要:针对目前句群划分工作缺乏计算语言学数据支持、忽略篇章衔接词的问题以及当前篇章分析较少研究句群语法单位的现象,提出一种汉语句群自动划分方法.该方法以汉语句群理论为指导,构建汉语句群划分标注评测语料,并且基于多元判别分析(MDA)方法设计了一组评价函数J,从而实现汉语句群的自动划分.实验结果表明,引入切分片段长度因素和篇章衔接词因素可以改善句群划分性能,并且利用Skip-Gram Model比传统的向量空间模型(VSM)有更好的效果,其正确分割率Pμ 达到85.37%、错误分割率WindowDiff降到24.08%.同时该方法在句群划分任务上有更大的优势,比传统MDA方法有更好的句群划分效果.

关 键 词:汉语句群划分  多元判别分析  篇章分析  Skip-Gram模型  篇章衔接  
收稿时间:2014-12-05
修稿时间:2014-12-24

Automatic Chinese sentences group method based on multiple discriminant analysis
WANG Rongbo,LI Jie,HUANG Xiaoxi,ZHOU Changle,CHEN Zhiqun,WANG Xiaohua.Automatic Chinese sentences group method based on multiple discriminant analysis[J].journal of Computer Applications,2015,35(5):1314-1319.
Authors:WANG Rongbo  LI Jie  HUANG Xiaoxi  ZHOU Changle  CHEN Zhiqun  WANG Xiaohua
Affiliation:1. Institute of Cognitive and Intelligent Computing, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China;
2. Department of Intelligent Science and Technology, Xiamen University, Xiamen Fujian 361005, China
Abstract:In order to solve the problems in Chinese sentence grouping domain, including the lack of computational linguistics data and the joint makers in a discourse, this paper proposed an automatic Chinese sentence grouping method based on Multiple Discriminant Analysis (MDA). Moreover, sentences group was rarely considered as a grammar unit. An annotated evaluation corpus for Chinese sentence group was constructed based on Chinese sentence group theory. And then, a group of evaluation functions J was designed based on the MDA method to realize automatic Chinese sentence grouping. The experimental results show that the length of a segmented unit and one discourse's joint makers contribute to the performance of Chinese sentence group. And the Skip-Gram model has a better effect than the traditional Vector Space Model (VSM). The evaluation parameter Pμ reaches to 85.37% and WindowDiff reduces to 24.08% respectively. The proposed method has better grouping performance than that of the original MDA method.
Keywords:Chinese sentences grouping  Multiple Discriminant Analysis (MDA)  discourse analysis  Skip-Gram model  discourse coherence
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号