首页 | 本学科首页   官方微博 | 高级检索  
     

基于语言学特征向量和词嵌入向量的汉语动词事件类型预测
引用本文:刘洪超,黄居仁,侯仁魁,李洪政. 基于语言学特征向量和词嵌入向量的汉语动词事件类型预测[J]. 中文信息学报, 2018, 32(1): 26-33
作者姓名:刘洪超  黄居仁  侯仁魁  李洪政
作者单位:1.香港理工大学 中文及双语学系,香港;2.鲁东大学文学院,山东 烟台 264001;3.北京师范大学 中文信息处理研究所,北京 100875
基金项目:国家社会科学基金(16BYY110)
摘    要:该文主要介绍汉语动词事件类型的预测。事件类型是根据内部时间结构对汉语动词进行的重要分类,包括状态、活动、变化(完结和达成)。对汉语动词事件类型进行预测从理论上能够对以往语言学研究提出的特征进行验证,从应用上可以服务于机器翻译等任务。该文基于两种方式构建词向量进行汉语动词事件类型的预测,一种是根据语言学特征有监督地构建词向量,另一种是利用word2vec无监督地构建词嵌入向量。通过多元逻辑回归、支持向量机和人工神经网络分类器对汉语动词事件类型进行预测,最终实现了73.6%的总体准确率。

关 键 词:事件类型  汉语动词  语言学特征  词嵌入  分类  预测  

Prediction of Mandarin Verbs-Event Types Based on Linguistic Features Vectors and Word Embedding Vectors
LIU Hongchao,HUANG Churen,HOU Renkui,LI Hongzheng. Prediction of Mandarin Verbs-Event Types Based on Linguistic Features Vectors and Word Embedding Vectors[J]. Journal of Chinese Information Processing, 2018, 32(1): 26-33
Authors:LIU Hongchao  HUANG Churen  HOU Renkui  LI Hongzheng
Affiliation:1. CBS, The Hong Kong Polytechnic University, Hong Kong, China; 2. School of Chinese Language and Literature, Ludong University, Yantai, Shandong 264001, China; 3. Institute of Chinese Information Processing, Beijing Normal University, Beijing, 100875, China
Abstract:This paper investigates the prediction of event types of Mandarin verbs, which are trisected into state, activity and transition or quartered into state, activity, accomplishment and achievement. Previous linguistic studies of event types of Mandarin verbs have come up with various features for different event types, but none of them are validated by statistical or computational methods. Both supervised vectors and unsupervised vectors are examined for prediction, i.e. the linguistics features and the embedding vectors by word2vec, respectively. We achieve an overall accuracy of 73.6% using classifiers of multinominal regression, supporting vector machine and the neural network.
Keywords:
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号