首页 | 本学科首页   官方微博 | 高级检索  
     

基于最大熵的越南语新闻事件元素抽取方法
引用本文:周枫,庙介璞,潘清清,严馨,余正涛.基于最大熵的越南语新闻事件元素抽取方法[J].数据采集与处理,2017,32(4):838-843.
作者姓名:周枫  庙介璞  潘清清  严馨  余正涛
作者单位:昆明理工大学信息工程与自动化学院, 昆明, 650500
摘    要:越南与中国一水相依,是重要的政治、军事和经济合作邻国,然而针对越南语新闻事件元素的提取研究非常匮乏。本文针对越南语特点,提出一种基于最大熵模型的越南语新闻事件元素抽取方法。该方法针对越语句子结构和词汇语义的特点,采用最大熵算法,选取上下文、邻近触发词以及邻近实体作为特征,定义特征模版,训练获得越南语新闻事件模型,实现新闻事件元素抽取。抽取实验结果表明本文提出的方法抽取新闻事件元素的准确率达到80%以上。

关 键 词:越南语  最大熵  机器学习  新闻事件元素抽取

Extractiond Method of Vietnamese News Event Elements Based on Maximum Entropy
Zhou Feng,Miao Jiepu,Pan Qingqing,Yan Xin,Yu Zhengtao.Extractiond Method of Vietnamese News Event Elements Based on Maximum Entropy[J].Journal of Data Acquisition & Processing,2017,32(4):838-843.
Authors:Zhou Feng  Miao Jiepu  Pan Qingqing  Yan Xin  Yu Zhengtao
Abstract:The study on extraction of Vietnamese news event elements is rare, while Vietnam is a significant neighboring country with political, military and economic cooperation, which is just at a distance of a river with us. According to the Vietnamese characteristics, this paper puts forward a method of Vietnamese news event element extraction based on maximum entropy model. This method selects the context, adjacent trigger words and neighboring entities as features, delimits feature templates, trains Vietnamese news events model and achieves the extraction of news event elements of Vietnamese on the basis of the characteristics of the Vietnamese sentence structure and lexical semantic using the maximum entropy algorithm. The experimental result of the extraction shows that the accuracy of the news event elements extracted by the method proposed in this paper reaches more than 80%.
Keywords:Vietnamese  maximum entropy  machine learning  news event elements extraction
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号