首页 | 本学科首页   官方微博 | 高级检索  
     

微博客蕴含交通事件信息抽取的自动标注方法
引用本文:仇培元,张恒才,余丽,陆锋.微博客蕴含交通事件信息抽取的自动标注方法[J].中文信息学报,2017,31(2):107-116.
作者姓名:仇培元  张恒才  余丽  陆锋
作者单位:1.中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101;
2. 中国科学院大学,北京 100101
基金项目:国家自然科学基金(41631177); 国家自然科学基金(41401460)
摘    要:微博客文本蕴含丰富的实时交通事件信息,能够为现有交通信息采集手段提供补充。然而,当前事件抽取方法缺少对地理实体关系的判断过程,对涉及多个地理实体及关系表达的地理空间要素抽取效果不佳,难以准确识别交通事件信息的位置描述。该文提出一种自动标注方法,将地理实体关系识别引入事件抽取过程来解决这一问题。该方法利用条件随机场模型实现交通事件角色标注,利用支撑向量机模型实现角色关系与要素关系标注,完成了交通事件信息空间要素识别。以新浪微博为数据源开展的实验分析表明,该文所提出的微博客蕴含交通事件抽取方法,正确率和召回率均达到90%,优于现有的基于模式匹配的抽取方法。

关 键 词:微博客  信息抽取  交通事件  条件随机场  支撑向量机  

Automatic Event Labeling for Traffic Information Extraction from Microblogs
QIU Peiyuan,ZHANG Hengcai,YU Li,LU Feng.Automatic Event Labeling for Traffic Information Extraction from Microblogs[J].Journal of Chinese Information Processing,2017,31(2):107-116.
Authors:QIU Peiyuan  ZHANG Hengcai  YU Li  LU Feng
Affiliation:State Key Lab of Resources and Environmental Information System,
IGSNRR, CAS, Beijing 100101, China;
University of Chinese Academy of Sciences, Beijing 100101, China
Abstract:Microblog messages usually contain a great amount of real-time traffic information which can complement the sensor based traffic information collecting technologies. In this paper, we propose an automatic event labeling method to extract traffic information from microblog messages. Specifically, we apply the spatial relation identification between geographic entities in event extraction to determine the spatial elements in traffic event messages. Firstly, a conditional random field model is used to label the event role in the message texts. Secondly, the relations between the roles and the relations between the elements are tagged by SVM models. The experiment on Sina microblogs shows the precision and recall of the proposed approach are both over 90%, which is superior to the well-known pattern matching method.
Keywords:microblog  information extraction  traffic event  conditional random fields  support vector machine  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号