首页 | 本学科首页   官方微博 | 高级检索  
     

基于D-S证据理论的微博客蕴含交通信息提取方法
引用本文:张恒才,陆锋,仇培元. 基于D-S证据理论的微博客蕴含交通信息提取方法[J]. 中文信息学报, 2015, 29(2): 170-178
作者姓名:张恒才  陆锋  仇培元
作者单位:中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101
基金项目:国家863项目(2012AA12A211,2013AA120305);国家自然科学基金(41271408)
摘    要:微博客消息中经常蕴含大量实时交通信息,有望与现有实时交通信息采集方式形成互补。该文针对微博客消息语义模糊性及用户描述差异性问题,提出了一种微博客消息蕴含交通信息的D-S证据理论提取方法。该方法首先构建微博客消息蕴含交通状态信息评价体系,利用百科知识提高评价精度,然后定义微博客消息源的基本概率分配函数,通过证据合成与证据决策,实现微博客消息蕴含实时交通信息的甄别与融合。实验结果表明,该方法能够对微博客消息蕴含实时交通信息的可信度进行有效判断,并能够在最大程度上利用不同微博客用户发布消息的信息内容,且较之传统的文本聚类融合方法具有更高的准确率。

关 键 词:微博客  交通信息  文本聚类  证据理论  维基百科  

Extracting Traffic Information from Micro Blog Based on D-S Evidence Theory
ZHANG Hengcai;LU Feng;QIU Peiyuan. Extracting Traffic Information from Micro Blog Based on D-S Evidence Theory[J]. Journal of Chinese Information Processing, 2015, 29(2): 170-178
Authors:ZHANG Hengcai  LU Feng  QIU Peiyuan
Affiliation:State Key Lab of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
Abstract:Micro-Blog messages usually contain a great amount of real-time traffic information which can be expected to become an important data source for city traffic. In this paper, we propose an approach for extracting traffic information from massive micro-blogs based on D-S evidence theory to solve the data fusion problem brought by micro-blogs characteristics of high dynamic, uncertainty and ambiguous narrating. Firstly, an evaluation index system for the traffic information collected from the mass micro-blog messages is built, whose accuracy is enhanced by use of a wikipedia semantic model. Secondly, a function of basic probability assignment is defined for the micro-blog messages with the help of word similarity. Finally, the D-S theory is adopted to judge and fuse the extracted traffic information, throught evidence composition and decision. An experiment on Beijing road networks and Sina Micro-blog platform shows the presented approach can effectively judge the reliability of the traffic information contained in mass micro-blog messages, and can utilize the message contents delivered by different micro-blog users at utmost. Meanwhile, compared with traditional text clustering algorithm, the proposed approach is more accurate.
Keywords:micro-blog   traffic information   text clustering   D-S evidence theory   wikipedia  
本文献已被 CNKI 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号