首页 | 官方网站   微博 | 高级检索  
     

Hadoop支持下海量出租车轨迹数据预处理技术研究
引用本文:吕江波,张永忠.Hadoop支持下海量出租车轨迹数据预处理技术研究[J].城市勘测,2016(3):46-49.
作者姓名:吕江波  张永忠
作者单位:兰州交通大学,甘肃 兰州 730070; 兰州市勘察测绘研究院,甘肃 兰州 730030
摘    要:海量出租车轨迹数据预处理是轨迹数据挖掘和应用的前提。出租车轨迹数据是典型的大数据,传统的数据处理技术无法解决大规模出租车轨迹数据误差分析和处理问题,文章在分析轨迹数据误差来源和误差类型的基础上,提出基于Hadoop的海量出租车轨迹数据预处理模型,使用Hive实现轨迹数据误差统计分析,设计MapReduce并行处理程序实现轨迹数据预处理。实验结果表明,该模型可以有效解决大规模出租车轨迹数据预处理问题,处理方式可靠性较高,大大提高了轨迹数据预处理效率,为后期轨迹数据深入挖掘和分析奠定了基础。

关 键 词:轨迹数据  Hadoop  大数据  数据预处理  并行计算

Based on the Hadoop Massive Taxi Trajectory Data Preprocessing Technology Research
Lv Jiangbo,Zhang Yongzhong.Based on the Hadoop Massive Taxi Trajectory Data Preprocessing Technology Research[J].Urban Geotechnical Investigation & Surveying,2016(3):46-49.
Authors:Lv Jiangbo  Zhang Yongzhong
Abstract:Massive taxi trajectory data preprocessing is the precondition of trajectory data mining and the application. Taxi trajectory data is a typical big data,the traditional data processing technology can not solve the problem of large scale taxi track data error analysis and preprocessing,on the basis of analyzing the trajectory data error source and error type, study of mass trajectory error statistical analysis method and data processing method,the taxi trajectory data preprocessing model based on Hadoop is put forward,using the hive for the realization of the trajectory error statistics,design MapReduce parallel processing procedures for the realization of trajectory data preprocessing. Experimental results show that,the model can effectively solve the problem of large scale taxi trajectory data preprocessing,high reliability,greatly improve the effi-ciency of the trajectory data preprocessing,late for trajectory data digging and analysis laid a foundation.
Keywords:trajectory data  hadoop  big data  data preprocessing  parallel computing
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号