首页 | 本学科首页   官方微博 | 高级检索  
     

基于Spark深度感知决策树的恒星/星系分类应用研究
引用本文:黄智昌,王俊义,郑霖,符杰林.基于Spark深度感知决策树的恒星/星系分类应用研究[J].计算机应用研究,2017,34(3).
作者姓名:黄智昌  王俊义  郑霖  符杰林
作者单位:桂林电子科技大学;桂林电子科技大学认知无线电与信息处理教育部重点实验室,广西密码学与信息安全重点实验室,通信网信息传输与分发技术重点实验室,桂林电子科技大学广西无线宽带通信与信号处理重点实验室
基金项目:国家自然科学基金项目(61362006,61261017,61571143,61561014);中电54所通信网信息传输与分发技术重点实验室基金项目(ITD-U14008/KX142600015(2014));北京邮电大学泛在网络教育部重点实验室基金项目(KFKT-2014102);认知无线电与信息处理教育部重点实验室基金项目(2013ZR08,CRKL150112);广西无线宽带通信与信号处理重点实验室基金项目(GXKL0614202,GXKL0614101,GXKL061501);广西自然科学基金项目(2013GXNSFAA019334);桂林电子科技大学研究生教育计划资助项目(YJCXS201517)
摘    要:针对传统决策树分类算法需要依靠人工构造特征才能实现对数据进行分类的问题, 以及其在处理海量天文数据时所面临的处理速度和资源分配瓶颈问题,结合深度学习强大的特征学习能力和Spark高效的数据处理性能,提出了一种基于Spark平台的深度感知决策树并行化算法,并将其应用于天文恒星/星系分类问题中。研究结果表明,该算法具有很好的可伸缩性,可以通过增加Spark集群计算节点的数量,来减少分类模型所需的训练时间和增强其对海量天文数据的处理能力。并且,其因同时具备强大的特征学习和分类能力而在恒星星系分类问题上可以获得比传统决策树更高的分类准确率。

关 键 词:Spark    深度学习  决策树  并行化  恒星/星系  分类
收稿时间:2016/1/31 0:00:00
修稿时间:2017/1/17 0:00:00

The research on star/galaxy classification based on Spark deep neural decision tree
Huang Zhichang,Wang Junyi,Zheng Lin and Fu Jielin.The research on star/galaxy classification based on Spark deep neural decision tree[J].Application Research of Computers,2017,34(3).
Authors:Huang Zhichang  Wang Junyi  Zheng Lin and Fu Jielin
Affiliation:Key Lab of Cognitive Radio Information Processing,the Ministry of Education,Guilin University of Electronic Technology,Guilin Guangxi,,Science and Technology on Information Security,Guilin University of Electronic Technology,Guilin Guangxi,Guangxi Key Lab of Wireless Wideband Communication Signal Processing,Guilin University of Electronic Technology
Abstract:In view of the traditional decision tree need to predefine the features before classifying data, and in order to solve the bottleneck problems of processing speed and resource allocation when dealing with massive astronomical data, considering the strong representation learning of deep learning and good performance of processing huge amounts of data on Spark, this paper proposed a parallel deep neural decision tree based on Spark. And then it was applied this on the astronomy star/galaxy separation problem. The results show that, this algorithm can scale well with cluster size as it can dramatically decrease the training time of model and enhance the ability of processing massive astronomy data with it. Moreover, a better classification accuracy on the star/galaxy separation problem is obtained by addressing the decision tree to learn the proper representations of input data and the final classifiers in a joint manner.
Keywords:Spark  deep learning  decision tree  parallelization  star/galaxy  classification
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号