首页 | 本学科首页   官方微博 | 高级检索  
     


Tests and variables selection on regression analysis for massive datasets
Authors:Tsai-Hung   Kuang-Fu
Affiliation:

aGraduate Institute of Statistics, National Central University, Chungli, Taiwan, ROC

Abstract:According to Lindley’s paradox, most point null hypotheses will be rejected when the sample size is too large. In this paper, a two-stage block testing procedure is proposed for massive data regression analysis. New variables selection criteria incorporating with classical stepwise procedure are also developed to select significant explanatory variables. Our approach is not only simple in computation for massive data but also confirmed by the simulation study that our approach is more accurate in the sense of achieving the nominal significance level for huge data sets. A real example with moderate sample size verifies that the proposed procedure is accurate compared with the classical method, and a huge real data set is also demonstrated to select appropriate regressors.
Keywords:Massive data   Hypothesis testing   Lindley’s paradox   Regression analysis   Stepwise selection
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号