首页 | 本学科首页   官方微博 | 高级检索  
     


Sparse partial least-squares regression and its applications to high-throughput data analysis
Authors:Donghwan LeeWoojoo Lee  Youngjo LeeYudi Pawitan
Affiliation:
  • a Department of Statistics, Seoul National University, Seoul, South Korea
  • b Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
  • Abstract:
    The partial least-squares (PLS) method is designed for prediction problems where the number of predictors is larger than the number of training samples. PLS is based on latent components that are linear combinations of all of the original predictors, so it automatically employs all predictors regardless of their relevance. This will potentially compromise its performance, but it will also make it difficult to interpret the result. In this paper, we propose a new formulation of the sparse PLS (SPLS) procedure to allow both sparse variable selection and dimension reduction. We use the standard L1-penalty and the unbounded penalty of [1]. We develop a computing algorithm for SPLS by modifying the nonlinear iterative partial least-squares (NIPALS) algorithm, and illustrate the method with an analysis of a cancer dataset. Through the numerical studies we find that our SPLS method generally performs better than the standard PLS and other existing methods in variable selection and prediction.
    Keywords:Lasso   Modeling   Prediction   Regression analyses   Variable selection
    本文献已被 ScienceDirect 等数据库收录!
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号