首页 | 本学科首页   官方微博 | 高级检索  
     


Genetic algorithms for outlier detection and variable selection in linear regression models
Authors:J.?Tolvi  author-information"  >  author-information__contact u-icon-before"  >  mailto:jussi.tolvi@utu.fi"   title="  jussi.tolvi@utu.fi"   itemprop="  email"   data-track="  click"   data-track-action="  Email author"   data-track-label="  "  >Email author
Affiliation:(1) Department of Economics, University of Turku, 20014 Turku, Finland
Abstract:This article addresses some problems in outlier detection and variable selection in linear regression models. First, in outlier detection there are problems known as smearing and masking. Smearing means that one outlier makes another, non-outlier observation appear as an outlier, and masking that one outlier prevents another one from being detected. Detecting outliers one by one may therefore give misleading results. In this article a genetic algorithm is presented which considers different possible groupings of the data into outlier and non-outlier observations. In this way all outliers are detected at the same time. Second, it is known that outlier detection and variable selection can influence each other, and that different results may be obtained, depending on the order in which these two tasks are performed. It may therefore be useful to consider these tasks simultaneously, and a genetic algorithm for a simultaneous outlier detection and variable selection is suggested. Two real data sets are used to illustrate the algorithms, which are shown to work well. In addition, the scalability of the algorithms is considered with an experiment using generated data.I would like to thank Dr Tero Aittokallio and an anonymous referee for useful comments.
Keywords:Variable selection  Model selection  Outlier  Outlier detection  Genetic algorithm
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号