首页 | 本学科首页   官方微博 | 高级检索  
     


Imputing manufacturing material in data mining
Authors:Ruey-Ling Yeh  Ching Liu  Ben-Chang Shia  Yu-Ting Cheng  Ya-Fang Huwang
Affiliation:(1) Division of Biometrics, Graduate Institute of Agronomy, National Taiwan University, Taipei, Taiwan;(2) Department of Statistics and Information Science, Fu Jen Catholic University, Taipei, Taiwan;(3) Department of Statistics Science, National Chengchi University, Taipei, Taiwan
Abstract:Data plays a vital role as a source of information to organizations, especially in times of information and technology. One encounters a not-so-perfect database from which data is missing, and the results obtained from such a database may provide biased or misleading solutions. Therefore, imputing missing data to a database has been regarded as one of the major steps in data mining. The present research used different methods of data mining to construct imputative models in accordance with different types of missing data. When the missing data is continuous, regression models and Neural Networks are used to build imputative models. For the categorical missing data, the logistic regression model, neural network, C5.0 and CART are employed to construct imputative models. The results showed that the regression model was found to provide the best estimate of continuous missing data; but for categorical missing data, the C5.0 model proved the best method.
Keywords:Data mining  C5.0  Regression  BPNN  Missing data  Imputation
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号