Imputing manufacturing material in data mining |
| |
Authors: | Ruey-Ling Yeh Ching Liu Ben-Chang Shia Yu-Ting Cheng Ya-Fang Huwang |
| |
Affiliation: | (1) Division of Biometrics, Graduate Institute of Agronomy, National Taiwan University, Taipei, Taiwan;(2) Department of Statistics and Information Science, Fu Jen Catholic University, Taipei, Taiwan;(3) Department of Statistics Science, National Chengchi University, Taipei, Taiwan |
| |
Abstract: | Data plays a vital role as a source of information to organizations, especially in times of information and technology. One encounters a not-so-perfect database from which data is missing, and the results obtained from such a database may provide biased or misleading solutions. Therefore, imputing missing data to a database has been regarded as one of the major steps in data mining. The present research used different methods of data mining to construct imputative models in accordance with different types of missing data. When the missing data is continuous, regression models and Neural Networks are used to build imputative models. For the categorical missing data, the logistic regression model, neural network, C5.0 and CART are employed to construct imputative models. The results showed that the regression model was found to provide the best estimate of continuous missing data; but for categorical missing data, the C5.0 model proved the best method. |
| |
Keywords: | Data mining C5.0 Regression BPNN Missing data Imputation |
本文献已被 SpringerLink 等数据库收录! |
|