Use of biplot technique for the comparison of the missing value imputation methods


Alkan B. B., ALKAN N., ATAKAN C., Terzi Y.

International Journal of Data Analysis Techniques and Strategies, cilt.7, sa.3, ss.217-230, 2015 (Scopus) identifier

Özet

© 2015 Inderscience Enterprises Ltd.This study was performed to assess the effects of different imputation methods on the performance of a biplot technique. We selected the Fisher's iris data as our reference dataset. Some elements of the Iris data were deleted in different rates under missing at random (MAR) assumption to generate incomplete datasets which had 3.5%, 7%, %15, 20% missing value. Datasets with missing values were completed by four imputation methods [mean imputation, regression imputation, expectation maximisation (EM) algorithm, multiple imputation (MI)]. The new imputed datasets were analysed by biplot technique and their results were compared with original complete biplot of the data. The results of biplot analysis were similar in all the imputation methods when missing rate is low under MAR assumption. Even when the missing rate was greater than 10%, results of EM and MI methods were similar to real values and graphical representation of original data. For multivariate methods, we also propose filling in the missing value with the arithmetic mean of the imputed estimates which are obtained with multiple imputation. This paper also indicates that the use of biplot technique for the comparison of the missing value imputation methods provides a useful visual tool.