Processing of missing values in survey data using Principal Component Analysis and probabilistic Principal Component Analysis methods

Abstract

The idea of carrying out research on incomplete data came from the circumstances of our dear country and the horrors of war, which resulted in the missing of many important data and in all aspects of economic, natural, health, scientific life, etc.,. The reasons for the missing are different, including what is outside the will of the concerned or be the will of the concerned, which is planned for that because of the cost or risk or because of the lack of possibilities for inspection. The missing data in this study were processed using Principal Component Analysis and self-organizing map methods using simulation. The variables of child health and variables affecting children's health were taken into account: breastfeeding and maternal health. The maternal health variable contained missing value and was processed in Matlab2015a using Methods Principal Component Analysis and probabilistic Principal Component Analysis of where the missing values were processed and then the methods were compared using the root of the mean error squares. The best method to processed the missing values Was the PCA method