CART_based Approach for Discovering Emerging Patterns in Iraqi Biochemical Dataset

Abstract

This paper is intended to apply data mining techniques for real Iraqi biochemicaldataset to discover hidden patterns within tests relationships. It is worth noting thatpreprocessing steps take remarkable efforts to handle this type of data, since it ispure data set with so many null values reaching a ratio of 94.8%, then it becomes0% after achieving these steps. However, in order to apply Classification AndRegression Tree (CART) algorithm, several tests were assumed as classes, becauseof the dataset was unlabeled. Which then enabled discovery of patterns of testsrelationships, that consequently, extends its impact on patients‟ health, since it willassist in determining test values by performing only relevant tests. Thereforedecreases the number of tests for patients.