Feature Selection based on Genetic Algorithm for Classification of Mammogram Using K-means, k-NN and Euclidean Distance

Abstract

There have been several supervised classification attempts for mammograms in the recent times, but very few research works have focused on unsupervised classification to explore its potentialities and weaknesses. I have in this paper attempted to utilize unsupervised clusters to classify malignant, and benign mammograms samples. MiniMIAS database has total 322 mammogram images out which 64 are benign and 51 are malignant. I used 115 images for my experimentation i.e. 64 benign and 51 malignant. Out of these 115, 60% were used for training and 40% for testing. Therefore from 64 benign cases 39 images were used for training and rest for testing, and out of 51 malignant cases 31 images were used for training and rest for testing., the classifications was done on the bases of the features selected using genetic algorithm. Attempts have also been made to study the performance of each feature selected by Genetic Algorithm (GA) in classification. The initially identified clusters using K-means are used to classify 60 unknown samples using k-NN. The proposed work got reasonably good results with 96.23% accuracy for malignant samples, 95.37% for benign. The proposed work can help the radiologists and oncologist as second opinion during screening sessions for early detection.