SVM-BT-RFE: An improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm

Abstract

Gene Regulatory Network (GRN) has always gained considerable attention from bioinformaticians and system biologists inunderstanding the biological process. But the foremost difficulty relics to appropriately select a stuff for its expression. Anelementary requirement stage in the framework is mining relevant and informative genes to achieve distinguishable biological facts.In an endeavor to discover these genes in several datasets, we have suggested a strategic gene selection algorithm called SupportVector Machine Bayesian T-Test Recursive Feature Elimination algorithm (SVM-BT-RFE), which is an extended variation ofsupport vector machine recursive feature elimination (SVM-RFE) algorithm and support vector machine t-test recursive featureelimination (SVM-T-RFE). Our algorithm accomplishes the goal of attaining maximum classification accuracy with smallersubsets of gene sets of high dimensional data. Each dataset is said to contain approximately 5000e40,000 genes out of which asubset of genes can be selected that delivers the highest level of classification accuracy. The proposed SVM-BT-RFE algorithm wasalso compared to the existing SVM-T-RFE and SVM-RFE where it was found that the proposed algorithm outshined than the latter.The proposed SVM-BT-RFE technique have provided an improvement of approximately 25% as compared to the existing SVM-TRFEand more than 40% of improvement as compared to the existing SVM-RFE. The comparison was performed with regard to theclassification accuracy based on the number of genes selected and classification error rate of 5 runs of the algorithm.© 2015 The Authors. Production and hosting by Elsevier B.V. on behalf of University of Kerbala. This is an open access articleunder the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).