LUNG CANCER RELAPSE PREDICTION USING PARALLEL XGBOOST
Abstract
Lung cancer has been the most popular form of cancer for decades. Surgery will offer the non-smallcell lung cancer (NSCLC) patients the best hope of a cure if the cancer is diagnosed in the early stage. However,many patients eventually die of their disease due to relapse after surgery. Because of no symptoms of lung cancer inits early stage, many researchers try to improve methods to predict lung cancer relapse early. This study proposeda method to predict lung cancer relapse more accurately. This method has three stages; feature selection, paralleleXtreme Gradient Boost (XGBoost) classifications with different hyperparameters, and selection stage. It used twodatasets of a gene expression microarray for different lung cancer types with its clinical information. The accuracyresults of the proposed model are 0.88 and 0.83 for both datasets, which are more accurate than the representedmachine learning. This multi-construction of the parallel XGBoost gives the system the flexibility to deal with abroader range of datasets without hyperparameters tuning and within a short time.
Metrics