[Article] Feature Selection for Outcome Prediction in OEsophageal Cancer using Genetic Algorithm and Random Forest Classi er

Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classi er

Desbordes Paula,b, Ruan Sua, Modzelewski Romaina,c, Vauclin Sebastienb, Vera Pierrea,c, Gardin Isabellea,c
a Litis – QuantIF, University of Rouen, 22, boulevard Gambetta, 76000 Rouen, France
b Dosisoft, 45/47, avenue Carnot, 94230 Cachan, France
c Henri Becquerel Centre, 1, rue d’Amiens, 76038 Rouen Cedex France

ABSTRACT
The outcome prediction of patients can greatly help to personalize cancer treatment. A large amount of quantitative features (clinical exams, imaging . . . ) are potentially useful to assess the patient outcome. The challenge is to choose the most predictive subset of features. In this paper, we propose a new feature selection strategy called GARF (Genetic Algorithm based on Random Forest) extracted from Positron Emission Tomography (PET) images and clinical data. The most relevant features, predictive of the therapeutic response or which are prognoses of the patient survival 3 years after the end of treatment, were selected using GARF on a cohort of 65 patients with a local advanced oesophageal cancer eligible for chemo-radiation therapy. The most relevant predictive results were obtained with a subset of 9 features leading to a random forest misclassi cation rate of 18  4% and an Areas Under the of Receiver Operating Characteristic (ROC) Curves (AUC) of 0.823  0.032. The most relevant prognostic results were obtained with 8 features leading to an error rate of 20  7% and an AUC of 0.750  0.108. Both predictive and prognostic results show better performances using GARF than using 4 other studied methods.