Cancer recurrence prediction using machine learning

Cancer is one of the deadliest diseases in the world and is responsible for around 13% of all deaths world-wide. Cancer incidence rate is growing at an alarming rate in the world. Despite the fact that cancer is preventable and curable in early stages, the vast majority of patients are diagnosed wit...

Full description

Bibliographic Details
Main Authors: Shoon Lei, Win, Htike@Muhammad Yusof, Zaw Zaw, Yusof, Faridah, Noorbatcha, Ibrahim Ali
Format: Article
Language:English
Published: AIRCC Publishing Corporation 2014
Subjects:
Online Access:http://irep.iium.edu.my/37773/
http://irep.iium.edu.my/37773/
http://irep.iium.edu.my/37773/
http://irep.iium.edu.my/37773/1/2214ijcsity02.pdf
Description
Summary:Cancer is one of the deadliest diseases in the world and is responsible for around 13% of all deaths world-wide. Cancer incidence rate is growing at an alarming rate in the world. Despite the fact that cancer is preventable and curable in early stages, the vast majority of patients are diagnosed with cancer very late. Furthermore, cancer commonly comes back after years of treatment. Therefore, it is of paramount importance to predict cancer recurrence so that specific treatments can be sought. Nonetheless, conventional methods of predicting cancer recurrence rely solely on histopathology and the results are not very reliable. The microarray gene expression technology is a promising technology that could predict cancer recurrence by analyzing the gene expression of sample cells. The microarray technology allows researchers to examine the expression of thousands of genes simultaneously. This paper describes a state-of-the-art machine learning based approach called averaged one-dependence estimators with subsumption resolution to tackle the problem of predicting, from DNA microarray gene expression data, whether a particular cancer will recur within a specific timeframe, which is usually 5 years. To lower the computational complexity, we employ an entropy-based gene selection approach to select relevant prognostic genes that are directly responsible for recurrence prediction. This proposed system has achieved an average accuracy of 98.9% in predicting cancer recurrence over 3 datasets. The experimental results demonstrate the efficacy of our framework.