Scientists have developed an innovative machine learning method that can accurately identify epigenetic factors in cancers.
The collaborative study by researchers from Weill Cornell Medicine, New York-Presbyterian and the New York Genome Center (NYGC), published in Cancer discovery, has resulted in an innovative machine learning method that can accurately identify the genetic changes that cause cancer.
The machine learning method was developed as a result of examining methylation. This particular type of DNA-modifying chemical is known to suppress neighbouring genes and the method relies heavily on this phenomenon. Indeed, the researchers were able to analyse thousands of DNA methylation changes in tumour cells and determine which ones have the greatest risk of triggering tumour growth.
Why is methylation important in cancer research?
Methylation is an epigenetic process responsible for regulating the activity of genes throughout the genome. Its function is to change the structure of the DNA without changing the information contained in the genes. Observing this process is crucial to the study of cancer, as excessive methylation – hypermethylation – around a tumour suppressor gene can silence that gene and subsequently trigger the cell division that causes cancer.
This innovative method will allow the profiling of a large number of tumours. Indeed, mapping the epigenetic changes that contribute to tumour growth in certain cancers will improve understanding of the origins of cancer, ultimately leading to optimised treatment for each patient, as explained by Dr Dan Landau, lead author of the study and an oncologist at New York-Presbyterian/Weill Cornell Medical Center.
“The challenge with this new technique is similar to the one cancer researchers have faced with DNA mutations: how to distinguish pilot mutations from the more abundant transient mutations that do not affect cancer. Although there are now sophisticated methods for distinguishing between genetic mutations, the techniques for distinguishing driver methylation changes from passenger methylation changes are not nearly as sophisticated,’ he concluded.
To analyse methylation behaviour, the team developed an algorithm called MethSig, which monitors the background methylation level in the genome, estimating when it may be a cancer driver. The algorithm was then applied to DNA methylation maps of different tumour types, finding a small number of cancer-causing events – each tumour containing about a dozen – compared to thousands of transient methylation changes. The models were consistent across patients and tumour types, indicating a non-incremental increase in performance compared to other methods.
In addition, multiple cancer factors related to DNA methylation were confirmed by knocking out the affected gene in chronic lymphocytic leukaemia cells, meaning that knocking out the gene promoted the growth of untreated cells, and concluding that this method is more accurate than previously used techniques. The team then demonstrated the qualities of the algorithm by applying it to a set of chronic lymphocytic leukaemia samples, predicting the aggressiveness of each patient’s cancer.
The classifier developed using MethSig produced estimated risks for each patient, and found that patients with higher estimated risks were more likely to have had worse outcomes, said Dr Heng Pan, senior research associate at Weill Cornell Medicine. Dr Landeau explained that the team’s vision is to provide a mapping of the full range of cancer-causing DNA methylation changes, for different tumour types and in the context of different treatments, in order to expand the scope of precision medicine beyond genetics to also include the critical dimension of epigenetic changes in cancer, continuously striving to optimise patient outcomes.