Comprehensive Summary
This study presents an explainable AI framework that reveals how the Heidelberg DNA methylation classifier distinguishes between 82 brain tumor types and 9 normal tissue classes. Using Random Forest models trained on 2,801 samples and 428,799 probes, the researchers found that only 2.3% of probes accounted for over 61% of the classifier’s decisions. These high-usage probes were concentrated in biologically relevant regions such as CpG islands, enhancer elements, heterochromatin, and lamina-associated domains, with distinct usage patterns across tumor types. IDH-mutant gliomas and ETMR relied on hypermethylated CpG island probes, while tumors like LIPN and PITAD predominantly used hypomethylated probes in open sea regions. A streamlined version of the classifier using 10,000 probes revealed 88 distinct probe clusters, over 80% of which were specific to individual tumor classes. These clusters were spread throughout the genome, indicating redundancy that enhances classifier robustness. Unsupervised clustering and dimensionality reduction showed clear class-specific patterns, such as clusters 27 and 78 for ETMR and cluster 1 for IDH-mutant gliomas. To make these findings accessible, the authors created a web application called “shinyMNP,” which allows users to explore class-informative probes and associated gene expression. The study demonstrated this tool’s utility through four tumor-specific examples: SHPRH hypermethylation in ETMR, PWWP3A hypomethylation and overexpression in HGNET_MN1, TBX19 promoter hypomethylation in PITAD_ACTH, and RET hypermethylation with high expression in HGNET_BCOR. The framework’s outputs were consistent with other interpretability methods such as SHAP and generalizable to other cancer types like sarcoma, supporting its broad applicability.
Outcomes and Implications
By identifying the specific DNA methylation features used for classification, this work enhances transparency in machine learning diagnostics and improves clinical trust. It also highlights tumor-specific epigenetic changes that could serve as biomarkers or therapeutic targets, such as RET in HGNET_BCOR. The shinyMNP tool supports clinical and research use by enabling interactive exploration of diagnostic and biological relevance across tumor types.