INTEGRATED SYSTEMS BIOLOGY AND MACHINE LEARNING FRAMEWORK FOR THE IDENTIFICATION OF MITOCHONDRIAL AND SIGNALING SIGNATURES IN ISCHEMIC AND DILATED CARDIOMYOPATHY

Authors

  • Mahak Department of Life Sciences, School of Science, Garden City University, Bangalore, India
  • Shaik Mubarak Basha Department of Life Sciences, School of Science, Garden City University, Bangalore, India
  • Janani Sp Department of Life Sciences, School of Science, Garden City University, Bangalore, India
  • Ms. Kesiya Joy Department of Life Sciences, School of Science, Garden City University, Bangalore, India

DOI:

https://doi.org/10.69980/1b77hc04

Keywords:

Systems Biology, Machine Learning, Cardiomyopathy, Mitochondrial Dysfunction, Signal Transduction Pathways, Biomarker Identification

Abstract

Cardiovascular diseases (CVDs), predominantly manifested as ischemic cardiomyopathy (ICM) and dilated cardiomyopathy (DCM), constitute the leading cause of morbidity and mortality worldwide, driven by a complex architecture of genetic, metabolic, and environmental alterations. Traditional clinical diagnostic modalities, while effective for hemodynamic assessment, often fail to capture the granular molecular heterogeneity required for early intervention and personalized therapeutic strategies. The advent of high-throughput transcriptomics has provided a systems-level view of these pathologies, yet the translation of high-dimensional genomic data into robust clinical biomarkers remains hindered by the "curse of dimensionality" and the noise inherent in biological datasets. This study presents a comprehensive, integrated bioinformatics and machine learning pipeline to identify a compact, biologically potent gene signature for CVD classification. Drawing conceptual inspiration from the "Moana" classification framework—which prioritizes biological structure over raw feature variance—we employed a pathway-guided feature selection strategy on RNA-sequencing data (GSE55296) and microarray validation datasets (GSE57338). Functional enrichment analysis using KEGG and Reactome databases revealed a profound dysregulation of mitochondrial metabolism, oxidative phosphorylation, and cardiomyopathy-related pathways. Subsequent protein–protein interaction (PPI) network analysis, refined through the intersection of multiple topological centrality algorithms (Degree, Maximal Clique Centrality, and Betweenness), distilled the transcriptome into a core signature of six genes: COX4I1, NDUFA5, NDUFV2, NDUFS3, NDUFS4, and AKT1. These features were utilized to train and evaluate supervised machine learning models. The Random Forest classifier demonstrated superior stability and performance, achieving a mean cross-validation accuracy of approximately 69.5%, significantly outperforming a Deep Neural Network (DNN) in this limited-sample regime. The findings underscore the critical role of mitochondrial Complex I and IV dysfunction in heart failure and demonstrate the efficacy of integrating systems biology with ensemble learning to develop interpretable, high-precision molecular diagnostics.

References

1.Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., … Zheng, X. (2016). TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. https://arxiv.org/abs/1603.04467

2.Barabási, A.-L., Gulbahce, N., & Loscalzo, J. (2011). Network medicine: A network-based approach to human disease. Nature Reviews Genetics, 12(1), 56–68. https://doi.org/10.1038/nrg2918

3.Bellman, R. E. (1961). Adaptive control processes: A guided tour. Princeton University Press.

4.Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1), 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x

5.Braunwald, E. (2013). Heart failure. JACC: Heart Failure, 1(1), 1–20. https://doi.org/10.1016/j.jchf.2012.10.002

6.Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324

7.Chin, C.-H., Chen, S.-H., Wu, H.-H., Ho, C.-W., Ko, M.-T., & Lin, C.-Y. (2014). cytoHubba: Identifying hub objects and sub-networks from complex interactome. BMC Systems Biology, 8(Suppl 4), S11. https://doi.org/10.1186/1752-0509-8-S4-S11

8.Edgar, R., Domrachev, M., & Lash, A. E. (2002). Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research, 30(1), 207–210. https://doi.org/10.1093/nar/30.1.207

9.Fabregat, A., Jupe, S., Matthews, L., Sidiropoulos, K., Gillespie, M., Garapati, P., Haw, R., Jassal, B., Korninger, F., May, B., Milacic, M., Roca, C. D., Rothfels, K., Sevilla, C., Shamovsky, V., Viteri, G., Weiser, J., Wu, G., Stein, L., … D’Eustachio, P. (2018). The Reactome pathway knowledgebase. Nucleic Acids Research, 46(D1), D649–D655. https://doi.org/10.1093/nar/gkx1132

10.Fukuda, R., Zhang, H., Kim, J.-W., Shimoda, L., Dang, C. V., & Semenza, G. L. (2007). HIF-1 regulates cytochrome oxidase subunits to optimize efficiency of respiration in hypoxic cells. Cell, 129(1), 111–122. https://doi.org/10.1016/j.cell.2007.01.047

11.Hirst, J. (2013). Mitochondrial complex I. Annual Review of Biochemistry, 82, 551–575. https://doi.org/10.1146/annurev-biochem-070511-103700

12.Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., & Morishima, K. (2017). KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Research, 45(D1), D353–D361. https://doi.org/10.1093/nar/gkw1092

13.LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539

14.Libbrecht, M. W., & Noble, W. S. (2015). Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16(6), 321–332. https://doi.org/10.1038/nrg3920

15.Liu, Y., Morley, M., Brandimarto, J., Hannenhalli, S., Hu, Y., Ashley, E. A., Tang, W. H. W., Moravec, C. S., Margulies, K. B., Cappola, T. P., & Li, M. (2015). RNA-Seq identifies novel myocardial gene expression signatures of heart failure. Genomics, 105(2), 83–89. https://doi.org/10.1016/j.ygeno.2014.12.002

16.Love, M. I., Huber, W., & Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12), 550. https://doi.org/10.1186/s13059-014-0550-8

17.Maisel, A. S., Krishnaswamy, P., Nowak, R. M., McCord, J., Hollander, J. E., Duc, P., Omland, T., Storrow, A. B., Abraham, W. T., Wu, A. H. B., Clopton, P., Steg, P. G., Westheim, A., Knudsen, C. W., Perez, A., Kazanegra, R., Herrmann, H. C., & McCullough, P. A. (2002). Rapid measurement of B-type natriuretic peptide in the emergency diagnosis of heart failure. New England Journal of Medicine, 347(3), 161–167. https://doi.org/10.1056/NEJMoa020233

18.Margulies, K. B., Bednarik, D. P., & Dries, D. L. (2009). Genomics, transcriptional profiling, and heart failure. Journal of the American College of Cardiology, 53(19), 1752–1759. https://doi.org/10.1016/j.jacc.2008.12.064

19.Matsui, T., Tao, J., del Monte, F., Lee, K.-H., Li, L., Picard, M., Force, T. L., Franke, T. F., Hajjar, R. J., & Rosenzweig, A. (2002). Akt activation preserves cardiac function and prevents injury after transient cardiac ischemia in vivo. Circulation, 104(22), 2607–2612. https://doi.org/10.1161/hc4702.099485

20.McKenna, W. J., & Judge, D. P. (2021). Epidemiology of the cardiomyopathies and their genetic architecture. Nature Reviews Cardiology, 18(3), 158–174. https://doi.org/10.1038/s41569-020-0428-2

21.Murphy, M. P. (2009). How mitochondria produce reactive oxygen species. Biochemical Journal, 417(1), 1–13. https://doi.org/10.1042/BJ20081386

22.Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

23.Rosca, M. G., & Hoppel, C. L. (2013). Mitochondrial dysfunction in heart failure. Heart Failure Reviews, 18(5), 607–622. https://doi.org/10.1007/s10741-012-9340-0

24.Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., Amin, N., Schwikowski, B., & Ideker, T. (2003). Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Research, 13(11), 2498–2504. https://doi.org/10.1101/gr.1239303

25.Shiojima, I., & Walsh, K. (2006). Regulation of cardiac growth and coronary angiogenesis by the Akt/PKB signaling pathway. Genes & Development, 20(24), 3347–3365. https://doi.org/10.1101/gad.1492806

26.Stanley, W. C., Recchia, F. A., & Lopaschuk, G. D. (2005). Myocardial substrate metabolism in the normal and failing heart. Physiological Reviews, 85(3), 1093–1129. https://doi.org/10.1152/physrev.00006.2004

27.Szklarczyk, D., Gable, A. L., Nastou, K. C., Lyon, D., Kirsch, R., Pyysalo, S., Doncheva, N. T., Legeay, M., Fang, T., Bork, P., Jensen, L. J., & von Mering, C. (2021). The STRING database in 2021: Customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Research, 49(D1), D605–D612. https://doi.org/10.1093/nar/gkaa1074

28.Taegtmeyer, H., Sen, S., & Vela, D. (2010). Return to the fetal gene program: A suggested metabolic link to gene expression in the heart. Annals of the New York Academy of Sciences, 1188(1), 191–198. https://doi.org/10.1111/j.1749-6632.2009.05100.x

29.Tarazón, E., Roselló-Lletí, E., Rivera, M., Ortega, A., Molina-Navarro, M. M., Triviño, J. C., Martínez-Dolz, L., Lago, F., González-Juanatey, J. R., Salvador, A., & Portolés, M. (2014). RNA sequencing analysis of endomyocardial biopsies reveals new molecular mechanisms of human ischemic and dilated cardiomyopathy. Circulation: Heart Failure, 7(4), 622–631. https://doi.org/10.1161/CIRCHEARTFAILURE.113.001044

30.Wagner, F., & Yanai, I. (2018). Moana: A robust and scalable cell type classification framework for single-cell RNA-Seq data. bioRxiv. https://doi.org/10.1101/456129

31.Wang, Z., Gerstein, M., & Snyder, M. (2009). RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews Genetics, 10(1), 57–63. https://doi.org/10.1038/nrg2484

32.World Health Organization. (2023). Cardiovascular diseases (CVDs). https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)

Downloads

Published

2026-04-10