INTEGRATIVE TRANSCRIPTOMIC, PROTEOMIC, AND MACHINE LEARNING ANALYSIS OF COVID-19 SEVERITY AND ITS IMPLICATIONS FOR POST-VIRAL OUTCOMESFor Post-Viral Outcomes
DOI:
https://doi.org/10.69980/w53n3j86Keywords:
Transcriptomics, Proteomics, Machine Learning, COVID-19 Severity, Multi-Omics Integration, Post-Viral OutcomesAbstract
COVID-19, caused by SARS-CoV-2, exhibits wide variability in clinical severity, ranging from asymptomatic infection to critical respiratory failure. Understanding the molecular determinants of this severity is essential for biomarker discovery, risk stratification, and improved clinical management. This study presents an integrative multi-omics framework combining transcriptomic analysis, proteomic enrichment profiling, and machine learning-based feature prioritization to identify molecular signatures associated with COVID-19 severity and their potential implications for post-viral outcomes.
Transcriptomic analysis of the publicly available whole-blood dataset GSE213313 (Agilent microarray) identified 1,544 significantly differentially expressed genes (FDR < 0.05) in critical versus healthy samples, with enrichment of innate immune response, leukocyte activation, antigen processing, T-cell differentiation, and cytokine-mediated signalling pathways. Proteomic analysis using a published peer-reviewed dataset comparing mild (n = 3) and severe (n = 5) COVID-19 cases identified 91 severity-associated proteins, with significant enrichment of acute-phase response, complement cascade activation, platelet degranulation, and coagulation-related processes.
Multi-omics integration identified 8 overlapping molecules between transcriptomic and proteomic datasets, with an inverse RNA–protein fold-change correlation (r = −0.58), indicative of post-transcriptional regulatory complexity. Machine learning analysis using a Random Forest classifier demonstrated perfect discrimination between mild and severe cases (AUC = 1.0; LOOCV accuracy = 100%), and prioritized proteins including LIPG, CLEC11A, KDR, FAIM3, and BST1 as key severity-associated features.
Collectively, this study demonstrates that severe COVID-19 is characterized by coordinated dysregulation of immune activation, inflammatory signalling, complement amplification, and coagulation mechanisms across multiple molecular layers. The identified molecular signatures provide a foundation for future biomarker validation and therapeutic targeting in COVID-19 and related post-viral conditions.
References
1.Aebersold, R., & Mann, M. (2016). Mass-spectrometric exploration of proteome structure and function. Nature, 537(7620), 347–355.
2.Blanco-Melo, D., Nilsson-Payant, B. E., Liu, W. C., Uhl, S., Hoagland, D., Møller, R., ... & tenOever, B. R. (2020). Imbalanced host response to SARS-CoV-2 drives development of COVID-19. Cell, 181(5), 1036–1045.
3.Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
4.Chen, X., & Ishwaran, H. (2012). Random forests for genomic data analysis. Genomics, 99(6), 323–329.
5.Hadjadj, J., Yatim, N., Barnabei, L., Corneau, A., Boussier, J., Smith, N., ... & Terrier, B. (2020). Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients. Science, 369(6504), 718–724.
6.Hasin, Y., Seldin, M., & Lusis, A. (2017). Multi-omics approaches to disease. Genome Biology, 18(1), 1–15.
7.Huang, C., Wang, Y., Li, X., Ren, L., Zhao, J., Hu, Y., ... & Cao, B. (2020). Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The Lancet, 395(10223), 497–506.
8.Libbrecht, M. W., & Noble, W. S. (2015). Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16(6), 321–332.
9.Liu, Y., Beyer, A., & Aebersold, R. (2016). On the dependency of cellular protein levels on mRNA abundance. Cell, 165(3), 535–550.
10.Lucas, C., Wong, P., Klein, J., Castro, T. B., Silva, J., Sundaram, M., ... & Iwasaki, A. (2020). Longitudinal analyses reveal immunological misfiring in severe COVID-19. Nature, 584(7821), 463–469.
11.Messner, C. B., Demichev, V., Wendisch, D., Michalick, L., White, M., Freiwald, A., ... & Ralser, M. (2020). Ultra-high-throughput clinical proteomics reveals classifiers of COVID-19 infection. Cell Systems, 11(1), 11–24.
12.Nalbandian, A., Sehgal, K., Gupta, A., Madhavan, M. V., McGroder, C., Stevens, J. S., ... & Wan, E. Y. (2021). Post-acute COVID-19 syndrome. Nature Medicine, 27(4), 601–615.
13.Patel, M. A., Knauer, M. J., Nicholson, M., Daley, M., Van Nynatten, L., Martin, C., ... & Cepinskas, G. (2023). Elevated anti-inflammatory proteins and reduced alarmin-related proteins are associated with ICU COVID-19 deaths. Critical Care Medicine, 51(5), 582–594.
14.Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., & Smyth, G. K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research, 43(7), e47.
15.Rung, J., & Brazma, A. (2013). Reuse of public genome-wide gene expression data. Nature Reviews Genetics, 14(2), 89–99.
16.Shen, B., Yi, X., Sun, Y., Bi, X., Du, J., Zhang, C., ... & Hou, Y. (2020). Proteomic and metabolomic characterization of COVID-19 patient sera. Cell, 182(1), 59–72.
17.Su, Y., Yuan, D., Chen, D. G., Ng, R. H., Wang, K., Choi, J., ... & Heath, J. R. (2023). Multiple early factors anticipate post-acute COVID-19 sequelae. Cell, 185(5), 881–895.
18.Tang, N., Li, D., Wang, X., & Sun, Z. (2020). Abnormal coagulation parameters are associated with poor prognosis in patients with novel coronavirus pneumonia. Journal of Thrombosis and Haemostasis, 18(4), 844–847.
19.Tay, M. Z., Poh, C. M., Rénia, L., MacAry, P. A., & Ng, L. F. (2020). The trinity of COVID-19: immunity, inflammation and intervention. Nature Reviews Immunology, 20(6), 363–374.
20.World Health Organization. (2020). COVID-19: Case definitions. WHO Reference Number: WHO/2019-nCoV/Surveillance_Case_Definition/2020.2.
21.Yu, G., Wang, L. G., Han, Y., & He, Q. Y. (2012). clusterProfiler: an R Package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology, 16(5), 284–287.
22.Zhou, F., Yu, T., Du, R., Fan, G., Liu, Y., Liu, Z., ... & Cao, B. (2020). Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The Lancet, 395(10229), 1054–1062.


