A Study to Analyse Covid-19 Outbreak Using Multiple Linear Regression: A Supervised Machine Learning Approach


  • Jayanti Semwal Himalayan Institute of Medical Sciences, Swami Rama Himalayan University, Dehradun, India
  • Abhinav Bahuguna Himalayan Institute of Medical Sciences, Swami Rama Himalayan University, Dehradun, India
  • Akanksha Uniyal Himalayan Institute of Medical Sciences, Swami Rama Himalayan University, Dehradun, India
  • Shaili Vyas Himalayan Institute of Medical Sciences, Swami Rama Himalayan University, Dehradun, India




Covid-19, Supervised machine learning, Linear Regression model, Multiple Linear Regression, Forecast, Uttarakhand


Introduction: Globally, COVID-19 have impacted people's quality of life Machine learning have recently become popular for making predictions because of their precision and adaptability in identifying diseases. This study aims to identify significant predictors for daily active cases and to visualise trends in daily active, positive cases, and immunisations.

Material and methods: This paper utilized secondary data from Covid-19 health bulletin of Uttarakhand and multiple linear regression as a part of supervised machine learning is performed to analyse dataset.

Results: Multiple Linear Regression model is more accurate in terms of greater score of R2 (=0.90) as compared to Linear Regression model with R2=0.88. The daily number of positive, cured, deceased cases are significant predictors for daily active cases (p <0.001). Using time series linear regression approach, cumulative number of active cases is forecasted to be 6695 (95% CI: 6259 - 7131) on 93rd day since 18 Sep 2022, if similar trend continues in upcoming 3 weeks in Uttarakhand.

Conclusion: Regression models are useful for forecasting COVID-19 instances, which will help governments and health organisations address this pandemic in future and establish appropriate policies and recommendations for regular prevention.


Coronavirus disease (COVID-19) outbreak situation. Available at https://www.who.int/emergencies/diseases/novelcoronavirus-2019 (2020).

India COVID-19 TRACKER. 2020 [online]. Available at, https://www. covid19india.org/. Accessed on: 11th July 2020

Negi SS. Uttarakhand: Land and people: MD Publications Pvt. Ltd.; 1995

Band SS et al., A Survey on Machine Learning and Internet of Medical Things-Based Approaches for Handling COVID-19: Meta-Analysis. Frontiers in Public Health. 2022 Vol-10;2296-65. Doi: https://doi.org/10.3389/fpubh.2022.869238 PMid:35812486 PMCid:PMC9260273

Ghosal S, Sengupta S, Majumder M, Sinha B. Linear Regression Analysis to predict the number of deaths in India due to SARS-CoV-2 at 6 weeks from day 0 (100 cases - March 14th 2020). Diabetes Metab Syndr. 2020;14(4):311-315. Doi: https://doi.org/10.1016/j.dsx.2020.03.017 PMid:32298982 PMCid:PMC7128942

New York City Department of Health and Mental Hygiene (DOHMH) COVID-19 Response Team. Preliminary Estimate of Excess Mortality During the COVID-19 Outbreak - New York City, March 11-May 2, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(19):603- 605. Published 2020 May 15. Doi: https://doi.org/10.15585/mmwr.mm6919e5 PMid:32407306

Pandey G , Chaudhary P , Gupta R , Pal S, SEIR and Regression Model based COVID-19 outbreak predictions in India, medRxiv 2020.04.01.20049825; Doi: https://doi.org/10.2196/preprints.19406

Anastassopoulou C, Russo L, Tsakris A, Siettos C Data-based analysis, modelling and forecasting of the novel coronavirus (2019-Ncov) outbreak. medRxiv preprint 10.1101/2020.02.11.20022186. Doi: https://doi.org/10.1101/2020.02.11.20022186

Dutta S, Bandyopadhyay SK. Machine Learning Approach for Confirmation of COVID19 Cases: Positive, Negative, Death and Release, medRxiv 2020: Doi: https://doi.org/10.1101/2020.03.25.20043505

Data Sources: Daily Covid-19 Health Bulletin (March 2022 - November 2022): Ministry of Health & Family Welfare (MoHFW), Gov-ernment of Uttarakhand; URL: https://health.uk.gov.in

Alimadadi A , Aryal S, Manandhar , Munroe PB, Joe B , Cheng X . Artificial intelligence and machine learning to fight COVID-19. Phys-iol Genomics. 2020 ; 52(4): 200-202. Doi: https://doi.org/10.1152/physiolgenomics.00029.2020 PMid:32216577 PMCid:PMC7191426

Suganya R, Arunadevi R, Buhari SM. COVID-19 forecasting using multivariate linear regression.Research Square.2020:1-17. Doi: https://doi.org/10.21203/rs.3.rs-71963/v1

Khanday AMUD et al ,Rabani ST, Khan QR, Rouf N, Mohiuddin .Machine learning based approaches for detecting COVID-19 using clin-ical text data. Int J Inf Technol.2020; 12(3):731-739. Doi: https://doi.org/10.1007/s41870-020-00495-9 PMid:32838125 PMCid:PMC7325639

Arvind V , Kim JS, Cho BH, GengE, Cho SK. Development of a machine learning algorithm to predict intubation among hospitalized pa-tients with COVID-19. J Crit Care 2020;62:25-30. Doi: https://doi.org/10.1016/j.jcrc.2020.10.033 PMid:33238219 PMCid:PMC7669246

Burdick H et al.Prediction of respiratory decompensation in Covid-19 patients using machine learning: the READY trial. Comput Bi-ol Med.2020; 124:103949. Doi: https://doi.org/10.1016/j.compbiomed.2020.103949 PMid:32798922 PMCid:PMC7410013

Semwal, J., Bahuguna, A., Sharma, N., Dikshit, R. K., Bijalwan, R., & Augustine, P. Time Series Analysis of COVID-19 Data-A study from Northern India. Indian Journal of Community Health. 2020;34(2):202-206. Doi: https://doi.org/10.47203/IJCH.2022.v34i02.012

Painuli, D., Mishra, D., Bhardwaj, S., & Aggarwal, M. Forecast and prediction of COVID-19 using machine learning. Data Science for COVID-19 . 2021:381-397). Doi: https://doi.org/10.1016/B978-0-12-824536-1.00027-7 PMCid:PMC8138040

Singh, S., Chowdhury, C., Panja, A. K., & Neogy, S. Time series analysis of COVID-19 data to study the effect of lockdown and unlock in India. Journal of The Institution of Engineers (India): Series B. 2021;102(6):1275-1281. Doi: https://doi.org/10.1007/s40031-021-00585-7 PMCid:PMC8031344

Rath et al. Diabetes & Metabolic Syndrome: Clinical Research & Reviews 14 (2020) 1467e1474. Doi: https://doi.org/10.1016/j.dsx.2020.07.045 PMid:32771920 PMCid:PMC7395225




How to Cite

Semwal J, Bahuguna A, Uniyal A, Vyas S. A Study to Analyse Covid-19 Outbreak Using Multiple Linear Regression: A Supervised Machine Learning Approach. Natl J Community Med [Internet]. 2023 Feb. 28 [cited 2023 Mar. 25];14(02):82-9. Available from: https://njcmindia.com/index.php/file/article/view/2656



Original Research Articles

Most read articles by the same author(s)