Stephen F Weng
Prediction of premature all-cause mortality: a prospective general population cohort study comparing machine-learning and standard epidemiological approaches
Weng, Stephen F; Vaz, Luis; Qureshi, Nadeem; Kai, Joe
Authors
Luis Vaz
Professor NADEEM QURESHI nadeem.qureshi@nottingham.ac.uk
CLINICAL PROFESSOR
Professor JOE KAI joe.kai@nottingham.ac.uk
PROFESSOR OF PRIMARY CARE
Abstract
Background: Prognostic modelling using standard methods is well-established, particularly for predicting risk of single diseases. Machine-learning may offer potential to explore outcomes of even greater complexity, such as premature death. This study aimed to develop novel prediction algorithms using machine-learning, in addition to standard survival modelling, to predict premature all-cause mortality.
Methods: A prospective population cohort of 502,628 participants aged 40-69 years were recruited to the UK Biobank from 2006-2010 and followed-up until 2016. Participants were assessed on a range of demographic, biometric, clinical and lifestyle factors. Mortality data by ICD-10 were obtained from linkage to Office of National Statistics. Models were developed using deep learning, random forest and Cox regression. Calibration was assessed by comparing observed to predicted risks; and discrimination by area under the ‘receiver operating curve’ (AUC).
Findings: 14,418 deaths (2.9%) occurred over a total follow-up time of 3,508,454 person-years. A simple age and gender Cox model was the least predictive (AUC 0.689, 95% CI 0.681 – 0.699). A multivariate Cox regression model significantly improved discrimination by 6.2% (AUC 0.751, 95% CI 0.748 – 0.767). The application of machine-learning algorithms further improved discrimination by 3.2% using random forest (AUC 0.783, 95% CI 0.776 – 0.791) and 3.9% using deep learning (AUC 0.790, 95% CI 0.783 – 0.797). These ML algorithms improved discrimination by 9.4% and 10.1% respectively from a simple age and gender Cox regression model. Random forest and deep learning achieved similar levels of discrimination with no significant difference. Machine-learning algorithms were well-calibrated, while Cox regression models consistently over-predicted risk.
Conclusions: Machine-learning significantly improved accuracy of prediction of premature all-cause mortality in this middle-aged population, compared to standard methods. This study illustrates the value of machine-learning for risk prediction within a traditional epidemiological study design, and how this approach might be reported to assist scientific verification.
Citation
Weng, S. F., Vaz, L., Qureshi, N., & Kai, J. (2019). Prediction of premature all-cause mortality: a prospective general population cohort study comparing machine-learning and standard epidemiological approaches. PLoS ONE, 14(3), 1-22. https://doi.org/10.1371/journal.pone.0214365
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 12, 2019 |
Online Publication Date | Mar 27, 2019 |
Publication Date | Mar 27, 2019 |
Deposit Date | Mar 26, 2019 |
Publicly Available Date | Mar 29, 2019 |
Journal | PLOS ONE |
Electronic ISSN | 1932-6203 |
Publisher | Public Library of Science |
Peer Reviewed | Peer Reviewed |
Volume | 14 |
Issue | 3 |
Article Number | e0214365 |
Pages | 1-22 |
DOI | https://doi.org/10.1371/journal.pone.0214365 |
Keywords | premature all-cause mortality; machine-learning; risk prediction |
Public URL | https://nottingham-repository.worktribe.com/output/1669986 |
Publisher URL | https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0214365 |
Contract Date | Mar 29, 2019 |
Files
Weng PLOS ONE 2019
(1.4 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
A protocol to assess risk of fractures associated with use of menopausal hormone therapy: nested case-control study using CPRD
(2024)
Preprint / Working Paper
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search