Skip to main content

Research Repository

Advanced Search

A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach

Akyea, Ralph K.; Ntaios, George; Kontopantelis, Evangelos; Georgiopoulos, Georgios; Soria, Daniele; Asselbergs, Folkert W.; Kai, Joe; Weng, Stephen F.; Qureshi, Nadeem

A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach Thumbnail


Authors

George Ntaios

Evangelos Kontopantelis

Georgios Georgiopoulos

Daniele Soria

Folkert W. Asselbergs

Stephen F. Weng



Abstract

Individuals developing stroke have varying clinical characteristics, demographic, and biochemical profiles. This heterogeneity in phenotypic characteristics can impact on cardiovascular disease (CVD) morbidity and mortality outcomes. This study uses a novel clustering approach to stratify individuals with incident stroke into phenotypic clusters and evaluates the differential burden of recurrent stroke and other cardiovascular outcomes. We used linked clinical data from primary care, hospitalisations, and death records in the UK. A data-driven clustering analysis (kamila algorithm) was used in 48,114 patients aged ≥ 18 years with incident stroke, from 1-Jan-1998 to 31-Dec-2017 and no prior history of serious vascular events. Cox proportional hazards regression was used to estimate hazard ratios (HRs) for subsequent adverse outcomes, for each of the generated clusters. Adverse outcomes included coronary heart disease (CHD), recurrent stroke, peripheral vascular disease (PVD), heart failure, CVD-related and all-cause mortality. Four distinct phenotypes with varying underlying clinical characteristics were identified in patients with incident stroke. Compared with cluster 1 (n = 5,201, 10.8%), the risk of composite recurrent stroke and CVD-related mortality was higher in the other 3 clusters (cluster 2 [n = 18,655, 38.8%]: hazard ratio [HR], 1.07; 95% CI, 1.02–1.12; cluster 3 [n = 10,244, 21.3%]: HR, 1.20; 95% CI, 1.14–1.26; and cluster 4 [n = 14,014, 29.1%]: HR, 1.44; 95% CI: 1.37–1.50). Similar trends in risk were observed for composite recurrent stroke and all-cause mortality outcome, and subsequent recurrent stroke outcome. However, results were not consistent for subsequent risk in CHD, PVD, heart failure, CVD-related mortality, and all-cause mortality. In this proof of principle study, we demonstrated how a heterogenous population of patients with incident stroke can be stratified into four relatively homogenous phenotypes with differential risk of recurrent and major cardiovascular outcomes. This offers an opportunity to revisit the stratification of care for patients with incident stroke to improve patient outcomes.

Citation

Akyea, R. K., Ntaios, G., Kontopantelis, E., Georgiopoulos, G., Soria, D., Asselbergs, F. W., …Qureshi, N. (2023). A population-based study exploring phenotypic clusters and clinical outcomes in stroke using unsupervised machine learning approach. PLOS Digital Health, 2(9), Article e0000334. https://doi.org/10.1371/journal.pdig.0000334

Journal Article Type Article
Acceptance Date Aug 10, 2023
Online Publication Date Sep 13, 2023
Publication Date Sep 13, 2023
Deposit Date Sep 4, 2023
Publicly Available Date Sep 13, 2023
Journal PLOS Digital Health
Electronic ISSN 2767-3170
Publisher Public Library of Science (PLoS)
Peer Reviewed Peer Reviewed
Volume 2
Issue 9
Article Number e0000334
DOI https://doi.org/10.1371/journal.pdig.0000334
Public URL https://nottingham-repository.worktribe.com/output/25060559
Publisher URL https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000334

Files




You might also like



Downloadable Citations