George Ntaios
Data‐driven machine‐learning analysis of potential embolic sources in embolic stroke of undetermined source
Ntaios, George; Weng, Stephen F.; Perlepe, Kalliopi; Akyea, Ralph; Condon, Laura; Lambrou, Dimitrios; Sirimarco, Gaia; Strambo, Davide; Eskandari, Ashraf; Karagkiozi, Efstathia; Vemmou, Anastasia; Korompoki, Eleni; Manios, Efstathios; Makaritsis, Konstantinos; Vemmos, Konstantinos; Michel, Patrik
Authors
Stephen F. Weng
Kalliopi Perlepe
Dr RALPH AKYEA RALPH.AKYEA1@NOTTINGHAM.AC.UK
SENIOR RESEARCH FELLOW
Laura Condon
Dimitrios Lambrou
Gaia Sirimarco
Davide Strambo
Ashraf Eskandari
Efstathia Karagkiozi
Anastasia Vemmou
Eleni Korompoki
Efstathios Manios
Konstantinos Makaritsis
Konstantinos Vemmos
Patrik Michel
Abstract
Background: Hierarchical clustering, a common “unsupervised” machine‐learning algorithm, is advantageous for exploring potential underlying aetiology in particularly heterogeneous diseases. We investigated potential embolic sources in ESUS using a data‐driven, machine‐learning method, and explored variation in stroke recurrence between clusters.
Methods: We used hierarchical k‐means clustering algorithm on patients’ baseline data, which assigned each individual into a unique clustering group, using a minimum‐variance method to calculate the similarity between ESUS patients based on all baseline features. Potential embolic sources were categorised into atrial cardiopathy, atrial fibrillation, arterial disease, left ventricular disease, cardiac valvulopathy, patent foramen ovale (PFO) and cancer.
Results: Among 800 consecutive ESUS patients (43.3% women, median age 67years), the optimal number of clusters was 4. Left ventricular disease was most prevalent in cluster 1 (present in all patients) and perfectly associated with cluster 1. PFO was most prevalent in cluster 2 (38.9% of patients) and associated significantly with increased likelihood of cluster 2 (adjusted odds‐ratio:2.69, 95%CI:1.64‐4.41). Arterial disease was most prevalent in cluster 3 (57.7%) and associated with increased likelihood of cluster 3 (adjusted odds‐ratio:2.21, 95%CI:1.43‐3.13). Atrial cardiopathy was most prevalent in cluster 4 (100%) and perfectly associated with cluster 4. Cluster 3 was the largest cluster involving 53.7% of patients. Atrial fibrillation was not significantly associated with any cluster.
Conclusions: This data‐driven machine‐learning analysis identified 4 clusters of ESUS which were strongly associated with arterial disease, atrial cardiopathy, PFO and left ventricular disease respectively. More than half of patients were assigned to the cluster associated with arterial disease.
Citation
Ntaios, G., Weng, S. F., Perlepe, K., Akyea, R., Condon, L., Lambrou, D., Sirimarco, G., Strambo, D., Eskandari, A., Karagkiozi, E., Vemmou, A., Korompoki, E., Manios, E., Makaritsis, K., Vemmos, K., & Michel, P. (2021). Data‐driven machine‐learning analysis of potential embolic sources in embolic stroke of undetermined source. European Journal of Neurology, 28(1), 192-201. https://doi.org/10.1111/ene.14524
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 31, 2020 |
Online Publication Date | Sep 11, 2020 |
Publication Date | Jan 1, 2021 |
Deposit Date | Sep 15, 2020 |
Publicly Available Date | Sep 12, 2021 |
Journal | European Journal of Neurology |
Print ISSN | 1351-5101 |
Electronic ISSN | 1468-1331 |
Publisher | Wiley |
Peer Reviewed | Peer Reviewed |
Volume | 28 |
Issue | 1 |
Pages | 192-201 |
DOI | https://doi.org/10.1111/ene.14524 |
Keywords | embolic stroke of undetermined source; stroke; potential embolic source; machine learning; hierarchical clustering |
Public URL | https://nottingham-repository.worktribe.com/output/4904921 |
Publisher URL | https://onlinelibrary.wiley.com/doi/10.1111/ene.14524 |
Additional Information | This is the peer reviewed version of the following article: Ntaios, G., Weng, S.F., Perlepe, K., Akyea, R., Condon, L., Lambrou, D., Sirimarco, G., Strambo, D., Eskandari, A., Karagkiozi, E., Vemmou, A., Korompoki, E., Manios, E., Makaritsis, K., Vemmos, K. and Michel, P. (2020), Data‐driven machine‐learning analysis of potential embolic sources in embolic stroke of undetermined source. Eur J Neurol. Accepted Author Manuscript, which has been published in final form at https://doi.org/10.1111/ene.14524. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions. |
Files
Ntaios Euro J Neurology 2020 AAM
(466 Kb)
PDF
You might also like
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search