Lucy Bennett
Using a machine learning model to risk stratify for the presence of significant liver disease in a primary care population
Bennett, Lucy; Mostafa, Mohamed; Hammersley, Richard; Purssell, Huw; Patel, Manish; Street, Oliver; Athwal, Varinder S.; Hanley, Karen Piper; ,, The ID-LIVER Consortium; Hanley, Neil A.; Morling, Joanne R.; Guha, Indra Neil
Authors
Mohamed Mostafa
Richard Hammersley
Huw Purssell
Manish Patel
Oliver Street
Varinder S. Athwal
Karen Piper Hanley
The ID-LIVER Consortium ,
Neil A. Hanley
Professor JOANNE MORLING JOANNE.MORLING@NOTTINGHAM.AC.UK
PROFESSOR OF PUBLIC HEALTH AND EPIDEMIOLOGY
Professor NEIL GUHA neil.guha@nottingham.ac.uk
PROFESSOR OF HEPATOLOGY
Abstract
Background: Current strategies for detecting significant chronic liver disease (CLD) in the community are based on the extrapolation of diagnostic tests used in secondary care settings. Whilst this approach provides clinical utility, it has limitations related to diagnostic accuracy being predicated on disease prevalence and spectrum bias, which will differ in the community. Machine learning (ML) techniques provide a novel way of identifying significant variables without preconceived bias. As a proof-of-concept study, we wanted to examine the performance of nine different ML models based on both risk factors and abnormal liver enzyme tests in a large community cohort. Methods: Routine demographic and laboratory data was collected on 1,453 patients with risk factors for CLD, including high alcohol consumption, diabetes and obesity, in a community setting in Nottingham (UK) as part of the Scarred Liver project. A total of 87 variables were extracted. Transient elastography (TE) was used to define clinically significant liver fibrosis. The data was split into a training and hold out set. The median age of the cohort was 59, mean body mass index (BMI) 29.7 kg/m2, median TE 5.5 kPa, 49.2% had type 2 diabetes and 20.3% had a TE >8 kPa. Results: The nine different ML models, which included Random Forrest classifier, Support Vector classification and Gradient Boosting classifier, had a range of area under the curve (AUC) statistics of 0.5 to 0.75. Ensemble Stacker model showed the best performance, and this was replicated in the testing dataset (AUC 0.72). Recursive feature elimination found eight variables had a significant impact on model output. The model had superior sensitivity (74%) compared to specificity (60%). Conclusions: ML shows encouraging performance and highlights variables that may have bespoke value for diagnosing community liver disease. Optimising how ML algorithms are integrated into clinical pathways of care and exploring new biomarkers will further enhance diagnostic utility.
Citation
Bennett, L., Mostafa, M., Hammersley, R., Purssell, H., Patel, M., Street, O., Athwal, V. S., Hanley, K. P., ,, T. I.-L. C., Hanley, N. A., Morling, J. R., & Guha, I. N. (2023). Using a machine learning model to risk stratify for the presence of significant liver disease in a primary care population. Journal of Medical Artificial Intelligence, 6, Article 27. https://doi.org/10.21037/jmai-23-35
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 22, 2023 |
Online Publication Date | Nov 21, 2023 |
Publication Date | Nov 30, 2023 |
Deposit Date | Nov 27, 2023 |
Publicly Available Date | Nov 27, 2023 |
Journal | Journal of Medical Artificial Intelligence |
Electronic ISSN | 2617-2496 |
Publisher | AME Publishing Company |
Peer Reviewed | Peer Reviewed |
Volume | 6 |
Article Number | 27 |
DOI | https://doi.org/10.21037/jmai-23-35 |
Keywords | Liver disease; machine learning (ML); diagnosis; community |
Public URL | https://nottingham-repository.worktribe.com/output/27590043 |
Publisher URL | https://jmai.amegroups.org/article/view/8267/ |
Files
8267-PB1-6171-R1
(2.6 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
SteatoSITE: an Integrated Gene-to-Outcome Data Commons for Precision Medicine Research in NAFLD
(2023)
Preprint / Working Paper
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search