Skip to main content

Research Repository

Advanced Search

Forced vital capacity trajectories in patients with idiopathic pulmonary fibrosis: a secondary analysis of a multicentre, prospective, observational cohort

Fainberg, Hernan P; Oldham, Justin M; Molyneau, Philip L; Allen, Richard J; Kraven, Luke M; Fahy, William A; Porte, Joanne; Braybrooke, Rebecca; Saini, Gauri; Karsdal, Morten A; Leeming, Diane J; Triguero, Isaac; Sand, Jannie M B; Oballa, Eunice; Wells, Athol U; Renzoni, Elisabetta; Wain, Louise V; Noth, Imre; Maher, Toby M; Stewart, Iain D; Jenkins, R. Gisli


Hernan P Fainberg

Justin M Oldham

Philip L Molyneau

Richard J Allen

Luke M Kraven

William A Fahy

Joanne Porte

Rebecca Braybrooke

Gauri Saini

Morten A Karsdal

Diane J Leeming

Jannie M B Sand

Eunice Oballa

Athol U Wells

Elisabetta Renzoni

Louise V Wain

Imre Noth

Toby M Maher

Iain D Stewart

R. Gisli Jenkins


Background: Idiopathic Pulmonary Fibrosis (IPF) is a progressive fibrotic lung disease with a variable clinical trajectory. Decline in Forced Vital Capacity (FVC) is the main indicator of progression, however missingness prevents long-term analysis of lung function patterns. We used Machine Learning (ML) techniques to identify patterns of lung function trajectory.

Methods: Longitudinal FVC data were collected from 415 participants with IPF. The imputation performance of conventional and ML techniques to impute missing data was evaluated, then the fully imputed dataset was analysed by unsupervised clustering using Self-Organizing Maps (SOM). Anthropometrics, genomic associations, blood biomarkers and clinical outcomes were compared between clusters. Replication was performed using an independent dataset.

Results: An unsupervised ML algorithm had the lowest imputation error amongst tested methods, and SOM identified four distinct clusters (CL1 to CL4), confirmed by sensitivity analysis. CL1 (n=140): linear decline over three years; CL2 (n=100): initial improvement in FVC before declining; CL3 (n=113): initial FVC decline before stabilisation; CL4(n=62): stable lung function. Median survival was shortest in CL1 (2.87 - 95%CI: 2.29–3.40) and longest in CL4 (5.65 - 95%CI: 5.18–6.62). Baseline FEV1/FVC ratio and biomarker SPD levels were significantly higher among clusters CL1 and CL3. Similar lung function clusters with some shared anthropometric characteristics were identified in the replication dataset.

Conclusions: Using a data-driven unsupervised approach, we identified four clusters of lung function trajectory with distinct clinical and biochemical features. Enriching or stratifying longitudinal spirometric data into clusters may optimise evaluation of intervention efficacy during clinical trials and patient management


Fainberg, H. P., Oldham, J. M., Molyneau, P. L., Allen, R. J., Kraven, L. M., Fahy, W. A., …Jenkins, R. G. (2022). Forced vital capacity trajectories in patients with idiopathic pulmonary fibrosis: a secondary analysis of a multicentre, prospective, observational cohort. The Lancet. Digital Health, 4(12), e862-e872.

Journal Article Type Article
Acceptance Date Aug 25, 2022
Online Publication Date Nov 1, 2022
Publication Date Dec 1, 2022
Deposit Date Sep 5, 2022
Publicly Available Date Nov 1, 2022
Journal The Lancet Digital Health
Print ISSN 2589-7500
Electronic ISSN 2589-7500
Peer Reviewed Peer Reviewed
Volume 4
Issue 12
Pages e862-e872
Keywords Health Information Management; Decision Sciences (miscellaneous); Health Informatics; Medicine (miscellaneous)
Public URL
Publisher URL


You might also like

Downloadable Citations