Markus Helmer
On the stability of canonical correlation analysis and partial least squares with application to brain-behavior associations
Helmer, Markus; Warrington, Shaun; Mohammadi-Nejad, Ali-Reza; Ji, Jie Lisa; Howell, Amber; Rosand, Benjamin; Anticevic, Alan; Sotiropoulos, Stamatios N.; Murray, John D.
Authors
Dr SHAUN WARRINGTON Shaun.Warrington1@nottingham.ac.uk
Research Fellow
Dr. ALIREZA MOHAMMADINEZHAD KISOMI ALIREZA.MOHAMMADINEZHADKISOMI@NOTTINGHAM.AC.UK
Research Fellow
Jie Lisa Ji
Amber Howell
Benjamin Rosand
Alan Anticevic
STAMATIOS SOTIROPOULOS STAMATIOS.SOTIROPOULOS@NOTTINGHAM.AC.UK
Professor of Computational Neuroimaging
John D. Murray
Abstract
Associations between datasets can be discovered through multivariate methods like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). A requisite property for interpretability and generalizability of CCA/PLS associations is stability of their feature patterns. However, stability of CCA/PLS in high-dimensional datasets is questionable, as found in empirical characterizations. To study these issues systematically, we developed a generative modeling framework to simulate synthetic datasets. We found that when sample size is relatively small, but comparable to typical studies, CCA/PLS associations are highly unstable and inaccurate; both in their magnitude and importantly in the feature pattern underlying the association. We confirmed these trends across two neuroimaging modalities and in independent datasets with n ≈ 1000 and n = 20,000, and found that only the latter comprised sufficient observations for stable mappings between imaging-derived and behavioral features. We further developed a power calculator to provide sample sizes required for stability and reliability of multivariate analyses. Collectively, we characterize how to limit detrimental effects of overfitting on CCA/PLS stability, and provide recommendations for future studies.
Citation
Helmer, M., Warrington, S., Mohammadi-Nejad, A., Ji, J. L., Howell, A., Rosand, B., …Murray, J. D. (2024). On the stability of canonical correlation analysis and partial least squares with application to brain-behavior associations. Communications Biology, 7(1), Article 217. https://doi.org/10.1038/s42003-024-05869-4
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 28, 2024 |
Online Publication Date | Feb 21, 2024 |
Publication Date | 2024 |
Deposit Date | Jan 29, 2024 |
Publicly Available Date | Feb 22, 2024 |
Journal | Communications Biology |
Electronic ISSN | 2399-3642 |
Publisher | Nature Publishing Group |
Peer Reviewed | Peer Reviewed |
Volume | 7 |
Issue | 1 |
Article Number | 217 |
DOI | https://doi.org/10.1038/s42003-024-05869-4 |
Keywords | Cognitive neuroscience; Computational neuroscience; Statistical methods |
Public URL | https://nottingham-repository.worktribe.com/output/30509209 |
Publisher URL | https://www.nature.com/articles/s42003-024-05869-4 |
Additional Information | Received: 7 May 2023; Accepted: 28 January 2024; First Online: 21 February 2024; : The authors declare the following competing interests: M.H. and J.L.J. are currently employed by Manifest Technologies. A.A. and J.D.M. hold equity with Neumora Therapeutics (formerly BlackThorn Therapeutics) and are co-founders of Manifest Technologies. J.D.M. and A.A. are co-inventors on the patent Methods and tools for detecting, diagnosing, predicting, prognosticating, or treating a neurobehavioral phenotype in a subject, U.S. Application No.16/149,903, filed on October 2, 664 2018, U.S. Application for PCT International Application No.18/054, 009 filed on October 2, 2018. A.A., J.D.M. and J.L.J are co-inventors on the patent Systems and Methods for Neuro-Behavioral Relationships in Dimensional Geometric Embedding(N-BRIDGE), PCT International Application No.PCT/US2119/022110, filed March 13, 2019. A.A., J.D.M., M.H. and J.L.L. are co-inventors on the patent Methods of Identifying Subjects for Inclusion and/or Exclusion in a Clinical Trial, Application No.: 63/533,888, filed August 21, 2023. All other authors declare no competing interests. |
Files
s42003-024-05869-4
(3.7 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
Mapping brain endophenotypes associated with idiopathic pulmonary fibrosis genetic risk
(2022)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search