Skip to main content

Research Repository

Advanced Search

Large language models surpass human experts in predicting neuroscience results

Luo, Xiaoliang; Rechardt, Akilles; Sun, Guangzhi; Nejad, Kevin K.; Yáñez, Felipe; Yilmaz, Bati; Lee, Kangjoo; Cohen, Alexandra O.; Borghesani, Valentina; Pashkov, Anton; Marinazzo, Daniele; Nicholas, Jonathan; Salatiello, Alessandro; Sucholutsky, Ilia; Minervini, Pasquale; Razavi, Sepehr; Rocca, Roberta; Yusifov, Elkhan; Okalova, Tereza; Gu, Nianlong; Ferianc, Martin; Khona, Mikail; Patil, Kaustubh R.; Lee, Pui Shee; Mata, Rui; Myers, Nicholas E.; Bizley, Jennifer K.; Musslick, Sebastian; Bilgin, Isil Poyraz; Niso, Guiomar; Ales, Justin M.; Gaebler, Michael; Ratan Murty, N. Apurva; Loued-Khenissi, Leyla; Behler, Anna; Hall, Chloe M.; Dafflon, Jessica; Bao, Sherry Dongqi; Love, Bradley C.

Large language models surpass human experts in predicting neuroscience results Thumbnail


Authors

Xiaoliang Luo

Akilles Rechardt

Guangzhi Sun

Kevin K. Nejad

Felipe Yáñez

Bati Yilmaz

Kangjoo Lee

Alexandra O. Cohen

Valentina Borghesani

Anton Pashkov

Daniele Marinazzo

Jonathan Nicholas

Alessandro Salatiello

Ilia Sucholutsky

Pasquale Minervini

Sepehr Razavi

Roberta Rocca

Elkhan Yusifov

Tereza Okalova

Nianlong Gu

Martin Ferianc

Mikail Khona

Kaustubh R. Patil

Pui Shee Lee

Rui Mata

Jennifer K. Bizley

Sebastian Musslick

Isil Poyraz Bilgin

Guiomar Niso

Justin M. Ales

Michael Gaebler

N. Apurva Ratan Murty

Leyla Loued-Khenissi

Anna Behler

Chloe M. Hall

Jessica Dafflon

Sherry Dongqi Bao

Bradley C. Love



Abstract

Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. Here, to evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs indicated high confidence in their predictions, their responses were more likely to be correct, which presages a future where LLMs assist humans in making discoveries. Our approach is not neuroscience specific and is transferable to other knowledge-intensive endeavours.

Citation

Luo, X., Rechardt, A., Sun, G., Nejad, K. K., Yáñez, F., Yilmaz, B., Lee, K., Cohen, A. O., Borghesani, V., Pashkov, A., Marinazzo, D., Nicholas, J., Salatiello, A., Sucholutsky, I., Minervini, P., Razavi, S., Rocca, R., Yusifov, E., Okalova, T., Gu, N., …Love, B. C. (2024). Large language models surpass human experts in predicting neuroscience results. Nature Human Behaviour, https://doi.org/10.1038/s41562-024-02046-9

Journal Article Type Article
Acceptance Date Oct 3, 2024
Online Publication Date Nov 27, 2024
Publication Date Jan 1, 2024
Deposit Date Dec 5, 2024
Publicly Available Date Dec 18, 2024
Journal Nature Human Behaviour
Electronic ISSN 2397-3374
Publisher Nature Publishing Group
Peer Reviewed Peer Reviewed
DOI https://doi.org/10.1038/s41562-024-02046-9
Public URL https://nottingham-repository.worktribe.com/output/42595003
Publisher URL https://www.nature.com/articles/s41562-024-02046-9

Files





You might also like



Downloadable Citations