Jobie Budd
A large-scale and PCR-referenced vocal audio dataset for COVID-19
Budd, Jobie; Baker, Kieran; Karoune, Emma; Coppock, Harry; Patel, Selina; Payne, Richard; Tendero Cañadas, Ana; Titcomb, Alexander; Hurley, David; Egglestone, Sabrina; Butler, Lorraine; Mellor, Jonathon; Nicholson, George; Kiskin, Ivan; Koutra, Vasiliki; Jersakova, Radka; McKendry, Rachel; Diggle, Peter; Richardson, Sylvia; Schuller, Björn; Gilmour, Steven; Pigoli, Davide; Roberts, Stephen; Packham, Josef; Thornley, Tracey; Holmes, Chris
Authors
Kieran Baker
Emma Karoune
Harry Coppock
Selina Patel
Richard Payne
Ana Tendero Cañadas
Alexander Titcomb
David Hurley
Sabrina Egglestone
Lorraine Butler
Jonathon Mellor
George Nicholson
Ivan Kiskin
Vasiliki Koutra
Radka Jersakova
Rachel McKendry
Peter Diggle
Sylvia Richardson
Björn Schuller
Steven Gilmour
Davide Pigoli
Stephen Roberts
Josef Packham
TRACEY THORNLEY Tracey.Thornley1@nottingham.ac.uk
Professor of Health Policy
Chris Holmes
Abstract
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the ‘Speak up and help beat coronavirus’ digital survey alongside demographic, symptom and self-reported respiratory condition data. Digital survey submissions were linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,565 of 72,999 participants and 24,105 of 25,706 positive cases. Respiratory symptoms were reported by 45.6% of participants. This dataset has additional potential uses for bioacoustics research, with 11.3% participants self-reporting asthma, and 27.2% with linked influenza PCR test results.
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 10, 2024 |
Online Publication Date | Jun 27, 2024 |
Publication Date | Jun 27, 2024 |
Deposit Date | Apr 10, 2024 |
Publicly Available Date | Jul 8, 2024 |
Electronic ISSN | 2052-4463 |
Publisher | Nature Publishing Group |
Peer Reviewed | Peer Reviewed |
Volume | 11 |
Article Number | 700 |
DOI | https://doi.org/10.1038/s41597-024-03492-w |
Public URL | https://nottingham-repository.worktribe.com/output/33561144 |
Publisher URL | https://www.nature.com/articles/s41597-024-03492-w |
Files
A large-scale and PCR-referenced vocal audio dataset for COVID-19
(2.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
Assessing the impact of the ‘Antibiotic Guardian Schools Ambassadors' initiative on trainee pharmacist learning and development
(2023)
Presentation / Conference
How trainee pharmacists are tackling AMR through a schools outreach scheme
(2022)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search