Dr GAVIN LONG Gavin.Long@nottingham.ac.uk
Research Fellow
Machine learning on national shopping data reliably estimates childhood obesity prevalence and socio-economic deprivation
Long, Gavin; Nica-Avram, Georgiana; Harvey, John; Lukinova, Evgeniya; Mansilla, Roberto; Welham, Simon; Engelmann, Gregor; Dolan, Elizabeth; Makokoro, Kuzivakwashe; Thomas, Michelle; Powell, Edward; Goulding, James
Authors
Mrs GEORGIANA NICA-AVRAM GEORGIANA.NICA-AVRAM1@NOTTINGHAM.AC.UK
TRANSITIONAL ASSISTANT PROFESSOR
Dr JOHN HARVEY John.Harvey2@nottingham.ac.uk
ASSOCIATE PROFESSOR
Dr EVGENIYA LUKINOVA EVGENIYA.LUKINOVA@NOTTINGHAM.AC.UK
ASSISTANT PROFESSOR
Mr ROBERTO MANSILLA LOBOS Roberto.MansillaLobos@nottingham.ac.uk
ASSISTANT PROFESSOR
Simon Welham simon.welham@nottingham.ac.uk
ASSISTANT PROFESSOR
Gregor Engelmann
Mrs ELIZABETH DOLAN Elizabeth.Dolan@nottingham.ac.uk
Research Associate in Health DataScience
Kuzivakwashe Makokoro
Michelle Thomas
Edward Powell
Dr JAMES GOULDING JAMES.GOULDING@NOTTINGHAM.AC.UK
PROFESSOR OF DATA SCIENCE
Abstract
Deprivation pushes people to choose cheap, calorie-dense foods instead of nutritious but expensive alternatives. Diseases, such as obesity, cardiovascu-lar disease, and diabetes, resulting from these poor dietary choices place a significant burden on public health systems. Measuring nutritional insecurity is difficult to achieve at scale and so the ability to study the relationship between nutritional outcomes and deprivation at a national level is very challenging. This makes it difficult to understand the effect of new policies or track changes over time. To address this challenge, we develop a machine learning approach using massive anonymised transactional data (4 million members and 2.5 billion transactions) in partnership with the retailer The Cooperative Group UK. We engineer a series of variables related to obe-sogenic diets, including a new measure called 'Calorie-oriented purchasing'. These variables help illustrate how large-scale transactional data can discriminate between neighbourhoods most affected by deprivation and childhood obesity. Through comparative assessment of machine learning approaches, we find better performance from tree-based models (Random Forest, XG-Boost) with the best-achieving accuracy of 0.88 for predicting deprivation and an accuracy of 0.79 for childhood obesity. Calorie-oriented purchasing emerges as a robust predictor of deprivation and childhood obesity at the census area level. Results show this approach can help summarise nutritional insecurity, and support its spatio-temporal monitoring. We conclude with policy implications and recommend retailers adopt new measures for measuring national nutrition insecurity.
Citation
Long, G., Nica-Avram, G., Harvey, J., Lukinova, E., Mansilla, R., Welham, S., Engelmann, G., Dolan, E., Makokoro, K., Thomas, M., Powell, E., & Goulding, J. (2025). Machine learning on national shopping data reliably estimates childhood obesity prevalence and socio-economic deprivation. Food Policy, 131, Article 102826. https://doi.org/10.1016/j.foodpol.2025.102826
Journal Article Type | Article |
---|---|
Acceptance Date | Feb 10, 2025 |
Online Publication Date | Feb 27, 2025 |
Publication Date | 2025-02 |
Deposit Date | Feb 19, 2025 |
Publicly Available Date | Mar 3, 2025 |
Journal | Food Policy |
Print ISSN | 0306-9192 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 131 |
Article Number | 102826 |
DOI | https://doi.org/10.1016/j.foodpol.2025.102826 |
Keywords | Deprivation; Obesity; Machine learning; Dietary Monitoring; Digital Footprints; Food Security |
Public URL | https://nottingham-repository.worktribe.com/output/45595723 |
Publisher URL | https://www.sciencedirect.com/science/article/pii/S0306919225000302?via%3Dihub |
Additional Information | This article is maintained by: Elsevier; Article Title: Machine learning on national shopping data reliably estimates childhood obesity prevalence and socio-economic deprivation; Journal Title: Food Policy; CrossRef DOI link to publisher maintained version: https://doi.org/10.1016/j.foodpol.2025.102826; Content Type: article; Copyright: © 2025 The Authors. Published by Elsevier Ltd. |
Files
1-s2.0-S0306919225000302-main
(6.2 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2025 The Authors. Published by Elsevier Ltd.
You might also like
Predicting Healthy Start Scheme Uptake using Deprivation and Food Insecurity Measures
(2024)
Presentation / Conference Contribution
Detecting iodine deficiency risks from dietary transitions using shopping data
(2024)
Journal Article
Predicting health related deprivation using loyalty card digital footprints
(2023)
Journal Article
Modelling Urban Housing Stocks for Building Energy Simulation using CityGML EnergyADE
(2019)
Journal Article
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search