Skip to main content

Research Repository

Advanced Search

Outputs (38)

Beyond the Walls: Patterns of Child Labour, Forced Labour, and Exploitation in a New Domestic Workers Dataset (2024)
Journal Article

The new Domestic Workers Dataset is the largest single set of surveys (n = 11,759) of domestic workers to date. Our analysis of this dataset reveals features about the lives and work of this “hard-to-find” population in India—a country estimated to h... Read More about Beyond the Walls: Patterns of Child Labour, Forced Labour, and Exploitation in a New Domestic Workers Dataset.

Assessing relative contribution of Environmental, Behavioural and Social factors on Life Satisfaction via mobile app data (2023)
Conference Proceeding

Life satisfaction significantly contributes to wellbe-ing and is linked to positive outcomes for individual people and society more broadly. However, previous research demonstrates that many factors contribute to the life satisfaction of an individua... Read More about Assessing relative contribution of Environmental, Behavioural and Social factors on Life Satisfaction via mobile app data.

Who consumes anthocyanins and anthocyanidins? Mining national retail data to reveal the influence of socioeconomic deprivation and seasonality on polyphenol dietary intake (2023)
Conference Proceeding

Anthocyanins are a class of polyphenols that have received widespread recent attention due to their potential health benefits. However, estimating the dietary intake of anthocyanins at a population level is a challenging task, due to the difficulty o... Read More about Who consumes anthocyanins and anthocyanidins? Mining national retail data to reveal the influence of socioeconomic deprivation and seasonality on polyphenol dietary intake.

Assessing the value of integrating national longitudinal shopping data into respiratory disease forecasting models (2023)
Journal Article

The COVID-19 pandemic led to unparalleled pressure on healthcare services. Improved healthcare planning in relation to diseases affecting the respiratory system has consequently become a key concern. We investigated the value of integrating sales of... Read More about Assessing the value of integrating national longitudinal shopping data into respiratory disease forecasting models.

Data donation of individual shopping data to help predict the occurrence of disease: A pilot study linking individual loyalty card and health survey data to investigate COVID-19 (2023)
Journal Article

Introduction & Background Previous studies have found shopping data could increase the predictive accuracy of disease surveillance systems and illuminate behavioural responses in the self-management of symptoms of disease. Yet, accessing individual... Read More about Data donation of individual shopping data to help predict the occurrence of disease: A pilot study linking individual loyalty card and health survey data to investigate COVID-19.

Expert perspectives on how educational technology may support autonomous learning for remote out-of-school children in low-income contexts (2023)
Journal Article

Across Sub-Saharan African, 98 million children are illiterate and innumerate and do not attend school. Educational technologies (EdTech) that promote autonomous learning may ameliorate this learning poverty. Yet, little is known if or how these tech... Read More about Expert perspectives on how educational technology may support autonomous learning for remote out-of-school children in low-income contexts.

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey (2023)
Journal Article

Background Shopping data can be analysed using machine learning techniques to study population health. It is unknown if use of such methods can successfully investigate pre-diagnosis purchases linked to self-medication of symptoms of ovarian cance... Read More about Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Computational Analysis of a Web-Based Survey.

Bundle entropy as an optimized measure of consumers' systematic product choice combinations in mass transactional data (2022)
Conference Proceeding

Understanding and measuring the predictability of consumer purchasing (basket) behaviour is of significant value. While predictability measures such as entropy have been well studied and leveraged in other sectors, their development and application t... Read More about Bundle entropy as an optimized measure of consumers' systematic product choice combinations in mass transactional data.

Informing action for United Nations SDG target 8.7 and interdependent SDGs: Examining modern slavery from space (2021)
Journal Article

This article provides an example of the ways in which remote sensing, Earth observation, and machine learning can be deployed to provide the most up to date quantitative portrait of the South Asian ‘Brick Belt’, with a view to understanding the exten... Read More about Informing action for United Nations SDG target 8.7 and interdependent SDGs: Examining modern slavery from space.

The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions (2018)
Conference Proceeding

Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socio... Read More about The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions.

Exploring the capabilities of Projection Augmented Relief Models (PARM) (2017)
Presentation / Conference Contribution
Priestnall, G., Goulding, J., Smith, A., & Arss, N. (2017). Exploring the capabilities of Projection Augmented Relief Models (PARM).

This paper explores the broad capabilities of physical landscape models when augmented by projection, termed Projection Augmented Relief Models (PARM). This includes experiences of developing PARM displays in public settings such as museums and visit... Read More about Exploring the capabilities of Projection Augmented Relief Models (PARM).

Event series prediction via non-homogeneous Poisson process modelling (2016)
Presentation / Conference Contribution
Goulding, J., Preston, S. P., & Smith, G. (2016). Event series prediction via non-homogeneous Poisson process modelling. In 2016 IEEE 16th International Conference on Data Mining (ICDM). https://doi.org/10.1109/ICDM.2016.0027

Data streams whose events occur at random arrival times rather than at the regular, tick-tock intervals of traditional time series are increasingly prevalent. Event series are continuous, irregular and often highly sparse, differing greatly in nature... Read More about Event series prediction via non-homogeneous Poisson process modelling.

Cross-system Recommendation: User-modelling via Social Media versus Self-Declared Preferences (2016)
Presentation / Conference Contribution
Alanazi, S., Goulding, J., & McAuley, D. (2016). Cross-system Recommendation: User-modelling via Social Media versus Self-Declared Preferences. In HT '16: Proceedings of the 27th ACM Conference on Hypertext and Social Media (183-188). https://doi.org/10.1145/2914586.2914640

© 2016 ACM. It is increasingly rare to encounter aWeb service that doesn't engage in some form of automated recommendation, with Collaborative Filtering (CF) techniques being virtually ubiquitous as the means for delivering relevant content. Yet seve... Read More about Cross-system Recommendation: User-modelling via Social Media versus Self-Declared Preferences.

A novel symbolization technique for time-series outlier detection (2015)
Presentation / Conference Contribution
Smith, G., & Goulding, J. (2015). A novel symbolization technique for time-series outlier detection. In 2015 IEEE International Conference on Big Data (Big Data). https://doi.org/10.1109/BigData.2015.7364037

The detection of outliers in time series data is a core component of many data-mining applications and broadly applied in industrial applications. In large data sets algorithms that are efficient in both time and space are required. One area where sp... Read More about A novel symbolization technique for time-series outlier detection.

AMP: a new time-frequency feature extraction method for intermittent time-series data (2015)
Presentation / Conference Contribution
Barrack, D. S., Goulding, J., Hopcraft, K., Preston, S., & Smith, G. (2015). AMP: a new time-frequency feature extraction method for intermittent time-series data.

The characterisation of time-series data via their most salient features is extremely important in a range of machine learning task, not least of all with regards to classification and clustering. While there exist many feature extraction techniques... Read More about AMP: a new time-frequency feature extraction method for intermittent time-series data.

A refined limit on the predictability of human mobility (2014)
Presentation / Conference Contribution
Smith, G., Wieser, R., Goulding, J., & Barrack, D. (2014). A refined limit on the predictability of human mobility. In 2014 IEEE International Conference on Pervasive Computing and Communications (PerCom). https://doi.org/10.1109/PerCom.2014.6813948

It has been recently claimed that human movement is highly predictable. While an upper bound of 93% predictability was shown, this was based upon human movement trajectories of very high spatiotemporal granularity. Recent studies reduced this spatiot... Read More about A refined limit on the predictability of human mobility.

A data driven approach to mapping urban neighbourhoods (2014)
Presentation / Conference Contribution
Brindley, P., Goulding, J., & Wilson, M. L. (2014). A data driven approach to mapping urban neighbourhoods.

Neighbourhoods have been described by the UK Secretary of State for Communities and Local Government as the “building blocks of public service society”. Despite this, difficulties in data collection combined with the concept’s subjective nature have... Read More about A data driven approach to mapping urban neighbourhoods.

Towards optimal symbolization for time series comparisons (2013)
Presentation / Conference Contribution
Smith, G., Goulding, J., & Barrack, D. (2013). Towards optimal symbolization for time series comparisons. In 2013 IEEE 13th International Conference on Data Mining Workshops. https://doi.org/10.1109/ICDMW.2013.59

The abundance and value of mining large time series data sets has long been acknowledged. Ubiquitous in fields ranging from astronomy, biology and web science the size and number of these datasets continues to increase, a situation exacerbated by the... Read More about Towards optimal symbolization for time series comparisons.