Research Repository

See what's under the surface


The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions (2018)
Conference Proceeding
Engelmann, G., Smith, G., & Goulding, J. (2018). The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions

Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socio... Read More

Generating vague neighbourhoods through data mining of passive web data (2017)
Journal Article
Brindley, P., Goulding, J., & Wilson, M. L. (in press). Generating vague neighbourhoods through data mining of passive web data. International Journal of Geographical Information Science, 32(3), doi:10.1080/13658816.2017.1400549. ISSN 1365-8816

Neighbourhoods have been described as \the building blocks of public services society". Their subjective nature, however, and the resulting difficulties in collecting data, means that in many countries there are no officially defined neighbourhoods e... Read More

Seasonal variation in collective mood via Twitter content and medical purchases (2017)
Journal Article
Dzogang, F., Goulding, J., Lightman, S., & Cristianini, N. (in press). Seasonal variation in collective mood via Twitter content and medical purchases. Lecture Notes in Artificial Intelligence, 10584, doi:10.1007/978-3-319-68765-0_6. ISSN 0302-9743

The analysis of sentiment contained in vast amounts of Twitter messages has reliably shown seasonal patterns of variation in multiple studies, a finding that can have great importance in the understanding of seasonal affective disorders, particularly... Read More

Event series prediction via non-homogeneous Poisson process modelling (2016)
Conference Proceeding
Goulding, J., Preston, S. P., & Smith, G. (2016). Event series prediction via non-homogeneous Poisson process modelling. In 2016 IEEE 16th International Conference on Data Mining (ICDM)doi:10.1109/ICDM.2016.0027

Data streams whose events occur at random arrival times rather than at the regular, tick-tock intervals of traditional time series are increasingly prevalent. Event series are continuous, irregular and often highly sparse, differing greatly in nature... Read More

Cross-system recommendation: user-modelling via social media versus self-declared preferences (2016)
Conference Proceeding
social media versus self-declared preferences. In HT '16: Proceedings of the 27th ACM Conference on Hypertext and Social Media, 183-188. doi:10.1145/2914586.2914640

It is increasingly rare to encounter a Web service that doesn’t engage in some form of automated recommendation, with Collaborative Filtering (CF) techniques being virtually ubiquitous as the means for delivering relevant content. Yet several key iss... Read More

A novel symbolization technique for time-series outlier detection (2015)
Conference Proceeding
Smith, G., & Goulding, J. (2015). A novel symbolization technique for time-series outlier detection. In 2015 IEEE International Conference on Big Data (Big Data)doi:10.1109/BigData.2015.7364037

The detection of outliers in time series data is a core component of many data-mining applications and broadly applied in industrial applications. In large data sets algorithms that are efficient in both time and space are required. One area where sp... Read More

A refined limit on the predictability of human mobility (2014)
Conference Proceeding
Smith, G., Wieser, R., Goulding, J., & Barrack, D. (2014). A refined limit on the predictability of human mobility. In 2014 IEEE International Conference on Pervasive Computing and Communications (PerCom)doi:10.1109/PerCom.2014.6813948

It has been recently claimed that human movement is highly predictable. While an upper bound of 93% predictability was shown, this was based upon human movement trajectories of very high spatiotemporal granularity. Recent studies reduced this spatiot... Read More

Towards optimal symbolization for time series comparisons (2013)
Conference Proceeding
Smith, G., Goulding, J., & Barrack, D. (2013). Towards optimal symbolization for time series comparisons. In 2013 IEEE 13th International Conference on Data Mining Workshopsdoi:10.1109/ICDMW.2013.59

The abundance and value of mining large time series data sets has long been acknowledged. Ubiquitous in fields ranging from astronomy, biology and web science the size and number of these datasets continues to increase, a situation exacerbated by the... Read More