Skip to main content

Research Repository

Advanced Search

Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Survey Study

Dolan, E.; Goulding, J.; Tata, L.; Lang, A.


E. Dolan

A. Lang



Shopping data can be analysed using machine learning techniques to study population health. It is unknown if use of such methods can successfully investigate pre-diagnosis purchases linked to self-medication of symptoms of ovarian cancer.


To gain new domain knowledge from women’s experiences, to better understand how women’s shopping behaviour relates to their pathway to diagnosis of ovarian cancer, and to inform research on computational analysis of shopping data for insights into population health.


An online survey about individuals’ shopping patterns occurring prior to an ovarian cancer diagnosis was analysed to identify key knowledge about healthcare purchases. Logistic regression and random forest models were employed to statistically examine how products linked to potential symptoms related to presentation to healthcare and timing of diagnosis.


Of 101 women surveyed with ovarian cancer 58% bought non-prescription healthcare products for up to more than a year prior to diagnosis, including pain relief and abdominal products. General Practitioner advice was the primary reason for purchases (40%), with 51% occurring due to a participant’s doctor believing their health problems were due to a condition other than ovarian cancer. Associations were shown between purchases made because a participant’s doctor believing their health problems were due to a condition other than ovarian cancer, and the following variables: health problems for longer than a year prior to diagnosis (OR 7.33; 95% CI 1.58 – 33.97), buying healthcare products for more than 6 months to a year (OR 3.82; 95% CI 1.04 – 13.98) or for more than a year (OR 7.64; 95% CI 1.38 – 42.33), and the amount of healthcare product types purchased (OR 1.54; 95% CI 1.13 - 2.11). Purchasing patterns are shown to be potentially predictive of a participant’s doctor thinking their health problems were due to some condition other than ovarian cancer, with nested cross-validation of random forest classification models achieving an overall in-sample accuracy score of 89.1%, and an out-of-sample score of 70.1%.


Women in the survey were seven times more likely to have had a duration of more than a year of health problems prior to a diagnosis of ovarian cancer if they were self-medicating based on advice from a doctor, rather than having made the decision to self-medicate independently. Predictive modelling indicates that women in such situations, who are self-medicating because their doctor believes their health problems may be due to a condition other than ovarian cancer, exhibit distinct shopping behaviours that may be identifiable within purchasing data. Further investigation is required to determine if receiving such advice from a doctor might disproportionately increase the time women self-manage symptoms prior to re-seeking help, leading to a longer duration of health problems prior to diagnosis. Through exploratory research combining women sharing their behaviours prior to diagnosis, and computational analysis of this data, the study demonstrates women’s shopping data could potentially be useful for earlier ovarian cancer detection.


Dolan, E., Goulding, J., Tata, L., & Lang, A. (in press). Using Shopping Data to Improve the Diagnosis of Ovarian Cancer: Survey Study. JMIR Cancer,

Journal Article Type Article
Acceptance Date Jun 23, 2022
Deposit Date Jul 8, 2022
Journal JMIR Cancer
Electronic ISSN 2369-1999
Publisher JMIR Publications
Peer Reviewed Peer Reviewed
Public URL