Tracking the Consumption of Home Essentials

Predictions of people's behaviour increasingly drive interactions with a new generation of IoT services designed to support everyday life in the home, from shopping to heating. Based on the premise that such automation is difficult due to the contingent nature of people's practices, in this work we explore the nature of these contingencies in depth. We have designed and conducted a technology probe that made use of simple linear predictions as a provocation, and invited people to track the life of their household essentials over a two-month period. Through a mixed-method approach we demonstrate the challenges of simple predictions, and in turn identify eight categories of contingencies that influenced prediction accuracy. We discuss strategies for how designers of future predictive IoT systems may take the contingencies into account by removing, hiding, revealing, managing, or exploiting the system uncertainty at the core of the issue.


INTRODUCTION
Research in HCI is beginning to examine how people experience emerging 'proactive' Internet of Things (IoT) technologies in their everyday lives [20], for example to automate elements of food shopping [57]. Current approaches for managing grocery shopping vary, relying on various degrees of automation and technology, including buttons for instant re-ordering (e.g. Amazon Dash), and 'meal kit' subscription services. Yet these do not account for existing products in the home, and 'smart' fridges that attempt to monitor groceries as yet do not feature automated re-ordering. Prospectively, future technologies will combine elements of all of these approaches but must address the challenge of predicting shopping needs for such a system to be adopted at scale.
For these IoT systems to deliver the expected economic and societal benefits [37], they face the challenge to fit into the contingent practices that shape the lived experience of household shopping and consumption [30]. In this work, our primary aim is to investigate these contingencies to develop an understanding of household product consumption that can be used to inform the design of IoT-based services that leverage consumption data. For example, 'automated replenishment' services that make use of IoT-enabled prediction of essential product use in the home to trigger automated repurchasing and delivery. Managing essential products in the home is an opportunity for HCI to address numerous challenges, including reducing food waste [19,50] and accessibility challenges for less-abled people [17]. However, while approaches to predict household consumption are emerging [56], an understanding of the contingent nature of actual consumption that shapes the uncertainty that designers of predictive systems should take into account is lacking.
To address this gap, we have conducted a two-month long technology probe deployment in ten UK households. Our quantitative findings demonstrate shortcomings of simple predictions of consumption; and our qualitative analysis of participant feedback gathered throughout the study and semi-structured interviews with participant households reveals eight categories of contingencies that participants used to explain why predictions were wrong. In turn, we discuss strategies that designers can adopt to address the uncertainty effected by the contingent nature of product consumption.

RELATED WORK
We now situate our work with literature on the Internet of Things and domestic grocery and household shopping.

The Proactive Internet of Things (IoT)
The IoT paradigm promises an era in which everyday objects are interconnected with each other and the Internet, enabling technologies to "disappear from the consciousness of the user" [24]. The economic scale and impact of IoT is set to touch nearly every industry ranging from homes [31,44], retail [46], health [32,35], smart cities [3], and as with our focus, automated shopping and delivery [9]. Yet, the notion that our lives will be filled with devices generating and collecting data engenders new challenges in how future systems could adequately act upon this data to be of value to users [40]. The idea is that future systems will be equipped to autonomously, or rather, proactively, deal with this potential 'deluge' of data, enabling them to act on behalf of their users [20].
The Proactive IoT extends the idea of autonomous systems. By including sufficient sensory input, systems will be able to self-direct and reveal the consequences of automated decision making, and allow users to delegate consent appropriately [43], such that the users can rely on and trust the system to perform as expected [8]. While the importance of understanding how to design an interactive system for the situated and contingent nature of everyday life is long understood in HCI [48], work to make sense of how a proactive system might meet this challenge is relatively nascent. Recent literature in this vein has drawn on a variety of traditions, ranging from ethnomethodological inquiry (e.g. [30]), and in-home breaching experiments (e.g. [57]), to algorithmic Machine Learning-driven approaches (e.g. [1]). Regardless of the approach taken, predicting the needs of end-users is demonstrably an unresolved challenge, even with existing commercial systems [60], and is complicated by the multi-person, multi-activity collaborative environments such technologies are destined for, such as the home [42].
Therefore, to design a proactive system that is capable of predicting the essential grocery needs of homes, designers must not only model individual consumers [16], but they should also allow the system to be shaped by the on-going contingencies of everyday life. We deploy a technology probe to track existing product use in the home and conduct focused contextual interviews with participant households to examine the contingencies that shape their consumption of the home essentials.

Domestic grocery shopping
Shopping provides a ripe opportunity for the development of proactive IoT technologies that can predict and deliver items to the home prior to depletion [37]. The activity of grocery shopping has been found to have two distinct tasks: planning and preparation [51], and fulfilment [6]. Work has explored the practices of creating shopping lists as memory aids [6], how such lists might be digitised [25], and how stores might use customer data to predict future shopping lists, increasing their revenue [15]. Hyland et al. [30] conducted ethnographic fieldwork and identified how participants anticipate their food needs. The authors have found shopping to be both comprised of incidental (i.e., opportunistic) and intentional (i.e., planful) practices. Data collection methods have included photo diary studies of participants preparing and consuming food [28,57], and wearable cameras to reveal food consumption cycles in the home [39]. IoT technologies, however, hold the potential to allow such data to be collected automatically [40], using devices such as smart fridges to track how products in the home are used [9].
Tsubakida et al. [56], made use of food logging and find that seven days sufficed to coarsly predict an individual's eating habits. However, others have shown shopping choice to shaped by the contingent social and economic factors of everyday life [41], motivating a closer look at what these contingencies are in our work.
Research on the use of grocery items in the home has focused on features such as environmental and financial impact of food waste [5,47]. Of the work that considers other benefits, such as convenience to users, Hong et al. [26] proposed a semi-autonomous system that prompted users to re-order items that are running low, based on sensors in cupboard shelves. However their approach stopped short of satisfying household needs. Verame et al. identified the "challenges of making agency delegation accountable to meal planning, persons' schedules, food-centred values, adaptation and innovation, and the social division of labour in which computational agency will ultimately be embedded" [57, p. 10]. Thus, they elaborate on Crabtree and Tolmie's point that "sensing domestic activity at a local level . . . raises real challenges for machine learning" [13, p. 1747]. It is this very challenge that our study seeks to explicate.
In sum, a proactive IoT system that supports grocery shopping must address: how and which data to collect to make sense of the product life cycle [39], how to compute metrics to drive a system that reliably reorders food 'in time' while considering human desires for variety [57], how to deal with the 'out-of-stock substitution' problem, but without generating food waste [50]. The gap we address in this work is to offer an examination of the contingencies that shape the uncertainty of everyday routine, and how this could be systematically handled in and through design.

TRACKING HOUSEHOLD ESSENTIALS
Following a home deployment approach [54], we now introduce the probe we designed to track household products in order to "consider food-related behaviours within the social environments in which they occur" [11]. We frame our deployment as a technology probe [29], in that it is designed to gather data through technology use in a real-world setting. Our probe consists of a barcode scanner and a web application to gather consumption data and participant feedback by means of the probe and contextual semi-structured interviews [7]. We chose the barcode scanner as it is familiar to most UK shoppers from self-checkouts in supermarkets. Our probe is in line with other home deployments that were used to understand the socio-technical challenges of how future IoT systems should be designed to meet the needs of everyday life [2,12]. Our work however does not propose a solution to the 'tracking problem'-the primary purpose of our deployment is to capture data about the consumption of products in the home. Thus, our work should be understood as an exploration of the problem space, supporting future work that rises to the challenge. Our work focuses on the contingencies that (potentially) explain variability in everyday use to support the design of proactive IoT systems that predict the use of products in the home, for example to 'replenish' items before they run out.

Home Essentials probe design
To develop an understanding of how different types of items are consumed in each household, we designed a probe allowing households to track the use of 'essential' consumables, from when they were brought home to when they were consumed or thrown away-this we deemed the 'cycle' of the product for our purposes. In order to practically constrain the scope of the study, we asked households to identify 10-20 items they considered 'essential' (these could be food and also other consumable products such as beauty or cleaning products). The probe deployed in each household consisted of a hand-held barcode scanner, connected via Bluetooth to an iPad with an external keyboard for easy data entry, running the Home Essentials web app (see Figure 1); the probe had the following key features.
Scanning. Participants use a standard EAN barcode scanner to 'scan in' items when they enter the home, and 'scan out' items once they were consumed or disposed of. A sheet with custom barcodes was provided for items that do not have a barcode. Each time a barcode is scanned the item is looked up from an open API [49], and our own product database. In case the product is not found the user is prompted to enter product name, brand, and size/weight and the data is added to our database.
Cycle prediction. The probe tracks each item's cycle to compute an average (mean) consumption time for each product, updated after each cycle. Average consumption was calculated when an item was 'scanned out'; we ensured it was paired with the oldest 'in stock' item to ensure that 'top up' shops completed prior to an item running out did not erroneously skew average consumption times. The purpose of using a mean-based algorithm is not to propose this as a solution, but to prompt participant feedback and to support interviews. We projected that certain items may be purchased more regularly than others, thus a mean-based prediction may be sufficient. However, other products might fall afoul of everyday variances in routine. Explicating the contingencies that establish the irregularity of purchasing and consuming products is the key objective for us.
User feedback. In the case of our predictions being 'incorrect' (i.e. more than one day between the predicted and actual consumption date), the probe prompts the user for an explanation by generating a request for feedback via the 'Inbox' tab of the web app. This feedback provides a critical data source in this work to understand the contingencies that make the computation of predictions of items in the home a challenge.
Grouping items. We grouped equivalent products, or 'substitutes' for each household, drawing on the idea of how in online grocery shopping, items which are out-of-stock are substituted for items which are deemed to be equivalent by a system. We used a heuristic to make decisions such as "is whole milk the same as semi-skimmed milk for this household?"-any future system that delivers predictions based on consumption may have to address this too. The difficulty in deciding whether and how similar products are equivalent is exacerbated by people's propensity to buy items from different shops, in different quantities, varieties, brands, and flavours. As we were not aware of any criteria that define equivalency we opted to take a relatively simple approach and only grouped items that had a different brand, but the same flavour/variety and the same amount, for example we would not group whole milk with skimmed milk, nor 4pts of whole milk with 2pts of whole milk. We would use this approach to provoke critical reflection from participants, allowing us to gain insight into how different households themselves made sense of equivalency when items were or were not grouped.
Predictive shopping list. We also designed a proactive 'shopping list' that brought together items that had been consumed and that would be consumed within the next seven days. In the latter case, this list was generated using predictions based on previous average time-to-consume of the tracked essentials. The list was optimised for mobile screens and it allowed households to add additional items manually, so that it could be used as an actual shopping list on the go. The purpose of introducing the shopping list feature would be to encourage participants to engage with the predictive elements to support the later interviews.

Implementation
We built a tablet-optimised web application. The website, shown in Figure 1, consisted of a tabbed interface covering the four main features: two tabs to scan in and scan out items respectively, an Essentials tab that listed items currently in stock, and those which were previously in stock, a Calendar view that showed when items were expected to run out, and an Inbox tab for collecting feedback when predictions were incorrect. The shopping list was accessible both from the main probe as a separate webpage, and through a special link that was accessible from mobile devices to allow participants to use the list while shopping.

THE STUDY
After our two-month study was approved by the university's Ethics Committee we hired a market research agency to recruit ten households from the local community. We instructed the agency to recruit a broad range of participants in the wider Nottingham area, who 'live and eat together', mostly families, couples, but also singles. We had to exclude shared homes as our probe was not designed to cope with multiple households in the same home.

Participants
Our participants' demographics are shown in table 1. H 07 was withdrawn after the interim interview due to insufficient engagement in the study, and thus was excluded from analysis.

Procedure
Prior to the commencement of the study, each household was asked to take part in a study that consisted of three interviews plus the use of our probe to track essential products in the home over two months. Each household was reimbursed in cash with £50 plus up to a further £50 based upon their use of the probe (each item scanned in or scanned out added £0.10 to the reimbursement total). During our first visit to each home, we collected demographic information; asked questions about existing shopping routines, and which items members considered 'essential' in the home. Following this, we provided an interactive demonstration of the probe. Periodically we reminded participants via SMS or WhatsApp to reply to questions sent to them in the 'Inbox' tab of our probe. We conducted an 'interim' interview after one month, focusing on the use of the probe within participants' routines, and to elicit initial feedback on the ideas of an autonomous system that predicted the cycle of essential items. At this stage we also introduced the 'shopping list' feature to participants. The final interview focused on the predictive elements of the probe including the shopping list, and factors that might have led to their routine varying through the deployment period. In this, we elicited household perceptions of which types of items they consider to be equivalent, and which they did not. Following the final interview, each household was debriefed and reimbursed.

Analysis
We employed a mixed methods approach [14] to combine our statistical analysis of data collected through our probe with insights from contextual interviews and feedback about predictions. We extracted event logs as well as computed predictions from our data, which to analyse participant engagement and prediction accuracy. We transcribed interviews and performed a thematic analysis, with three researchers working together. As discussed, our probe solicited users for feedback when a prediction was wrong. The 538 feedback statements collected through the 'Inbox' tab were iteratively categorised through affinity diagramming by two of the authors [34], with an experienced third author critically checking and advising on progress periodically. We strictly followed our participant's own categorisations of whether using the item up early or late can be best explained through, for example, 'routine change' or 'normal use', testing each statement against our broad categories to determine best fit.

FINDINGS
We briefly present the findings on how each household described their shopping practices, and the use of the Home Essentials probe throughout the deployment. We then examine the contingent nature of household consumption, including the (lacking) accuracy of simple predictions, and the contingencies that householders used to explain why predictions were wrong. Finally, we examine the 'equivalency problem'.

Existing shopping practices
Across our participant households, the female adult was responsible for shopping, although three homes mentioned another household member's help. The variety of types of shops included supermarkets (all), cash-and-carries (H 03 , H 08 ), convenience stores (H 04 , H 10 ), specialist stores (H 01 ), and markets and independent traders (H 09 ). All households had 'club cards' for various saving schemes. Eight went shopping multiple times per week, especially to 'top up' when items were running low. Trips would frequently take place after work/dropping children off at school (H 02 , H 03 , H 04 ). Three households said they had a weekly routine (H 05 , H 06 , H 08 ). Most households used lists, except H 01 and H 10 ; only H 08 made use of a digital list on their phone (others used paper lists).
Regarding routines, there was a range of patterns: H 03 check their stock to produce a list, H 02 , H 05 , and H 08 maintain a list in a shared space (e.g. on the fridge), whereas H 04 and H 06 plan a week's meals before shopping; H 09 and H 10 use "mental lists". Some would simply buy items when they ran out (H 01 , H 02 , H 04 , H 10 ), some stated they overstocked on certain items to ensure they didn't run out (H 02 , H 03 , H 08 , H 09 , H 10 ), and some said they frequently checked whether items were needed before shopping (H 03 , H 05 , H 06 , H 08 ).

Engagement with the Home Essentials probe
We logged all interactions with the probe to gauge engagement. As well as providing an insight into the use of the probe, it also served as a point of reference for the later interviews. The probe was least used in H 06 (every 1.56 days) and most in H 02 (0.03 days) and H 08 (0.29 days). A linear regression confirmed that larger households used the probe more frequently, the number of occupants and the mean time between interactions correlated significantly (F (1, 7) = 12.51, p = 0.010) with an R 2 of 0.641. A pairedsamples t-test was performed to check whether engagement (in terms of number of scans) changed from the first half of the deployment to the second half. There was no significant difference in the mean number of scans for the first half (M = 172.89, SD = 78.89) and the second half We asked participants to input 10-20 items into the probe, but because products of different varieties and brands have unique barcodes, we recorded a total of 630 products across all households, from 1549 scan ins and 1304 scan outs. Table 2 lists number of scan events and items (in total, with at least one cycle, and with at least two cycles) per household. The number of items is derived after grouping based on the heuristic stated above. We now consider key elements of the probe: whether and how participants made use of the predictions and the predictive shopping list. Did participants use predictions? Predictions were presented to users in the 'Essentials' view that listed 'in stock' and 'out of stock' items, and in the 'Calendar' view that listed dates at which items were predicted to be consumed by ( Figure 1). As expected, participants concurred that the probe would be too onerous to track the stock level of items in the home to be viable alternative to their existing routine. This routine often included 'looking and checking' which items were needed buying prior to shopping trip, echoing others who have studied shopping practices (i.e. [30]). Some participants did make use of the stock level in the probe as an aide-mémoire: "it reminds me of what I might have in that I might've forgotten about" (H 03 ). Yet, purchasing decisions were handled with business as usual without the probe predictions: "it doesn't replace actually walking around the store and having ideas when you have special offers or something new in just for a few weeks" (H 10 ). When asked about the accuracy of predictions, participants felt that the probe worked better for most regularly used items, for fresh fruits and vegetables (H 05 , H 08 ), for items used quickly or consumed "on the go" (H 04 ), or for items bought at the same place (H 08 ). On the other hand, participants reported that predictions didn't work well for items that take longer to consume or for items that were not used frequently i.e., lentils in large packages (H 03 ), for toiletries, for frozen items, and for tinned food. Also, routine changes have an impact on the accuracy, e.g. increased consumption due to being at home during holidays (H 10 ). We analyse the accuracy of predictions after reflecting on the predictive shopping list.
Was the predictive shopping list used? During the interim interview, we introduced the predictive shopping list generated from participants' prior use of the probe. All households made use of the feature at least once, although there was consensus amongst the participants that the use was of limited value to them compared to their preexisting routines (of no list, paper list, or a note stored on a mobile phone; as discussed above). In one instance the predictive list was used as a 'backstop' to identify forgotten items; a participant "didn't bring a list that time and . . . thought: oh, what else do I need?" (H 05 ). H 02 also remarked how they used it prior to a shop to check for any items they had forgotten from their paper list. On the whole, however, the shopping list was mostly not used.

Understanding household consumption
We now turn to examine the contingent nature of household consumption. First, we demonstrate shortcomings of simple predictions. Second, we unpack participants' statements why predictions were wrong to reveal the contingencies driving consumption. Prediction accuracy was computed in terms of prediction error for the 214 items that had at least two 'cycles', i.e. scanned in and out two or more times. Following the first scan out of an item, the predicted next time-to-consume was equal to the time-to-consume of the item. On each successive scan out, the prediction was updated based on the mean of all prior time-to-consumes. The prediction error is calculated as the measure of how many days early or late the scan out is compared to when it was predicted to occur (using the mean of all time-to-consumes up to, but not including the scan out).
How accurate were the predictions? We first consider how the probe fared across households. The average error across all products was 4.43 days, but this varied considerably between households. Figure 2 shows box plots of the distribution of accuracy errors by household (whiskers denoting interquartile range). H 01 , a single occupant, had the greatest mean error of 11.29 days (σ = 6.75, N = 34). When asked about the disadvantages of using the probe, she remarked that she doesn't do ". . . a regular shop and it's just me, then I just do as and when" (H 01 ). This suggests a limitation of a future system-that some peoples' lives provide limited regularity in their shopping and consumption habits-may lead to less accurate predictions. Conversely, H 09 featured a family and from our small sample of households, was the one who engaged with the probe the most in terms of number of products, yielding a mean error of 2.23 days (σ = 2.80, N = 60). This household used a paper list of items scanned into and out of the probe, marking items off when they were consumed to ensure data validity. A linear regression also confirmed participants' statements that the accuracy was better for products used up more quickly, for the items that have at least two cycles in our data (N = 214). A significant regression equation was found (F (1, 212) = 193, p < 0.01), with an R 2 of 0.48, a strong correlation suggesting 48% of the variability in the prediction error can be accounted for by the mean time-to-consume in our data.  Does accuracy increase with product cycles? To answer the question whether prediction accuracy increases with more product cycles we ran a simple linear regression, again on the items that have at least two cycles in our data (N = 214). A significant regression equation was found (F (1, 212) = 4.92, p = 0.028), with an R 2 of 0.02, meaning only 2% of the variability in the accuracy (i.e. prediction error) can be accounted for by number of cycles in our data. The error decreases by 0.2 for each additional product cycle. This result can be interpreted to mean that even with 'more data' (i.e. more product cycles) the prediction accuracy does only improve marginally over time; thus this result supports our premise that simplistic computational predictions of this kind are generally inaccurate.
Why were predictions wrong? To understand the contingencies that contributed to why predictions were wrong we prompted users to provide feedback when items were scanned out earlier or later than the predicted time. After cleaning the data from duplicate comments on grouped items, we obtained 538 feedback statements, 386 (72%) of these were related to items scanned out earlier than predicted. The other 152 (28%) of the statements were obtained for items scanned out later than predicted. We now present the categories that emerged through affinity diagramming of the statements, apples, carrots, cheese, coffee, easy-peelers, egg, juice, lettuce, milk, mushrooms, soft drinks, sugar, tinned tomatoes.
"Kids had a friends sleepover so had big brekkie" (bread, H03), "Had a week of staying home with friends visiting so more hot drinks than normal" (sugar, H09) Snacking (32) H02, H03, H04, H08 apples, bananas, bread, carrots, cucumber, grapes, milk, olives, pears, sweet chilli, tomatoes, yogurt "Used for overnight porridge" (yogurt, H06), "All had chopped pear as snacks" (pears, H02), "Easy snack and used for lunch" (apples, H04) Cleaning day (16) H02, H03, H05, H08 cif, dryer sheets, kitchen towels, liquid, soap pads, washing up, wipes "More mess to clean up with children at home!" (wipes, H08), "Used for cleaning up builders' mess" (wipes, H02) Shared at work (11) H02, H06, H10 cherry tomatoes, coffee, coleslaw, cucumber, milk "I take coffee to work and had used it up there" (coffee, H10), "Took to work and used for lunch with colleagues" (coleslaw, H06) Routine change (70) Away/Holidays (16) H01, H02, H08, H09 baby wipes, bananas, cider, juice, kitchen towels, peanut butter, soya milk, toilet paper, toilet wipes, yogurt "They are being eaten by my son too as he is off school. " (yogurt, H08), Packed lunch (16) H02, H05 apples, bananas, bread, cereal bars, cucumber, grapes, tomatoes, yogurt "Lots of packed lunches" (bread, H02) Irregular work patterns (10) H02, H04 bacon, baked beans, apple, bread, bananas, yogurt "Busy days and something quick and easy" (baked beans, H04), "Mum working, packed lunches early shift pattern sandwiches" (bread, H02) Seasonal changes (10) H02, H04, H09 bread, chicken, cucumber, kitchen, milk, olives, shower gel, towels, water "Humid weather = thirsty household" (milk, H09), "Hot weather, more salad eaten" (cucumber, H02) Preference (23) Enjoyed (5) H01, H02, H04, H09, H10 milk, muesli, pasta, peanut butter, yogurt "Enjoyed it last time we had the meal and easy to cook" (pasta, H04), "I ate this in one sitting" (yogurt, H09), "We just like drinking it" (soya milk, H10) Favourite items (16) H02, H06, H09, H10 cheestrings, juice, ketchup, malties, milk, muesli, pears, sausages, yogurt "My son drinks a lot of this so always use early" (juice, H06), "Favourites eaten at breakfast and tea" (yogurt, H02), "Daughter is addicted" (cheestrings, H09) Loc. (14) Convenient (14) H02, H09 wipes and toilet tissue "This pack kept in downstairs loo which girls use more frequently" (wipes, H02) with associated products and participants' comments by subcategory. While eight categories emerged, we only have space to show details for the five most frequent. Table 3 shows the top five categories of contingencies for products scanned out earlier than predicted. The most cited reasons for early use really just explain 'normal use' (133). This category captures explanations of ordinary day-to-day use by our participants. The most frequent subcategories included references to use in a recipe (43), unsure why (40), using more than usual (26), and used up in batch cooking (13). 'Sporadic events' was the second-most frequent category (115). Although these could be considered day-to-day too, participants distinguished these as irregular occurrences, such as guests visiting (32), snacking (32), cleaning day (16), and sharing at work (11). Comments explaining early use by 'routine changes' fell into the third most frequent category (70), capturing changes to routines that commonly follow a pattern, including occasional packed lunches (16), irregular work patterns (10), and seasonal changes (10) being the most cited reasons. Routine changes are a particular challenge for systems in multi-occupancy households. For example, some members may work full time and take a packed lunch every day, some work part-time or flexible hours and sometimes make packed lunch, and kids sometimes take a packed lunch or may rely on school meals. 23 statements expressing personal 'preference' included reference to enjoyment of items (5) and favourites (16). 14 statements referred to a convenient 'location' contributing to items' early use. Not in the table are statements pertaining to the (non-)use of the probe (17), product freshness (9), and product quality (5). Overall, consumption increase typically accounts for earlierthan-predicted use but a few relate to imprecise use of the probe (17). Table 4 shows the five most frequent (sub-)categories of contingencies for products scanned out later than predicted. The most cited reasons explain a decrease in consumption by reference to routine change (49). Being away / on holiday counted for the most frequent single subcategory (30), followed by: illness (5) and fasting (5)-one of our families were observing Ramadan. Statements referencing normal use came second (48), including: unsure why (17), using products alongside others (9) or less than normal (8). People also referenced personal preferences (19), such as 'not wanting' items (11), or changing their mind (8). Reasons on the freshness of products were also cited (17), with people deciding to freeze products to prolong their lifetime (6), and people still consuming products after their usual lifetime (6). People also referred to the location of products to explain non-consumption (17); inconvenient location contributed to using the product less (5), and contributed to forgetting about the product (2). Further categories not captured in the "Not eaten as normal as been away" (cheese, H05), "Spent two nights away from home so partner forgot to feed him" (cat food, H09)

Location (7)
Loc. inconvenient (5) H02, H09 salt & vinegar squares, toilet wipes "Kept in upstairs loo which kids use less often" (wipes, H02), "Hidden at the back of cupboard" (squares, H09) Forgotten about (2) H04, H05 apples, sweetcorn "Forgot to take them to work" (apples, H04) table due to space include statements on sporadic events (7), the probe (5), and product quality (1). Overall, the contingencies people bring up-and our categorisation of them-reveal and systematise the reasons for why everyday use of household products is variable. The statements together are testament to the need for variety when it comes to food, echoing previous findings [57]. The following set of results sheds more light on this topic, by examining the aforementioned equivalency problem.
Which products are 'the same' as others? There is a mismatch between how people categorise their products in terms of classes and a barcode-based approach that knows only individual items (defined by brand, flavour, and size). To recap, we followed a heuristic to group equivalent items for the participants to enable predictions on equivalent products (e.g. 'same' products, but different brand/source), but we did not go as far as to group classes of product (e.g. milk, tomatoes, etc.). We interviewed people about our grouping not only to check our assumptions, but to learn about the equivalency problem in more detail. Specifically, we picked out two items from their history and asked whether they saw them as equivalent. Generally, our heuristic approach was more conservative than people's views on equivalency. A thematic analysis of the transcribed interview data revealed four overall features individuals use to consider items as equivalent.
→ Product class. People were more lenient in referring to products as belonging to the same class than our heuristic. Six households' statements suggest that people often (but not always) see items as equivalent despite variation in some of their attributes, including colour, brand, varietytype, and size. In other words, some vegetables and fruits are considered equivalent substitutes, as well as some items that do not differ apart from amount, for example: "I think grapes should be in a whole sort of grape group. I don't think the colour of them should make any difference to what it actually is . . . if we went to a shop and it was my only shop I was going into, if they didn't have red grapes I'd still probably buy green grapes" (H 04 ). When asked about different kinds of lettuce, H 08 replied: "So they're, kind of, interchangeable for me, but maybe not for someone else." (H 08 ). However, it seems that any apples are apples for H 02 : "Yes, those, they are different, obviously, but I don't really care which, as long as we've got some apples, but they're the two I would go for" (pink lady and braeburn apples, H 02 ). However there is also a clear preference for just these two kinds of apples at play.
→ Product attributes. Three households identified particular features of products that are relevant for considering items as equivalent (or not). Apart from the ones mentioned above, these include (1) purpose of use, e.g. tomatoes for snacking are different from tomatoes for daily use, and 1pt of milk can be for sharing at work while 4pts of milk is for the home; (2) source, e.g. products from the market are different to the same products from the supermarket; (3) use by date; (4) freshness of products; and (5) packaging, e.g. there is a difference between items bought from the market and the supermarket. Products from the supermarket usually last longer than items from the market: "In Tesco you would buy your grapes out of a fridge whereas on the market they're just sat there. So, to group them as one wouldn't give a true reflection in the long run . . . because the market ones . . . need to be eaten within a day or two . . . I think the ones from the supermarket are a lot fresher. They're in a proper container, . . . from the market you just sort of eat them on the day and whatever's left [you] get rid of. " (H 09 ) → Shop/store. Four households mention that the shop items are bought from has an impact on whether substitute products are bought that are seen as equivalent: ". . . it just depends which shop I'm going to that week. So, I normally shop at Aldi, but I might go into Tesco if I hear that they've got good offers on and then just pick up a bag of carrots. So, it's . . . no difference in the product or anything like that" (H 05 ). As in H 05 's statement, 'shopping around' is a common feature of shopping, which can be influenced by a shop having 'offers on' as in this case, alongside myriad other factors including location, specific product needs, etc.
→ Brand loyalty. A household explained the differences by references to brand loyalty, such as the perceived quality of brands. Previous experience of the product quality are a factor commonly invoked in the perception of item equivalency, which can be moderated by price, offers, discounts, and budget: "That's one thing with the capsules, I never buy their own brand. Because I've bought in the past and I don't like them, so I always buy a brand that I've used and I like. " (H 03 ) Overall then, our brief foray to shed light on the equivalency problem shows that whether products are the same is contingent on a range of factors, such as whether people judge the items to be of the same product class, which shop the items were purchased from, and a host of product attributes, including brand, purpose of use, source, use by date, freshness, and packaging. Judgements of equivalency are idiosyncratic and are perhaps better left for households to decide. This has implications that we discuss in turn.

DISCUSSION
The findings of our study provide an understanding of household product consumption that can be used to inform the design of IoT-based services that leverage consumption data (predictions on lifecycles, groupings of items, etc.), for example to automatically replenish products. Our findings demonstrate the challenges of simple linear predictions, although prediction accuracy is higher for more quickly consumed products. Our analysis found that prediction errors were frequent and the finding that only 2% of the variability in accuracy can be explained by the number of cycles in our data shows how important it is to consider the contingent nature of variability. With the most frequent explanations for prediction errors relating to 'normal use' (34%), 'routine changes' (22%), and 'sporadic events' (23%), this strongly suggests that variability-and the contingencies that drive it-is part of the usual everyday consumption of household goods. The range of factors that shape perceived product equivalency lend further weight to this finding. When it comes to food, the saying 'variety is the spice of life' really does ring true. What our probe has revealed is a spectrum of contingencies that may not have been apparent in prior work that has exposed users to predictions on aggregate data from single sources, such as electricity [2,21]. While electricity consumption is (mostly) needs-driven, food consumption, while just 'subsistence' at its most basic, is arguably more driven by experiential, sensual, and physical desires, which are personal, varied, intricate, and changeable.
For the designers of predictive systems then, the variability and contingencies of product consumption are at the root of uncertainty in the system. Uncertainty about when products are really used up, uncertainty about preference changes or sporadic events, uncertainty about when products will be replenished (restocked), uncertainty about what counts as a substitute, etc. The question of interest to HCI is, what can we do by design in the face of this uncertainty? There are specific solutions, such as displaying system confidence levels to the user [58], but the aim here is to discuss what can be done strategically. Uncertainty has been a long-standing challenge for the design of interactive systems (e.g. [4,23]), and thus, we turn to the literature to reflect on suitable design strategies.

Design strategies for dealing with uncertainty
We adopt Benford et al. 's design strategies for dealing with uncertainty [4], the only framework we found that brings together design strategies with a user perspective in this space. Thus, it provides a perspective that system designers should be able to inherit to build better systems. These five strategies have been proposed in the context of a location-based system, but we find them applicable to our work. The five design strategies include removing, hiding, managing, revealing, and exploiting uncertainty. We reflect and speculate, using as example the contingencies identified, on how each applies to the design of systems that leverage these predictions to drive services such as automated replenishment of goods.
→ Removing uncertainty. One strategy is to try to remove uncertainty from the user's perspective by including more sensors, data sources, and algorithms. For example, weighing scales could allow for more fine-grained predictions through continuous sensing [10], and connecting to calendars could make those 'routine changes' available for system reasoning, for example "school holiday starting next week", or sensing "location of products" inside cabinets, fridge or containers could help with identifying items forgotten about or hidden. While Machine Learning (ML) solutions could improve the 'smarts' of the system, one of the challenges for ML-based solutions to be viable for idiosyncratic use contexts such as household consumption is to take into account the contingencies that drive consumption, such as personal preferences for substitutes, brands, variety etc. It would seem then, that many problems remain to be solved before fully removing the system uncertainty by means of ML alone.
→ Revealing uncertainty. A further design strategy to deal with the system uncertainty would be to reveal it to the user. Design solutions following this strategy would emphasise the user's responsibility to act on the uncertainty. Designers may find inspiration in a related bodies of work on intelligibility and explanations in context-aware systems [33], recommender systems [52], and explainable AI [45]. The work shows however that providing explanations does not always make the system intelligible, and that the nature of the explanations matter [33]. In the context of an auto-replenishment system for household goods however, surfacing the system uncertainty of whether an item should be reordered may help addressing the problem, for example, the system could ask "school holidays coming up, I checked your travel booking and you won't be at home next week, do we need to cancel the order, I'm not sure?".
→ Hiding uncertainty. Alternatively, the uncertainty could be hidden, for example in instructions, or in subtle suggestions, or in declarations of intent. For example, the system could state "school holidays coming, I'm going to order more yogurt and fruits". A prior study of home automation for religious reasons has shown that people were happy to surrender control of their routines to the home automation system [59]. However, it is probably unlikely that one could simply 'instruct away' uncertainty; prior work has shown that instructions in informal settings such as location-based games have led to tensions, misunderstandings, mistrust, deliberate disobedience, and thus an overall lack in compliance [38,53]. Although compliance may of course be different in more formal, professional, or operational settings, people at home are first and foremost accountable to the social fabric of the home, and not to some external agency [55].
→ Managing uncertainty. A further design strategy is to let the user manage the uncertainty, by enabling the user to review and intervene in system actions; for example, addressing "Preferences" for favourite items, "I've ordered extra cheestrings for Ana, let me know if you want to change it". Such an approach is common in mixed-initiative systems [27], and more recently in interactive ML [18]. An interactive system is central to this approach; related exemplars range from online settings that systems take into account when automatically switching energy tariffs [2], to a booking system so that an agent can optimise battery charging for the time-of-use [12]. These works implicitly acknowledge that bringing the user's reasoning into the loop is the key strategy to respond to the contingencies of everyday life.
→ Exploiting uncertainty. The idea that uncertainty, or ambiguity can be exploited in creative ways in design has been an influential one in HCI [23]. As the focus of design in HCI has moved from the workplace to less formal environments, design values have broadened from utility, accuracy and performance, to include values such as experience, creativity, and surprise. While the notion of automation in the home may resonate with a utilitarian perspective, this does not need to be its exclusive focus. What if, for example, the system would just suggest new recipes based on available items? Prior work found participants enjoyed the experience of receiving unknown items in a vegetable box, motivating them to experience new flavours, new recipes, or be creative with their existing repertoire [57]. In a similar way, a service providing automated replenishment of household goods may, to some extent, provide unexpected items, potentially contributing to joyful experiences of surprise that lead to creative use.

Reflections on the probe
Our findings on engagement with the probe shows higher prediction accuracy for those that use the system more regularly; we found it worked better for households with more than one person, and even better for households with children. Regular engagement with a tracking system is important as it affects the accuracy of the predictions, which highlights the trade-off between accuracy and cost (e.g., in terms of time spent). An analysis of engagement over the 2-month period confirms that engagement did not decrease significantly; however, our findings are limited in that we employed probe-specific measures such as reminders and payments-per-scan. Thus, future work to develop viable tracking solutions is likely to encounter user engagement as a key challenge. Designers would benefit from knowing their audience, incentivising regular use, and communicating the system's scope and limitations. Our findings also confirmed that scanning barcodes would be too onerous to track products, suggesting prior efforts to develop alternative 'sensing' approaches should continue, such as scanning shopping receipts [36], cameras placed in bins [50] and the fridge [22]; initiatives could also build upon supermarket loyalty card datasets (while safeguarding privacy).

CONCLUSION
This paper has presented a mixed-methods technology probe study in ten homes to track the consumption of home essentials. Our findings reveal the underlying contingencies of everyday life that shape the 'cycle' of household goods in the home. Our analysis has identified eight categories of contingencies participants used frequently to explain why items were used up earlier or later than predicted by a simple linear algorithm; these categories were routine changes (e.g. holidays), sporadic events (e.g. guests visiting), preferences (e.g. enjoyment), location (e.g. forgotten about), normal use (e.g. batch cooking), probe (e.g. forgot to scan), product quality (e.g. different brand), freshness (e.g. frozen). We also found whether products are seen as equivalent to others is contingent a range of factors including product class, where it was bought, and a host of product attributes. A take-away is that these contingencies drive the variability that is part and parcel of everyday product use and thus need considering in design. We discussed the implications of the fundamental uncertainty these contingencies create for systems aiming to predict people's consumption. We suggest and outline five strategies including removing, hiding, revealing, managing, and exploiting uncertainty that designers can adopt to develop proactive IoT services for the home, such as automatic replenishment of groceries.