To Please in a Pod: Employing an Anthropomorphic Agent-Interlocutor to Enhance Trust and User Experience in an Autonomous, Self-Driving Vehicle

Recognising that one of the aims of conversation is to build, maintain and strengthen positive relationships with others, the study explores whether passengers in an autonomous vehicle display similar behaviour during transactions with an on-board conversational agent-interface; moreover, whether related attributes (e.g. trust) transcend to the vehicle itself. Employing a counterbalanced, within-subjects design, thirty-four participants were transported in a self-driving pod using an expansive testing arena. Participants undertook three journeys with an anthropomorphic agent-interlocutor (via Wizard-of-Oz), a voice-command interface, or a traditional touch-surface; each delivered equivalent task-related information. Results show that the agent-interlocutor was the most preferred interface, attracting the highest ratings of trust, and significantly enhancing the pleasure and sense of control over the journey experience, despite the inclusion of 'trust challenges' as part of the design. The findings can help support the design and development of in-vehicle agent-based voice interfaces to enhance trust and user experience in autonomous cars.


INTRODUCTION
Autonomous, self-driving vehicles are expected to revolutionise everyday travel with anticipated benefits of improved road safety, efficiency, comfort and mobility. First experiences are likely to be in driverless 'pods' that operate in contained, 'geo-fenced' environments (university campuses, airports etc.) [1], with several examples already in existence. Nevertheless, major concerns have been expressed regarding the public's willingness to adopt the technology [2], in particular, relating to issues of trust and the overall user experience [3,4].

Trust and Driver Acceptance
Trust in technology is considered to be the extent to which people believe that technology will perform effectively and without a negative or injurious outcome [5]. Trust therefore shapes an individual's attitudes and ultimately determines their behaviour, such as their intention to use the system [6,7], the extent to which they rely upon the technology and their operational strategies towards its use [8,9]. Intertwined with trust is the concept of acceptance. Acceptance has been defined in the automotive domain as "the degree to which an individual incorporates the system in his/her driving, or if the system is not available, intends to use it" [10] (p.18). The determinants of drivers' trust and acceptance are thus complex and interrelated, and derive from various factors, including the individual's understanding of the system limits and the context in which it is implemented [11].
Various theoretical models have been proposed to define and evaluate acceptance (for example, the Technology Acceptance Model, TAM) [11]. While these were originally developed in the information technology domain, they have been widely adapted for other contexts, such as driving, with most models now incorporating additional factors. In the context of autonomous driving, relevant factors include: the degree to which users can predict and understand the operation of the vehicle (system transparency) and the degree of user perception on the performance of the vehicle or technological component (technical competence) [12]. In addition, factors such as reliability and dependability can affect trust and acceptance in relation to automated vehicle technologies: reliability is defined as the ability of a device or system to perform a required function under stated conditions for a specified period, whereas dependability refers to the frequency of automation breakdowns or errors [13]. Errors (such as false alarms) can therefore have a profound effect on users' perception of reliability and dependability. For example, a reduction in the occurrence of errors notably increases drivers' perception of the reliability of the system and positively impacts trust development [14].
Privacy and security concerns have also been highlighted as potential barriers to the trust and acceptance of autonomous vehicles. Privacy is linked to the handling of personal data, for example, ensuring that the user knows which data are being collected and how they will be used [15], and typically centres around two aspects: unauthorized access due to security breaches or the lack of internal controls, and the risk of secondary use, that is, the re-use of personal data for unrelated purposes without the user's consent [16]. In contrast, security refers to the technical guarantees that ensure measures against threat of intentional attack on systems/software, and the legal requirements and good practices with regard to privacy, are met [15].
A common ground in trust and acceptance research is that human behaviour is not determined by objective factors, but rather by the user's subjective perceptions, based on their individual attitudes, expectations and experience [7]. Thus, even a well-designed system that evidently performs effectively and without inflicting a negative or injurious outcome, may not necessarily warrant a user's trust or acceptance. As such, we look for guidance to social psychology (from which our understanding and operationalisation of trust-in-technology originate). Here, trust is defined as: "a psychological state comprising the intention to accept vulnerability based upon positive expectations of the intentions or behaviour of another" [17] (p.395). In other words, trust is a belief by a person in the integrity of another, and centres around 'human' factors such as benevolence and honesty [18]. In practice, this means that humans calibrate trust in another by making attributions based on personal qualities and characteristics, often identified through speech and conversation.

Speech, Conversation and Talking Technology
Philosophical debates identify speech as one of the quintessential marker of humanness [19]. It is the primary means of social identification amongst humans, and implicates more parts of the brain than any other function [20]. Speech is peppered with salient, socially-relevant, cues, above and beyond the lexical content, that humans quickly become experts at extracting and comprehending based on vocal characteristics such as pitch, cadence, speech rate and volume: these are subsequently used to provide systematic guidance for determining gender, personality and emotionspecific actions, such as who to like and trust [21,19].
When formed as conversation ("any interactive spoken exchange between two or more people"), speech serves many purposes, which can be broadly categorised as either transactional (task-based) or social (interactional) [22]. Although these may overlap within natural conversation [23], transactional conversations pursue a practical goal, whereas the more social features of conversation aim to build, maintain and strengthen positive relationships with the other interlocutor(s) [24,25]. Social conversation therefore includes aspects such as greetings and small talk that can help develop common ground [26], trust and rapport between interlocutors [23].
A common proposition in HCI is that humans appear to lack the wherewithal to overcome instinctive behaviours, and interact with a talking computer in a similar manner to talking to another human, demonstrating humanlike behaviours and making similar attributions [19]. For example, different digital 'personalities', created by varying the vocal characteristics and language content of spoken language interfaces, have been shown to influence trust, performance, learning and even consumers' buying habits during research studies [19]. Similar effects have been noted in the automotive domain, with participants recognising unique 'personalities' associated with different voices employed to deliver navigational instructions, even though the content remained the same: this influenced their attitudes towards the navigational device, including how much they liked it, their preferences for use, and the level of trust that they associated with it [27]. In social robotics, conversational interaction has been highlighted as an important factor in ensuring long-term rapport building and use [28]. The accommodation of social conversations therefore appears to be a critical factor in developing trust in a social agent [29].

Anthropomorphism
Attributing human motivations, characteristics or behaviour to inanimate objects, such as talking technology (as observed in the aforementioned examples) -and building expectations on the basis of this -is evidence of anthropomorphism [30]. Although the term has been used pejoratively in science when novelty features (such as a human face) have been added to non-human entities, anthropomorphism is not simply the titivation of artefacts with superficial human features and characteristics, but rather the process of inductive inference whereby people are inspired to believe (at some level) that the artefact has capacity for rational thought (agency) and conscious feeling (experience) [31]. This is typically inspired by peoples' experience of these features and characteristics (voice, social behaviour etc.) in humans.
Indeed, passengers of an autonomous vehicle that was anthropomorphised (given a human name, gender and voice) rated their vehicle as having more humanlike cognitive capacities than those who occupied a vehicle with the same autonomous features but without the associated anthropomorphic cues [32]. Participants in the study also reported trusting their vehicle more, were more relaxed in an accident, and blamed their vehicle and related entities less for an accident caused by another driver [32]. In addition, increases in the level of trust in automation were noted following the addition of anthropomorphic features (voice and gender) to an existing audio-visual Human-Machine Interface (HMI) during conditionally-automated driving [33]. In this simulator-based study, the speech interface explained what the vehicle was intending to do during handovers. The value of using anthropomorphism to improve system transparency (that is, what the vehicle is doing and why) was also highlighted by Miglani, Diels, and Terken [34] who predicted associated increases in the level of trust.
Other authors, notably Eriksson and Stanton [35] have postulated the importance of employing a conversational agent HMI (their so-called "chatty co-pilot") -drawing upon the aviation literature -as an effective and natural means of providing calibrated trust to human users of a conditionallyautomated vehicle. The suggestion here is that the chatty copilot encourages appropriate levels of trust relevant to the system capability and reliability, thereby ensuring the system is used (appropriately) and complacency effects are minimised.

Study Aims and Scope
Evidently, there is considerable research exploring the use of human characteristics -especially conversational speechwithin automotive HMIs to leverage human qualities and capabilities and embody these within the technology. This has been inspired to a large extent by the recent proliferation and popularity of personal devices like Amazon Echo and Google Home. While the research has generally revealed positive effects associated with talking technology during simulated driving (to enhance the subjective evaluation of the vehicle, and improve users' behaviour and performance), this study is the first to investigate how an anthropomorphised agent-interlocutor employing natural, conversational language could influence trust and the user experience in a genuinely autonomous 'pod' vehicle. It builds on the work of Antrobus et al. [36], who explored the use of a natural language interface to improve the trust and acceptance of an autonomous taxi in a driving simulator. It is worth highlighting at the outset that this is not an evaluation of technology that has been deliberately designed to hold emotional intelligence or influence users' emotions. Instead, it is a controlled experiment motivated by the psychological construct of anthropomorphism, which is subsequently used as a theoretical basis to interpret the observed behaviour. The study therefore aims to expose natural, instinctive behaviours and opinions that were motivated by the presence of a voice, and the interaction style it affords, and it is therefore also highly relevant to current trends and research interest in conversational user interfaces.

Use-Case and Script Development
Prior to undertaking the driving study itself, a series of focus groups were conducted to determine the content and delivery of the 'anthropomorphic agent' dialogue, and to explore any concerns that people may have when using such an interface in an autonomous vehicle. In light of the findings from the focus groups, and in consideration of the limitations of the experimental setup, we decided to focus on three use-cases: entertainment (news, sporting headlines, music etc.), office/scheduling (managing and creating to-do lists, calendars and emails) and system notifications (requests for vehicle diagnostics etc.). For each of these use-cases, we devised a variety of predetermined opening gambits and responses in order to form the basis of the anthropomorphic agent's conversational exchanges, and the style with which to deliver these.

Participants
A representative sample of experienced drivers were recruited to take part (n=34), comprising 17 male and 17 female participants. Ages ranged from 21-58 years with a mean age of 40, driving experience from 3.5-40 years with a mean of 20.6 years, and self-reported annual mileage from 6,000 to 60,000 with a mean of 15,691. All participants were employees of Jaguar Land Rover (JLR), but primarily from non-technical roles, i.e. administrative/support.

Apparatus
The study was conducted in the Urban Development Lab (UDL) indoor testing environment in Coventry using a driverless pod supplied by the RDM Group ( Figure 1). The pod operated fully-autonomously throughout the study. The area was presented as an urban scenario, with shop fronts etc. projected onto timber constructions and perimeter screens to emulate commercial premises etc. The layout enabled multiple routes to be followed. During the study, participants were recorded using a GoPro camera for subsequent analysis, as well as a second camera for immediate streaming and observation. A small booth on the edge of the testing area housed a professional actor, who delivered the anthropomorphic agent dialogue in real time, for example in response to conversation initiated by participants, using a Wizard-of-Oz approach [37]. Participants were told that they were interacting with a prototype highly-capable, conversational interface; they were not aware that they were conversing with another human. In addition, a Microsoft Surface touch-screen tablet with a bespoke PowerPoint presentation was installed in the vehicle, although this was not used with the conversational user interface.

Experimental Design
To ensure a thorough investigation, three interfaces were evaluated in a counterbalanced, within-subjects design: touchscreen, voice command and the anthropomorphic agent interlocutor. The voice command condition was included to explore whether any observed differences were as a result of the communication modality (voice versus touch) or due to the 'anthropomorphic' nature of the speech interface. Each interface provided equivalent task-related information relevant to the use-case under examination at set intervals throughout each drive, although the scope to engage in further interactions naturally differed between interfaces.

Touchscreen
A touchscreen interface was installed in the centre of the pod. Participants were able to interact with this using a bespoke PowerPoint presentation navigable via embedded hyperlinks, to appear as a fully-functioning, interactive HMI.

Voice Command ('AutoCab')
Participants were required to interact using specific voice commands relevant to each use-case. System responses were delivered in real-time by the actor using a Wizard-of-Oz approach. However, the actor was restricted in the range of responses available, and was instructed only to respond to correctly formatted commands (as would be expected in commercially-available voice-command technology). To aid participants, the range of usable commands was provided as a static, visual prompt on the touchscreen, which also remained present during this drive.

Anthropomorphic Agent-Interlocutor ('UltraCab')
Participants were able to interact with the anthropomorphic agent using free-flowing, conversational language. System responses were conversational in nature, based on the script informed by the focus groups, and also included more personable qualities (use of the first-person pronoun, participant's name, politeness etc.) in line with findings from both the focus groups and previous work [38,32]. Responses were composed and delivered in real-time by the actor using the Wizard-of-Oz approach.

Procedure
Following an initial briefing and the collection of demographic and background data (including anthropomorphic tendency [39]), participants completed three drives, with each drive relating exclusively to a different HMI. Each drive lasted approximately eight minutes and followed one of three different routes (HMIs and route-selection were counterbalanced between participants). To explore issues of reliability, participants were presented with a 'trust challenge' towards the end of each journey, inspired by the methodology employed by Antrobus, et al. [36]. For example, the pod abruptly stropped, claiming to have detected a pedestrian in the road ahead, and informed the participant that the journey would resume once the pedestrian had moved; in practice, the roadway ahead was clear. Security and privacy 'challenges' were also apparent throughout each journey in the form of requests to access participants' personal email accounts and diaries etc.
Following each drive, participants completed the trust-inautomation rating scale [5]. To measure the user (affective) experience, they completed the Self-Assessment Manikin (SAM) [40]. The SAM is a non-verbal, pictorial assessment technique that operationalises user experience through the constructs of pleasure, arousal and dominance ( Figure 3). Finally, participants rated the perceived anthropomorphism of each interface [41].
After experiencing journeys with all three HMIs, participants were invited to undertake a fourth drive using the HMI of their choice (later interpreted as an objective indication of preference): no 'trust challenges' were present during this 'free-choice' drive. In addition, participants ranked the three HMIs in terms of their overall preference. Finally, a structured, post-study interview was conducted to elucidate participants' ratings. Each trial lasted approximately 1½ hours, including briefing and debriefing.

RESULTS
A repeated-measures ANOVA with post-hoc pairwise comparisons was conducted for each measure, unless otherwise stated. Post-study interviews were transcribed and analysed in a systematic qualitative interpretation using inductive thematic analysis [42]. In addition, conversational exchanges with UltraCab were transcribed and analysed, although this is reported elsewhere (see: [43] for details).

Trust-in-Automation
Responses to Jian et al.'s [5] trust in automation scale indicated significant differences in the level of Trust that participants placed in the different HMIs ( Figure 4) (F(2,66) = 11.80, p < .001, ηp 2 = .26). Pairwise comparisons revealed that the trust placed in the anthropomorphic agent (mean: 67.1) was significantly higher than trust placed in the voice command (mean: 63.2, p = .041) and touchscreen interfaces (mean: 54.9, p < .001). Significant differences were also observed between the voice command and touchscreen (p = .007).

Self-Assessment Manikin: Pleasure
There were statistically significant differences between all three conditions for ratings of Pleasure (F(3,66) = 33.3, p < .001, ηp 2 = .50) ( Figure 5). Pairwise comparisons revealed that mean ratings for Pleasure were highest (most positive affect) in response to the anthropomorphic agent (mean: 4.5), compared to the voice command (mean: 3.8, p = .009) and touchscreen interfaces (mean: 2.5, p < .001). Significant differences were also observed between the voice command and touchscreen (p < .001).

Perceived Anthropomorphism of the Interfaces
Statistically significant differences were also noted for the Perceived Anthropomorphism of the interfaces (F(3,66) = 56.2, p < .001, ηp 2 = .63), with the anthropomorphic agent rated significantly higher than both the touchscreen and voice command interfaces (mean ratings: 91.4, 77.9 and 57.6, respectively, all p < .001) (Figure 8).

Preferences
Given the ability to choose, thirty participants (88%) chose the anthropomorphic agent to accompany them during their final, 'free-choice' drive. In addition, rankings indicated a strong preference for voice interfaces in general, and the anthropomorphic agent in particular. A pairwise rank analysis was conducted. This confirmed a clear overall preference for the anthropomorphic agent ( Figure 9).

Post-Study Interviews: Thematic Analysis
The following six themes emerged during the analysis of the post-study interview transcripts. These are presented and discussed below. , it made you feel in control" P23), though it was also noted that the feelings of control did not necessarily transcend to the vehicle itself -one participant mentioned that they were unable to stop the vehicle by asking UltraCab, for example (this was in fact true for all interfaces). Those participants who felt more in control with the touchscreen or voice command interface commented that they understood how these operated (i.e. were able to quickly form an accurate mental model).

"I think once I'd figured out the touch screen a bit more I felt a bit more in control of that." P20
Conversely, difficulty in forming a coherent mental model for the anthropomorphic agent (for example, how it arrived at each decision or formulated each response) was seen as a limiting factor by those participants who were more critical of this particular HMI. This is likely to be influenced by factors such as system transparency, which was also mentioned by some, and already highlighted as a potential barrier to trust and technology acceptance [11].
Challenging Trust. Trust was frequently mentioned during the post-study interviews, particularly in relation to the trust challenges. Although several participants recognised the need to develop trust over time ("I think trust is gained by experience, so the more it happens the more trust is built up" P27), the anthropomorphic agent appeared to attract inherently higher levels of initial trust, from the outset, even despite the erroneous declaration:

"I trust it, I trusted [UltraCab] from the first go round to be honest, it told me why it stopped" P21
As part of the experimental design, all HMIs provided the same explanation following the stoppages. It is therefore possible that participants were more reassured by the agent's explanation because they felt that they could question it to gain clarification (given the dubious nature of the stoppage) -though there was little evidence within the conversational exchanges (also transcribed -see: [43]) that any participants actually sought further explanation or clarification. An alternate explanation is that an element of trust that had been established through the social aspects of the conversation [22]. Indeed, some participants felt that the agent exclusively warranted blame (in that it had contravened their shared

"…that made me not trust it [UltraCab], because I was getting an error message that didn't relate to the conditions, and it made it feel less reliable… It felt like this isn't working properly." P30
" [UltraCab] was the most frustrating probably because your eyes were telling you something and the system was telling you another." P20 Nevertheless, some participants were more forgiving of the technology generally ('better safe than sorry'), irrespective of the manner in which information was delivered: Privacy and Security. Similar to the findings of the earlier focus group, many participants expressed concerns about the privacy and security of their personal information. These included concerns about financial information, targeted advertising, knowledge of their home address and potential security breaches made by the autonomous vehicle: interestingly, such concerns were targeted mostly at the anthropomorphic agent, suggesting perhaps that participants felt it knew them better, given the conversational exchanges that had taken place. Participants also indicated that they would be reticent about sharing their personal information (through conversation) had they been travelling in a communal vehicle, compared to had it been their own personal transport: "I have one concern and that's for sure, and that's the security aspects of it." P19

"What's not clear to me is when you go in, what's it got access to." P30
Even so, privacy and security of information were not necessarily seen as barriers, although participants indicated that they would have remained in control of which data they shared (and how they shared it):

"I would have programmed in things, like my emails…I would have programmed in my reminders and it will remind me of those things, but I'll still be very much in control of them." P27
Participants therefore tended to concur that they viewed the role of UltraCab was as an assistant, meaning it could conduct tasks on their behalf. This suggests that participants wanted to maintain agency and responsibility over decisionmaking (and the sanctity of their data), but delegate routine task execution to the agent.
On the Edge of the Uncanny Valley. Several participants expressed concerns associated with the anthropomorphic agent attempting to deceive them into believing it was a human, when it was not; moreover, this had a negative effect on users' perception of the development of trust:

"I think anything it did deliberately to build my trust would have the opposite effect. An attempt to pretend to be anything other than what it is, i.e. trying to be more human, is not necessarily a good thing in my view." P32
This raises potential concerns associated with the agent being viewed as too human -the so-called uncanny valley effect. This can result in a sense of eeriness and suspicion due to a perceptual tension arising from mismatched stimuli and causing incongruence between users' expectations of a system and its actual capabilities [44]. This has already been recognised as an area of concern associated with intelligent conversational agents, with researchers proposing potential language-based solutions [45,46].
Natural and Easy to Use. Several participants recognised the benefits of using natural language as an interaction mechanism, commenting that it was quick and intuitive, and that they did not need to learn a new HMI or technique: "The conversation was better, I think because often you can't remember how you're supposed to say a command…I could interact just by talking, it felt a lot more natural seemed to work well." P26 The benefits of using verbal interactions more generally (i.e. including the command-based interface) were also recognised:

"I thought the voice command was very good, it's much easier to interact and find information that you'd need whilst staying aware of what's going on around you." P19
Nevertheless, some participants felt that conversational interactions with UltraCab required more effort. This may reflect the fact that using spoken conversation inspired participants to actively engage in the co-creation of common ground, trust and rapport (as they would with another human conversational partner) -rather than just the transactional (functional) aspects required by the interaction. Conversely, a single button press on the touchscreen, for example, was quick and easy and required no emotional attachment or commitment. Even so, participants generally agreed that the anthropomorphic agent provided the most 'natural' interaction. In particular, being able to express the same information in different ways, and the ability to build and maintain common ground and mutual understanding within the interaction (so that this could be referenced in later interactions) were actually seen as most useful; again, this reflects common practices employed during human-human conversation [26].
The Personable Touch. Participants frequently commented on the 'personable' aspects of the anthropomorphic agent ("I think it brought together a slightly personal touch, without sounding too automated." P30), and recognised that these were inspired by the use of conversational dialogue, specifically mentioning the use of social etiquette (politeness, apology, etc.). Comments suggest that participants believed this enhanced the perception of the agent as being 'friendly', 'helpful' and 'intelligent'. Moreover, these qualities were seen to reinforce trust, comfort and even companionship: "I trusted it more…when it prompted questions and was a bit friendlier in its answers…I instantly felt more comfortable." P34 "you could ask questions, also felt less lonely, because you're in there by yourself, you could relax" P23 "I think he is helpful and makes the journey pleasant, so it doesn't feel so, like you are in a machine. So it feels more intelligent, you can trust it." P31 Similar observations have also been made during studies in social robotics, in which people spoke openly about personal matters with a conversational robot, using it as an emotional outlet that reduced feelings of loneliness [47].

DISCUSSION
The study explored the efficacy of using an anthropomorphic agent employing conversational language to engender trust and enhance the user experience in an autonomous, selfdriving 'pod' vehicle. The approach was motivated by our understanding of the role of conversation in human-human interactions, which is used in part to build trust and maintain and strengthen a positive relationship with the other interlocutor(s) [24,25]. Overall, results show that the anthropomorphic agent-interlocutor was the most preferred interface. It invited the highest ratings of trust, and significantly increased the pleasure and sense of dominance (or control) over the journey experience. Ratings associated with 'arousal' were contrary to initial expectations, suggesting (on face value at least) that both the touchscreen and voice command interface were more 'exciting' and 'engaging' than the anthropomorphic agent. Nevertheless, while the pleasure and dominance scales have distinct positive and negative valences -'happy' versus 'sad', and 'under control' versus 'in control', the semantic anchors utilised by the SAM [40] arousal scale lack unique associations. For example, the first picture (rating 1) shows an individual who is very calm ( Figure 3). As such, this could be interpreted as 'relaxed', but 'bored' or even 'lazy' may be equally applied. In contrast, the last picture (rating 6) shows an individual who is literally 'bursting' with arousal. Thus, interpretations associated with a 6-rating could include extreme emotional states of 'excitation' and 'euphoria', but this could equally be interpreted as severe rage, agitation or anger. It is therefore feasible that the low ratings associated with the agent suggest a positive result in that participants were more relaxed when interacting using conversation, and conversely, that participants were highly agitated when interacting with the touchscreen. This is important in a driving-related context, particular if drivers may be required to take control of the vehicle at some point.
As expected, participants associated higher levels of anthropomorphism with the agent. Indeed, the 'personable' aspects were a strong feature of the post-study interview, with participants positively reinforcing elements such as the use of social etiquette (politeness, apology, etc.). Participants also commented that they found the free-flowing conversational nature of the agent (as opposed to the strict voice-command approach) to be more 'natural', enabling them to simply say what they wanted, rather than having to interact in a predefined or command-based manner. It was also commented that using conversational exchanges made the interaction friendlier and more pleasant, and that the anthropomorphic agent could therefore potentially provide companionship on longer journeys which may otherwise feel isolating. Nevertheless, some participants notably commented that interacting with the natural language interface required more effort -this is thought to be due to the perceived additional effort to engage with elaborative, contextual social talk (to build common ground, trust and rapport) that is seen in human conversation [22,23].
The significantly higher reported levels of trust associated with the anthropomorphic agent are likely to be a factor of the perceived humanness, or anthropomorphism, as also observed by Waytz et al. [32]. It is suggested that this helped to overcome potential reliability issues, such as the 'trust challenges', explored as part of the experiment. Nevertheless, it is also worth noting that these challenges attracted concerns about 'deception' more generally, and this was relevant for all interfaces. While this is perhaps unsurprising, given the nature of the perturbation, it is interesting to note that this generally had less of an impact on trust when using the anthropomorphic agent, suggesting that people are more likely to accept such fallibility from an entity that they perceive to be more humanlike -there was even evidence of participants blaming the agent.
Deception was also mentioned in the context of the anthropomorphic agent pretending to be human or appearing too humanlike -this was implicit in the design of the HMI and its loquacious style of interaction, and is an interesting irony, given the Wizard-of-Oz methodology employed here, whereby a human was actually impersonating the technology. However, it is also a word of warning to designers of conversational user interfaces that care must be taken to ensure that these do not descend into the 'uncanny valley' [44,45], as this can result in deleterious effects on human perception and performance, and encourage inappropriate assertions of trust and reliance.
A potential criticism associated with the anthropomorphic agent (raised by several participants) was a lack of system transparency -for example, understanding how it arrived at each decision, calculated risk or formulated responses. This difficulty in establishing a coherent mental model for this HMI may have contributed to feelings of being out of control by these participants (although, the agent notably attracted the highest ratings of control/dominance overall). Many participants also expressed concerns about the privacy and security of their personal information. These included concerns about financial information, targeted advertising, knowledge of their home address, and potential security breaches (or 'hacking') of the autonomous vehicle. Participants also indicated that they would be less willing to share information with the vehicle when undertaking a journey shared with other passengers. Even so, participants expressed a strong desire that the system learned their preferences, for example their tastes in music, and performed the functions of a personal organiser, thereby improving the overall user experience. Overall, participants were therefore not entirely unwilling to share information with the autonomous vehicle (to achieve this goal), although they did indicate that they would feel much more comfortable doing so if they had greater knowledge of how their data was being collected and used. This has clear implications for trust.
Although the study revealed positive benefits associated with using an anthropomorphic agent-interlocutor to support passengers in an autonomous pod (compared to a touchscreen and voice-command interface), care should be taken when generalising from the findings. For example, while the pod did indeed operate autonomously, the experience was rather restrained, in that the pod travelled very slowly (at a fast walking pace), braking could be abrupt, and the physical design of the vehicle restricted participants' vision. In addition, the overall journey experience was limited (there were no other vehicles present, for example) and participants could exert no control over the pod -they could not modify the route or stop the vehicle: this could potentially have influenced their subjective ratings of the experience -indeed, some participants indicated that their ratings of trust may have been different had the pod travelled more quickly and in the presence of other traffic.
In addition, although all three interfaces delivered the same task-related information for each of the use-cases, the anthropomorphic agent arguably offered greater interactivity, in that participants were able to engage in conversational dialogue to seek further clarification, request music tracks etc. While this is in fact a particular benefit of employing this type of HMI over the more constrained and task-oriented experiences offered by traditional touchscreen or voice command systems, the number of interactions and information exchanged during the study may therefore have differed between conditions, and this could have affected comparative ratings. Nevertheless, the decision and propensity to engage further with the agent was at the behest of each participant. As a consequence, individual differences (age, gender, personality etc.) and cultural differences may have influenced their propensity to engage; these factors could be explored in future works. In addition, recruiting a broader range of participants (i.e. not limited to employees of Jaguar Land Rover) would be beneficial in future work, although this was a restriction imposed during the current study.
Finally, although the agent's responses were guided by a script that incorporated appropriate language and phrasing (informed by the focus groups and previous studies), no restrictions were placed in terms of how much elaboration was possible. Moreover, our 'wizard' was instructed to respond to all enquiries, avoiding clinical, out-of-domain responses, such as, "Sorry. I don't understand". As such, the interface exceeded current state-of-the-art agentinterlocutors (such as Alexa, Siri and Google Home), which promise much through their implied humanness, but fall short of the reflexive and adaptive interactivity that occurs in most human-human conversation [48]. However, this was an important part of the experimental design to ensure that the results support the long-term development of human-agent interaction, rather than simply commenting on current limitations of the technology; even so, it may have discombobulated some participants.

CONCLUSION
While other scholars have explored the effects of using a speech-based interface during simulated driving, this is the first study to consider the impact of using an anthropomorphic agent (employing conversational speech) on the development of trust and the overall user experience in a fully-autonomous pod vehicle. Based on the results, we can conclude that using an anthropomorphic agent with twoway conversational interactions increases users' perceived trust and pleasure; moreover, passengers felt more in control of the journey experience when accompanied by the agent. It was also evident that using anthropomorphism in the design of the agent created a more 'forgiving' experience (compared to other, more traditional interfaces), in which passengers were apparently more willing to accept reliability and dependability indiscretions, such as those revealed by the trust challenges. Nevertheless, issues of security and privacy remained, particularly where the agent appeared to hold or have access to personal, and even intimate knowledge about the passenger. Although the approach is inspired by aspects of human-human interpersonal relationships -and results support the fact that participants imbued the agent with similar, humanlike qualities and capabilities -the study also reveals potential challenges facing designers of 'intelligent' conversational user interfaces, such as system transparency and the development of an appropriate mental model, particularly where users perceived the agent's humanlike qualities to be approaching perfection. This suggests that future work should focus on tailoring the experience to engender appropriate human-likeness, and thereby avoid the perils of the uncanny valley [45]. Results of the study can also inform the design, and ultimate uptake and acceptance of autonomous, self-driving 'pod' vehicles more generally.