Reality-based interaction affecting mental workload in virtual reality mental arithmetic training

The concept of digital game-based learning (DGBL) evolves rapidly together with technological enhancements of virtual reality (VR) and smart phones. However, the mental workload (MWL) that VR-training applications demand and motivational qualities originating from user experience (UX) should be identified in order to create effective and enjoyable training/learning challenges that fit with individual users’ capabilities. This study examined the effects of reality-based interaction (RBI) and VR on measures of student motivation and MWL, in a mental arithmetic game for secondary school pupils. In a randomized controlled trial with sixty school children a mental arithmetic game was tested with three different interaction and two different presentation methods – VR RBI, VR head-mounted-display tapping and tablet flick-gesture. Results found a significant effect of RBI on MWL but no differences in enjoyment of training were found between VR-experience and tablet training-experience. In fact, adding the gaming-context to the mental arithmetic task created an enjoyable, motivating experience regardless of presentation or interaction-style.


1
Introduction and purpose User Experience UX Perceptions and responses resulting from the use and or anticipated use of a product, system or service Reality-based interaction RBI Concept of interacting in VR in the same way as in reality (e.g. throwing gesture to throw a ball).
Mental Workload MWL Cognitive demand that is additively generated through intrinsic, germane and extraneous cognitive load during performance of a task (Sweller, Ayres, and Kalyuga 2011) Working Memory WM Limited cognitive capacity of humans during task performance (Baddeley 2002;Kahneman 1973

Virtual reality training and mental workload
Virtual Reality (VR) can be defined as a concept of total immersion of an individual in a computed -synthetic -environment. Key features are real-time response of the computer-simulation to user movement and interaction (Burdea and Coiffet 2003). Fullyimmersive systems use body-tracking sensors to ensure natural response of the virtual world to user movement (Rothbaum et al. 2001). The Oxford Dictionary denotes VR as: "A computer-generated simulation of a lifelike environment that can be interacted with in a seemingly real or physical way by a person, esp. by means of responsive hardware such as a visor with screen or gloves with sensors…" (OED online, 2018). Therefore, VR is not only a visual experience but can encompass all human senses. It must therefore be regarded holistically as perceptions via any sensory channel may impact the user expereince of presence and immersion. The potential and possibilities of VR for training and education are enormous and well investigated in recent studies focusing on knowledge/skill-acquisition or knowledge/skill-training and learning outcome. VRtraining has been investigated for example in surgery (Rahm et al. 2016;Alaraj et al. 2015;Jensen et al. 2015;Piedra et al. 2016), military (Bhagat, Liou, and Chang 2016),and business/social training with adults (Kiss et al. 2015;Froese, Iizuka, and Ikegami 2014) as well as in education contexts with students or children (Ijaz, Bogdanovych, and Trescak 2017). However, most of these studies utilized VR in a nonimmersive form, relied on specific stationary simulators/hardware for training or did not consider immersive VR-interaction or cognitive demands (e.g. Alhalabi 2016; Jimeno- Morenilla et al. 2016).
In immersive VR training contexts the concepts of reality-based interaction (RBI) or natural user interfaces (NUI) are implicitly involved in three-dimensional virtual experiences (Jacob et al. 2008;Wigdor and Wixon 2010) as touch and gestures are essential human behaviours in any reality, virtual or other. Although it has long been known that educational applications demand a clear understanding of the cognitive factors that may impair or enhance learning-outcome and learning-experience (Wickens 1992 (Paas, Renkl, and Sweller 2003). It is therefore imperative to investigate the scientific gap in VR-interaction regarding educational application and the effects on MWL.

Working memory and mental arithmetic
Specifying MWL as a measure implies that mental efforts are measureable on a model of mental resources. For this study we follow the suggestions of Kahneman (1973) that human cognitive resources are limited and task performance requires capacity (Norman and Bobrow 1979) in accordance with the established working memory (WM) model by Baddeley and Hitch (1974) in its latest conception (Baddeley 2002). In this respect, MWL cannot be regarded as a single dimension since not all tasks compete for one resource-pool (Wickens 2008). In fact, data originating from divided attention tasks (Kantowitz 2000;Kantowitz and Knight 1974;Kantowitz and Simsek 2001;Wickens 1976) provide evidence of differentiated resource-pools for auditory and visuospatial subsystems in WM. Consequently, the current understanding of WM involves two separate subsystems that both are suggested to have certain temporary storage-capacitythe phonological loop and the visuospatial sketchpad -which will be referred to as auditory and visuospatial WM in this paper. Besides those independent subsystems, WM is believed to have a central executive (CE) component with no storage-capability but functioning as attention control where attentional capacity is focused, switched or divided and a component termed episodic buffer to address questions of long-term memory (LTM) retrieval into WM and interaction between subsystems (Baddeley 2002). Further, it is believed to also play a role to facilitate chunking using LTM access (Baddeley 2002;Miller 1956). Naturally, with regards to measuring MWL in such a distinct framework of WM it has to be assured that tasks compete on the same resources if specific impacts are to be assessed. The key elements to consider for MWL-assessment and resourcedistribution in respect of this study of MWL in a VR mental arithmetic experience are VR-perception/interaction on one hand and mental arithmetic processing on the other.
As studies on spatial sound perception in VR have shown positive effects on MWL with attribution of load reduction concerning visuospatial WM overload (Flanagan et al. 1998;Nelson et al. 1998), it can be inferred that both, auditory and visuospatial WM are substantially involved in MWL with VR-perception.
Additionally, Best and colleagues (2011) argue that different types of mathematic tasks are related differently to WM-components. For instance, calculating is suggested to be linked more strongly to fact retrieval and would therefore require less attentional control through the CE (Best, Miller, and Naglieri 2011). However, fact retrieval is associated with demand on the episodic buffer which would suggest a shift of demand in WM rather than a reduction (Baddeley 2003). It is, in this regard, also of importance if the mental arithmetic task is one of production or verification (i.e. the result is to calculate, or it is presented with questioning true or false), as a proposed answer facilitates fact retrieval (Lemaire 1996). It can be concluded from the present evidence that auditory WM, visuospatial WM, CE and episodic buffer are all to varying degrees implicated in mental arithmetic, a conclusion also drawn by DeStefano and LeFevre (2004).

Motivation and digital game-based learning
MWL not only represents a determinant for learning outcome regarding efficiency but may also be linked to the motivational capacity of VR training. The connection between MWL and intrinsic motivation has been demonstrated through empirically tested theories of Csíkszentmihályi (2014), who found that a state of intrinsic motivation resulting in ideal concentration and holistic immersion (flow-experience) is likely to be reached when a task meets the appropriate level of challenge between overload and boredom. Thus, it requires a specifically designed MWL-level that represents a person or group specific challenge-skill balance to induce intrinsic motivation.
Related studies (Bhagat, Liou, and Chang 2016;Smith and Ericson 2009) indeed found applying VR in a training or learning context results in added motivational benefits (i.e. an enjoyable experience) but did not assess the origin of motivational effects or any cognitive demands. However, which specific factors contribute to motivational benefits is debatable as interaction style within different VR training studies is variable and scenarios are frequently entangled with gaming elements or include game-based interaction-sequences. Whereas some scholars argue motivational effects transpire from immersion and feelings of presence in virtual environments (Psotka 2013), serious gaming literature shows that digital game-based learning (DGBL) (Prensky 2003) represents in itself a concept which is proven to be capable of inducing intrinsic motivation (Dörner et al. 2016). Empirical studies have established a fair evidence-base that DGBL improves student engagement and motivation (Hamari et al. 2016;Papastergiou 2009;Hung, Huang, and Hwang 2014;Erhel and Jamet 2013) and also content learning (Lester et al. 2014;Perrotta et al. 2013 (2010) points out, emotion, cognition, motivation and action are inseparably entwined in a user experience (UX). Consequently, the emotional experience resulting from RBI with virtual environments may create a different level of intrinsic motivation than interacting in a non RBI-way with buttons on a gamepad.

Empirical research approach
Recent related studies have focused on comparing traditional training-methods with VR (Chao et al. 2017;Ijaz, Bogdanovych, and Trescak 2017) or usability in VR with non-RBI interaction (Xu and Ke 2016;Xiong et al. 2016;Shin, Biocca, and Choo 2013). Our view is that, to create a holistic, motivating and also effective VR learning experience, the effects of RBI in VR specifically with respect to MWL and UX must be identified. Our study, therefore, in addressing the MWL of RBI and RBI effects on UX, respectively, explored the potential connection between physical and psychological challenge and demand as illustrated in Figure 1. The purpose of the empirical study consequently was twofold: (1) assess the MWL of RBI in VR game-based training by comparing measures of MWL (perceived difficulty of the mental arithmetic task and mental arithmetic performance) when performing a mental arithmetic game (MAG) under one of three presentation/interaction-configurations (between-subjects studydesign): (a) virtual reality display with reality-based interaction, natural interaction (VRMAG-RBI) (b) virtual reality display with interaction controlled by tapping the VR headset (VRMAG-HMD) (c) non-VR conventional tablet display with touchscreen interaction (TMAG) (2) identify differences in learner-motivation through joy-of-use originating from the presentation/interaction-differences.
A game-based mobile-app was developed for training mental arithmetic skills for .
secondary education level at key stage 3. The mental arithmetic game was produced in both mobile VR and a tablet-computer format. The VR mental arithmetic training game (VRMAG) offers two different interaction-modalities; one representing a natural ballthrowing gesture (VRMAG-RBI) and the other representing conventional head mounted display (HMD)-interaction by tapping at the designated touch-area on the HMD to 'throw' a ball (VRMAG-HMD). The tablet-variant (TMAG) was visually identical except for the fact that it is non-VR and presented on a 2D-screen with touchscreen input such as swiping/flipping to 'throw' a ball.

Evaluation methods and hypothesis
There are four basic empirical strategies to assess MWL (Wickens et al. 2015, Fourth edition:530-548): primary task (PT) performance, secondary task (ST) performance, psychophysiological measurement or subjective rating scales (SRS). As this study involved intensive movement by executing throwing gestures repeatedly, validity impairment of psychophysiological measures (e.g. heart rate variability, electrodermal activity, eye-blink rate) would be unavoidable and so these measures not applied.
ST-measurement requires a clear distinction on which WM subsystem (i.e. auditory or visual) the demands are competing. As the task in this study was not channelexclusive, this would provide insufficient insight on total added MWL originating from the three different presentation/interaction-models (Sharples and Megaw 2015, 532).
In consequence, this study followed the recommendations of Sharples and Megaw (2015, 543) by combining subjective MWL rating with objective MWL-performancemeasurement. As argued by DeStefano and LeFevre (2004) primary-task-measurement is advisable due to involvement of all WM components in cognitive arithmetic. Therefore, primary task metrics including total points (correct solutions), wrong solutions, balls thrown, balls missed, falsely thrown balls and others were logged within the game. In order to measure MWL effects in performance, the PT has to be of high cognitive demand as otherwise reserve capacity (Wickens et al. 2013, 350) and coping strategies (Hancock and Warm 2003) will prevent performance decrease. Thus, for the mental arithmetic tasks in this study teachers were asked to provide appropriate levels of calculation demands suitable for the education level of the children in the study (see Table 2).
With respect to measurement of ease-of-application, we applied a simple post-task question taken from -the subjective mental effort question (SMEQ) (Salvendy 2012; Sauro and Dumas 2009) focusing on the question of mental arithmetic calculation difficulty was applied for this research.
To assess perceived challenge and motivation 14 bipolar adjective pairs of the hedonic qualities of the AttrakDiff2 questionnaire (Hassenzahl et al., 2003, Hassenzahl 2010) were rated on a 7-point semantic differential scale. Perceived physical effort and perceived fun, as well as willingness to train with the game, were additionally evaluated on a 7-point rating scale. All questions were presented to participants on a printed questionnaire sheet post-task.
To investigate MWL and UX this study established three null hypotheses: H0a: 'There are no significant differences between the three interaction/presentation variants in the mental arithmetic performance.' H0b: 'There are no significant differences between the three interaction/presentation variants in perceived difficulty of the mental arithmetic task.' H0c: 'There are no significant differences between the three interaction/presentation variants in perceived user experience.'

Concept of mental arithmetic game
The underlying concept of the mental arithmetic game was a mathematical verificationtask with a true answer result and two false confusion results. For example, Figure 2 displays a mathematical calculation task (147+50=) and three given answers, only one of which is correct. The player is required to select the correct answer by throwing a ball in the gate with the correct answer. Table 1 shows the resulting types and level of cognitive math-tasks included in the game. The two false results were computed by randomly adding or subtracting a number between 1 and 5 to or from the correct result. The game-environment displays an outdoor setting at a lake surrounded by mountains where three goals are presented with one correct result and two obfuscated results for a mental arithmetic task shown above the gates (Figure 2). Acoustic atmosphere and effects for the ball-throwing-interaction were implemented to form an enjoyable experience and provide auditory feedback for the throwing-interaction. The player is required to throwing a ball through the gate in which the correct result is shown.
The primary goal for the game was to throw a ball through the gate with the correct solution to each arithmetic task presented. A timer displayed on the screen counted down from 5 minutes to zero and the player was instructed to achieve a high score by correctly answering as many arithmetic tasks as possible in the time limit. For every ball thrown into the correct gate, the player was awarded 10 points; whereas for each ball thrown into an incorrect gate 10 points were deducted and missing all gates had no impact on the score. It was therefore possible to end with a negative score. Throwing the ball into an incorrect gate could result from selection of an incorrect answer or a poor throw of the ball hitting an incorrect answer by accident. Furthermore, it was possible to miss the gate or hit the border requiring the player to throw another ball.
The mental arithmetic game was created in the Unity 3D game development environment (Version 5.3.2f1) and programmed in C# (Mono .NET 2.0). The same game was used in all three experimental conditions with differences only in 2D-screen/3D-VRdisplay presentation. Whereas the visual design on a smart phone or tablet is presented from a static (camera) viewpoint in 2D, the smart phone can be inserted in a portable HMD mount such as the Samsung GEAR VR, which subsequently enables an immersive 3D-VR-experience of the same visual presentation allowing free rotation of the viewpoint through head-tracking. In order to assess influence of interaction method on MWL the interaction mechanisms described below were used.

Experimental condition A: VR reality-based interaction (VRMAG-RBI)
This throwinggesture should match the natural interaction in having a controllable release moment so that it matches the real experience to release the ball with the appropriate velocity.
A gesture ring by Nod, Inc., Mountain View, USA was used to develop the RBI.
The version used for this study was an unobtrusive finger-ring with a touch-sensitive area. It connects to the smartphone via Bluetooth LE and is capable of tracking acceleration and rotations of the finger/hand through integrated inertial sensors with an accuracy of millimetre resolution. However, the ring was designed for adults and was too big for the children in the study.  releasing it at the desired moment, they hold the touch-area with the thumb and release it at the same moment they would release the ball in reality. Figure 4 shows a participant performing the throwing sequence. The speed of the throwing gesture was tracked using the accelerometer sensor-data, thus, the velocity of the ball was determined from the throwing motion intensity. The physics engine of Unity 3D calculated the flying curve of the virtual ball in VR according to its mass and the tracked acceleration. Gaze direction of the player was assessed through head-tracking of the GEAR VR and was taken as the desired throwing direction.

Experimental condition B: VR HMD conventional tap interaction (VRMAG-HMD)
Game variant B utilised a more conventional VR-interaction paradigm using the

Experimental condition C: Tablet computer swipe/flick gesture (TMAG)
Game variant C was presented on a tablet computer. The Samsung Galaxy Note 8 features an LCD-TFT display with 1280 x 800 px resolution resulting in 189 ppi. While the game on the tablet was presented from a 2D fixed point-of-view, all other visual and auditory properties were identical to the above described VR variants. To "throw" a ball, players had to perform a flick-gesture on the touchscreen of the tablet from the centre of the viewpoint in the direction of the goal intended ( Figure 6). Thereby, the speed of the thrown ball was derived from the speed of the executed flick-gesture. More specifically, the time-span between start and end of the flick-gesture was tracked and correlated with the speed of the thrown ball. Just as with the other variants, balls could miss the gates or hit the posts and bounce back according to physics calculation from the game engine.

Participants
Participants were secondary school students (n = 60; 32 girls, 28 boys) from two different schools (henceforth referred to as school m and school g) aged between 12 and 14 years (mean age: 12.95 years) enrolled in various different classes at key stage 3 level. The two schools were differentiated by general socio-economic background with pupils in school g tending to come from families with a more academic background and pupils in school m originating from working/middle class families. Participating pupils were healthy and had no physical or psychological disabilities. All participating students gave their written consent as did their legal representatives and their form teacher and school management.
Participants did not receive financial compensation but were given the mental arithmetic game for home usage if desired. Ethical approval for the study was granted by the Faculty of Engineering Ethical Committee, University of Nottingham.

Procedure and experiment location
The study was designed as a randomised controlled trial (RCT) and took place at each of the two schools to allow the children to participate within an accustomed educational setting.
The students were randomly assigned to the experimental conditions which resulted in the demographic distribution for the study as shown in Table 3. For assessing MWL the game performance measures were logged as PTM with implemented logging algorithms. The adapted AttrakDiff2 and SMEQ scales were assessed using a paper-based post-task. Students were also asked to rate their experience with the game with respect to fun, physical demand and their willingness to train with the game on a 7-point rating scale (1, not at all; 7, very much) (Albert and Tullis 2013, 128) and had the option to add their own comments at the end of the questionnaire.
Participating students were called in the randomly assigned order from their ongoing classes to the test-rooms. Each room was supervised by a research assistant who instructed the pupil with the specifics of the interaction/presentation-variant and the assessment-procedure before the training. Ring-size and HDM were fitted according to the needs of each participant before starting the game.

Mental Workload -Primary Task Measurements
For analysis of PTM three participants (two in group A and one in group B) were excluded from the ball-throwing measures as the logging metrics could not be retrieved from their training session. Furthermore, two pupils (one of school m and one of school g) omitted valuating one attribute-pair in the semantic differential which was also considered in the analysis. Variance-analysis of PTM revealed significant differences between the interaction/presentation-variants in key performance metrics (Table 4)   Considering the quota of correctly thrown balls, Figure 8 displays the significant (p = .003) mean decrease of 18.34% (95%-CI [-30.85, -5.83 [-15.28, 9.15]) in hitting the correct solution as the tablet control-group C with no significant difference (p = .814).

Mental Workload -Perceived Mental Workload
While PTM as one part of the MWL-assessment has exposed several significant findings between the three variants, variance analysis of the perceived difficulty of the game also revealed significant differences, F (2, 57) = 5.46, p = .007 with large effect size, ηp 2 = .161 in support of PTM-results (Table 5).

hoc tests and show that learners in RBI-group
A perceived the arithmetic calculations as significantly (p = .010) more difficult than students in tablet control-group C with a mean increase of 16.57 mm on the SMEQ-scale 29.56]). This represents a mean increase in the perceived MWL-scale from "not very hard to do" to "fairly hard to do". Furthermore, students in VR-group B controlling the game with HMD-tap also significantly (p = .019) felt the arithmetic tasks were more demanding than group C with an average increase of 11.75 mm (p = .019, 95%-CI [1.681, 21.81]) which represents MWL-intensification from "not very hard to do" to "a bit hard to do" on the SMEQ-scale.
Supplementary statistical analysis between gender and schools on assessed MWLdata with Welch's t-test revealed no significant differences of PTM or subjective MWL between boys and girls. However, school m did not perform as well as school g, with a 68.14 points (95%-CI [-130.80, -5.48
Post-hoc testing revealed that VR RBI variant A was perceived as significantly (p = .005) more challenging than control variant C (-1.78, 95%-CI [-3.1, -0.50]) but not significantly (p = .826) more challenging than VR-control-group B (-3, 95%-CI [-1.52, 0.93]). In general, statistical analysis suggests to reject all three null hypotheses (H0a , b, c) as testing revealed significant differences among the three interaction/presentationvariants concerning mental arithmetic performance, perceived difficulty of the mental arithmetic task and perceived user experience.

Discussion and interpretation of the results
Observed behaviour, commentary of pupils as well as self-reported perception of excitement indicate that the participants of this randomised controlled trial generally enjoyed the training with all three presented variants to a very high degree. In fact, the question on perceived fun by training with the game was, on average, rated above 6 on a scale of 1 (not enjoyable) to 7 (very enjoyable) for all interaction/presentation-variants.
The same findings were obtained for reported willingness to train with the mental arithmetic game (Table 4) with only marginal differences between schools. Most notably in this regard, neither immersive VR nor RBI improved the fun experience significantly as compared to the 2D tablet. On the contrary, while most of the perceived UX-qualities did not differ between variants, the results of data analysis revealed that pupils evaluated the tablet-variant as more attractive on average (approx. 17%) than VR-variant A. It is not clear whether the children rated attractiveness solely on visual design/presentation or also included other factors in this judgement; however, it could be inferred that when totally immersed in a virtual experience with no other visual sensations, the visual design has to be more appealing than when presented on a 2D display. Obviously, the intense magnification of the display in the VR HMD reveals a pixilation effect which potentially adds to the reported difference. However, as the difference from control-group C to VRvariant B was not significant this remains to be investigated further.
On the other hand, both VR-variants presented a significantly higher challenge (approx. 24% higher for A than C) to the pupils as rated on UX-dimension stimulation.
This would seem to suggest that immersion in VR does indeed have an isolated effect on stimulation/motivation (Psotka 2013) in representing a challenge. An immersive VRcontent -even a solitary, passively observed 360° picture -would, according to this assumption, present a challenge to perceiving individuals. However, it is probable that the challenging effect in the case of this study originates rather from interaction than visual perception, immersion or presence. In fact, both VR-variants included the challenge to aim with the head/gaze while visually verifying the correct math solution from three possibilities. This competition on the visual WM channel could cause a higher demand or perceived challenge for the pupils. Participants in control-group C, on the other hand, could verify one result and flick the ball at another without regulating their gaze. This MWL-explanation of challenge is more concordant to the findings in this trial as no further differences in joy-of-use of the three interaction/presentation-variants could be detected. Indeed, it can be concluded that neither immersive VR nor RBI provided a substantial negative or positive effect on UX as compared to a tablet mental arithmetic game. Adding the gaming-context to the mental arithmetic task, however, created an experience that students reported as having fun training with and would like to train with regardless of interaction-modality or presentation-form specifics. Thus, the results on joy-of-use clearly support the supermotivation theory of Spitzer (1996) in this respect.
As far as MWL is concerned, the trial results on this key question are very clear and identified congruent findings observed in the students' performance and their subjective perception of the arithmetic task-difficulty. The learners' performance declined significantly between tablet-group C and VR RBI A variant with a more than 43% lower average in total points; an approx. 18% lower mean in thrown balls hitting the correct arithmetic solution and around 13% higher average in tossed balls missing all targets in the VR RBI version. At the same time, the subjectively perceived difficulty of solving the mental arithmetic task was assessed on average as being more than twice as high by students in the VR RBI-group rising from "not very hard to do" to "fairly hard to do" compared to control-group C.
It is therefore inferred, that RBI does indeed significantly increase MWL while training or learning in VR compared to a non-VR tablet-variant mental arithmetic training. This finding contrasts with the view of Wickens (1992) that cognitive effort could be reduced by a "natural" interface. In fact, the findings suggest RBI presents a more extraneous cognitive load to the learners and should be avoided when aiming for effective learning outcomes (Paas, Renkl, and Sweller 2003;Sweller, Ayres, and Kalyuga 2011b) within a mental training task. Nonetheless, as studies on cognitive impact of learning-media interaction have shown (Holst, Churchill, and Gilmore 1997), it is required to determine exactly what cognitive demands of interaction could be a beneficial part of intrinsic cognitive load. For instance, motor skills training such as VR surgery simulation requires RBI for interaction as it is an essential training goal to perfect hand movements and RBI cognitive processes are thereby part of the intrinsic cognitive load.
Notably, as with the reported level of fun experienced in all variants, the pupils did not perceive any difference in physical demand when interacting with the three variations. Although the throwing gesture could be considered to be much more demanding than flicking a finger, the students did not say so. Additionally, we observation that the RBI produced very engaged, active learner participation with holistic body movement whereas variant C, and sometimes also B. were performed in physically unfavourable sitting postures.
However, we can assume that part of the extraneous overhead in MWL is attributed to immersive VR-presentation regardless of interaction-mode as some results displayed significant differences in this regard. Calculation difficulty in variant B with HMD-tap interaction was subjectively perceived as more difficult compared to controlgroup C, and significant different, but to a lesser extent, was observed RBI-variant A and control-group C, but there was no difference in perceived difficulty between variants A and B. However, PTM-parameters showed that students in variant B performed best at hitting targets and almost as good at hitting correct arithmetic solutions as tablet-group C.
Thus, the perceived cognitive overhead in variant B is likely to originate from other factors than interaction. It is possible that these reported higher demands are linked with the findings on higher challenge/lower attractiveness in UX and result from differences such as the visually more pixelated presentation. This needs further investigation to verify.

Study limitations
This study revealed and described significant differences in MWL and motivational effects regarding a mental arithmetic game with three different interaction/presentationmodes designed and developed for the research process. However, a few limiting aspects on these findings have to be considered. The assessment of MWL in this research is partly based on subjective perception as a post-task measure. Although supplemented and verified by PTM it might still be advisable in further studies to utilize methods that can actually address real-time changes in MWL during training such as psychophysiological measurements (e.g. eye-blink parameters). In this respect it is also worth considering methods that can clearly determine which subcomponent of WM is involved in the MWL overhead between VR and non-VR as well as RBI and conventional interaction.
The findings of this research also do not distinguish the composition of MWL as the primary task -mental arithmetic -involves every WM-subcomponent to some degree. This should be altered for future VR RBI investigation to gather more detailed insight in the exact composition of MWL. Furthermore, although the findings in UX showed substantial perceived enjoyment within all groups and displayed congruence to the MWL-measures, more detailed examination of UX in VR RBI compared to other interaction-methods is required to determine specific influences.
The subjective post-task questioning in this study relied on common understanding of the presented bipolar adjective pairs. However, we consider that this understanding could be varying to some extent in a young group of participants. In its nature as a randomized controlled trial the study did not address for long-term effects in learning or possible habituation effects considering MWL of RBI. Furthermore, the study did not address extraneous MWL originating from game context as all tested variants incorporated the same game mechanics / DGBL concept, but instead assessed MWL focused on the added cognitive load, generated from RBI and VR-presentation.

Conclusions and implications for future research
This study explored the MWL and motivational effects of RBI and VR within an educational setting of mental arithmetic DGBL. The results clearly indicate a higher MWL for students with RBI-interaction when training with a mental arithmetic game in VR than training with a tablet game and flick interaction. The findings further suggest that immersive VR-presentation itself represents an extraneous cognitive demand, albeit to a smaller extent than RBI. These effects are to be regarded in an instructional design if learning is desired to be effective, particularly if intrinsic cognitive load is already high due to challenging learning content. Importantly, those differences in MWL should further be considered when aiming for an appropriate challenge-skill balance to allow for flow-experience and intrinsic motivation. Moreover, the presented results suggest that neither VR-presentation nor RBI contribute significantly to the UX in mental arithmetic DGBL. In fact, the research outcome on UX leads to the conclusion that, in accordance with Spitzer (1996), adding an adequate gaming context to the task of mental arithmetic training creates a fun and motivating experience regardless of VR or non-VR presentation and RBI or conventional interaction. Nonetheless, the results of our study also reflected the basic model of Virtual Learning Environments of Dalgarno and Lee (2010) as both, representational fidelity and learner interactions proved as significant differntiators in the learning experience. This is an important finding concerning the soft issues of VR in education and paedagogy as finding ideal levels in these characteristics can maximise learning (Fowler 2015).
Our investigation of interaction within a learning context displayed clearly that DGBL approaches can result in positive training experiences on multiple platforms. We could demonstrate that a mental arithmetic application with a high user experience and motivation quality is easily adaptable for tablet computing and to mobile VR HMD but MWL is to be regarded considering cognitive demands of interaction. As key outcome for VR research our investigation showed significant impact of RBI on MWL in a VR training scenario and verified that a DGBL context can create fun learning experiences with little impact of presentation or interaction style. However, with respect to cognitive computing, big data and other emerging ICT research domains our study was also able to display an easy and usable approach for digitalisation in education settings that displayed excellent technology acceptance with a young target group. Creating a VRMAG based on our outlined approach that is widely used or distributed over social media can thereby for example provide the basis for deep learning scenarios (Lytras, Raghavan, and Damiani 2017;Lytras et al. 2018) or social media analysis (Lytras et al. 2015) and in consequence lead to improved effectiveness through personalisation of mental arithmetic training.
Future research should investigate the effects of RBI on MWL in a VR training context to distinguish MWL composition according to the individual WM-components and allow for creating advanced challenge-skill balanced instruction designs.
Furthermore, familiarisation effects regarding MWL of RBI should be targeted in prospective investigations as well as long-term learning-outcomes with VR DGBL.
Equally, possible advantages of VR-presentation compared to other DGBL presentation forms should be explored in more detail with respect to motivational benefits as this study did not indicate such effects. Ultimately, educators should feel strongly advised to incorporate DGBL in their tutoring schedule by the results of the presented study regardless of presentation form. However, identifying the appropriate challenge-skill balance in this endeavour remains imperative.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work has been partially funded by the Austrian Research Promotion Agency FFG through the COMET programme (5 th call) research project LiTech sponsored by the Austrian Ministry for Transport, Innovation and Technology (BMVIT) and the Austrian Ministry of Science, Research and Economy (BMWFW).