Module 2: Capabilities and limitations
Section outline
-
-
Technology has become ubiquitous and has hugely emerged during the last decades. It has become a part of our everyday lives and has inevitably affected education, including the teaching and learning process. Learning a Foreign Language (FL) is a crucial educational topic that can become more immersive and interactive with the help of emerging technologies. Language experts have been interested in using Extended Reality (XR) technologies in education, particularly in language learning, for the last two decades. Extended reality (xR) encompasses all real-and-virtual combined environments and human-machine interactions generated by computer technology and wearables such as Virtual Reality (VR), Mixed Reality (MR) and Augmented Reality (AR). XR technologies can either fully immerse users in virtual worlds or integrate the real world with the virtual one (Pomerantz, 2018; Fast-Berglund, et al., 2018).
The implementation of XR in education and language learning has increased within the past two decades (Parmaxi & Demetriou, 2020; Panagiotidis, 2021). The use of XR technology in education, especially for language learning, appears to have a very great potential on the cognitive, academic, and linguistic levels (Lan, 2020; Alfadil, 2020; Panagiotidis, 2021).
Language learning can increase engagement and can be converted into a more self-directed learning experience with the use of high-immersion Virtual Reality using a simulated realistic environment (Ou Yang et al., 2020). VR produces an entirely artificial and virtual environment (Hein et al., 2021) and it also provides learners with the opportunity to practise their speaking skills (Li et al., 2020). On the other hand, AR overlays virtual content over real-world objects and locations (Hein et al., 2021). Compared to VR and AR, MR as a term has not seen a consistent use. Speicher et al. (2019) conducted expert interviews with AR/VR experts from academia and industry and reviewed sixty-eight papers in the field, concluding that there are multiple competing definitions for this term. In specific contexts, it is synonymous with XR as a cover term that includes both AR and VR. In others, it represents the midpoint of the reality-virtuality continuum; the point where real and virtual elements are equally displayed for the user. A more distinct use of MR is to describe technology in which virtual and real elements supplement one another. With this distinction, AR refers to technology that displays imagery that only interacts with the user but not with their real-world environment while MR refers to technology with virtual elements that interact with both the user and their real-world environment. According to Hein et al. (2021), MR refers to human-machine interactions produced by wearables and computer technologies in real and virtual settings that are merged. Human perception can be changed by AR and MR systems since they both add virtual items to the real world. The distinction is that while mixed reality creates new habitats and representations where actual and digital items co-exist and interact in real time, augmented reality takes place in the real world (Panagiotidis, 2021).
-
2.1 Rationale
The use of MR in foreign language learning has positive effects on student motivation and improvement in learning outcomes (Ibáñez et al., 2011). The most frequently mentioned advantages of XR applications in the literature is increased motivation, engagement, participation pleasure and decreased anxiety (Panagiotidis, 2021). The current study focuses on Mixed Reality in Foreign Language Education (FLE) for specific purposes globally. According to Agustina (2014), English for Specific Purposes (ESP) is defined in a variety of ways. Some define it as being nothing more than the teaching of English for any possible purpose. Others, however, are more specific, referring to it as the teaching of English for academic purposes, the teaching of English for professional or vocational purposes, or the teaching of English for non-native English speakers who acquire English for certain goals (Agustina, 2014). For the purpose of this study, we define language learning for specific purposes as the learning of a language, native or second/foreign, for academic, professional, vocational and/or other specific purposes differentiated to general language (English or other).
Although previous systematic literature reviews explored the existing research on virtual and augmented reality in language learning in general (e.g. Dhimolea et al., 2022; Parmaxi, 2020; Huang et al., 2021) the recent use of virtual, augmented and mixed reality applications for language learning for specific purposes have not been reviewed. This calls for a summary of XR applications that have been used to enhance language learning and promote effective language learning for specific purposes. The current paper identifies the barriers and facilitators, promoting factors and acceptance related to the implementation and application of XR in FLE globally.
Setting out to address the issues above, the present systematic review is timely for two reasons: First, it provides very recent findings on the potentials and direct impact of AR,VR,MR on language for specific purposes, thus supporting policy makers, researchers and practitioners in making evidence-based decisions related to the deployment of XR in the language teaching and learning. Second, it is timely in providing relevant parties with up-to-date insights. Given the rapid advances in technology (Zhang & Zou, 2022) and the rise in the XR, anything older than a few years is considered obsolete. In the following sections, the methodology of the study is described, followed by the findings, conclusions and implications for educators and future research.
-
2.2 Protocol
A predefined protocol was used in order to minimize the researcher bias (Kitchenham & Charters, 2007) following the PRISMA approach. Figure 1 presents the systematic literature review process. Five phases of conducting the search were (1) Plan, (2) Conduct, (3) Iterations, (4) Findings and (5) Conclusions. In the first phase, the need for systematic literature review was identified and the research questions were defined. Then, the search databases were decided and the query string to be searched was defined. In the second phase, the database search was conducted and relevant papers were extracted. The inclusion and exclusion criteria were defined and applied, followed by the data extraction and synthesis. In the iteration phase, the forward and backward search was used to find additional papers related to XR for language learning for specific purposes. In the fourth phase, the findings were synthesised and presented. Finally, in the final phase the conclusions, implications for practitioners and researchers, as well as the study limitations were presented.
Figure 19 Systematic Literature Review Protocol

-
2.3 Study scope
Aiming at capturing the most recent and innovative MR applications, journal articles published during the period 2020-2022 were searched electronically.
-
2.4 Search strategy
Published manuscripts during the period 2020-2022 were searched electronically with an eye to retrieve recent relevant published literature on the topic. In the identification phase, previous research on MR and the terms used in the relevant literature were consulted to create a list of keywords for information search. Three online research databases (Scopus, Web of Science, ERIC) were used in order to find relevant literature sources related to Mixed Reality applications for language learning and nursing. The Boolean strings used were based on combinations of mixed reality and language for specific purposes such as "Virtual Environment" OR "immersive environment" OR "Virtual Reality Learning Environment" OR "Virtual Reality Environment" OR "virtual world" OR "VR" OR "VRLE" OR "virtual classroom" OR "virtual class" OR "augmenting reality" OR "mixed reality" OR "mixed reality environment" OR "mixed reality instruction" OR "mixed reality learning") AND ("language learning" OR "computer assisted language learning" OR "technology-enhanced language learning" OR “VRALL” OR “VR-assisted language learning” OR "language course" OR "language classroom" OR "Language education" OR "Foreign language" OR "Second language" OR "Language acquisition" OR "Language teaching" OR "Language learning" OR "Language classroom" OR "L2" OR "language teach*" AND "ESP" OR "language for specific purposes" OR "English for specific purposes" OR "English for specific academic purposes" OR "vocational English" OR "workplace communication" OR "communication competence" OR "English communication" OR "English for occupational purposes" OR "EOP"). The specific databases were chosen as they provide access to quality, peer-reviewed journals related to education and technology. The last search was conducted on the 30th of March 2022. Our keywords derive from Milgram and Kishino’s (1994) “reality-virtuality continuum”. Table 1 presents the number of the results extracted from each database.
Table 1 Database search results
Database
Notes
Results
Scopus
- Search on the fields “Abstract”, “Title” and “Keywords”
93
Web of Science
- All fields Search
357
ERIC
- All fields Search
0
Total
450
-
2.5 Inclusion and exclusion criteriaI
The search revealed 450 results based on the query string searched. A set of inclusion and exclusion criteria were used during this review to aid in the selection of relevant manuscripts (see Table 2).
Table 2 Inclusion and exclusion criteria
Inclusion criteria
Exclusion criteria
1
The manuscript should have been published between the years 2020-2022
The paper was published before 2020
2
The manuscript should involve empirical data on XR applications for language learning for specific purposes (e.g. English for academic/specific academic purposes, professional English etc)
The paper was a review or did not include empirical data on the use of XR applications for language learning in general.
3
The manuscript presented sufficient data to identify how an MR application was used or implemented.
The manuscript was a short paper that did not provide sufficient data (e.g. abstract papers, poster, presentations, scientific events program, tutorial slides, literature reviews, book reviews or editorials)
4
The manuscript was written in English
Publications written in a language other than English were excluded
5
Publications should be accessible
Publications that were not accessible (required to pay or could not access for any other reason) were excluded
6
The manuscript should refer to adult learners (adult learners, university students, professionals)
The manuscript refers to learners other than adults (e.g. middle school or high school students etc.)
Some automation tools were used to narrow down the results based on these criteria and duplicated results were excluded. Subsequently, the titles and abstracts were reviewed to test their appropriateness for the study's purposes. Studies were eligible for inclusion in the corpus if they met the criteria outlined in Table 2. A PRISMA diagram (Page et. al 2017) was followed to depict the process and present the number of the papers retrieved, the reasons for exclusion, as well as the final number of the papers included (see Fig. 2).
-
2.6 Iterations
The forward and backward search was used to find additional papers related to XR for language learning for specific purposes. The references of the included papers were reviewed and relevant papers were added to the corpus. Through this process, six (6) new papers were found, although only three (3) additional studies met the inclusion and exclusion criteria and were included in the corpus.
Figure 20 PRISMA diagram

-
2.7 Synthesis
Three independent researchers reviewed the results to finalise the corpus of manuscripts to be reviewed and the suitability of the manuscripts. Two researchers independently reviewed all manuscripts, a third one reviewed nearly 50% of the articles to confirm the suitability of the selected manuscripts. Disagreements between the three authors were settled through discussion and a second look at the disputed studies. The criteria for inclusion in the final review were met by a total of 21 articles.
The retrieved data were qualitatively synthesized following the strategy of Spolaôr and Benitti (2017). The information extracted from the selected papers was used to create the synthesis and extract information from the papers that is grouped in four categories (see Table 3). Based on Spolaôr and Benitti's (2017) technique, the data were qualitatively synthesized after being retrieved. The information collected from the chosen publications was the basis for the synthesis.
Table 3 Data categorisation based on Spolaôr and Benitti (2017)
Group 1. Material identification
Group 2. Activities reported in the material
IE1. Material ID
IE2. Material title
IE3. Year of publication
IE4. Authors' name(s)
IE5. Source of the material
IE6. Application name
IE7. Level of immersion
IE8. Context
IE9. Short description of the application
Group 3. Basis of the publication
Group 4. Evaluation of material
IE11. Type of hardware
IE11. Target group
IE12. Language
IE13. Technology used
IE14. Facilitating Factors
IE15. Instructional design /Learning design experience
IE16. Classroom orchestration
IE17. Type of task design
IE 18. Intended Outcomes
IE19. Benefits
IE20. Barriers/ Disadvantages/ Potential pitfalls
IE21. Stakeholders’ acceptance
-
2.8 Findings
This section provides an overview of the findings retrieved from our review.
2.8.1 Existing MR/VR/AR technologies for FLE
The literature demonstrated a variety of existing MR/AR/VR applications for language learning (See Table 4). The majority of the technology used was VR (n=17), including two (2) manuscripts that examined the use of VR combined with 360 (Lin et al., 2021; C. H. Chen et al., 2021) and two that (2) examined the use of VR combined with 3D technology (Hara et al., 2021; Wang et al., 2021;). Three (3) papers were found to examine the use of AR technologies (Lin et al., 2022; Sydorenko et al., 2021; Yeh & Tseng, 2020), while there was only one paper referred to the use of MR technologies (Shih, 2020). All the applications were used with higher education students.
Table 4 Existing VR/AR/MR applications
Technology
Application name
Number of manuscripts
VR Applications (n=16)
Second Life
5
EduVenture VR
4
Comunica-Enf.
1
Modern Operation Room (MOR)
1
Google Tour
1
Not specified VR environments
5
Total VR applications
17
AR Applications (n=3)
Unity Mobile App
1
Tourist AR app
1
ChronoOps AR app
1
Total AR applications
3
MR Applications (n=1)
Virtual English Classroom Augmented Reality (VECAR)
1
Total MR applications
1
Total VR/AR/MR applications
21
The most popular application for VR in the corpus was Second Life (SL) (J. C. C. Chen, 2020; Jehma, 2020; Kruk, 2021; Muñoz, 2021; Wang et al., 2021). SL is an internet-based, 3-dimensional virtual world that has been created by Linden Labs Company in San Francisco (Jehma, 2020). There are many opportunities in SL that are suitable for teaching and learning other languages. Students can interact with native speakers of the language in a virtual setting, be exposed to a wealth of authentic input, take on different roles that will help them use the language in a more natural context, and work together to accomplish challenging tasks using suitable media such as text, voice, and video (Yu et al., 2020).
EduVenture VR app was the second most popular application found in the corpus (C. H. Chen et al., 2021; C. Y. Chen et al., 2021; V. Lin et al., 2021; Yeh et al., 2020). In EduVenture, 360-degree videos, or spherical videos are produced with a camera that can record and capture a screen presenting content from all angles. Students can work with a variety of information modalities (including still photographs, panoramas, audios, videos, and texts) to create VR material.
Other VR applications found in the corpus include Modern Operation Room (MOR), a newly created VR teaching tool that intends to give nursing students the chance to experience a simulated operating room and practise performing surgery before they can become licensed nurses. It presents virtual healthcare scenes to the participants (Wu et al., 2021), Google Tour application which allows users to use Google's street-view technology to build tours on their computers (Y. J. Lin & Wang, 2021).
There was also a number of five (5) applications that were not specified, however two of them referred to virtual reality learning environments (VRLE) (Barrett et al., 2021; Pack et al., 2020) and others also had to do with VR environments and tasks, but no further details about the environments were given (e.g. Taguchi, 2022; Khodabandeh, 2022).
AR Applications that were found in the corpus include ChronoOps, an AR GPS-enabled place-based game (Sydorenko et al., 2021) and Tourist AR app, a location-based AR app (Yeh & Tseng, 2020). Unity for mobile AR app was also used for creating an augmented-reality context-aware ubiquitous writing (ARCAUW) application (V. Lin et al., 2022).
The only MR Application that was found in the corpus is Virtual English Classroom Augmented Reality (VECAR), a virtual environment that offers 360-degree street-level footage of well-known locales all around the world, enabling virtual access to context-sensitive learning environments for vocabulary. Language learners can explore actual street view sceneries and interact with virtual things or other avatars by using virtual avatars in conjunction with real street views (Shih, 2020).
The papers were also categorised based on their educational context as formal or informal. According to Crompton et al. (2017), a formal setting refers to intended learning in a typical educational setting. Informal context refers to intended learning in an atypical setting such as a lesson taking place outside the classroom (Crompton et al., 2017). The majority of the applications were used in a formal setting while only one (1) of them was found to implement tasks outside of class and therefore classified as informal learning (Sydorenko et al., 2021).
With regards to the hardware and level of immersion, the applications were classified as fully immersive (require specialised hardware such as head mounted displays-HMDs-), semi-immersive (increase immersion while retaining some outside information), and non-immersive (with the use of mobile and desktop screens) as described in Saeed Alqahtani et al. (2017). In our case, more than half of the applications (n=11) were classified as immersive because they could be used with HMDs such as Google Cardboard and other VR standalone headsets. Six (6) of the applications were classified as non-immersive as they were XR applications used on mobile, desktop or tablet devices and no HMDs were used (Chen, 2020; Jehma, 2020; Kruk, 2021; Wang et al., 2021; Yeh et al., 2020; Yu et al., 2020), while four (4) applications were semi-immersive that combined both immersive and real-life elements (Lin et al., 2022; Shih, 2020; Sydorenko et al., 2021; Yeh & Tseng, 2020).
The language focus of the corpus was on English (19 out of the 21 manuscripts), one paper referred to Chinese while one paper didn’t specify the target language. Fifteen (15) manuscripts were found to focus on Language Learning for specific purposes, of those 14 concerned English for academic purposes and one (1) considered Chinese. There were also papers referring to Language Learning for Specific Academic Purposes (n=6). Most of them refer to English for Specific Academic Purposes (ESAP) (n=5) and include English for tourism purposes (n=2), English for nursing (n=1), multimedia English (n=1) and English translation (n=1) while only one (1) paper does not specify the target language but refers to language learning for nursing. Table 5 summarises the language focus of the manuscripts.
Table 5 Language Focus
Language Learning for Academic Purposes
15
· English for Academic Purposes (EAP)
14
· Chinese
1
Language Learning for Specific Academic Purposes
6
- English for Specific Academic Purposes (ESAP)
5
· English for tourism purposes
2
· English for nursing
1
· Multimedia English
1
· English Translation
1
- Not specified language targeted for nurses
1
Total
21
-
2.9 Benefits related to the implementation and application of MR in FLE
The application of XR in FLE comes with many advantages as it can be seen from the present corpus. All studies presented relatively positive results for learners which mostly relate to the creation of authentic contexts for foreign language use and initiation of “real-world” linguistic interactions. More specifically, Ma (2021) emphasised the importance of VR as an emerging technology that can break the time and space limit and offer a “real” environment for students to immerse themselves in. Ma’s study demonstrated how the immersive environment can foster better cooperation and participation from students and their peers and help them become more “active” in the learning process. The realistic element of the immersive 3D serious game in the study of Hara et al. (2021) also yielded positive results since the players could locate themselves within the game and interact freely in an interactive and standardized way. The studies also demonstrated cases where students valued the experiential aspects of immersion (Wang et al., 2021) and improved their language production (Lin et al., 2021) in specific subjects. Immersive VR also helped develop tasks that can engage L2 learners with meaningful, real-life-like language use (Taguchi, 2022). Moreover, the usage of AR significantly improved long-term memory, motivation, and self-regulation of writing cognitive processes in the study of Lin et al. (2022). Additionally, students' multimodal literacy significantly increased as a result of the content they created for a location-based augmented reality app in the study of Yeh and Tseng (2020).

The corpus also brought to life cases of oral proficiency improvement and development of students’ English communication skills in SL (Jehma, 2021). Moreover, the vivid EFL learning environment provided by visual prompt scaffolding-based VR approach helped improve the learners’ reading skills (Wang et al., 2021). Another benefit included the positive impact on the students’ efficacy for creative thinking (Lin & Wang, 2021). Specifically, the hands-off opportunity for students in the L2 classroom to create content in the VR environment from scratch, enabled them to take on the roles of English learner, user and VR creator and deal with multiple tasks and brainstorm new ideas and solutions. Finally, the exploratory study of Chinese Nursing students in learning with immersive reality by Wu et al. (2021) demonstrated positive attitudes towards the use of Modern Operation Room in improving their medical content knowledge and enhancing their English vocabulary knowledge.
-
2.10 Facilitators, promoting factors and acceptance related to the implementation and application of MR in FLE
In this section, the facilitators, promoting factors and acceptance related to the implementation and application of MR in FLE are presented. They are Categorised in factors that are encountered before and after the implementation of the activity, as they emerged from the corpus of manuscripts.
2.10.1 Before the implementation of the activity
Prior to the start of the course, teachers need to clarify the tasks, teaching points and difficulties, and teaching activities and prepare a teaching plan (Ma, 2021). The appropriate software and content based on the lesson objectives should also be selected (Ma, 2021; Hara et al., 2021). Additionally, information on how to use and interact with the VR applications, as well as preparing the students and the teachers for the tasks before the implementation of the VR applications and activities should be provided (Ma, 2021; Hara et al., 2021; Yeh et al., 2020; Wang et al., 2021; Kruk, 2021; Taguchi, 2022; Barrett et al., 2021; Yu et al., 2020). For example, in the study of Lin and Wang's (2021) students were given a handout with information on fundamental SL operations like chatting, teleporting, looking at the SL map, and using their avatars, and they practiced these for a week. The researchers provided online assistance if they ran into any issues (Lin & Wang, 2021). Software and equipment training is also important for both students and teachers (Ma, 2021; Yeh et al., 2020; Wang et al., 2021; Wu et al., 2021; Kruk, 2021; Yu et al., 2020; Barrett et al., 2021).

The training sessions prevent students from feeling overwhelmed and confused when moving within the VR Learning Environments and make it easier for them to concentrate on the subject at hand by (Yu et al., 2020). For example, the study of Wu et al. (2021) allocated 20 minutes on training teachers and students on how to wear the VR headset as well as how to use the controllers before the teaching activity (Wu et al., 2021). It was also found that learners could use the technology on their own time and pace in order to familiarise themselves with it. In the study of Lin et al. (2022), prior to starting the actual task at each level, the participants viewed augmented reality learning materials at the learning site and spent nine hours on the AR-based practice and learning in total. Some students took the tablets and keyboards home to do the assignments (Lin et al., 2022).
2.10.2 During the activity implementation
At the time the VR activity is taking place, not only students but also the instructors play an important role for the support and guidance of their learners (Yu et al., 2020; C. H. Chen et al., 2021; Wu et al., 2021; Lin et al., 2022; Yeh & Tseng, 2020). When participants needed assistance with technology or wanted more detailed instructions on how to accomplish the AR-based learning tasks, the instructor and teaching assistant served as facilitators in the study of Lin et al. (2022). During their experience with the VR, the participants in the study of Barrett et al. (2021) walked, exactly as they would in the real world, to manage their navigation within the virtual environment. They were told not to cross the blue net-delineated boundaries of the lab room while navigating and they were instructed to face the direction of an arrow that was placed adjacent to a virtual round frame/spotlight that was drawn on the ground. The participants were then asked if they were prepared to be automatically transferred to the environment and as soon as they gave their consent, they were virtually transported to the simulated room (Barrett et al., 2021).
In the study of Yu et al., (2020), students were paired up and given the roles of tutors and tutees. A tutor was physically present to make sure students could use the specific application (EduVenture), to help them if they felt unwell, and to provide them feedback on how they were doing on their tasks (Yu et al., 2020). Similarly, in the study of C. H. Chen et al. (2021), in order to assist the students in resolving the challenges presented in the scenario during the problem-based learning (PBL), the instructor offered a set of guided questions. In particular, the students were instructed to identify potential issues, look for reliable sources, come up with workable solutions, and consider their conclusions, while the instructor guided the procedure, monitored it, and held weekly progress meetings to ensure ongoing improvement.
To aid students' improvement of their English-speaking abilities, a progressive question prompt-based peer-tutoring strategy in VR situations (PQP-PTVR) is suggested by C. Y. Chen et al. (2021). The English question prompt-based strategy gives tutors guiding questions so they can methodically direct students' exploration of the museum and aid in their understanding of the exhibition through questioning and giving feedback (C. Y. Chen et al., 2021).
Furthermore, in the study of Hara et al., (2021), a tool to assess health communication skills was used as a guide to improve faculty's capacity to evaluate effective health communication behaviours as well as to promote multidisciplinary collaboration in nursing students' health communication education (Campbell et al., 2013).
2.10.3 Acceptance
When viewing the acceptance and attitudes towards the implementation and application of MR in FLE, the majority (n=12) of the manuscripts stated a positive attitude of the stakeholders towards the technology implementation (e.g. Kruk, 2021; Wu et al., 2021). In the rest of the articles, it wasn't clearly specified whether the stakeholders had a positive or negative attitude. Only one paper stated that the participants had a neutral attitude towards the use of Second Life in a collaborative project (Yu et al., 2020).
-
2.11 Instructional Design
The implementation of the MR technologies in the corpus demonstrated various learning design experiences. 360° videos and images were also part of the learning design experiences. Taguchi (2022), used 360° videos recorded using Insta360 which were uploaded on YouTube and accessed by participants through the OculusGo VR headset. Initially, the participants saw the written situational scenario displayed for 20 seconds and then received 30 seconds in the video to familiarise themselves in their surroundings. After the delivery of a spoken prompt by a person in the video, the participants produced their own spoken response to that person with the aid of a digital voice recorder. Lin et al. (2021) provided auditory, visual, and textual input and created an immersive and interactive scenery-based virtual reality (SBVR) through immersive and interactive 360° video-based VR. Specifically, the participants in this study received linguistic content in visual, auditory, and textual forms simultaneously in the SBVR script to improve their linguistic competence in English for tourism purposes.
The implementation of real-life tasks was also part of the corpus. The study of J. C. C. Chen (2020) incorporated meaningful, real-life tasks to the syllabus with students who wished to practise their English with other speakers around the world in VIRTLANTIS in SL, and the study of Yu et al. (2020) with five tasks embedded in a Technology Assisted Language Course at a public university in northern Taiwan. Finally, Wang et al. (2021) provided undergraduate students of Chinese language in an Australian university with access to a non-compulsory, extra-curricular, online activity space based on the principles of Task-Based Language Learning (TBLL). The activity space included customised language and culture learning tasks based on the textbook and designed on the Chinese Island (CI) in SL and a custom-built website “Virtual Chinese” on Moodle, with instructions, language quizzes and direct access links to locations on the CI. Similarly, the study of Sydorenko et al. (2021) required the communication with various speakers-language learners in real-world contexts
Other instructional design implementations in the corpus varied widely. Yeh at al. (2020) engaged EFL students in the process of making VR content to enhance their intracultural learning through a VR application in an 18-week project. After choosing three tourist sites for their VR presentations, the students were taught how to use the application to make panoramas and interactive features. The second stage involved the students composing their content and applying what they had learnt into practice with the aid of their instructors. During the final stage, the students engaged in a peer reviewing process and completed questionnaires for their responses. In the study of Yeh and Tseng (2020), project-based learning (PBL) was adopted as an educational framework to support students' improvement in multimodal literacy. The PBL approach's objective is to have students work with peers to create artifacts. The end product was an augmented reality (AR) software for tourists that allowed users to use their phones to locate specific points of interest and interact with those areas once they were there (Yeh & Tseng, 2020).
As for the development of communication skills, the study of Jehma (2021) recorded the participation of students in SL after being informed by the teacher of the class objectives and tasks that were to be assigned. Students engaged in oral, written and listening practices based on their tasks, ending with the teacher’s feedback before the class ended. Finally, the implementation of VR for nursing education revolved around communication purposes and improvement of vocabulary. Specifically, the nursing students working in Comunica-Enf in the study of Hara et al. (2021) acted with a virtual scenario and communicated with the patient’s avatar, after a nurse educator executed the keyboard commands to give access to the screens and the patient unit. The scenario closed once the nurse educator understood the student achieved the learning objectives or if the communicative attempt was unsuccessful. In case effective communication was established, the avatar patient accepted to undergo the procedure and the player saw a green light and a smile from the patient.
In the study of Sydorenko et al. (2021), the participants played in groups an AR GPS-enabled place-based game. The game required players to go around a college campus, stopping at each of the five designated sites. The game encouraged participants to record a video report about the specific green technology displayed there once they arrived at a given place (e.g., solar panels, electric vehicles, rain reclamation). Other groups' members could now view the video reports because they had been added to the game. In order to promote discussion between participants, the rules of the game were purposefully vague and open-ended. Participants engaged in extensive discussion and negotiation to reach each of these decisions (Sydorenko et al., 2021).
Aiming more on the students’ improvement of medical vocabulary, Wu et al. (2021) demonstrated how teacher and student participants were immersed with their VR headsets and handheld controllers in the fully operational room for 80 minutes in a session designed by the English teacher to support vocabulary learning. All participants underwent the following three stages – Input, Use and Output. In the Input stage, students entered the hospital scene and were asked to perform pre-surgery procedures, such as washing hands and wearing their masks correctly. They were also instructed by their teacher to pick up surgical equipment in the Modern Operation Room (MOR) while the surgical operation was taking place. This aimed at developing the students’ spatial awareness of the surgical instruments. During the Use stage, students were instructed to participate more actively in the simulated surgery and finish it under the monitoring and guidance of their teacher. While in the Output stage, the students worked in pairs and took turns on the role of a nurse supervisor. The last stage enabled students to actively put to practise what they had learnt in the previous two stages and gain a deeper understanding of relevant vocabulary related to their field of study.
For participants to produce a process analysis essay on global warming, the learning application described in the study of Lin et al (2022) delivered a series of metacognitive scaffolding exercises at various levels of difficulty. Inquiry-based learning, genre pedagogy, metacognitive scaffolding, and a plot about rescuing an Icelandic polar bear village were among the design components. The mission required participants to explore the green building in order to gather data on global warming before writing their reports on tablets using Bluetooth portable keyboards (Lin et al., 2022).
-
2.12 Task design
A variety of task design was identified from the manuscripts (see Table 6). Real-and-virtual communication was found to be used as the main task in XR manuscripts (Kruk, 2021; Jehma, 2020; J. C. C. Chen, 2020; Sydorenko et al., 2021). For example, the study of Hara et al. (2021) focused on communication with patients in the form of role play, a task mentioned also in Taguchi (2022). Instruction-based tasks were also described, for example in the study Wu et al., (2021) that focused on performing nursing tasks and on active participation in nursing procedures/actions. Also, the study of Lin et al. (2022) included tasks such as connecting pictures of green buildings using a photo collage, and then depicting the causes and impacts of global warming using a concept map based on YouTube videos (Lin et al., 2022).
More demanding tasks include content creation tasks in VR/AR (Yeh et al., 2020; Yeh & Tseng, 2020; Lin & Wang, 2021; C. H. Chen et al., 2021) such as narration in Google Tour Creator (Lin & Wang, 2021), as well as creative tasks for VR content creation (Yeh et al., 2020) or AR content creation where students work with peers to create artifacts in AR (Yeh & Tseng, 2020). Other tasks described include peer tutoring (C. Y. Chen et al., 2021) and VR/AR environment exploration (Shih, 2020; Wang et al., 2021; Yu et al. 2020; Sydorenko et al., 2021) where students navigate within a virtual environment (such as SL) learning naturally.
Table 6 Summary of task design
Task
Examples
Description
Intended Outcomes
Potential pitfalls
Virtual communication
Kruk (2021), Jehma (2020), J. C. C. Chen (2020)
Using VR tools (e.g. VR headsets, google cardboard) students are asked to conduct communicative and presentative tasks in the virtual environment (e.g. SL, EduVenture)
-To enhance language performance
-To develop communication skills (listening, speaking, writing)
Limited space and time
Expensive equipment
Technology availability and compatibility
Technical issues
Motion sickness
Role play
Hara et al. (2021)
Taguchi (2022)
The users have a specific role (e.g. nurse) and come to communicate (mostly orally) with other users (real or avatars) in a virtual environment
- To Improve oral proficiency and willingness to communicate
-To improve oral fluency for specific purposes (e.g. clinical context)
- To recognize and handle conflicts/ decisions
Difficulties while talking with avatars
Greater cognitive load
Less fluent speech
Class time constraints
Limited feedback from instructor
Peer tutoring
C. Y. Chen et al. (2021)
Students have roles of tutors and tutees in a VR-enhanced interactive learning system (e.g. in a museum learning guidance model). Tutors are provided with guiding questions to guide tutees’ exploration of the museum
-To enhance students’ English-speaking practices
Challenging to promote students’ learning engagement and peer interactions
Instruction-based tasks
Wu et al. (2021); Barrett et al., (2021);
Pack et al. (2020); Lin et al., (2022); Khodabandeh, 2022)
Students are asked to perform a specific task (e.g. move objects, match objects with words/characters, perform a nursing task, color words, form paragraphs, repeat words) in the VR environment following the instructions given depending on the learning objective
-To experience a simulated context (e.g. medical context)
-To improve language learning (vocabulary, paragraph/writing structure, prepositions)
-To increase awareness of semantic radicals
-To improve long-term memory, motivation, and self-regulated cognition
Requires virtual literacy of the learner
Loss of direction, sore eyes, and neck pains.
Technical issues
Need for more teacher support
Increased cognitive load
Time consuming
VR environment exploration
Shih (2020); Wang et al., (2021), Yu et al. (2020); Sydorenko et al., (2021)
Students are learning through their avatars after being exposed to content intentionally integrated within a virtual environment (such as SL) (e.g. navigating to another country and being exposed to the target language and culture)
Improve target language learning, vocabulary and culture awareness
To reduce the learners' foreign language anxiety
Students are likely to be demotivated by the feeling of too much freedom
Technical difficulties
Content creation in VR/AR
Yeh et al., (2020); Lin & Wang (2021);
C.H. Chen et al. (2021); Yeh & Tseng (2020)
The students view, design, create and share VR content (e.g. video or/and voice recordings, narrations, images etc) aiming to demonstrate a specific culture or points of interest.
To increase language learning motivation and increase cultural awareness
To increase multimodal literacy
Technical difficulties
High cognitive loads in the VR environments.
-
2.13 Classroom orchestration practices and intended Outcomes
When it comes to classroom orchestration, most of the studies supported a solo use of the MR technologies. In fact, in more than half of the studies it was reported that students worked individually within the immersive environment while in Yeh et al. (2020), students worked individually and later engaged in a peer-review activity. Even though in Barrett et al. (2020) students worked alone in the virtual environment, the teacher was co-present but did not interfere. It is interesting to note that a combination of individual and group activity was found in only two studies (Wu et al., 2021; J. C. C. Chen, 2020) while group activity was implemented in one third of the studies. Specifically, in Lin et al. (2021), students performed joint verbalizations in dyads to deepen the learning and activate knowledge co-construction. In the experimental study of C. Y. Chen et al. (2021), students were assigned roles of tutors and tutees with one tutor student being present in each group of tutees.

A progressive question prompt-based peer-tutoring strategy in VR situations (PQP-PTVR) was suggested by C. Y. Chen et al. (2021). The teacher paired up the students and assigned them the roles of tutors and tutees. Two tutees were under the supervision of one tutor in each group. The tutee is typically asked to give a quick overview of the museum by the tutors. The tutors follow up with additional inquiries to elicit more thorough descriptions based on the response. When they believe it is difficult to respond to a question, the tutees may also approach the tutors for advice. Generally, the questions are shallower and more generic in the early rounds before becoming deeper and more particular. The tutors are in a non-immersive VR environment in contrast to the tutees so they can help and mentor the tutees in the immersive VR environment. When they are unsure of what questions to ask, they can also access the leading questions by accessing the question bank on their tablets. When the tutors select "Help," the question bank opens. Progressive question prompts are given to the experimental group's instructors. The technique initially delivers guided questions with one word removed and a blank in its place. Using the context provided, the instructors must choose an acceptable word to complete in the blank. Later on, the system ups the difficulty and asks questions with several words eliminated and their places filled in with blanks. Since there is no instruction in the last phase, the instructors must develop context-appropriate questions. Six settings were incorporated in the VR system, each with five to eight pre-set rounds of questions and answers. For the instructors' use as needed, a question bank was created and integrated into the system. Before starting the assignment, all students viewed a 10-minute English-language film introducing the Beitou Hot Spring Museum (C. Y. Chen et al., 2021). Similarly, in the study of Sydorenko et al., 2021, students were in groups of L1 speakers of English (ES) and English language learners (ELL). While ESs' goal was to observe interactions involving language learners in real-world contexts and then relate their observations to SLA theories covered in their pre-service teacher training program, ELLs' goal was to engage in communication with a variety of speakers in relatively unstructured tasks outside of class. All students, both ELLs and ESs, wrote observations on the experience after the game.
-
2.14 Intended outcomes
Through careful manuscript studying, we identified a variety of intended outcomes, involving the knowledge, or skills students were expected to acquire from their participation in the MR studies. Out of the 21 manuscripts, the majority focused on communicative and oral development outcomes. In the study by Hara et al. (2021), Nursing students engaged in verbal and non-verbal communication with a virtual patient who had a failed venipuncture in a 3D Serious Game called “Comunica-Enf”. A nurse avatar informed students of the scenario while the students had to ask for permission from the virtual patient to proceed with the venipuncture again. J. C. C. Chen (2020) also studied the students’ communication and oral proficiency development in a three-dimensional multi-user virtual environment (MUVE), testing the effects of pre-task planning on EFL learners’ oral performance in Second Life.
Intended outcomes also included the development of skills like reading, listening, and writing. Pack et al. (2020) used a prototype virtual reality learning environment (VRLE) designed for teaching and learning writing structure. Ma (2021) aimed at improving the students’ general English learning ability in a Chinese tertiary education context through a VR-based situational teaching method with an experimental and control class. Other intended outcomes revolved around the creation of VR content (Yeh et al., 2020), performing nursing tasks and procedures as a way of enhancing students’ medical vocabulary (Wu et al., 2021) and student narration of their points of interest in Google Tour Creator (Lin & Wang, 2021).
Additionally, other intended outcomes include the improvement of long-term memory, motivation, and self-regulated cognition in participants' writing development (Lin et al., 2022). improvement of multimodal literacy (Yeh & Tseng, 2020), as well as to advance understanding of ELF interactions in many circumstances, such as an AR location-based mobile activity (Sydorenko et al., 2021).
-
2.15 Barriers and potential pitfalls related to the implementation and application of MR in FLE
The implementation of MR in FLE comes with certain barriers which might impede students from fully engaging in the immersive environment. Instances of motion sickness were demonstrated in the corpus. Specifically, in Jehma’s (2021) study, learners reported dizziness with the use of VR goggles in SL due to near sightedness, resorting to their smartphone app on some occasions to watch the VR video. Some concerns of the Chinese Nursing students related to health problems when using the Modern Operation Room included the loss of direction, sore eyes, and neck pains (Wu et al., 2021). Students complaining about headaches caused by “too many sounds” and “too much time spent in Second Life” was also part of the recorded barriers (Kruk, 2021). Finally, Barrett et al. (2021) found a major challenge when designing the Virtual Reality Learning Environment (VRLE) for learning writing structure, which revolved around how to comfortably render a large amount of text for the participants. In cases when the text was too close to the users, they had to frequently move their head and neck, causing unwanted discomfort and physical strain on their eyes. Instructors and practitioners who wish to design a VRLE for writing need to take this into consideration since users may not want to use the VRLE for a long time due to frustration and straining. An additional barrier found by Barret et al. (2021) was the overheating of the VR headset, which could apply pressure to the participants’ glasses or face.

Other barriers in the corpus revolved around lower levels of confidence and motivation. Apart from headaches, Kruk (2021) demonstrated decreasing and increasing patterns of self-reported motivation. One of the two participants in this study displayed lower motivation levels when visiting SL since she could not find anybody to chat with. The other participant found chatting with some of her interlocutors to be boring for her. Another interesting drawback concerning de-motivation in the custom-built area Chinese Island in SL was found by Wang et al. (2021). The authors reported the loss of motivation when too much freedom was given to the students within the immersive environment and difficulty with the self-service delivery of instruction.
In the study of J. C. C. Chen (2020), it was seen to be anxiety-inducing to have the EFL participants with no prior experience using Google Expeditions to present as soon as they stepped foot in the classroom, without giving them any chance to prepare. Google Expeditions was unable to accommodate all participants in a single session so they were divided into two groups to provide them the chance to participate in more active learning in the classroom (J. C. C. Chen, 2020).
Finally, due to the time-consuming nature of AR-based learning at an authentic learning site, more time is spent on instruction (Lin et al., 2022).
-
2.16 Discussion
The first research question of this study aimed to identify the existing MR/VR/AR applications for FLE for specific purposes. Increased attention was directed towards VR applications for language learning for specific purposes (e.g., Barrett et al., 2021; C. H. Chen et al., 2021; J. C. C. Chen, 2020), rather than on AR or MR applications. Only one study was found to focus on MR applications (Shih, 2020), while no study was found to focus on AR. In line with previous studies (Pellas et al., 2020), it was found that modern cardboard and mobile technologies, along with integrated simulations produced by XR, enable a highly engaging learning experience.
The second research question sought to identify the benefits, facilitators promoting factors, barriers and potential pitfalls related to the implementation and application of VR/AR/MR in FLE.With regards to the facilitating factors, facilitators were identified prior the activity, as well as during the activity. Before the activity, the tasks should be clarified by the instructor and developed based on the curricula (Ma, 2021). It is important to choose appropriate software and content based on the lesson objectives as well as to provide instructions or/and training sessions on how to use the specific technology (J. C. C. Chen, 2020; Kruk, 2021; Ma, 2021; Hara et al., 2021; Yeh et al., 2020). The role of the instructor during the technology activity is important for guidance, support and feedback (Yu et al., 2020; C. H. Chen et al., 2021; Wu et al., 2021). When it comes to the benefits and acceptance related to the implementation and application of MR in FLE, the use of VR applications in the language learning process has positive results and highlight the importance of authentic context and realistic environments (Ma, 2021; Hara et al. 2021; Taguchi, 2022). This is also in line with previous studies that support that extended reality technologies provide great opportunities for interactive learning to promote learning experiences and outcomes (Chen et al., 2022). Teachers and students are generally positive in using such technologies, especially VR in their teaching and learning process (J. C. C. Chen, 2020).
To answer the third question, a variety of task and instructional design was applied within the manuscripts. The most important categories of task design that were extracted include communication tasks, role play, interactive and collaborative tasks, creative tasks, individual tasks and instruction-based tasks. When it comes to instructional design, various learning design experiences were identified, such as experimental and control groups, implementation of real-life tasks, making VR content, activities for participants’ scaffolding, development of communication skills, communication purposes and improvement of vocabulary. With regards to the classroom orchestration practices applied and the intended outcomes. Most of the studies supported a solo use of the MR technologies with students working mainly individually within the environment. The intended outcomes focus on communicative and oral development, development of skills like reading, listening, and writing, creation of VR content, medical vocabulary enhancement and narration. The final research question explored the barriers and potential pitfalls of the MR/VR/AR applications. The most common barriers when using the technologies were found to be motion sickness, low levels of confidence and motivation, as well as anxiety.
-