U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.6(11); 2020 Nov

Logo of heliyon

Multimedia tools in the teaching and learning processes: A systematic review

M.d. abdulrahaman.

a Department of Information and Communication Science, University of Ilorin, Ilorin, Nigeria

b Department of Telecommunication Science, University of Ilorin, Ilorin, Nigeria

A.A. Oloyede

N.t. surajudeen-bakinde.

c Department of Electrical and Electronics Engineering, University of Ilorin, Ilorin, Nigeria

L.A. Olawoyin

O.v. mejabi, y.o. imam-fulani.

d Department of Religions, Faculty of Arts, University of Ilorin, Ilorin, Nigeria

e Department of Mass Communication, University of Ilorin, Ilorin, Nigeria

Access to quality education is still a major bottleneck in developing countries. Efforts at opening the access to a large majority of citizens in developing nations have explored different strategies including the use of multimedia technology. This paper provides a systematic review of different multimedia tools in the teaching and learning processes with a view to examining how multimedia technologies have proven to be a veritable strategy for bridging the gap in the provision of unrestricted access to quality education and improved learners' performance. The review process includes conducting an extensive search of relevant scientific literature, selection of relevant studies using a pre-determined inclusion criteria, literature analysis, and synthesis of the findings of the various studies that have investigated how multimedia have been used for learning and teaching processes. The review examines various case study reports of multimedia tools, their success and limiting factors, application areas, evaluation methodologies, technology components, and age groups targeted by the tools. Future research directions are also provided. Apart from text and images, existing tools were found to have multimedia components such as audio, video, animation and 3-D. The study concluded that the majority of the multimedia solutions deployed for teaching and learning target the solution to the pedagogical content of the subject of interest and the user audience of the solution while the success of the different multimedia tools that have been used on the various target groups and subjects can be attributed to the technologies and components embedded in their development.

Education, Media in education, Teaching/learning strategies, Pedagogical issues, Systematic review

1. Introduction

Multimedia is a combination of more than one media type such as text (alphabetic or numeric), symbols, images, pictures, audio, video, and animations usually with the aid of technology for the purpose of enhancing understanding or memorization ( Guan et al., 2018 ). It supports verbal instruction with the use of static and dynamic images in form of visualization technology for better expression and comprehension ( Alemdag and Cagiltay, 2018 ; Chen and Liu, 2008 ). The hardware and software used for creating and running of multimedia applications is known as multimedia technology ( Kapi et al., 2017 ). Multimedia technology has some characteristics like integration, diversity, and interaction that enable people to communicate information or ideas with digital and print elements. The digital and print elements in this context refer to multimedia-based applications or tools used for the purpose of delivering information to people for better understanding of concepts.

Indeed, various aspects of human endeavours, especially the educational sector, are being transformed by the advent of Information and Communication Technology (ICT). ICT involves the use of hardware and software for the purpose of collecting, processing, storing, presenting, and sharing of information mostly in digital forms. Multimedia technology is an important aspect of ICT that deals with how information can be represented and presented digitally, using different media such as text, audio, video, among others ( Guan et al., 2018 ). It involves the combination of several technologies provide information in the best possible formats, packages, and sizes.

However, when used in the classroom or for educational purposes, the design quality and sophistication of multimedia application must be high enough to combine the different elements of the cognitive processes so as to achieve the best mimicking of the teacher. There are different types of multimedia applications available in the market today. These applications have been deployed for different educational purposes such as the works deployed for Mathematics classes, Social Sciences, Sciences, Physiology, Physics and Physical Education Studies ( Al-Hariri and Al-Hattami 2017 ; Anderson, 1993 ; Chen and Liu, 2008 ; Chen and Xia, 2012 ; Ilhan and Oruc, 2016 ; Jian-hua & Hong, 2012 ; Milovanovi et al., 2013 ; Shah and Khan, 2015 ).

The central problem, however, remains the same. Which is, the problem of how to use the applications to provide students with stimulating experience by delivering information for better understanding of concepts. While it is important to develop various applications for effective teaching delivery, each of these applications has its own focus area, peculiarities, target age, merits and demerits. Thus, the taxonomy and component synthesis for the development of the multimedia application need to be extensively investigated as these would affect the teaching delivery, learning and wider applicability. Some of the multimedia solutions have been deployed, tested and recorded significant success, while some did not record marginal success.

The success stories also vary with location, target age and deployment purposes. Therefore, the aim of this paper is to provide a systematic review of the scientific published studies that examined different multimedia tools in the teaching and learning process with a view to identifying the existing multimedia-based tools, understanding their usage, application areas and impacts on education system. In order words, the study, through a systematic review of literature, aims at identifying the existing multimedia-based tools for teaching and learning; understanding their usage and limiting factors, application areas, evaluation methodologies, technology components synthesis and impacts on education system.

To this end, the study is guided by the following research questions:

  • (1) What are the existing multimedia tools in teaching and learning?
  • (2) What type of multimedia component fits an audience?
  • (3) What types of multimedia components are adopted in the existing tools?
  • (4) What evaluation methodologies are useful for successful outcome?
  • (5) What factors aid success or failure in the use of multimedia tools for teaching and learning?

The outcome of this study is aimed at serving as a guide for teachers and education administrators while selecting multimedia tools and applications for teaching in schools. So, in this study, the taxonomy and component synthesis of some widely cited multimedia applications are provided. Various case studies and results are examined. Furthermore, barriers limiting the usage of ICT and multimedia in teaching and learning are identified; and some unresolved cases and future research decisions are outlined.

The subsequent parts of this paper include Section 2 , which is the literature review that examines multimedia technology and its place in teaching and learning; Section 3 , the research methodology; Section 4 , presentation of results; Section 5 , discussion of the findings; and Section 6 , the conclusion, recommendations and suggestions for future work.

2. Literature review

2.1. multimedia learning and teaching: concepts and resources.

Multimedia or digital learning resources assist learners to get on well with mental representations with the use of different media elements, which support information processing. Information, which is made up of content and sometimes learning activities, are presented with the use of the combination of text, image, video and audio by digital learning resources. It has been demonstrated, by research on using multimedia for learning, that there are more positive results observed in learners who combine picture and words than those who use words only ( Chen and Liu, 2008 ; Mayer, 2008 ). As stated in Eady and Lockyer (2013) , different pedagogy methods were implemented by the use of digital resources. Their paper presented how the authors were able to introduce topics to students, demonstrate to them, stimulate a group, make different text types available and engage students in an interactive manner.

Generally speaking, multimedia technology for educational purposes can be categorised according to whether they are used for teaching or for learning. Some of the different multimedia or digital learning resources are listed in Eady and Lockyer (2013) . Furthermore, according to Guan et al. (2018) , several studies have established the importance of multimedia technologies to education and the widespread adoption of multimedia tools. Multimedia generally involves the use of technology and the widespread adoption of multimedia applications in education is as a result of its many benefits ( Almara'beh et al., 2015 ). Some of the benefits of the multimedia application tools for teaching and learning are summarized as follows:

  • (1) Ability to turn abstract concepts into concrete contents
  • (2) Ability to presents large volumes of information within a limited time with less effort
  • (3) Ability to stimulates students' interest in learning
  • (4) Provides teacher with the ability to know students position in learning.

Multimedia designed for learning refers to the process of building mental representation from words and pictures in different contexts. They are designed to assist learning with tools which can be used in presentations, class room or laboratory learning, simulations, e-learning, computer games, and virtual reality, thereby allowing learners to process information both in verbal and pictorial forms ( Alemdag and Cagiltay, 2018 ). Multimedia designed for learning requires understanding of some theories such as cognitive theory of multimedia learning, which postulates three assumptions that describe how people learn from instructional multimedia materials. These assumptions can be phrased as dual-channel, limited capacity, and active processing ( Alemdag and Cagiltay, 2018 ). Dual-channel assumes that learners have many channels to separate visual and auditory information. The restricted/limited capacity assumes that there is a limit to the load of data that can be processed in each channel. Understanding these will allow teachers not overwhelming learners with much information. On the other hand, learners will be aware of their information processing limitations or capabilities. Active processing proposes that when it comes to information selection, organization, and integration, human beings are active agents and are capable of managing the forms of information they are interacting with.

The appropriate use of ICT in teaching transforms the learning environment from teacher-centred to learner-centred ( Coleman et al., 2016 ) just as it is transforming all aspects of human life ( Guan et al., 2018 ). Coleman et al., (2016) emphasised that the shifting from teaching to learning creates a student-centred learning where teachers are there as facilitators and not sages on the stages, thus changing the role of the teacher from knowledge transmitter to that of a facilitator, knowledge navigator and a co-learner. Keengwe et al., (2008a) concluded that the application of multi-media technologies ensures a very productive, interesting, motivating, interactive and quality delivery of classroom instruction while addressing diverse learners' needs.

2.2. Role of multimedia technology in teaching and learning

Technology is evolving and scholars in the areas of Information Technology (IT) and education technology are continuing to study how multimedia technologies can be harnessed for the enhancement of teaching and learning. A software tool can be used to expand teaching and learning in various fields. It is important to provide students with practical experience in most fields of learning.

The importance of multimedia technologies and applications in education as a teaching or learning tool cannot be over emphasized. This has been confirmed in several studies that have investigated the impact of multimedia technology to the education system. Milovanovi et al. (2013) demonstrated the importance of using multimedia tools in Mathematics classes and found that the multimedia tool greatly enhances students' learning. Several works exist that show that multimedia enhances students' learning ( Aloraini, 2012 ; Al-Hariri and Al-Hattami, 2017 ; Barzegar et al., 2012 ; Chen and Xia 2012 ; Dalacosta et al., 2009 ; Jian-hua & Hong, 2012 ; Janda, 1992 ; Keengwe et al., 2008b ; Kingsley and Boone, 2008 ; Shah and Khan, 2015 ; Taradi et al., 2005 ; Zin et al., 2013 ).

Multimedia communication has close similarities to face-to-face communications. It is less restricted than text and ensures better understanding ( Pea, 1991 ). Multimedia technology helps simplify abstract content, allows for differences from individuals and allows for coordination of diverse representation with a different perspective. The use of the computer-based technique as an interface between students and what they are learning with suitable fonts and design can be very valuable.

Certainly, multimedia technology brings about improvement in teaching and learning, however, there are a number of limitations in this technology for educational purposes. Some of these limitations include unfriendly programming or user interface, limited resources, lack of required knowledge and skill, limited time and high cost of maintenance among others ( Al-Ajmi and Aljazzaf, 2020 ; Putra, 2018 ).

2.3. Multimedia evaluation techniques

Evaluation entails assessing whether a multimedia programme fulfils the purposes set including being useful for its target audience. Kennedy and Judd (2007) make the point that developers of multimedia tools have expectations about the way they will be used which could be functional (focused on the interface) or educational (involving the learning designs, processes and outcomes). It is important to note that there are different methods used in the evaluation of multimedia and most evaluations entail experiments, comparisons and surveys. The primary goal is to balance assessment validity with efficiency of the evaluation process ( Mayer, 2005 ).

Survey research has two common key features – questionnaires (or interviews) and sampling, and is ideally suited for collecting data from a population that is too large to observe directly and is economical in terms of researcher time, cost and effort when compared to experimental research. However, survey research is subject to biases from the questionnaire design and sampling including non-response, social desirability and recall and may not allow researchers to have an in-depth understanding of the underlying reasons for respondent behaviour ( West, 2019 ; Kelley et al., 2003 ).

Generally, comparison studies follow the format of comparing outcome from an experimental group using the multimedia being evaluated against a control group. This method has been criticised for having inadequate treatment definition, not specifying all treatment dimensions and failure to measure treatment implementation, among others ( Yildiz and Atkins, 1992 ).

Faced with the subjective nature of surveys and the limitations from comparison studies, eye tracking and other student behaviour such as emotional response, provides information not consciously controlled by the student or researcher and is used as an objective data gathering technique. Eye tracking research is a multi-disciplinary field that tracks eye movements in response to visual stimuli ( Horsley et al., 2014 ). Data from eye-tracking allows researchers to validate empirically and objectively, how learners comprehend the multimedia content, the attention of the learner while analysing the multimedia content, and the cognitive demand of the content ( Molina et al., 2018 ). Eye tracking is quite interesting as it provides a useful source of information in the case of children. This is because gathering information using the traditional techniques is more difficult especially when it involves children's interests and preferences ( Molina et al., 2018 ).

Earlier attempts at analysing student behaviour while engaging with online material included analysing student access computer logs, and the frequency of participation and duration of participation ( Morris et al., 2005 ). Nie and Zhe (2020) demonstrated that the conventional method of manually analysing student behaviour is gradually becoming less effective compared to online classroom visual tracking. They found that the online classroom visual tracking behaviour can be divided into several components: selection, presentation, mapping, analysis and collection, as well as the analysis from students' facial expression.

Several works exist that use student behaviour tracking to examine how students interact with multimedia learning tools. For instance, Agulla et al. (2009) , incorporated in a learning management system (LMS), student behaviour tracking that provided information on how much time the student spent in front of the computer examining the contents. They did so through the use of face tracking, fingerprint and speaker verification. Alemdag and Cagiltay (2018) conducted a systematic review of eye-tracking research on multimedia learning and found that while this research method was on the rise it was mainly used to understand the effects of multimedia use among higher education students. They also identified that although eye movements were linked to how students select, organise and integrate information presented through multimedia technologies, metacognition and emotions were rarely investigated with eye movements.

Molina et al. (2018) used eye-tracking in evaluating multimedia use by primary school children. Some studies have used a combination of eye tracking data and verbal data in order to gain insight into the learners' cognitions during learning and how they perceived the learning material ( Stark et al., 2018 ).

As much as eye-tracking and other behavioural research present opportunity for objective evaluation, difficulty of interpretation is one of the limitations of eye-movement data ( Miller, 2015 ), and it is not surprising that the traditional methods of evaluation through questionnaire administration and surveys are still commonly used.

3. Research methodology

This study adopted a research design that involves a searching method for identifying the articles to be reviewed for solving a specific research problem. It includes a systematic review of the article contents for analysis and synthesis. The systematic review follows the procedure outlined in the Preferred Reporting Items for Systematic Reviews and Meta-analysis for Protocol (PRISMA-P) 2015 guideline as provided in the work of Moher et al. (2015) , an extension of Liberati et al. (2009) . The guideline is to facilitate a carefully planned and documented systematic review in a manner that promotes consistency, transparency, accountability and integrity of review articles. Although it was originally developed for the analysis of health related studies, it is now widely adopted in other fields of study. Furthermore, the study involves protocol that includes identifying the data sources for the search, the keywords for the search and the inclusion criteria. To aid in synthesis of the identified articles, key points from the articles are summarised in tables and quantifiable components are analysed.

3.1. Data sources

The quality of a systematic review starts with the data sources used for identifying the articles to be selected for the review. This requires a thorough search and scrutiny of existing literatures from variety of academic databases and journals. The academic databases and journals considered for this review include Science Direct, IEEE Explore, ACM Digital library, Google Scholar, Springer, Wiley Online Library, Taylor & Francis, EBSCOHOST, Web of Science, and Scopus. These databases are reputable bibliographic sources and journals or conference papers indexed in them are deemed reputable and of good quality.

3.2. Search keywords

In order to ensure appropriate primary search terms are used and relevant papers are carefully selected for the review purpose, the literature search method of Kitchenham et al. (2009) was adopted. While it is expected that searching on a main string should be sufficient for the query output to collect all related papers, this is not the case always; hence the inclusion of substrings. Some problems associated with the databases used for the study are:

  • • Some do not have automatic root recognition
  • • Some have limitation of how many words to use e.g. IEEE, 15 words
  • • Some databases offer advanced or expert search
  • • ACM, IEEE and others do not have anything, not even a precedence rule.

The search terms for relevant literatures in the academic databases and journals specified in section 3.1 , are: “multimedia”, “multimedia technology”, “multimedia technology + Education”, “ICT impact + Education”, “multimedia tools + Education”, “multimedia + Teaching”, “multimedia + Learning”, “Application Software + Education”, and “Digital + Education”.

3.3. Inclusion and exclusion criteria

For the purpose given, each of the articles from the consulted academic databases and libraries had an equal chance of being selected. In order to avoid bias in the selection, a clear principle was set and adopted to form the criteria for inclusion of papers. These criteria are presented in Table 1 .

Table 1

Inclusion criteria of articles.

Thus, the queries using the stated search strings led to a pool of 10,972 articles in the subjects of interest that were online and written in English. All publications found as at the time of the search, which was in May 2019, were included. Publication date constraint for including a paper in the study was not applied. The process of screening this pool of 10,972 articles to meet the purpose of the study is outlined in the next section.

3.4. Exclusion from pooled articles

The number of articles from the database keywords search were reduced in line with the elimination procedure outlined as follows:

  • i. elimination of paper based on unrelated title and elimination of duplications from various sources, leading to a reduction from 10,972 to 1,403;
  • ii. examination of the abstracts of the 1,403 articles and reduction from 1,403 to 505;
  • iii. elimination based on the direction of the article after reading through, leading to reduction from 505 to 78.

The elimination procedure is represented in Figure 1 which shows the flow of the procedure for screening the articles for the study.

Figure 1

Literature elimination process.

Table 2 provides a summary of the databases visited and the respective number of articles (from the final 78) that were obtained from that source.

Table 2

Search databases and number for articles.

Table 2 shows the percentage of the articles sourced from each academic database and reveals that Science Direct accounts for the highest number of the related articles with 25 (32%) papers, closely followed by Google Scholar 20 (26%) and IEEE Explore with 12 (15%) articles. Springer accounts for 8 articles, which represents 10% of the entire reviewed papers, while ACM Digital Library, Taylor and Francis, Web of science and EBSCOHOST contribute 4 (5%), 2 (3%), 4 (5%) and 2 (3%) respectively. The least paper is contributed by Wiley Online Library with one paper, which represents 1% of the entire papers reviewed for this study.

3.5. Data collection and synthesis of results

Based on the selection mechanism, 78 articles were shortlisted for analysis. Each article was reviewed and information extracted from it for tabulation. The information sought included the following: the type of multimedia tool used, the focus area of the tool, the technology that was deployed, the multimedia components used within the tool, how the tool was applied – whether for teaching or learning or both, the location where the tool was tested, and the target age on which the tool was tested. The researchers also tabulated impressions gleaned from the review in a “comments” column. If the tool was evaluated, then the evaluation methodology, target group, sample size, outcome, limitations of the methodology and whether or not the outcome could be generalized, were also presented.

In the next section, the insights from the articles reviewed are presented and some of the findings presented in tables for ease of analyses and synthesis.

After careful application of the procedures for selection as outlined in section 3 , each of the 78 shortlisted articles were subjected to a systematic review which involved extracting information as itemised in section 3.5 . Such information were tabulated for further analysis. Not all the articles were empirical based or contained the desired data items. Nineteen articles which were based on experimental work reported the details of the multimedia tool developed or deployed. Furthermore, 13 articles with details of the evaluation of the use of multimedia tools in teaching and learning were identified. Also revealed, were barriers to the use of multimedia. The findings from the systematic review are presented in this section.

The set of articles reviewed clearly emphasized the importance of multimedia technology to the improvement of teaching and learning environment. Several studies that have investigated the impact of ICT to education stated that multimedia technology has positive impact on the way teachers impart knowledge and the manner in which learners comprehend subject matters. The review also revealed that several multimedia-based tools exist, most of which are usually based on subject, field, age or level at various institutions of learning. In addition, some of the reviewed papers investigated the impact of teaching and/or learning with multimedia based instructional materials using descriptive, qualitative and quantitative research methods with different focus groups for both the pre-test and post-test conditions.

Nevertheless, despite the impact of multimedia tools on the improvement of teaching and learning activities, it could be counterproductive if the computer-based tools are not properly designed or the instructional materials are not well composed. The reviews showed that multimedia adoption in education requires adequate understanding of technology and multimedia types or components required to properly represent concepts or ideas. This implies that a teacher must understand the learners and know what technology or tool needs to be adopted at a given time for a set of targets. According to the reviews, the target groups determine the type of multimedia components employed while preparing instructional materials and the ways they are to be delivered. To provide context, a review of some of the analysed case studies are presented next.

Huang et al. (2017) explored the use of multimedia-based teaching materials that include three view diagrams (3D) and tangible 3D materials to teach 3D modelling course. This was aimed at determining the influence of multimedia technology in meta-cognitive behaviour of students. The authors employed lag sequential analysis as well as interview methods to examine the pattern transformation of students' meta-cognitive behaviour while solving problematic tasks. The evaluation results show that different teaching method and materials produce different meta-cognitive behaviours in student. The result further revealed that compare to traditional instructional instruments, using 3D tangible object in cognitive apprenticeship instruction stimulates more meta-cognitive behaviour. To teach an introductory course to control theory and programming in MATLAB, a video based multimedia guide was created by Karel and Tomas (2015) for distance learning students using Camtasia Studio 7 program. The software can record screen, edit video and create DVD menu. The impact of the multimedia aid tool was evaluated to be positive on the students based on the feedback.

Zhang (2012) created an online teaching and learning resource platform with interactive and integrated features. The platform was created with Macromedia Flash version 8.0, a form of Computer – Aided Drawing (CAD) software that is very easy to use. In an attempt to test student's professional cognition and operational skill cognition as well as learning satisfaction during learning phase, an experimentation technique that utilizes a non-equivalent pre-test and post-test control group was adopted. The evaluation revealed no significant difference between the groups in terms of professional cognition and operation skill cognition. However, it was noted that a significant difference exists in learning satisfaction, which shows a greater satisfaction in the coursework with multimedia Flash compare to that of the traditional learning method.

A web-based multimedia software is another popular educational tool designed to enhance teaching and learning. The major constraints of web-based learning are in its ability to provide personalised learning materials. Hwang et al., (2007) presented a web-based tool for creating and sharing annotations in their study. They then investigated the effect of the tool on learning using college students as a case study after four months of using the tool. The study concluded that there is value in understanding the use of collaborative learning through shared annotation. The paper also carried out a GEFT test on the students and concluded that there was no significant divergence between field – dependent and cognitive style students on the quantity of annotation. The paper also concluded that in the final examination, the tool provided a high motivation for students to study for their final exams.

Similarly, Bánsági and Rodgers (2018) developed a graphic web-based application in the educational sector for liquid – liquid extraction with the help of ternary phase diagram. The application allows chemical engineering students of the University of Manchester to draw liquid – liquid two – phase equilibrium curves and calculate mixture of phase separation among others. The application was put into use for testing purpose during which student usage figure as well as their opinions was sampled for both full – time taught and distance learning courses. The HTML 5, JavaScript, and Cascading Style Sheet (CSS) based application is interactive and easy to be used. In order to further analyse the web application developed, an iTeach questionnaire for the assessment of the efficiency of individual pedagogical approach was administered to students. The study revealed that students find the application useful as it has increased their level of understanding the course.

In order to teach students how to compose and continue delivering text based information in various media forms for current and emerging technologies, Blevins (2018) made students to search and analyse various multimedia technologies used in new media and capable of reflecting on their current and future works by adopting a scaffold project – based activities. The students were taught Augmented Reality (AR) software in a specific way with an assumption that such method will change next time students embark on AR project. After student's evaluation, the assumption was achieved even more than expected.

Ertugrul (2000) provided an overview of some lab view application software for teaching. The focus of the software was to seek for software use friendliness and compatibility faced by users. The paper provided recommendations for selection criterion. Even though the software applications have been found very useful and could compliment for conventional practical teaching particularly where there is shortage of laboratory facilities, the application is not suitable for engineering kind courses that requires hands on and intensive practical. Davies and Cormican (2013) identified the fundamental principles needed when designing a multimedia training tool or material for effective teaching and learning. The principles considered both students and an instructor's perspectives. Experiments were conducted in Ireland using a computer aided design (CAD) training environment. During data collection, mixed methods (i.e. interviews, surveys and a group discussion) were employed and findings showed that computer-based material is the most effective and popular way to learn. However, the costs, perceived lack of skill and insufficient support could be hindering factors.

The department of Computer Science in UiTMNegriSembian, developed three applications, namely, the Greenfoot, Visualization makes Array Easy (VAE) and e-TajweedYassin. The Greenfoot as a Teaching Tool in Object Oriented Programming is a tool that creates scenarios in order to ease visualization of 2D objects interaction in teaching object-oriented programming. The term “scenario” is used in Greenfoot to mean a project, and it has been used as a teaching aid for object-oriented programming (OOP) language introduction course. To ensure that a standard and quality application is built, the teaching aid was developed using System Development Life Cycle (SDLC). The Greenfoot-based scenario shows a great improvement in visualization and object element interaction and an impressive engagement of students during learning process. The application also provides clear illustration of object-oriented concepts to students and enabled them develop a game-like application from the scenario provided.

The Visualization makes Array Easy (VAE) on the other hand was created using the ADDIE model which is made up of Analysis, Design, Develop, Implement and Evaluate for instructional design. The analysis stage recognizes visualization technique as a key factor for enhancing students' understanding of programming concepts. The design stage of VAE took about a week to create a storyboard, while MS PowerPoint with i-Spring and Video Scribe formed the principal software for developing the application using storyboard as a guide. The VAE was instrumental in teaching students some hard programming concepts like Array. The results of the simple test with 60 students showed simulation technique of VAE to be effective in helping students to learn the concepts. To determine the effectiveness of VAE prototype, learnability, efficiency, memorability, accuracy and satisfaction of students were examined.

While the e-TajweedYaasin software was also developed using the ADDIE model (Analysis, Design, Develop, Implement and Evaluate) as an e-learning application, the tool was intended to aid students mastering tajweed and avoid common mistakes that were usually made by previous students who had undergone the course. During the analysis stage, visualization and interactive technique were recognised to be helpful in ensuring that students understand tajweed properly and are able to study with ease. The design stage involved the designing of the application layout with the focus on its easy accessibility to users. In addition, its user interface imitates the traditional teaching method called syafawiah. The development stage involved the use of MS PowerPoint with i-Spring features. The combination of audio, video and animation was more effective in comparison to text only in the promotion of learning. A sample of 51 students were selected to use the system and later, they were evaluated based on their ability to read the surah of Yaasin. A great improvement was observed as the number of mistakes had reduced to all the rules as students were enabled to better recognise and practice the tajweed for the surah of Yaasin ( Kapi et al., 2017 ).

Kapi et al. (2017) compared the effectiveness of three multimedia applications for effective teaching and learning. The applications considered were: Greenfoot Tool for programming; Visualisation Makes Array Easy (VAE) and e-TajweedYasin applications. The comparison looked into the design models used in meeting the desired instructional needs. Findings from the paper showed much more improved students' performance, learning and better understanding of subjects taught.

The advantages of using multimedia tools to teach Physics, which most students think is difficult, are enumerated in Jian-hua & Hong's (2012) work. They established that effective application of multimedia technology in university physics teaching can change the form of information, integrating graph, text, sound and image on PC, improving the expressive force of the teaching content so that the students can actively participate in multi-media activities via multi senses. High-quality university physics multimedia courseware is the best means to provide a variety of audio-visual images, which can show a lot of physical processes and phenomena vividly that is difficult by common means. The tool, especially, combines the advantages of multimedia courseware for university physics and that of traditional teaching of physics, and it greatly helped in improving teaching results of physics ( Jian-hua & Hong, 2012 ).

Two researchers developed a culturally responsive Visual Art Education module at the secondary level so as to assist the teachers to integrate and to implement a multicultural education in the teaching and learning practices at schools with the aim of enhancing students' knowledge and awareness regarding the elements of art and culture inherited by each race that makes up the multiracial society in Malaysia. Microsoft power point authoring tool was the technology with visual art materials including images and texts in a multimedia interactive teaching material for teaching 60 secondary school students, which resulted in accelerated teaching and learning processes with the IT skills of the teachers greatly improved ( Maaruf and Siraj, 2013 ).

Two control groups, pre-test and post-test, were selected for the implementation of a developed multimedia tool for 20 weeks. The tool, multimedia aided teaching (MAT) with text, audio, video and animation, was applied on 60 science students with age less than 15 years. The valid and reliable questionnaires were used as data collection tools. The Attitude Towards Science Scale (ATSS) was used to measure the attitude of both groups before and after treatments. The independent sample t-test was used to analyze the data. The results indicated that MAT is more effective than the traditional one. Students' attitude towards science improved with the use of MAT when compared to the traditional method of teaching ( Shah and Khan, 2015 ).

The effect of multimedia tools on the performance of 67 grade 4 students of social studies in Kayseri, Turkey was presented. Teaching tool with Computer representation with text, audio, video and animation as its components applied on a control group and an experimental group. The study concluded that academic performance of students in social studies was greatly improved when multimedia technique was applied as compared to traditional classroom ( Ilhan and Oruc, 2016 ).

Two samples of 60 senior secondary school II students in two different schools in Lagos State, Nigeria, were selected for the pre-test, post-test control group quasi experimental design in the research by Akinoso. Mathematics Achievement Test (MAT) with twenty-five questions from four topics namely: logarithm, percentage error, range, variance and standard deviation and circle theorems was the tool used. It was concluded that the students in the experimental group where multimedia tool was used performed better than those in the control group. It was equally inferred from the work that students' interest, motivation and participation increased according to the researcher and experimental group's teacher observations ( Akinoso, 2018 ).

Specifically, in the field of engineering, laboratory software applications can be used to provide an interface to providing practical alternatives to students depending on their requirement. Ertugrul (2000) provided a review of LabView software applications. The paper provided some knowledge about laboratory software tools used in the field of engineering and concluded that computer-based technology has advanced up to the stage where it can aid Engineering education at a significantly low price. The paper also highlighted some challenges faced by institutions in selecting and in the use of these software such as the need to upgrade software as the curriculum changes while also providing some future trends.

Zulkifli et al. (2008) examined a self-calibrating automated system for depressed cladding applications as they demonstrated utilizing the Laboratory Virtual Instrument Engineering Workbench (LabVIEW) software and General-Purpose Interface Bus (GPIB) interface. The presented model confirmed that the overall experiment time was reduced by 80% and data obtained is more accurate than caring out the experiment physically. Similarly, Teng et al. (2000) presented a Lab view as a teaching aid for use as power analyzer. The paper showed the tool allows for developmental speed to be accelerated as it is a connection between different workbench instruments.

The structured information extracted from the relevant reviewed articles are presented in the next sections. The systematic review enabled us to extract information from the reviewed articles on the type of multimedia tool the article described, what type of technology the tool deployed, what were the multimedia components utilized, and whether the tool applied to a teaching or learning scenario or both. Furthermore, results from articles reviewed for their evaluation studies are also presented including barriers to multimedia use.

4.1. Multimedia tools, technology, components and applications

The systematic review enabled us to extract information from the reviewed articles on the type of multimedia tool the article described, what type of technology the tool deployed, what were the multimedia components utilized, and whether the tool applied to a teaching or learning scenario or both. The results are presented in Table 3 .

Table 3

Summary of multimedia tools, technology, components and applications for education.

Various multimedia tools were identified in the research papers reviewed. Perhaps, owing to the advancement in multimedia technology, several applications have been developed and deployed to enhance teaching skills and learning environment in many fields of study. These include subject specific tools such as that for teaching and learning Mathematics ( Akinoso, 2018 ), the Chinese language ( Wu and Chen, 2018 ), Physics ( Jian-hua & Hong, 2012 ) and for teaching Social Studies ( Ilhan and Oruc (2016) . All the multimedia tools were developed for teaching except the CENTRA tool ( Eady and Lockyer, 2013 ) and the e-Tajweed Yaasin tool ( Kapi et al., 2017 ). Likewise, all the tools handled learning except the web-based application reported by Bánsági and Rodgers (2018) and the multimedia interactive teaching material ( Maaruf and Siraj, 2013 ).

The tools fell into two categories: standalone or web-based. One-third were web-based (36%) while 65% were standalone.

Technologies identified varied widely. Multimedia tools used included advanced technologies such as computer representation ( Akinoso, 2018 ; Aloraini, 2012 ; Ilhan and Oruc, 2016 ; Milovanovic et al., 2013 ) and augmented reality ( Blevins, 2018 ). High-level web design and programming software were also utilized. For instance, Bánsági and Rodgers (2018) and Hwang et al. (2007) utilized HTML 5, JavaScript and Cascading Style Sheet (CSS), which are software commonly used for web site programming. Camtasia Studio 7 software was used in the development of a video based multimedia guide for teaching and learning ( Karel and Tomas, 2015 ).

A commonly used web design and animation software, Macromedia Flash, was also identified ( Zhang, 2012 ). Object-oriented programming software was reported by Kapi et al. (2017) in the Greenfoot multimedia tool reported by them. Some low end technologies such as word-processing ( Eady and Lockyer, 2013 ) and presentation software ( Kapi et al., 2017 ) were also utilised. Other technologies reported include the use of e-book ( Wu and Chen, 2018 ), computer aided design (CAD) ( Davies and Cormican, 2013 ) and YouTube ( Shoufan, 2019 ).

As shown in Table 3 , several multimedia components were identified. These included text, audio, video, image, animation, annotation and 3D, with several of the multimedia tools combining two or more components. However, the incorporation of 3D was reported only by Huang et al. (2017) . All the analysed papers incorporated text in the multimedia tool reported, except in the tool, CENTRA ( Eady and Lockyer, 2013 ). Animation was also embedded as part of the multimedia tool developed for visualisation ( Kapi et al., 2017 ), for teaching Social Studies ( Ilhan and Oruc, 2016 ), engineering virtual learning tool ( Ertugrul, 2000 ), CAD ( Davies and Cormican, 2013 ), augmented reality ( Blevins, 2018 ) and in tool for teaching Mathematics ( Akinoso, 2018 ). Figure 2 shows the trend in educational technology based on year of publication of the reviewed articles. The figure reveals that while incorporation of audio and video became common as from 2012, 3-D makes its first appearance in 2017. This suggests that as new ICTs emerge educators are likely to try them in the quest for the best learning experience possible.

Figure 2

Educational technology trend based on year of publication.

4.2. Multimedia tools test location and target age

In this section, information on the location where the multimedia tool was tested and the target age of the study group are presented as summarised in Table 4 . The table also includes comments about the articles that could not be captured under any of the tabulation headings.

Table 4

Summary of multimedia tools for education study locations.

The multimedia tools tested were reported in studies from various countries, including Nigeria ( Akinoso, 2018 ), Saudi Arabia ( Aloraini, 2012 ), England ( Bánsági and Rodgers, 2018 ), Ireland ( Davies and Cormican, 2013 ), Australia and Canada ( Eady and Lockyer, 2013 ), Taiwan ( Huang et al., 2017 ), Turkey ( Ilhan and Oruc, 2016 ) Czech republic ( Karel and Tomas, 2015 ), Malaysia ( Maaruf and Siraj, 2013 ), Serbia ( Milovanovic et al., 2013 ), Pakistan ( Shah and Khan, 2015 ) and China ( Wu and Chen, 2018 ).

Various age groups were targeted by the multimedia tool tests. A considerable proportion involved university students with ages starting from 16 or 18 years as specified in the articles ( Bánsági and Rodgers, 2018 ; Huang et al., 2017 ); Hwang et al., 2007 ; Jian-hua & Hong, 2012 ; Kapi et al., 2017 ; Karel and Tomas, 2015 ). Another group targeted were secondary school students ( Akinoso, 2018 ; Maaruf and Siraj, 2013 ) including vocational school students ( Wu and Chen, 2018 ). Shah and Khan (2015) reported testing their multimedia tool on children below the age of 15 years.

4.3. Evaluation methods of multimedia technology tools in education

The articles involving evaluation were examined to identify the methodologies used for the evaluation, the target groups and sample of the evaluation and the evaluation outcome. The limitations of the evaluation were also identified and whether or not the study outcome could be generalized. Thirteen articles were found and the results are presented in Table 5 .

Table 5

Summary of Evaluation methods of multimedia technology Tools in education.

Evaluation of multimedia technology used for teaching and learning is important in establishing the efficacy of the tool. For determination of the impact of a developed tool, an experimental evaluation is more meaningful over a survey. However, the results from the analysis showed that the survey method for evaluation was used nearly as equally as the experimental design.

Experimental based evaluation was conducted by Akinoso (2018) , Aloraini (2012) , Ilhan and Oruc (2016) , and Shah and Khan (2015) in order to determine the effectiveness of the multimedia tool they developed. Another group of experimental evaluations involved designing the research for teaching with or without multimedia aids not necessarily developed by the research team which involved exposing 10–11 year olds ( Dalacosta et al., 2009 ) and elementary school students ( Kaptan and İzgi, 2014 ) to animated cartoons. Another of such evaluation was done by Milovanovi et al. (2013) , who used an experimental and control group to evaluate the impact of teaching a group of university students with multimedia.

In contrast, the survey method was used to elicit the opinion of respondents on the impact of the use of multimedia in teaching and learning and the target group were university students ( Al-Hariri and Al-Hattami, 2017 ; Barzegar et al., 2012 ), secondary school students ( Akinoso, 2018 ; Maaruf and Siraj, 2013 ); one involved interviewing the Professors ( Chen and Xia, 2012 ), another involved 4–10 year olds ( Manca and Ranieri, 2016 ) and a sample of 272 students whose ages were not specified ( Ocepek et al., 2013 ).

The focus areas in which the evaluations were conducted ranged from the sciences including mathematics ( Akinoso, 2018 ; Al-Hariri and Al-Hattami, 2017 ; Dalacosta et al., 2009 ; Kaptan and İzgi, 2014 ; Milovanovi et al., 2013 ) to the social sciences ( Ilhan and Oruc, 2016 ) and the arts ( Maaruf and Siraj, 2013 ). There were evaluations focused on education as a subject as well ( Aloraini, 2012 ; Chen and Xia, 2012 ; Maaruf and Siraj, 2013 ; Manca and Ranieri, 2016 ). While positive outcomes were generally reported, Ocepek et al. (2013) , specified that students in their evaluation study preferred structured texts with colour discrimination.

Sample sizes used in the studies varied widely, from Maaruf and Siraj (2013) that based their conclusions on an in-depth interview of teachers, to Manca and Ranieri (2016) that carried out a survey with a sample of 6,139 academic staff. However, the latter study reported a low response rate of 10.5%. One notable weakness identified was that the findings from all but one of the studies could not be generalized. Reasons for this ranged from inadequate sample size, the exposure being limited to a single lesson, or the sampling method and duration of the experiment not explicitly stated.

4.4. Identified barriers to multimedia use in teaching and learning

The review revealed some challenges that could be barriers to the use of multimedia tools in teaching and learning. Some of these barriers, as found in the reviewed articles, are highlighted as follows:

  • • Attitudes and beliefs towards the use of technology in education. Findings from literatures and surveys have shown high resistant to change and negative attitude towards adoption and use of ICT in education ( Cuban et al., 2001 ; Said et al., 2009 ; Snoeyink and Ertmer, 2001 ). In some findings, some respondents perceived no benefits ( Mumtaz, 2000 ; Snoeyink and Ertmer, 2001 ; Yuen and Ma, 2002 ).
  • • Lack of teachers' confidence in the use of technology and resistance to change ( Bosley and Moon, 2003 ; Fabry& Higgs, 1997 ; Said et al., 2009 ).
  • • Lack of basic knowledge and ICT skills for adoption and use of multimedia tools ( Akbaba-Altun, 2006 ; Bingimlas, 2009 ; Cagiltay et al., 2001 )
  • • Lack of access to computing resources such as hardware and software ( Akbaba-Altun, 2006 ; Bosley and Moon, 2003 ; Cinar, 2002 ; Mumtaz, 2000 ; Taylor and Todd, 1995 )
  • • Lack of technical, administrative and financial supports ( Akbaba-Altun, 2006 ; Cinar, 2002 ; Said et al., 2009 ; Goktas et al., 2013 )
  • • Others include lack of instructional content, basic knowledge and skills, physical environment and lack of time to learn new technologies ( Akbaba-Altun, 2006 ; Cinar, 2002 ; Said et al., 2009 ).

5. Discussion

The findings from the systematic review are discussed in this section with a view to answering the research questions posed. The questions bordered on identifying the existing multimedia tools for teaching and learning and the multimedia components adopted in the tools, the type of audience best suited to a certain multimedia component, the methods used when multimedia in teaching and learning are being evaluated and the success or failure factors to consider.

5.1. Multimedia tools in teaching and learning

The review revealed that multimedia tools have been developed to enhance teaching and learning for various fields of study. The review also shows that multimedia tools are delivered using different technologies and multimedia components, and can be broadly categorized as web-based or standalone.

From the review, it was found that standalone multimedia tools were more than twice (64%) the number of tools that were web-based (36%). Standalone tools are a category of teaching and learning aids which are not delivered or used over the internet, but authored to be installed, copied, loaded and used on teachers or students' personal computers (PCs) or workstations. Standalone tools are especially useful for teaching and practicing new concepts such as 3D technology for modelling and printing ( Huang et al., 2017 ) or understanding augmented reality (AR) software ( Blevins, 2018 ). Microsoft Powerpoint is a presentation tool used in some of the reviewed articles and is usually done with standalone systems.

Standalone tools were favoured over web-based tools probably because the internet is not a requirement which makes the tool possible to deploy in all settings. This means that teachers and students in suburban and rural areas that are digitally excluded, can benefit from such a multimedia tool. This system is considered most useful because a majority of the populace in most developing countries are socially and educationally excluded due to a lack of the necessary resources for teaching and learning. The need to sustainably run an online learning environment may be difficult, and therefore, the standalone, provides a better fit for such settings. However, the problem with a standalone application or system is the platform dependency. For instance, a Windows based application can only run on a windows platform. Also, there will be slow convergence time when there is modification in the curricular or modules, since, each system will run offline and has to be updated manually or completely replaced from each location where the tool is deployed.

The other category, web-based multimedia tools, are authored using web authoring tools and delivered online for teaching and learning purposes. About one-third of the tools identified from the review were web-based although they were used largely in university teaching and learning. Examples of these tools are: online teaching and learning resource platform ( Zhang, 2012 ), graphic web-based application ( Bánsági and Rodgers, 2018 ), multimedia tool for teaching optimization ( Jian-hua & Hong, 2012 ), and educational videos on YouTube ( Shoufan, 2019 ).

One of the benefits of the web based multimedia solution is that it is online and centralized over the internet. Part of its advantages is easy update and deployment in contrast to the standalone multimedia system. The major requirements on the teachers and learners' side are that a web browser is installed and that they have an internet connection. Also, the multimedia web application is platform independent; it does not require any special operating system to operate. The same multimedia application can be accessed through a web browser regardless of the learners' operations system. However, when many people access the resource at the same time, this could lead to congestion, packet loss and retransmission. This scenario happens often when large classes take online examinations at the same time. Also, the data requirements for graphics or applications developed with the combination of video, audio and text may differs with system developed with only pictures and text. Hence, the web based system can only be sustainably run with stable high speed internet access.

A major weakness of web-based multimedia tools is the challenge posed for low internet penetration communities and the cost of bandwidth for low-income groups. As access to the internet becomes more easily accessible, it is expected that the advantages of deploying a web-based multimedia solution will far outweigh the disadvantages and more of such tools would be web-based.

5.2. Components, technology and applications of multimedia tools in education

The results from the review revealed that most of the existing multimedia tools in education consist of various multimedia components such as text, symbol, image, audio, video and animation, that are converged in technologies such as 3D ( Huang et al., 2017 ), Camtasia Studio 7 software ( Karel and Tomas, 2015 ), Macromedia Flash ( Zhang, 2012 ), HTML5, JavaScript, CSS ( Bánsági and Rodgers, 2018 ; Eady and Lockyer, 2013 ; Chen and Liu, 2008 ; Shah and Khan, 2015 ; Shoufan, 2019 ). As shown in Figure 3 , the analysis confirms that text (26.8%) is the predominant multimedia component being used in most of the educational materials while other components such as videos (19.5%), audios (18.3%), images (18.3%) and animation (11.0%) are fairly used in teaching and learning multimedia materials. However, annotation and 3D technologies are least incorporated.

Figure 3

Proportion of multimedia components in reviewed articles.

How these components are combined is shown in Figure 4 . Perhaps, the combination of these four major components (text, video, audio, image) provides the best outcome for the learner and points to the place of text as a most desired multimedia component. The components used also reflect the type of subject matter being addressed. For instance, the audio component is important for language classes while video and image components are stimulating in Biology classes, for example, due to the need for visual perception for the learners. It is, therefore, imperative to note that the choice of the combination of these components could yield variable impacts to learners. Hence, it can be deduced from the studies that most of the tools are applied either as teaching or/and learning aids depending on the nature of the audience and teacher.

Figure 4

Use of various multimedia combinations.

In Figure 4 , we provided the analysis of the component combination of the data set reviewed. The multimedia components combinations range from two to six. This was grouped based on the multimedia components combination employed in each of the data set. Group 1 (G1) represents the number of multimedia application with the combination of Text, Image, audio, Video, and 3D. G2 consists of video and audio, while G13 combines all the multimedia components except the 3D.

Furthermore, a majority of the multimedia applications considered four (4) and two (2) combinations of components in their design as shown in Figure 5 . Tools with five and six components were very few and as the figure reveals, all the tools used at least two components.

Figure 5

Multimedia tools and the number of components utilized.

These findings stress the fact that application of multimedia tools in education and the multimedia component incorporated, are audience, subject, curricula and teacher-specific and the tool needs to be well articulated and structured to achieve its goals.

5.3. Targeted multimedia solutions

Our systematic review also revealed that most multimedia solutions deployed for teaching and learning target the solution to the pedagogical content of the subject of interest (see Table 4 ) and the user audience of the solution ( Table 5 ). Several studies highlighted in Tables  4 and ​ and5 5 showcase multimedia tools used for mathematics classes ( Akinoso, 2018 ; Milovanovi et al., 2013 ), Social science ( Ilhan and Oruc, 2016 ), Physiology ( Al-Hariri and Al-Hattami, 2017 ), Physics ( Jian-hua and Hong, 2012 ), in Chemical engineering ( Bánsági and Rodgers, 2018 ) and teaching of Chinese language ( Wu and Chen, 2018 ). In addition, multimedia tools were utilized for teaching specific principles such as in control theory ( Karel and Tomas, 2015 ) and teaching of arrays ( Kapi et al., 2017 ). That multimedia solutions are subject-based is not surprising given that multimedia involves relaying information using different forms of communication. It follows that multimedia solution developers need to incorporate some text, video, audio, still photographs, sound, animation, image and interactive contents in a manner that best conveys the desired content for teaching or to aid learning.

As stated earlier, the review revealed a variety of user types for the multimedia solutions reported. It is noteworthy that a large proportion of the studies where the target audience were university students, a mixture of graphics, text, audio, video and sometimes animation was utilized ( Aloraini 2012 ; Blevins, 2018 ; Huang et al., 2017 ; Shah and Khan, 2015 ). While a sizeable number of solutions were targeted at secondary school students (such as Maaruf and Siraj, 2013 , Kapi et al., 2017 , and Ilhan and Oruc, 2016 ), very few studies were identified that targeted students less than 15 years in age. Shah and Khan (2015) targeted a multimedia teaching aid that incorporated text, audio, video and animation. Perhaps the absence of multimedia tools targeted at very young children may be as a result of the inclusion criteria used for identifying articles for the review.

5.4. Success factors

The success of the different multimedia tools that have been used on the various target groups and subjects can be attributed to the technologies and components embedded as shown in Tables  4 and ​ and5. 5 . In most cases where text, audio, video, graphics and animations were the components of choice, significant improvements in teaching and learning are used, as reported in the studies reviewed ( Blevins, 2018 ; Huang et al., 2017 ; Zhang, 2012 ).

These studies also implemented technologies such as 3D modelling and printing; Macromedia flash version 8.0 and augmented reality (AR) software respectively. It is worthy of note that all the above-mentioned multimedia tools were applicable in both the teaching and learning processes. Another set of tools with components being text, audio, video and animation, excluding graphics, and equally applied in both the teaching and learning processes, adopted computer representation as their technologies ( Aloraini, 2012 ; Ilhan and Oruc, 2016 ; Milovanovic et al., 2013 ). Teaching and learning were equally greatly improved in these cases.

5.5. Evaluation methodologies

Our systematic review included a synthesis of the methodologies described by the reviewed articles for evaluating the multimedia tools that they present as shown in the summary in Table 5 . The evaluation methodologies appeared to be different depending on the type of multimedia tool, technology components, deployment strategies, and application area and target groups. However, two main evaluation methods were identified - experimental investigations and the survey methodology.

The experimental approach involved the use of an experimental group and a control group, where the assessment of the impact of the multimedia tool on the students' performance on the experimental group was compared with the performance of the control group who were taught the same content without the use of the multimedia tool. This experimental approach is a widely practiced evaluation method and has proven to be effective. It was deployed by Aloraini (2012) , Milovanovi et al. (2013) , Kaptan and İzgi (2014) , Shah and Khan (2015) , Ilhan and Oruc (2016) and Akinoso (2018) in their studies in the subject area of education, social sciences, general science, science, education and mathematics classes respectively.

Survey, as an evaluation approach which was used in 46% of the studies reviewed, involved the use of questionnaires that were administered to gather opinion on the perceived impact of the multimedia tool from a targeted group of respondents. From the systematic review, it was found that the questionnaire administration approach also varied. The data collection could be face-to-face interview ( Al-Hariri and Al-Hattami, 2017 ; Barzegar et al., 2012 ; Chen and Xia, 2012 ), or online survey ( Armenteros et al., 2013 ; Wang et al., 2020 ).

The difficulty of determining impact from a survey is related to the weaknesses associated with instrument design and sampling biases. It is our opinion that the perceived impact of the technology components used in the development of the multimedia tools may not be accurately ascertained using survey when compared with the actual deployment and experimentation with the multimedia tool that takes place in experimentation approach. Besides, in the survey approach, judgment is merely based on perceptions. Interestingly, the simplicity and ease of the survey method makes it a good option for evaluating larger target groups, and its findings can be generalised when the statistical condition is satisfied ( Krejcie and Morgan, 1970 ).

Although the evaluation studies analysed had publication dates as recently as 2015 to 2018, none reported any objective data collection such as from eye-tracking or other behavioural data. Perhaps, this may be due to our search keyword terms not being wide enough to identify multimedia evaluation studies that used objective data gathering. It could also be that the cost, time and effort needed to collect objective data means that many studies incorporating evaluation are avoiding this route.

5.6. Barriers to multimedia use in teaching and learning

Several barriers to multimedia use in teaching and learning were revealed as a result of the review. Such barriers include resistance to the adoption of ICT, lack of teachers' confidence in the use of technology, resistance to change on the part of teachers, a lack of ICT skills and lack of access to ICT resources. Other barriers identified were the lack of support, lack of time to learn new technologies, lack of instructional content, and the physical environment in which multimedia delivery took place. Some studies reported respondents that perceived no benefits from the use of multimedia. These barriers certainly affect both the integration of multimedia in teaching and learning and the uptake of the multimedia tool.

Most of the barriers identified could be classified into three groups with a major one being the fear or resistance to change. This means that change management must be an integral part of multimedia tools development and deployment in order to achieve the desired goal. Also, barriers such as lack of time and lack of resources should not be underestimated. Some of the studies reported providing the hardware for the multimedia application and such an approach should be considered. Most multimedia tools are ICT driven and as such the identified barrier of lack of ICT skills is an important aspect that must be addressed. This can be done as part of the change process and would also boost the confidence of teachers to incorporate multimedia for teaching.

It is important that the multimedia tool is designed and developed with the end-goal in mind. As indicated, some recipients of multimedia applications did not see any benefit in its use. This means that the multimedia tool should be designed to provide an experience that is worth the teachers and students' time, attention and effort.

6. Conclusions and future research direction

A lot of work has been done to highlight the effectiveness of multimedia as a teaching and learning aid. This paper provides a systematic review of studies on the use of multimedia in education in order to identify the multimedia tools being commonly used to aid teaching and learning. The paper did a systematic review of extant literature that reported studies that have been carried out to determine the extent to which multimedia has been successful in improving both teaching and learning, and challenges of using multimedia for leaning and teaching.

We note, however, that our review, especially of the studies on evaluation of multimedia, leaned more to the outcome from the studies rather than the process. Some of the information that was not captured include how the classroom teacher's mastery of the technology influences the attractiveness of the tool to the learner, both visually and through the content and if the multimedia tool allowed for learners' participation. Also, while studies on multimedia evaluation was of interest to us, this search phrase was not part of the search phrases used. A future review could incorporate these for a richer perspective.

It is obvious from the review that researchers have explored several multimedia in order to develop teaching and learning tools either based on the web or standalone using different technologies. It is observed that there exist several multimedia tools in education, but the proliferation of the tools is attributed to the evolvement of technologies over the years and the continuous teachers' efforts to improving knowledge delivery with respect to the subject areas and target audience. It is also revealed that most multimedia solutions deployed for teaching and learning target the solution to the pedagogical content of the subject of interest and the user audience of the solution. The success of the different multimedia tools that have been used on the various target groups and subjects is also attributed to the technologies and components embedded.

Furthermore, the evaluation methodologies and learning outcomes of the deployment of multimedia tools appeared to be different depending on the type of multimedia tool, technology components, deployment strategies, and application area and target groups. The two main evaluation methodologies identified from the various studies reported in the articles we reviewed were the experimental investigations and the survey methodology.

Attitudes and beliefs towards the use of technology in education, lack of teachers' confidence and resistance to change, lack of basic knowledge and ICT skills, lack of technical, administrative and financial supports, lack of physical environment are some of the barriers identified in the various articles reviewed. These barriers affect the integration of multimedia in education.

For future work, efforts should be made to explore mobile technology with several multimedia components in order to enhance teaching and learning processes across a diverse group of learners in the primary, secondary, vocational, and higher institutions of learning. Such research efforts would be significant in increasing inclusiveness and narrowing the educational divide. Also, research into the change management process for overcoming the barriers to multimedia adoption would be of interest.


Author contribution statement.

All authors listed have significantly contributed to the development and the writing of this article.

Funding statement

This work was supported by Tertiary Education Trust Fund (TetFund), Ministry of Education, Federal Government of Nigeria 2016–2017 Institutional Based Research Grant.

Competing interest statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

  • Agulla E.G., Rúa E.A., Castro J.L.A., Jiménez D.G., Rifón L.A. 2009 11th IEEE International Symposium on Multimedia. 2009. Multimodal biometrics-based student attendance measurement in learning management systems; pp. 699–704. [ Google Scholar ]
  • Akbaba-Altun S. Complexity of integrating computer technologies into education in Turkey. Educ. Technol. Soc. 2006; 9 (1):176–187. [ Google Scholar ]
  • Akinoso O. Effect of the use of multimedia on students' performance in secondary school mathematics. Global Media J. 2018; 16 (30):1–8. [ Google Scholar ]
  • Al-Ajmi N.A.H., Aljazzaf Z.M. Factors influencing the use of multimedia technologies in teaching English language in Kuwait. Int. J. Emerg. Technol. Learn. 2020; 15 (5):212–234. [ Google Scholar ]
  • Alemdag E., Cagiltay K. A systematic review of eye tracking research on multimedia learning. Comput. Educ. 2018; 125 :413–428. 2018. [ Google Scholar ]
  • Al-Hariri M.T., Al-Hattami A.A. Impact of students' use of technology on their learning achievements in physiology courses at the University of Dammam. J. Taibah Univ. Med. Sci. 2017; 12 (1):82–85. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Almara'beh H., Amer E.F., Sulieman A. The effectiveness of multimedia learning tools in education. Int. J. Adv. Res. Comput. Sci. Software Eng. 2015; 5 (12):761–764. [ Google Scholar ]
  • Aloraini S. The impact of using multimedia on students’ academic achievement in the College of Education at King Saud University. Kind Saud Univ. J. King Saud Univ. Lang. Transl. 2012; 24 :75–82. 2012. [ Google Scholar ]
  • Anderson R.E. IEA Computers in Education Study, Department of Sociology, University of Minnesota. 1993. Computers in American schools 1992: an overview: a national report from the international IEA computers in education study. [ Google Scholar ]
  • Armenteros M., Liaw S.S., Fernández M., Díaz R.F., Sánchez R.A. Surveying FIFA instructors' behavioral intention toward the multimedia teaching materials. Comput. Educ. 2013; 61 :91–104. [ Google Scholar ]
  • Bánsági T., Jr., Rodgers T.L. Graphic web–apps for teaching ternary diagrams and liquid–liquid extraction. Educ. Chem. Eng. 2018; 22 :27–34. [ Google Scholar ]
  • Barzegar N., Farjad S., Hosseini N. The effect of teaching model based on multimedia and network on the student learning (case study: guidance schools in Iran) Procedia Soc. Behav. Sci. 2012; 47 :1263–1267. 2012. [ Google Scholar ]
  • Bingimlas K. Barriers to the successful integration of ICT in teaching and learning environments: a review of the literature. Eurasia J. Math. Sci. Technol. Educ. 2009; 5 (3):235–245. [ Google Scholar ]
  • Blevins B. Teaching digital literacy composing concepts: focusing on the layers of augmented reality in an era of changing technology. Comput. Compos. 2018; 50 :21–38. [ Google Scholar ]
  • Bosley C., Moon S. Centre for Guidance Studies, University of Derby; 2003. Review of Existing Literature on the Use of Information and Communication Technology within an Educational Context. [ Google Scholar ]
  • Cagiltay K., Cakiroglu J., Cagiltay N., Cakiroglu E. Teachers’ perspectives about the use of computer in education. H. U. J. Educ. 2001; 21 (1):19–28. [ Google Scholar ]
  • Chen H.Y., Liu K.Y. Web-based synchronized multimedia lecture system design for teaching/learning Chinese as second language. Comput. Educ. 2008; 50 (3):693–702. [ Google Scholar ]
  • Chen S., Xia Y. Research on application of multimedia technology in college physical education. Procedia Eng. 2012; 29 (2012):4213–4217. [ Google Scholar ]
  • Cinar A. METU; Ankara, Turkey: 2002. Teachers’ Computer Use at Basic Education Schools: Identifying Contributing Factors. Unpublished master’s thesis. [ Google Scholar ]
  • Coleman L.O., Gibson P., Cotten S.R., Howell-Moroney M., Stringer K. Integrating computing across the curriculum: the impact of internal barriers and training intensity on computer integration in the elementary school classroom. J. Educ. Comput. Res. 2016; 54 (2):275–294. [ Google Scholar ]
  • Cuban L., Kirkpatrick H., Peck C. High access and low use of technology in high school classrooms: explaining an apparent paradox. Am. Educ. Res. J. 2001; 38 (4):813–834. [ Google Scholar ]
  • Dalacosta K., Kamariotaki-Paparrigopoulou M., Palyvos J.A., Spyrellis N. Multimedia application with animated cartoons for teaching science in elementary education. Comput. Educ. 2009; 52 (4):741–748. [ Google Scholar ]
  • Davies W., Cormican K. An analysis of the use of multimedia technology in computer aided design training: towards effective design goals. Procedia Technol. 2013; 9 :200–208. 2013. [ Google Scholar ]
  • Eady M.J., Lockyer L. Queensland University of Technology; Australia: 2013. “Tools for Learning: Technology and Teaching Strategies,” Learning to Teach in the Primary School; p. 71. [ Google Scholar ]
  • Ertugrul N. Towards virtual laboratories: a survey of LabVIEW-based teaching/learning tools and future trends. Int. J. Eng. Educ. 2000; 16 (3):171–180. [ Google Scholar ]
  • Fabry D., Higgs J. Barriers to the effective use of technology in education. J. Educ. Comput. 1997; 17 (4):385–395. [ Google Scholar ]
  • Goktas Y., Gedik N., Baydas O. Enablers and barriers to the use of ICT in primary schools in Turkey: a comparative study of 2005–2011. Comput. Educ. 2013; 68 :211–222. [ Google Scholar ]
  • Guan N., Song J., Li D. On the advantages of computer multimedia-aided English teaching. Procedia Comput. Sci. 2018; 131 :727–732. 2018. [ Google Scholar ]
  • Horsley M., Eliot M., Knight B.A., Reilly R. Springer; Cham, Switzerland: 2014. Current Trends in Eye Tracking Research. [ Google Scholar ]
  • Huang T.C., Chen M.Y., Lin C.Y. Exploring the behavioral patterns transformation of learners in different 3D modeling teaching strategies. Comput. Hum. Behav. 2017; 92 :670–678. 2017. [ Google Scholar ]
  • Hwang W.Y., Wang C.Y., Sharples M. A study of multimedia annotation of Web-based materials. Comput. Educ. 2007; 48 (4):680–699. [ Google Scholar ]
  • Ilhan G.O., Oruc S. Effect of the use of multimedia on students' performance: a case study of social studies class. Educ. Res. Rev. 2016; 11 (8):877–882. [ Google Scholar ]
  • Janda K. Multimedia in political science: sobering lessons from a teaching experiment. J. Educ. Multimedia Hypermedia. 1992; 1 (3):341–354. [ Google Scholar ]
  • Jian-hua S., Hong L. Explore the effective use of multimedia technology in college physics teaching. 2012 International Conference on Future Electr. Power Energy Syst. Explore. 2012; 17 :1897–1900. [ Google Scholar ]
  • Kapi A.Y., Osman N., Ramli R.Z., Taib J.M. Multimedia education tools for effective teaching and learning. J. Telecommun. Electron. Comput. Eng. 2017; 9 (2-8):143–146. [ Google Scholar ]
  • Kaptan F., İzgi Ü. The effect of use concept cartoons attitudes of first grade elementary students towards science and technology course. Procedia Soc. Behav. Sci. 2014; 116 :2307–2311. 2014. [ Google Scholar ]
  • Karel P., Tomas Z. Multimedia teaching aid for students of basics of control theory in Matlab and Simulink. Procedia Eng. 2015; 100 :150–158. 2015. [ Google Scholar ]
  • Keengwe S., Onchwari G., Wachira P. The use of computer toolsto support meaningful learning. Educ. Technol. Rev. 2008; 16 (1):77–92. [ Google Scholar ]
  • Keengwe J., Onchwari G., Wachira P. Computer technology integration and student learning: barriers and promise. J. Sci. Educ. Technol. 2008; 17 :560–565. 2008. [ Google Scholar ]
  • Kelley K., Clark B., Brown V., Sitzia J. Good practice in the conduct and reporting of survey research. Int. J. Qual. Health Care. 2003; 15 (3):261–266. [ PubMed ] [ Google Scholar ]
  • Kennedy G.E., Judd T.S. Expectations and reality: evaluating patterns of learning behaviour using audit trails. Comput. Educ. 2007; 49 (3):840–855. [ Google Scholar ]
  • Kingsley K.V., Boone R. Effects of multimedia software on achievement of middle school students in an American history class. J. Res. Technol. Educ. 2008; 41 (2):203–221. [ Google Scholar ]
  • Kitchenham B., Brereton O.P., Budgen D., Turner M., Bailey J., Linkman S. Systematic literature reviews in software engineering–a systematic literature review. Inf. Software Technol. 2009; 51 (1):7–15. [ Google Scholar ]
  • Krejcie R.V., Morgan D.W. Determining sample size for research activities. Educ. Psychol. Meas. 1970; 30 (3):607–610. [ Google Scholar ]
  • Liberati A., Altman D.G., Tetzlaff J., Mulrow C., Gøtzsche P.C., Ioannidis J.P., Clarke M., Devereaux P.J., Kleijnen J., Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. Ann. Intern. Med. 2009; 151 (4):65. [ PubMed ] [ Google Scholar ]
  • Maaruf S.Z., Siraj S. The state of technology and the arts-interactive multimedia in enhancing culturally responsive pedagogy. Procedia Soc. Behav. Sci. 2013; 103 :1171–1180. [ Google Scholar ]
  • Manca S., Ranieri M. Facebook and the others.Potentials and obstacles of social media for teaching in higher education. Comput. Educ. 2016; 95 :216–230. [ Google Scholar ]
  • Mayer R.E. Cognitive theory of multimedia learning. Camb. handb. Multimed Learn. 2005; 41 :31–48. [ Google Scholar ]
  • Mayer R.E. Applying the science of learning: evidence-based principles for the design of multimedia instruction. Am. Psychol. 2008; 63 (8):760–769. [ PubMed ] [ Google Scholar ]
  • Miller B.W. Using reading times and eye-movements to measure cognitive engagement. Educ. Psychol. 2015; 50 (1):31–42. [ Google Scholar ]
  • Milovanovic M., Obradovic J., Milajic A. Application of interactive multimedia tools in teaching mathematics--examples of lessons from geometry. Turk. Online J. of Educ. Technol.-TOJET. 2013; 12 (1):19–31. [ Google Scholar ]
  • Moher D., Shamseer L., Clarke M., Ghersi D., Liberati A., Petticrew M., Shekelle P., Stewart L.A. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst. Rev. 2015; 4 (1):1. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Molina A.I., Navarro O., Ortega M., Lacruz M. Evaluating multimedia learning materials in primary education using eye tracking. Comput. Stand. Interfac. 2018; 59 :45–60. [ Google Scholar ]
  • Morris L.V., Finnegan C., Wu S.S. Tracking student behavior, persistence, and achievement in online courses. Internet High Educ. 2005; 8 (3):221–231. [ Google Scholar ]
  • Mumtaz S. Factors affecting teachers’ use of information and communications technology: a review of the literature. J. Inf. Technol. Teach. Educ. 2000; 9 (3):319–341. [ Google Scholar ]
  • Nie Y., Zhe Y. On-line classroom visual tracking and quality evaluation by an advanced feature mining technique. Signal Process. Image Commun. 2020; 84 (May):115817. [ Google Scholar ]
  • Ocepek U., Bosnić Z., Šerbec I.N., Rugelj J. Exploring the relation between learning style models and preferred multimedia types. Comput. Educ. 2013; 69 :343–355. 2013. [ Google Scholar ]
  • Pea R.D. Learning through multimedia. IEEE Comput. Grap. Appl. 1991; 11 (4):58–66. [ Google Scholar ]
  • Putra C.A. Utilization of multimedia technology for instructional media. J. ICT Educ. 2018; 5 :1–8. 2018. [ Google Scholar ]
  • Said A., Lin L., Jim P. Barriers to adopting technology for teaching and learning in Oman. Comput. Educ. 2009; 53 :575–590. [ Google Scholar ]
  • Shah I., Khan M. Impact of multimedia-aided teaching on students’ academic achievement and attitude at elementary level. US China Educ. Rev. 2015; 5 (5):349–360. [ Google Scholar ]
  • Shoufan A. Estimating the cognitive value of YouTube's educational videos: a learning analytics approach. Comput. Hum. Behav. 2019; 92 :450–458. [ Google Scholar ]
  • Snoeyink R., Ertmer P. Thrust into technology: how veteran teachers respond. J. Educ. Technol. Syst. 2001; 30 (1):85–111. [ Google Scholar ]
  • Stark L., Brünken R., Park B. Emotional text design in multimedia learning: a mixed-methods study using eye tracking. Comput. Educ. 2018; 120 :185–196. [ Google Scholar ]
  • Taradi S.K., Taradi M., Radic K., Pokrajac N. Blending problem-based learning with Web technology positively impacts student learning outcomes in acid-base physiology. Adv. Physiol. Educ. 2005; 29 (1):35–39. [ PubMed ] [ Google Scholar ]
  • Taylor S., Todd P.A. Understanding information technology usage: a test of competing models. Inf. Syst. Res. 1995; 6 (2):144–176. [ Google Scholar ]
  • Teng J.H., Chan S.Y., Lee J.C., Lee R. Vol. 1. 2000. A LabVIEW based virtual instrument for power analyzers; pp. 179–184. (2000 International Conference on Power System Technology. Proceeding s (Cat.No. 00EX409) ). [ Google Scholar ]
  • Wang C., Fang T., Gu Y. Learning performance and behavioral patterns of online collaborative learning: impact of cognitive load and affordances of different multimedia. Comput. Educ. 2020; 143 :103683. [ Google Scholar ]
  • West J. 2019. Data Collection. https://www.researchconnections.org/childcare/datamethods/survey.jsp Retrieved on 3 Sept 2020 from: [ Google Scholar ]
  • Wu T.T., Chen A.C. Combining e-books with mind mapping in a reciprocal teaching strategy for a classical Chinese course. Comput. Educ. 2018; 116 (2020):64–80. [ Google Scholar ]
  • Yildiz R., Atkins M.J. ERIC No. ED 350 978; 1992. How to Evaluate Multimedia Simulations: Learning from the Past. [ Google Scholar ]
  • Yuen A.H., Ma W.W. Gender differences in teacher computer acceptance. J. Technol. Teach Educ. 2002; 10 (3):365–382. [ Google Scholar ]
  • Zhang F. Significances of multimedia technologies training. Phys. Procedia. 2012; 33 :2005–2010. 2012. [ Google Scholar ]
  • Zin M.Z.M., Sakat A.A., Ahmad N.A., Bhari A. Relationship between the multimedia technology and education in improving learning quality. Procedia Soc. Behav. Sci. 2013; 90 :351–355. 2013. [ Google Scholar ]
  • Zulkifli M.Z., Harun S.W., Thambiratnam K., Ahmad H. Self-calibrating automated characterization system for depressed cladding EDFA applications using LabVIEW software with GPIB. IEEE Trans. Instrument. Meas. 2008; 57 (11):2677–2681. [ Google Scholar ]
  • Open access
  • Published: 26 April 2020

Eye-tracking and artificial intelligence to enhance motivation and learning

  • Kshitij Sharma 1 ,
  • Michail Giannakos 1 &
  • Pierre Dillenbourg 2  

Smart Learning Environments volume  7 , Article number:  13 ( 2020 ) Cite this article

17k Accesses

46 Citations

2 Altmetric

Metrics details

The interaction with the various learners in a Massive Open Online Course (MOOC) is often complex. Contemporary MOOC learning analytics relate with click-streams, keystrokes and other user-input variables. Such variables however, do not always capture users’ learning and behavior (e.g., passive video watching). In this paper, we present a study with 40 students who watched a MOOC lecture while their eye-movements were being recorded. We then proposed a method to define stimuli-based gaze variables that can be used for any kind of stimulus. The proposed stimuli-based gaze variables indicate students’ content-coverage (in space and time) and reading processes (area of interest based variables) and attention (i.e., with-me-ness), at the perceptual (following teacher’s deictic acts) and conceptual levels (following teacher discourse). In our experiment, we identified a significant mediation effect of the content coverage, reading patterns and the two levels of with-me-ness on the relation between students’ motivation and their learning performance. Such variables enable common measurements for the different kind of stimuli present in distinct MOOCs. Our long-term goal is to create student profiles based on their performance and learning strategy using stimuli-based gaze variables and to provide students gaze-aware feedback to improve overall learning process. One key ingredient in the process of achieving a high level of adaptation in providing gaze-aware feedback to the students is to use Artificial Intelligence (AI) algorithms for prediction of student performance from their behaviour. In this contribution, we also present a method combining state-of-the-art AI technique with the eye-tracking data to predict student performance. The results show that the student performance can be predicted with an error of less than 5%.


We present a study to investigate how well stimuli-based gaze analytics can be utilized to enhance motivation and learning in Massive Open Online Courses (MOOCs). Our work seeks to provide insights on how gaze variables can provide students with gaze-aware feedback and help us improve the design, interfaces and analytics used as well as provide a first step towards gaze-aware design of MOOCs to amplify learning.

The evidence for understanding and supporting users’ learning is still very limited, considering the wide range of data produced when the learner interacts with a system (e.g., gaze Prieto, Sharma, Dillenbourg, & Jesús, 2016 ). Devices like eye-trackers have become readily available and have the capacity to provide researchers with unprecedented access to users’ attention Sharma, Jermann, & Dillenbourg, 2014 ). Thus, besides commonly used variables coming from users’ click-streams, keywords and preferences, we can also use eye-tracking variables to accurately measure students’ attention during their interaction with learning materials (e.g., MOOC lectures).

A multitude of factors affect academic performance of the students: previous grades (Astin, 1971 ), students’ efforts and motivation (Grabe & Latta, 1981 ), socioeconomic differences (Kaplan, 1982 ), quality of schooling (Wiley, 1976 ), attention (Good & Beckerman, 1978 ) and participation (Finn, 1989 ). In this contribution, we address the general question of how gaze-variables (related to students’ reading and attention) can help students to watch MOOC videos more efficiently? We tackle this question from a teacher’s perspective (how much student follows the teacher) and call it this gaze-based measure “with-me-ness”. With-me-ness is defined in two levels: (1) perceptual (following teacher’s deictic acts) and (2) conceptual (following teacher discourse). Specifically, in this contribution, we address the following two questions:

How eye-tracking behaviour mediates the relationship between students’ motivation and learning within a MOOC? .

How well we can predict the learning gain and motivation from the eye-tracking data in its most basic form?

In order to answer these questions, we define variables using the stimulus (video lecture) presented to the students. These variables are defined using information from the stimulus with the different levels of details. Once, we have the variables, we perform mediation analysis to answer the first questions. To answer the second question, we utilize one of the most basic eye-tracking visualisations, “Heat-maps” (Špakov & Miniotas, 2007 ) to extract features and use state-of-the-art machine learning algorithms to predict the students’ learning gains.

Related work

  • Video based learning

The use of educational videos has been widely employed in the past years. Educational videos is a vital element in several online learning forms (in a MOOC, or how-to video tutorial), students spend enormous amount of time watching various forms of educational videos (Seaton, Bergner, Chuang, Mitros, & Pritchard, 2014 ). Educational videos have been studied extensively during the last decades, through the lenses of empirical studies and theories (Giannakos, 2013 ). One of the most commonly acceptes theoretical angles it the one of the Cognitive Theory of Multimedia Learning (CTML, Mayer & Moreno, 2003 ), CTML provides several insights on how video-based learning (and multimedia in general) can be used effectively.

Paivio ( 2013 ) argued that information provided by both auditory and visual channels should increase recall and retention. Studies by Mayer and Moreno ( 2003 ) have shown that visual information helps to process and remember verbal information and vice versa. This argument was strengthened by cue-summation theory showing that learning performance in the combined audio and pictures was better than in the combined audio and text, if the numbers of available cues or stimuli are increased (Severin, 1967 ). The major benefits of video as a facilitator of educational content include presentation of detailed information (with text and image), efficient engagement of students’ attention, simulating discussions and providing concrete real life examples with visualizations (Schwartz & Hartman, 2007 ).

During the last year, video-based learning practices are applied in a variety of ways, such as the flipped classroom, small private online courses (SPOCs), and xMOOCs. Today, advanced video repository systems have seen enormous growth (e.g. Khan Academy, PBS Teachers, Moma’s Modern Teachers, Lynda) through social software tools and the possibilities to enhance videos on them (Giannakos, 2013 ).

Existing research on video-based learning involves many features of today’s MOOCs lectures. Volery and Lord ( 2000 ) identified 3 success factors in online education: usable and interactive technology design, instructors’ enthusiasm and interest in the tool and students’ exposure to the web. Tobagi ( 1995 ) developed an online distant learning system to capture lectures real time, compress them, store them on an on-demand system and transmit the videos to internal server. The on-demand video system server eliminated distance limitations and provided time independent access to study material.

Tobagi ( 1995 ) compared different modalities of video lectures (interactive video, instructional television and television) and preconceptions of difficulty for different modalities and found that there was no significant difference in the learning outcome but there was a significant difference in the level of preconceived difficulty in television and interactive videos. Cennamo, Savenye, and Smith ( 1991 ) studied the effect of video based instruction on students problem solving skills and attitude towards mathematics and instruction and concluded that there was a significant improvement after the treatment in students problem solving and mathematical skills as well as the instructional attitude.

Choi and Johnson ( 2005 ) compared learning outcome and learners motivation (attention, relevance, confidence, satisfaction) in video based learning to traditional textual-instruction based learning and found no difference in learning outcome for the two conditions. However, the students wore more attentive in video based learning condition that the textual-instruction condition.

Video lectures have several affordances besides those relying to traditional fast-forward and rewind interactions. Innovative features, such as slide-video separation, social categorization and navigation, and advanced search, have also been used recently in video learning platforms (Giannakos, Chorianopoulos, & Chrisochoides, 2015 ). All this amount of interactions can be converted via analytics into useful information that can be used to support learning (Kim et al., 2014 ). As the number of learners and the diversity of collected data grows, our ability to capture richer and more authentic learning patterns grows as well, allowing us to create new affordances that amplify our learning capacity.

Eye-tracking and education

Utilizing representative and accurate data allows us to better understand students and design meaningful experiences for them. Eye tracking has been employed to understand the learning processes and different levels of outcome in a multitude of learning scenarios. Prieto et al. ( 2016 ), used eye-tracking data to explain the cognitive load that the teachers experience during different classes and various scenarios. These scenarios include different factors such as experience of the teacher, size of the class, presence of a new technology and presence of a teaching assistant. The results show that the eye-tracking data is an important source of information explaining different factors in teachers’ orchestration load and experience.

Eye-tracking has also been used in online learning for both in individual (Kizilcec, Papadopoulos, & Sritanyaratana, 2014 ) and collaborative (Sharma, Caballero, Verma, Jermann, & Dillenbourg, 2015a , b ) settings. Sharma et al. ( 2014 ) focus on capturing the attention of the individual learners in a video-based instructional setting to find the underlying mechanisms for positive learning outcome; Sharma et al. ( 2015a , b ) also focus on joint attention in remote collaborative learning scenarios to predict the learning outcome.

In general, eye-tracking allows us to generate rich information, but it can be challenging to identify what information is processed and retained based on human’s gaze. The eye-mind hypothesis (Just & Carpenter, 1980 ) proposes that there is a connection between people gaze and attention, if people process the information that they visually attend to. In this contribution, we utilize eye-tracking to measure students’ attention and then address how students’ attention (i.e., “with-me-ne”) has the capacity to mediate the relationship between students’ motivation and learning within a MOOC video.


Participants and procedure.

A total of 40 university students (12 females) from a major European university participated in the experiment. The only criterion for selecting the participant was that each participant took the introductory Java course in the previous semester. This is also a prerequisite for taking the Functional Programming in Scala course at the university campus. The participants watched two MOOC videos from the course “Functional Programming Principles in Scala” and answered programming questions after each video.

Upon their arrival in the experiment site the participants signed a consent form and answered the study processes questionnaire (SPQ, Biggs, Kember, & Leung, 2001 ). Then watched the two MOOC videos and answered the quiz based on what they were taught in the videos. During their interaction with the MOOC videos their gaze was recorded, using SMI RED 250 eye-trackers.

Some of the reasons why 40 students are sufficient in our study are: (i) the data collected are “big” in terms of the 4Vs’ (volume, variety, veracity, velocity). For example, eye-tracking data collected at a high frequency (e.g.,250 Hz in the present study) means that we have a continuous and unobtrusive measurement of the behaviour of the users. Collecting this kind of data results into continuously and massively gathering a few Gigabytes of data per person (Volume and Velocity). Furthermore, collecting data in the form of multiple datatypes at once (i.e., fixations, saccades, heatmaps, scanpaths, clickstream) satisfies Variety, whereas, previous research has acknowledged those data for cognitive load, attention, anticipation, fatigue, information process (Veracity); (ii) the current cost of the equipment necessary to collect those data does not allow for simultaneous use of multiple eye-trackers, but the granularity of information we can have access to, justifies their usage. Based on these reasons it is safe to say that 40 participants are indeed enough to arrive at the conclusions that our paper is deriving with the present study.

Moreover, in recent eye-tracking research we see similar sizes of the population used. For example, in two recent systematic reviews (Alemdag & Cagiltay, 2018 ; Ashraf et al., 2018 ) with a combined 85 different eye-tracking studies the majority (84.71%) of the studies had between 8 and 60 participants. The papers cited in this contribution with eye-tracking research also have the number of participants between 10 and 40 participants (except the collaborative studies where the researchers had 28 to 40 pairs).

The measures used in our study were: content coverage, scanpath (a combination of the fixations and saccades in the order of appearances) based variables, students/teacher co-attention (i.e., with-me-ness) coming from eye-tracking, students’ motivation coming from SPQ and students learning (coming from the final test).

The eye-tracking variables are defined using the semantics of the stimulus, that is the video lecture in our case. We define eye-tracking variables at four levels (see Table 1 ). First, the content coverage has no semantics from the video. Second, the scanpath based variables required us to define areas of interest on the video. Third, the perceptual with-me-ness was computed using the areas of interest and the pointing gestures of the teacher. Finally, the conceptual with-me-ness was defined using the areas of interest definitions and the dialogue of the teacher.

Content coverage

Content coverage is computed using the heat-maps (for details on heat-maps see Holmqvist et al. 2011 ) of the participants. We divided the MOOC lecture in slices of 10 s each and computed the heat-maps for each participant. Following are the steps to compute attention points from the heat-maps:

Subtract the image without heat-map (Fig.  1 b) from the image that has the heat-map (Fig. 1 a).

Apply connected components on the resulting image (Fig. 1 c)

The resulting image with connected components identified (Fig. 1 d) gives the attention points.

The combined area of attention points in a given time window represents the content coverage of that time window.

figure 1

Top-left: An example slide. Top-right: the same slide overlaid with the heatmap. Bottom-left: Resulting image after subtracting image without the heat-map (top-left) from heat-map overlaid image (top-right). Bottom-right: applying connected components to the bottom-left image

Attention points typically represented the different areas where the students focused their attention. The number of the attention points would depict the number of attention zones and the area of the attention points (Content Coverage) would depict the total time spent on a particular zone. We used the area covered by attention points per 10 s to check for the mediation effect on the relationship across the levels of performance and learning motivation. The area covered by the attention points typically indicated the content coverage for students. The content coverage indicates the content read by the students and the time spent on the content.

Scanpath based variables

An area of interest (AOI) was said to be missed by a participant who did not look at that particular AOI at all during the period the AOI was present on the screen. In terms of learning behaviour AOI misses would translate to completely ignoring some parts of the slides. We counted the number of such AOIs per slide in the MOOC video as a scan-path variable and compare the number of misses per slide across the levels of performance and learning strategy (for details on areas of interest see Holmqvist et al. 2011 ).

AOI backtracks

A back-track was defined as a saccade that went to the AOI which is not in the usual forward reading direction and had already been visited by the student. For example, in the Fig.  2 , if a saccade goes from AOI3 to AOI2 it would be counted as a back-track. AOI back-tracks would represent rereading behaviour while learning from the MOOC video. The notion of term rereading in the present study was slightly different than what is used in existing research (for example, Millis and King ( 2001 ), Dowhower ( 1987 ) and Paris and Jacobs ( 1984 )). The difference comes from the fact that in the present study the students did not reread the slides completely but they can refer to the previously seen content on the slide until the slide was visible. We counted the number of back-tracks per slide in the MOOC video as a scan-path variable and compared the number of back-tracks per slide across the levels of performance and motivation (Fig. 3 shows the typical AOIs on a slide).

figure 2

A typical example of a scanpath (left); and the computation of different variables (right)

figure 3

Example of a scan-path and Areas of Interest (AOI) definition. The rectangles show the AOIs defined for the displayed slide in the MOOC video and the red curve shows the visual path for 2.5 s


With-me-ness measures how much the student is paying attention to what the teacher is saying or pointing at (Sharma et al., 2014 ; Sharma et al., 2015a , b ). With-me-ness is defined at two levels of teacher-student interaction: perceptual and conceptual.

Perceptual with-me-ness

Perceptual with-me-ness measures if the student looks at the items referred to by the teacher through deictic acts (sometimes accompanied by words like, here, this variable or only by verbal references, like, the counter, the sum). Deictic references are implemented by using two cameras during MOOC recording, one that captures the teacher’s face and one, above the writing surface, that captures the hand movements. In some MOOCs, the hand is not visible but teacher uses a digital pen whose traces on the display (underlining a word, circling an object, adding an arrow) act as a deictic gestures. The perceptual “With-me-ness” has 3 main components: entry time, first fixation duration and the number of revisits (Fig.  4 ). Entry time (Fig. 4 top-right) is the temporal lag between the times a referring pointer appears on the screen and stops at the referred site (x,y) and the first time the student’s gaze stops at (x,y). First fixation duration (Fig. 4 bottom-left) is how long the student gaze stops at the referred site for the first time. Revisits (Fig. 4 bottom-right) are the number of times the student gaze comes back to the referred site. The measure of perceptual with-me-ness is an arithmetic combination of these components ( FFD  = First Fixation Duration; ET  = Entry Time; NumRV  = Number of revisits; RV  = Re Visit duration):

figure 4

A typical example of following the teacher’s deictic gestures in the video lecture

The with-me-ness measurement has also been used by Sharma et al. ( 2014 and 2015a , b ) to measure how much time the students spent in following the teacher’s deictics and dialogues. Sharma et al. ( 2014 and 2015a , b ) found this measure to be correlated to the learning gains of the students. We have extended the analyses to include the student motivation as an independent variable, learning as the dependent variable and gaze behavior as the mediating variable.

Conceptual with-me-ness

Conceptual with-me-ness is defined by the discourse of the teacher (i.e., to what extend students look at the object that the teacher is verbally referring to) Fig.  5 provides an example. Thus, conceptual with-me-ness measures how often a student looks at the objects verbally referred to by the teacher during the whole course of time (the complete video). In order to have a consistent measure of conceptual “With-me-ness” we normalize the time a student looks at the overlapping content by slide duration.

figure 5

A typical example of following the teacher’s speech in the video lecture

We used the motivation scales from the SPQ (Biggs et al., 2001 ). This is a 5-point Likert scale questionnaire containing 10 questions (5 for deep and 5 for surface motivation). Deep motivation is defined as having the intrinsic motivation towards learning, while the surface motivation is defined as fear of failing in the tests ((Biggs et al., 2001 )). In this study we used the mean motivation (mean of deep and surface) that has an average value of 2.10 (Std. Dev. = 1.20).

At the end of the videos the students took a test about the content they were taught in the two videos. The score from this test was considered to be the learning performance in this paper. After this point, we would refer to this as learning. The mean learning value was 6.9 out of 10 (Std. Dev. = 1.6). For the test, the instructor of the MOOC helped the authors to create the multiple-choice quiz for the two videos. This quiz was similar to the one used in the MOOC running at Coursera platform.

Data analysis

Mediation analysis.

To identify how “with-me-ness” (measured by eye-tracking) mediates the relationship between students’ motivation (measured by the questionnaire) and learning (measured by the post quiz) within a MOOC we employ mediation analysis proposed by Baron and Kenny ( 1986 ). In our analysis, we used motivation as the predictor, learning as the outcome and gaze behaviour as the mediating variables. Figure  6 shows the schematic representation of the model.

figure 6

Schematic representation of mediation effect and the example from the present contribution

To examine with-me-ness capacity to mediate the relationship between motivation and learning we followed Baron and Kenny ( 1986 ) three steps process: a) the predictor (i.e., motivation) must significantly influence the mediator (i.e., with-me-ness); b) the predictor (i.e., motivation) must significantly influence the outcome (i.e., learning); c) both predictor and mediator are employed to predict the outcome: if both of them significantly affect the dependent variable, then this mediator partially mediates the impact of the predictor independent variable on the outcome; if the influence of mediator is significant but the influence of predictor is not, then mediator fully mediates the impact of predictor on outcome.

Learning outcome and motivation prediction: feature extraction

For predicting the learning outcome from the behaviour data, we used the heat-maps and a pretrained deep neural network to generate the features. Figure  7 shows the basic working pipeline to extract the features from the heatmap image to the basic feature vector. Following are the steps to extract features from the eye-tracking data and the video lecture.

Overlay the eye-tracking data on the video to create the heatmap.

Create the heatmap image from every presentation slide in the video lecture (this step resulted into 15 heatmap image per participant).

Use the pretrained VGG-19 (Simonyan & Zisserman, 2014 ) deep neural network architecture to extract the 1000 features per image.

Use a non-overlapping and sliding window of size 10 to reduce the number of features to 100.

figure 7

Pipeline for extracting features from the heatmap of every minute of the eye-tracking data. Each slice of heatmap provides us with 1000 features, which are then reduced to 100 features using a moving average non-overlapping window

Feature selection for learning outcome and motivation prediction: least absolute shrinkage selection operator

To select the most important features we employ the Least Absolute Shrinkage and Selection Operator (LASSO, Tibshirani, 1996 ). LASSO is an extension of Ordinary Least Square (OLS) regression techniques fit for the cases where the number of examples is less than the length of the feature vector (Tibshirani, 1996 ). To find the best fitting curve for a set of data points, OLS tries to minimize the Residual Sum of Squares (RSS) which is the difference between the actual values of the dependent variable y and the fitted values ŷ. The formulation of the OLS is given as follows:

The objective of the OLS regression is to minimize the difference between \( \sum {\left(\hat{\mathrm{y}}-y\right)}^2 \) with the constraint that \( \sum {\beta}_i^2\le s \) . Where s is called the shrinkage factor. LASSO on the other hand performs similar optimization with the slight difference in the constraint, which is now ∑  ∣   β i   ∣   ≤  s . While using LASSO, some of the β i will be zero. Choosing s is like choosing the number of predictors in a regression model. Cross-validation can be used to estimate the best suited value for s . Here, we used a 10-fold cross-validation to select the appropriate value of s .

Learning outcome and motivation prediction and prediction evaluation

In order to predict the learning outcome of the students, we used several prediction algorithms. These algorithms include Gaussian process models (Rasmussen, 2003 ) with linear and polynomial kernels, Support Vector machines (SVM, Scholkopf & Smola, 2001 ) also with linear and polynomial kernels, Random forest (Liaw & Wiener, 2002 ), Generalised Additive Models (GAM, Hastie ( 1993 ) and Hastie and Tibshirani, 1990 ). The main reason for using these particular algorithms is that these are designed to handle datasets that have high frequency for fewer examples.

We divided the whole dataset into training (80%, 32 students) and testing (20%, 8 students). For removing the selection bias from the training set, we performed a 5-fold cross-validation. The results reported are the average error rate for all the cross-validation iterations.

For evaluating our prediction results, we are using the Normalized Root Mean Squared Error (NRMSE). NRMSE is the proposed metric for student models (Pelánek, 2015 ), and is used in most of the articles in learning technology (Moreno-Marcos, Alario-Hoyos, Muñoz-Merino, & Kloos, 2018 ) for measuring the accuracy of learning prediction.

To answer the first research question about the mediation effect of the gaze behaviour on the relation between learning and motivation, we will present the mediation analyses with content coverage, scanpath variables and with-me-ness. Further, to answer the second research question about the predicting ability of simplistic gaze variables, we will present the prediction results for both the students’ motivation and their learning.

To examine the mediation effect of content coverage we tested the model shown in Fig.  8 . As shown in Table  2 , the direct link between motivation and both variables of content coverage was significant and hence satisfied the first condition. The link between motivation and learning was also significant and hence satisfied the second condition as well. Moreover, the direct relationship between motivation with learning was not significant when content coverage variable were added. In Table 2 we present the results of the mediation analyses (row one for content coverage).

figure 8

Left: learning predicted by motivation. Middle: content coverage predicted by motivation. Right: learning predicted by motivation ( red  = high content coverage, blue  = low content coverage)

We observe that learning can be significantly predicted by motivation and that content coverage can be predicted by motivation. Finally, there is a significant prediction of learning by motivation and content coverage, however the coefficient of motivation is not significant anymore. Thus we can conclude that the content coverage fully mediates the relation between motivation and learning. The positive correlation between the motivation and learning is higher for the students with the higher content coverage than the positive correlation between motivation and learning for the students with the lower content coverage. It is clear from Fig. 8 that the students with high motivation have higher chances of getting a high score if they have high content coverage than the students with lower motivation.

Scanpath variables

To examine the mediation effect of scanpath variables we tested the model shown in Fig.  9 with both the AOI misses and the AOI backtracks of scanpath variables. As shown in Table 2 , the direct link between motivation and both scanpath variables was significant and hence satisfied the first condition. The link between motivation and learning was also significant and hence satisfied the second condition as well. However, the direct relationship between motivation with learning was still significant when scanpath variables (misses and backtracks) were added. In Table 2 we present the results of the two mediation analyses (row two for AOI misses and row three for AOI backtracks).

figure 9

Top left: learning predicted by motivation. Top-middle: AOI backtracks predicted by motivation. Top-right: AOI misses predicted by motivation. Bottom left: learning predicted by motivation ( red  = high AOI backtracks, blue  = low AOI backtracks). Bottom right: learning predicted by motivation ( red  = high AOI misses, blue  = low AOI misses)

We observe that learning can be significantly predicted by motivation and that perceptual with-me-ness can be predicted by motivation. Finally, there is a significant prediction of learning by motivation and AOI backtracks, however the coefficient of motivation is still significant. Thus we can conclude that the AOI backtracks only partially mediates the relation between motivation and learning. We can see that the correlation between the motivation and the learning is more positive for the students with high number AOI backtracks than that for the students with low number of AOI backtracks. It is clear from Fig. 9 that the students with high motivation have higher chances of getting a high score if they perform more AOI backtracks than the students with lower motivation.

Next, we observe that that learning can be significantly predicted by motivation, and that AOI misses can be predicted by motivation. Also, there is a significant prediction of learning by motivation and AOI misses, however the coefficient of motivation is still significant anymore. Thus we can conclude that the AOI misses only partially mediates the relation between motivation and learning. We can see that the correlation between the motivation and the learning is more negative for the students with high number AOI misses than that for the students with low number of AOI misses. It is clear from Fig. 9 that the students with low motivation have higher chances of getting a low score if they miss more AOIs than the students with higher motivation.

To examine the mediation effect of with-me-ness we tested the model shown in Fig.  10 with both the perceptual and the conceptual variables of with-me-ness. As shown in Table 2 , the direct link between motivation and both variables of with-me-ness was significant and hence satisfied the first condition. The link between motivation and learning was also significant and hence satisfied the second condition as well. Moreover, the direct relationship between motivation with learning was not significant when with-me-ness variables (perceptual and the conceptual) were added. In Table 2 we present the results of the two mediation analyses (row four for perceptual and row five for conceptual with-me-ness).

figure 10

Top left: learning predicted by motivation. Top-middle: perceptual with-me-ness predicted by motivation. Top-right: conceptual with-me-ness predicted by motivation. Bottom left: learning predicted by motivation ( red  = high perceptual with-me-ness, blue  = low with-me-ness). Bottom right: learning predicted by motivation ( red  = high conceptual with-me-ness, blue  = low conceptual with-me-ness)

We observe that learning can be significantly predicted by motivation and that perceptual with-me-ness can be predicted by motivation. Finally, there is a significant prediction of learning by motivation and perceptual with-me-ness, however the coefficient of motivation is not significant anymore. Thus, we can conclude that the perceptual with-me-ness fully mediates the relation between motivation and learning. We can see that the correlation between the motivation and the learning is more positive for the students with high perceptual with-me-ness than that for the students with low perceptual with-me-ness. It is clear from Fig.  10 that the students with high motivation have higher chances of getting a high score if they high perceptual with-me-ness than the students with lower motivation.

Next, we observe that that learning can be significantly predicted by motivation, and that conceptual with-me-ness can be predicted by motivation. Also, there is a significant prediction of learning by motivation and conceptual with-me-ness, however the coefficient of motivation is not significant anymore. Thus we can conclude that the conceptual with-me-ness fully mediates the relation between motivation and learning. We can see that the correlation between the motivation and the learning is more positive for the students with high conceptual with-me-ness than that for the students with low conceptual with-me-ness. It is clear from Fig.  10 that the students with high motivation have higher chances of getting a high score if they high conceptual with-me-ness than the students with lower motivation.

Prediction results

Figure  11 shows the prediction results for the students’ learning and motivation. For learning prediction, we observed a minimum error of 5.04% (SD = 0.52%) using the Gaussian Process Model with a polynomial kernel. The second least error of 8.07% (SD = 0.54%) was obtained using a Support Vector Machine also with a polynomial kernel. The worst error rate was found to be 11.18% (Sd = 0.63%) using the Generalised additive models. For motivation prediction, we observed similar performances with the prediction algorithms. We observed a minimum error of 9.04% (SD = 0.56%) using the Gaussian Process Model with a polynomial kernel. The second least error of 10.98% (SD = 0.57%) was obtained using a Support Vector Machine also with a polynomial kernel. The worst error rate was found to be 16.11% (Sd = 0.63%) using the Generalised additive models.

figure 11

Different prediction algorithms to predict the student’s learning (from the tests) and their motivation (from the study process questionnaire). In both the case the top two prediction algorithms are Gaussian process models with polynomial kernel and the Support Vector Machine also with a polynomial kernel

Discussions and conclusions

The reported study developed and empirically explored two models, where teacher/student co-attention (i.e., with-me-ness) were found to mediate the relationship of motivation and learning in MOOC videos. These two models demonstrated how the aspect of co-attention, not only influences learning, but also affects the effect of motivation in learning. Quantifying an often-overlooked element (i.e., instructor’s capacity to draw student’s attention) in online courses.

The attention points, derived from the heat-maps, were indicative of the students’ attention both in the terms of screen space and time. The area of the attention points depended on the time spent on a specific area on the screen. Higher average area of the attention points could be interpreted as more reading time during a particular period. The good performing students having a higher motivation had the highest content coverage (larger areas of the attention) among all the participants, despite having spent the similar time on the slides.

However, more reading time did not always guarantee higher performance. Byrne, Freebody, and Gates ( 1992 ) showed the inverse in a longitudinal reading study by proving that the best performing students were the fastest readers. On the other hand, Reinking ( 1988 ) showed that there was no relation between the comprehension and reading time. As Just and Carpenter ( 1980 ) put “ There is no single mode of reading. Reading varies as a function of who is reading, what they are reading, and why they are reading it .” The uncertainty of results about the relation between the performance and the reading time led us to find the relation between the reading time, performance and learning motivation. We found that the good-performers had more reading time than poor-performers and the high motivated-learners had more reading time than low motivated-learners. We could interpret this reading behaviour, based upon the reading time differences, in terms of more attention being paid by the good performing students having a high learning motivation than other student profiles. We could use content coverage to give feedback to the students about their attention span. Moreover, one could use the content coverage for student profiling as well based on the performance and the learning motivation.

The area of interest (AOI) misses and back-tracks were the temporal features computed from the temporal order of AOIs looked at. We found that good-performers with high motivation had significantly fewer AOI misses than the poor-performers with low motivation. AOI misses could be useful in providing students with the feedback about their viewing behaviour just by looking at what AOIs they missed.

The AOI back-tracks were indicative of the rereading behaviour of the students. We found that the good performers and highly motivated learners had significantly more back-tracks than the poor-performers. Moreover, some of the good-performers back-tracked to all the previously seen content, this explains the special distribution of AOI back-tracks for good-performers. Millis and King ( 2001 ) and Dowhower ( 1987 ) showed in their studies that rereading improved the comprehension. In the present study, the scenario is somewhat different than Millis and King ( 2001 ) and Dowhower ( 1987 ). In the present study, the students did not read the study material again. Instead, the students referred back to the previously seen content again during the time the slide was visible to them. Thus, the relation between rereading of the same content and the performance should be taken cautiously, clearly further experimentation is needed to reach a causal conclusion.

One interesting finding in the present study was the fact that the content coverage had fully mediated the relation between the performance and the learning motivation. Whereas, the AOI misses and AOI back-tracks had partial mediation effects. This could be interpreted in terms of the type of information we considered to compute the respective variables. For example, the content coverage computation took into account both the screen space and the time information and AOI back-tracks (and misses) computation required only the temporal information. However, in the context of the present study, we could not conclude the separation between spatial and temporal information and how it effected the relation between the gaze variables and performance and learning strategy.

In addition, we found that high-performers (those who scored high in the test) had more perceptual with-me-ness on the referred sites than the low-performers. This is in accordance with the literature, where Jermann and Nüssli ( 2012 ), showed that better performing pairs had more recurrent gaze patterns during the moments of deictic references. We also found that the students who scored better in the test, were following the teacher, both in deictic and discourse, in an efficient manner than those who did not score well in the test. These results were not surprising, but could be utilised to inform the students about their attention levels during MOOC lectures in an automatic and objective manner. The results also contribute towards our long-term goal of defining the student profiles based on their performance and motivation using the gaze data. The attention points can serve the purpose of a delayed feedback to the students based on their attention span.

The conceptual with-me-ness can be explained as a gaze-measure for the efforts of the student to sustain common ground within the teacher-student dyad. Dillenbourg and Traum ( 2006 ) and Richardson, Dale, and Kirkham ( 2007 ) emphasised upon the importance of grounding gestures to sustain shared understanding in collaborative problem solving scenarios. A video is not a dialogue; the learner has to build common grounds, asymmetrically, with the teacher. The correlation we observed between conceptual with-me-ness and the test score (r = 0.36, p  < 0.05) seemed to support this hypothesis.

Another interesting finding of our study is that the conceptual with-me-ness has more percentage mediation than the perceptual with-me-ness (39% for conceptual as compared to 33% for perceptual with-me-ness). This shows that eye-tracking can not only provide access to students’ attention but also to the students’ information processing mechanisms as well. Thus, students gaze is an important source of information that can be used to inform online learning.

Finally, from the prediction results, we were able to show that the heat-maps cannot be only used as a popular visualization tools, but also as a source of features to predict performance and other traits, such as motivation. The best prediction results for the performance was with a 5.04% normalized error. In terms of a quiz-based evaluation of learning, which in our case are 10 questions, this error translates to less than one question. For example, if a student answers 9 questions correctly, our method will predict the score within the range of [8.5–9.5]. Similarly, on the motivation scale, which is a 5-point Likert scale making it in the range of [0 -- 50], the error of 9.04% would translate to one incorrect prediction out of ten on the scale proposed by Biggs et al. ( 2001 ).

Additionally, in this contribution, the eye-tracking variables we defined had different pre-processing requirement. These variables also have capacities in terms of being used within an adaptive and real-time system. The computation of content coverage is real-time and requires no pre-processing of the data or the stimulus. The Scan-path variables can also be computed in real-time and there is small amount of pre-processing required in term of defining the area of interest (AOI) to be able to compute them. The pre-processing for computing the perceptual with-me-ness could be automatized since there are computer-vision methods to detect pointing/other deictic gestures of the teacher. Once this detection is done, the real-time computation of Perceptual with-me-ness if fairly straightforward. Finally, the conceptual with-me-ness, requires a few manual interventions in transcribing the teachers’ dialogues and mapping them to the content. This acts as a hindrance in the real-time computation of the conceptual with-me-ness, and therefore, this is the only gaze-based measure used in this study that requires further work to be used as within a personalised adaptive gaze-based feedback system.

To gain further insight into the design of MOOC videos and the affordances of the respective systems, we need to consider eye-gaze measurements (or can call them gaze analytics) that we found to not only strongly associated with learning, but also mediate the influence of other variables (i.e., motivation). Discussing these features from a technical standpoint can give rise to practical implications for the design of MOOC videos (e.g., designed in a way to draw students’ attention (Kizilcec et al., 2014 ) and the respective video-based learning systems (e.g., offer an indication of students’ attention based on the web-camera).

For future work, we are now beginning to collect eye-tracking data from different types instruction (e.g., pair problem solving) utilizing different stimulus (e.g., not controlled from the student like the video). In addition, we intend to investigate whether a plausible association exists between different students (e.g., novices). After identifying the role of with-me-ness and other gaze-analytics in different contexts, we will be able to propose how gaze-analytics can be integrated to various contemporary learning systems. For example, allowing us to enable student profiles based on their performance and learning strategy using gaze-analytics, and ultimately provide gaze-aware feedback to improve the overall learning process.

Availability of data and materials

As it is possible to identify participants from the data, ethical requirements do not permit us to share participant data from this study.


Massive open online course

Artificial intelligence

Cognitive theory of multimedia learning

Special purpose online courses

eXtended massive open online course

Study process questionnaire

Volume, variety, velocity, and veracity

Area of interest

First fixation duration

Number of ReVisits

Standard Deviation

Visual Geometry Group

Least absolute shrinkage and selection operator

Ordinary least squares

Residual sum of squares

Generalised additive models

Normalised root mean squared error

Alemdag, E., & Cagiltay, K. (2018). A systematic review of eye tracking research on multimedia learning. Computers in Education, 125 , 413–428.

Article   Google Scholar  

Ashraf, H., Sodergren, M. H., Merali, N., Mylonas, G., Singh, H., & Darzi, A. (2018). Eye-tracking technology in medical education: A systematic review. Medical Teacher, 40 (1), 62–69.

Astin, A. W. (1971). Predicting academic performance in college: Selectivity data for 2300 American colleges .

Google Scholar  

Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51 (6), 1173.

Biggs, J., Kember, D., & Leung, D. Y. (2001). The revised two-factor study process questionnaire: R-SPQ-2F. The British Journal of Educational Psychology, 71 (1), 133–149.

Byrne, B., Freebody, P., & Gates, A. (1992). Longitudinal data on the relations of word-reading strategies to comprehension, reading time, and phonemic awareness. Reading Research Quarterly, 27 , 141–151.

Cennamo, K. S., Savenye, W. C., & Smith, P. L. (1991). Mental effort and video-based learning: The relationship of preconceptions and the effects of interactive and covert practice. Educational Technology Research and Development, 39 (1), 5–16.

Choi, H. J., & Johnson, S. D. (2005). The effect of context-based video instruction on learning and motivation in online courses. American Journal of Distance Education, 19 (4), 215–227.

Dillenbourg, P., & Traum, D. (2006). Sharing solutions: Persistence and grounding in multimodal collaborative problem solving. The Journal of the Learning Sciences, 15 (1), 121–151.

Dowhower, S. L. (1987). Effects of repeated Reading on second-grade transitional readers' fluency and comprehension. Reading Research Quarterly, 22 , 389–406.

Finn, J. D. (1989). Withdrawing from school. Review of Educational Research, 59 (2), 117–142.

Giannakos, M. N. (2013). Exploring the video-based learning research: A review of the literature. British Journal of Educational Technology, 44 (6), E191–E195.

Giannakos, M. N., Chorianopoulos, K., & Chrisochoides, N. (2015). Making sense of video analytics: Lessons learned from clickstream interactions, attitudes, and learning outcome in a video-assisted course. The International Review of Research in Open and Distance Learning, 16 (1), 260–283.

Good, T. L., & Beckerman, T. M. (1978). Time on task: A naturalistic study in sixth-grade classrooms. The Elementary School Journal, 78 (3), 193–201.

Grabe, M., & Latta, R. M. (1981). Cumulative achievement in a mastery instructional system: The impact of differences in resultant achievement motivation and persistence. American Educational Research Journal, 18 (1), 7–13.

Hastie. (1993). In chambers and Hastie (1993) statistical models in S . Chapman and Hall.

Hastie and Tibshirani. (1990). Generalized additive models . Chapman and Hall.

Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & Van de Weijer, J. Eye tracking: A comprehensive guide to methods and measures. OUP Oxford, 2011.

Jermann, P., & Nüssli, M. A. (2012). Effects of sharing text selections on gaze cross-recurrence and interaction quality in a pair programming task. In Proceedings of the ACM 2012 conference on computer supported cooperative work (pp. 1125–1134). ACM.

Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87 (4), 329.

Kaplan, R. M. (1982). Nader's raid on the testing industry: Is it in the best interest of the consumer? The American Psychologist, 37 (1), 15.

Kim, J., Nguyen, P. T., Weir, S., Guo, P. J., Miller, R. C., & Gajos, K. Z. (2014). Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 4017–4026). ACM.

Kizilcec, R. F., Papadopoulos, K., & Sritanyaratana, L. (2014). Showing face in video instruction: Effects on information retention, visual attention, and affect. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2095-2102). ACM.

Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2 (3), 18–22.

Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38 (1), 43–52.

Millis, K. K., & King, A. (2001). Rereading strategically: The influences of comprehension ability and a prior reading on the memory for expository text. Reading Psychology, 22 (1), 41–65.

Moreno-Marcos, P. M., Alario-Hoyos, C., Muñoz-Merino, P. J., & Kloos, C. D. (2018). Prediction in MOOCs: A review and future research directions. IEEE Transactions on Learning Technologies , 12 (3), 384-401.

Paivio, A. (2013). Imagery and verbal processes . Psychology Press.

Paris, S. G., & Jacobs, J. E. (1984). The benefits of informed instruction for children's reading awareness and comprehension skills. Child development, 2083–2093.

Pelánek, R. (2015). Metrics for evaluation of student models. Journal of Educational Data Mining, 7 (2), 1–19.

Prieto, L. P., Sharma, K., Dillenbourg, P., & Jesús, M. (2016). Teaching analytics: Towards automatic extraction of orchestration graphs using wearable sensors. In Proceedings of the sixth international conference on Learning Analytics & Knowledge (pp. 148–157). ACM.

Rasmussen, C. E. (2003). Gaussian processes in machine learning. In Summer School on machine learning (pp. 63–71). Berlin, Heidelberg: Springer.

Reinking, D. (1988). Computer-mediated text and comprehension differences: The role of Reading time, reader preference, and estimation of learning. Reading Research Quarterly, 23 , 484–498.

Richardson, D. C., Dale, R., & Kirkham, N. Z. (2007). The art of conversation is coordination. Psychological Science, 18 (5), 407–413.

Scholkopf, B., & Smola, A. J. (2001). Learning with kernels: Support vector machines, regularization, optimization, and beyond . MIT press.

Schwartz, D. L., & Hartman, K. (2007). It is not television anymore: Designing digital video for learning and assessment. Video Research in the Learning Sciences edited by Ricki Goldman, Roy Pea, Brigid Barron, Sharon J. Derry , 335–348.

Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a massive open online course?

Book   Google Scholar  

Severin, W. (1967). Another look at cue summation. AV Communication Review, 15 (3), 233–245.

Sharma, K., Caballero, D., Verma, H., Jermann, P., & Dillenbourg, P. (2015a). Looking AT versus looking THROUGH: A dual eye-tracking study in MOOC context . In the proc. of the Computer Supported Collaborative Learning 205 (pp. 260-267). International Society of the Learning Sciences, Inc.[ISLS].

Sharma, K., Caballero, D., Verma, H., Jermann, P., & Dillenbourg, P. (2015b). Shaping learners’ attention in massive open online courses. Revue internationale des technologies en pédagogie universitaire/International Journal of Technologies in Higher Education, 12 (1–2), 52–61.

Sharma, K., Jermann, P., & Dillenbourg, P. (2014). “With-me-ness”: A gaze-measure for students’ attention in MOOCs. In Proceedings of international conference of the learning sciences 2014 (no. CONF (pp. 1017–1022). ISLS.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition arXiv preprint arXiv:1409.1556.

Špakov, O., & Miniotas, D. (2007). Visualization of eye gaze data using heat maps. Elektronika ir elektrotechnika, 74 , 55–58.

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B: Methodological, 58 (1), 267–288.

Tobagi, F. A. (1995). Distance learning with digital video. IEEE Multimedia, 2 (1), 90–93.

Volery, T., & Lord, D. (2000). Critical success factors in online education. International Journal of Educational Management, 14 (5), 216–223.

Wiley, D. E. (1976). Another hour, another day: Quantity of schooling, a potent path for policy. In Schooling and achievement in American society (pp. 225–265).

Download references


This work is supported from the Norwegian Research Council under the projects FUTURE LEARNING (number: 255129/H20) and Xdesign (290994/F20).

No funding was received for this study.

Author information

Authors and affiliations.

Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway

Kshitij Sharma & Michail Giannakos

Department of Computer Science, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

Pierre Dillenbourg

You can also search for this author in PubMed   Google Scholar


KS designed and conducted the study, analysed the data and drafted the manuscript. MG participated in the analysis of the data and framing of the contribution. PD participated in the conceptualisation and design of the study. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Kshitij Sharma .

Ethics declarations

Competing interests.

Participation was voluntarily, and all the data collected anonymously. Appropriate permissions and ethical approval for the participation requested and approved.

There is no potential conflict of interest in this study.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Sharma, K., Giannakos, M. & Dillenbourg, P. Eye-tracking and artificial intelligence to enhance motivation and learning. Smart Learn. Environ. 7 , 13 (2020). https://doi.org/10.1186/s40561-020-00122-x

Download citation

Received : 04 September 2019

Accepted : 03 April 2020

Published : 26 April 2020

DOI : https://doi.org/10.1186/s40561-020-00122-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Eye-tracking
  • Multimodal analytics
  • Massive open online courses
  • Deep learning

a systematic review of eye tracking research on multimedia learning

Using eye-tracking in education: review of empirical research and technology

  • Research Article
  • Published: 24 January 2024

Cite this article

  • Fengfeng Ke   ORCID: orcid.org/0000-0003-4203-1203 1 ,
  • Ruohan Liu 2 ,
  • Zlatko Sokolikj 1 ,
  • Ibrahim Dahlstrom-Hakki 3 &
  • Maya Israel 4  

556 Accesses

1 Altmetric

Explore all metrics

This study aims to provide a systematic review of recent eye-tracking studies conducted with children and adolescents in learning settings, as well as a scoping review of the technologies and machine learning approaches used for eye-tracking. To this end, 68 empirical studies containing 78 experiments were analyzed. Eye-tracking devices as well as the ever-evolving mechanisms of gaze prediction endorsed in the prior and current research were identified. The review results indicated a set of salient patterns governing the employment of eye-tracking measures and the inferred cognitive constructs in learning, along with the common practices in analyzing and presenting the eye-tracking data. Eye-tracking has been used to track engagement, learning interactions, and learning-relevant cognitive activities mainly in a research lab or a highly-controlled learning setting. The mechanisms of gaze capturing and prediction with learners in a dynamic and authentic learning environment are evolving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

a systematic review of eye tracking research on multimedia learning

Similar content being viewed by others

a systematic review of eye tracking research on multimedia learning

Artificial intelligence in education: Addressing ethical challenges in K-12 settings

Selin Akgun & Christine Greenhow

a systematic review of eye tracking research on multimedia learning

Assistive technology for the inclusion of students with disabilities: a systematic review

José María Fernández-Batanero, Marta Montenegro-Rueda, … Inmaculada García-Martínez

a systematic review of eye tracking research on multimedia learning

A Systematic Review of Research on Personalized Learning: Personalized by Whom, to What, How, and for What Purpose(s)?

Matthew L. Bernacki, Meghan J. Greene & Nikki G. Lobczowski

Articles marked with an asterisk were included in the systematic review

Alemdag, E., & Cagiltay, K. (2018). A systematic review of eye tracking research on multimedia learning. Computers & Education, 125 , 413–428. https://doi.org/10.1016/j.compedu.2018.06.023

Article   Google Scholar  

Anderson, J. R. (2005). Cognitive psychology and its implications . Macmillan.

Google Scholar  

Anderson, J. R., Bothell, D., & Douglass, S. (2004). Eye movements do not reflect retrieval processes: Limits of the eye-mind hypothesis. Psychological Science, 15 (4), 225–231. https://doi.org/10.1111/j.0956-7976.2004.00656.x

Andrzejewska, M., & Stolińska, A. (2016). Comparing the difficulty of tasks using eye tracking combined with subjective and behavioural criteria. Journal of Eye Movement Research . https://doi.org/10.16910/jemr.9.3.3

Bagot, K. L., Kuo, F. E., & Allen, F. C. (2007). Amendments to the perceived restorative components scale for children (PRCS-C II). Children Youth and Environments, 17 (4), 124–127. https://doi.org/10.7721/chilyoutenvi.17.4.0124

*Barnes, A. E., & Kim, Y. S. (2016). Low-skilled adult readers look like typically developing child readers: A comparison of reading skills and eye movement behavior. Reading and Writing, 29 (9), 1889–1914. https://doi.org/10.1007/s11145-016-9657-5

Bauer, P. J., & Dugan, J. A. (2020). Memory development. Neural circuit and cognitive development (pp. 395–412). Academic Press.

Chapter   Google Scholar  

Bendall, R. C., Lambert, S., Galpin, A., Marrow, L. P., & Cassidy, S. (2019). Psychophysiological indices of cognitive style: A triangulated study incorporating neuroimaging, eye-tracking, psychometric and behavioral measures. Personality and Individual Differences , 144 , 68–78. https://doi.org/10.1016/j.paid.2019.02.034

Blascheck, T., Kurzhals, K., Raschke, M., Burch, M., Weiskopf, D., & Ertl, T. (2017). Visualization of eye tracking data: A taxonomy and survey. Computer Graphics Forum, 36 (8), 260–284. https://doi.org/10.1111/cgf.13079

Bradbury-Jones, C., Breckenridge, J. P., Clark, M. T., Herber, O. R., Jones, C., & Taylor, J. (2019). Advancing the science of literature reviewing in social research: The focused mapping review and synthesis. International Journal of Social Research Methodology, 22 (5), 451–462. https://doi.org/10.1080/13645579.2019.1576328

Braun, V., & Clarke, V. (2012). Thematic analysis. American Psychological Association . https://doi.org/10.1037/13620-004

*Bolden, D., Barmby, P., Raine, S., & Gardner, M. (2015). How young children view mathematical representations: A study using eye-tracking technology. Educational Research, 57 (1), 59–79. https://doi.org/10.1080/00131881.2014.983718

*Bosma, E., & Nota, N. (2020). Cognate facilitation in Frisian-Dutch bilingual children’s sentence reading: An eye-tracking study. Journal of Experimental Child Psychology, 189 , 104699. https://doi.org/10.1016/j.jecp.2019.104699

Burris, J. L., Barry-Anwar, R. A., & Rivera, S. M. (2017). An eye tracking investigation of attentional biases towards affect in young children. Developmental Psychology, 53 (8), 1418. https://doi.org/10.1037/dev0000345

Carter, B. T., & Luke, S. G. (2020). Best practices in eye tracking research. International Journal of Psychophysiology, 155 , 49–62. https://doi.org/10.1016/j.ijpsycho.2020.05.010

*Childers, J. B., Porter, B., Dolan, M., Whitehead, C. B., & McIntyre, K. P. (2020). Does children’s visual attention to specific objects affect their verb learning? First Language, 40 (1), 21–40.

*Chita-Tegmark, M., Arunachalam, S., Nelson, C. A., & Tager-Flusberg, H. (2015). Eye-tracking measurements of language processing: Developmental differences in children at high risk for ASD. Journal of Autism and Developmental Disorders, 45 (10), 3327–3338.

*Clinton, V., Cooper, J. L., Michaelis, J. E., Alibali, M. W., & Nathan, M. J. (2017). How revisions to mathematical visuals affect cognition: Evidence from eye tracking. Eye-tracking technology applications in educational research (pp. 195–218). IGI Global.

Cognolato, M., Atzori, M., & Müller, H. (2018). Head-mounted eye gaze tracking devices: An overview of modern devices and recent advances. Journal of Rehabilitation and Assistive Technologies Engineering . https://doi.org/10.1177/2055668318773991

Cowen, L., Ball, L. J., & Delin, J. (2002). An eye movement analysis of web page usability. People and computers XVI-memorable yet invisible (pp. 317–335). Springer.

*Dahlstrom-Hakki, I., Asbell-Clarke, J., & Rowe, E. (2019). Showing is knowing: The potential and challenges of using neurocognitive measures of implicit learning in the classroom. Mind, Brain, and Education, 13 (1), 30–40. https://doi.org/10.1111/mbe.12177

*Desmeules-Trudel, F., Moore, C., & Zamuner, T. S. (2020). Monolingual and bilingual children’s processing of coarticulation cues during spoken word recognition. Journal of Child Language, 47 (6), 1189–1206. https://doi.org/10.1017/s0305000920000100

Duchowski, A. T. (2018). Gaze-based interaction: A 30 year retrospective. Computers & Graphics , 73 , 59–69. https://doi.org/10.1016/j.cag.2018.04.002

Dye, M. W., & Hauser, P. C. (2014). Sustained attention, selective attention and cognitive control in deaf and hearing children. Hearing Research, 309 , 94–102. https://doi.org/10.1016/j.heares.2013.12.001

Ellis, N. C., Hafeez, K., Martin, K. I., Chen, L., Boland, J., & Sagarra, N. (2014). An eye-tracking study of learned attention in second language acquisition. Applied Psycholinguistics, 35 (3), 547–579. https://doi.org/10.1017/S0142716412000501

*Eilers, S., Tiffin-Richards, S. P., & Schroeder, S. (2018). Individual differences in children’s pronoun processing during reading: Detection of incongruence is associated with higher reading fluency and more regressions. Journal of Experimental Child Psychology, 173 , 250–267. https://doi.org/10.1016/j.jecp.2018.04.005

*Erickson, L. C., Thiessen, E. D., Godwin, K. E., Dickerson, J. P., & Fisher, A. V. (2015). Endogenously and exogenously driven selective sustained attention: Contributions to learning in kindergarten children. Journal of Experimental Child Psychology, 138 , 126–134. https://doi.org/10.1016/j.jecp.2015.04.011

Faber, M., Krasich, K., Bixler, R. E., Brockmole, J. R., & D’Mello, S. K. (2020). The eye–mind wandering link: Identifying gaze indices of mind wandering across tasks. Journal of Experimental Psychology: Human Perception and Performance, 46 (10), 1201–1221. https://doi.org/10.1037/xhp0000743

*Falck-Ytter, T. (2015). Gaze performance during face-to-face communication: A live eye tracking study of typical children and children with autism. Research in Autism Spectrum Disorders, 17 , 78–85. https://doi.org/10.1016/j.rasd.2015.06.007

*Falck-Ytter, T., Carlström, C., & Johansson, M. (2015). Eye contact modulates cognitive processing differently in children with autism. Child Development, 86 (1), 37–47. https://doi.org/10.1111/cdev.12273

Frazier, T. W., Klingemier, E. W., Parikh, S., Speer, L., Strauss, M. S., Eng, C., Hardan, A. Y., & Youngstrom, E. A. (2018). Development and validation of objective and quantitative eye tracking—based measures of autism risk and symptom levels. Journal of the American Academy of Child & Adolescent Psychiatry, 57 (11), 858–866. https://doi.org/10.1016/j.jaac.2018.06.023

Fisher, A., Thiessen, E., Godwin, K., Kloos, H., & Dickerson, J. (2013). Assessing selective sustained attention in 3-to 5-year-old children: Evidence from a new paradigm. Journal of Experimental Child Psychology, 114 (2), 275–294. https://doi.org/10.1016/j.jecp.2012.07.006

Findlay, J. M., Findlay, J. M., & Gilchrist, I. D. (2003). Active vision: The psychology of looking and seeing . Oxford University Press.

Book   Google Scholar  

Forssman, L., Ashorn, P., Ashorn, U., Maleta, K., Matchado, A., Kortekangas, E., & Leppänen, J. M. (2017). Eye-tracking-based assessment of cognitive function in low-resource settings. Archives of Disease in Childhood, 102 (4), 301–302. https://doi.org/10.1136/archdischild-2016-310525

*Garcia-Zapirain, B., de la Torre Díez, I., & López-Coronado, M. (2017). Dual system for enhancing cognitive abilities of children with ADHD using leap motion and eye-tracking technologies. Journal of Medical Systems, 41 (7), 1–8. https://doi.org/10.1007/s10916-017-0757-9

Gaskell, M. G., & Dumay, N. (2003). Lexical competition and the acquisition of novel words. Cognition, 89 (2), 105–132. https://doi.org/10.1016/S0010-0277(03)00070-2

Geangu, E., Hauf, P., Bhardwaj, R., & Bentz, W. (2011). Infant pupil diameter changes in response to others’ positive and negative emotions. PLoS ONE, 6 (11), e27132. https://doi.org/10.1371/journal.pone.0027132

Geisen, E., & Bergstrom, J. R. (2017). Usability testing for survey research. Morgan Kaufmann . https://doi.org/10.1016/B978-0-12-803656-3.00001-4

George, A., & Routray, A. (2016). Real-time eye gaze direction classification using convolutional neural network. 2016 international conference on signal processing and communications (SPCOM) (pp. 1–5). IEEE.

Giannakos, M. N., Papavlasopoulou, S., & Sharma, K. (2020). Monitoring children’s learning through wearable eye-tracking: The case of a making-based coding activity. IEEE Pervasive Computing, 19 (1), 10–21. https://doi.org/10.1109/MPRV.2019.2941929

Godfroid, A. (2013). Eye tracking. In P. J. Robinson (Ed.), The Routledge encyclopedia of second language acquisition (pp. 234–236). Routledge. https://doi.org/10.4324/9781315775616

Goswami, U. (2019). Cognitive development and cognitive neuroscience: The learning brain . Routledge.

Grant, M. J., & Booth, A. (2009). A typology of reviews: An analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26 (2), 91–108.

Groves, P. M., & Thompson, R. F. (1970). Habituation: A dual-process theory. Psychological Review, 77 (5), 419.

*Gulz, A., Londos, L., & Haake, M. (2020). Preschoolers’ understanding of a teachable agent-based game in early mathematics as reflected in their gaze behaviors—an experimental study. International Journal of Artificial Intelligence in Education, 30 (1), 38–73. https://doi.org/10.1007/s40593-020-00193-4

*Hahn, N., Snedeker, J., & Rabagliati, H. (2015). Rapid linguistic ambiguity resolution in young children with autism spectrum disorder: Eye tracking evidence for the limits of weak central coherence. Autism Research, 8 (6), 717–726. https://doi.org/10.1002/aur.1487

Harezlak, K., & Kasprowski, P. (2018). Application of eye tracking in medicine: A survey, research issues and challenges. Computerized Medical Imaging and Graphics, 65 , 176–190. https://doi.org/10.1016/j.compmedimag.2017.04.006

Hannula, D. E., Ryan, J. D., Tranel, D., & Cohen, N. J. (2007). Rapid onset relational memory effects are evident in eye movement behavior, but not in hippocampal amnesia. Journal of Cognitive Neuroscience, 19 (10), 1690–1705. https://doi.org/10.1162/jocn.2007.19.10.1690

*Hautala, J., Kiili, C., Kammerer, Y., Loberg, O., Hokkanen, S., & Leppänen, P. H. (2018). Sixth graders’ evaluation strategies when reading Internet search results: An eye-tracking study. Behaviour & Information Technology, 37 (8), 761–773. https://doi.org/10.1080/0144929x.2018.1477992

*Heathcote, L. C., Lau, J. Y. F., Mueller, S. C., Eccleston, C., Fox, E., Bosmans, M., & Vervoort, T. (2017). Child attention to pain and pain tolerance are dependent upon anxiety and attention control: An eye-tracking study. European Journal of Pain, 21 (2), 250–263. https://doi.org/10.1002/ejp.920

*Hessel, A. K., Nation, K., & Murphy, V. A. (2021). Comprehension monitoring during reading: An eye-tracking study with children learning English as an additional language. Scientific Studies of Reading, 25 (2), 159–178.

Huang, M. X., Kwok, T. C., Ngai, G., Chan, S. C., & Leong, H. V. (2016). Building a person-alized, auto-calibrating eye tracker from user interactions. Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 5169–5179). CHI.

Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., & Van de Weijer, J. (2011). Eye tracking: A comprehensive guide to methods and measures . OUP Oxford.

Hosp, B., Eivazi, S., Maurer, M., Fuhl, W., Geisler, D., & Kasneci, E. (2020). RemoteEye: An open-source high-speed remote eye tracker: Implementation insights of a pupil-and glint-detection algorithm for high-speed remote eye tracking. Behavior Research Methods, 52 (3), 1387–1401. https://doi.org/10.3758/s13428-019-01305-2

*Howard, L. H., Riggins, T., & Woodward, A. L. (2020). Learning from others: The effects of agency on event memory in young children. Child Development, 91 (4), 1317–1335. https://doi.org/10.1111/cdev.13303

Huettig, F., Rommers, J., & Meyer, A. S. (2011). Using the visual world paradigm to study language processing: A review and critical evaluation. Acta Psychologica, 137 (2), 151–171. https://doi.org/10.1016/j.actpsy.2010.11.003

Irwin, D. E. (2004). Fixation location and fixation duration as indices of cognitive processing. The Interface of Language, Vision, and Action: Eye Movements and the Visual World, 217 , 105–133. https://doi.org/10.4324/9780203488430

Jacob, R. J., & Karn, K. S. (2003). Eye tracking in human-computer interaction and usability research: Ready to deliver the promises. The mind’s eye (pp. 573–605). North-Holland.

*Jiang, S., Jiang, X., & Siyanova-Chanturia, A. (2020). The processing of multiword expressions in children and adults: An eye-tracking study of Chinese. Applied Psycholinguistics, 41 (4), 901–931. https://doi.org/10.1017/S0142716420000296

*Jian, Y. C., & Ko, H. W. (2017). Influences of text difficulty and reading ability on learning illustrated science texts for children: An eye movement study. Computers & Education, 113 , 263–279. https://doi.org/10.1016/j.compedu.2017.06.002

Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye fixations to comprehension. Psychological Review, 87 (4), 329. https://doi.org/10.1037/0033-295X.87.4.329

*Jung, Y. J., Zimmerman, H. T., & Pérez-Edgar, K. (2018). A methodological case study with mobile eye-tracking of child interaction in a science museum. TechTrends, 62 (5), 509–517. https://doi.org/10.1007/s11528-018-0310-9

Kaplan, R., & Kaplan, S. (1989). The experience of nature: A psychological perspective. Cambridge University Press . https://doi.org/10.1037/030621

Kaplan, S. (1995). The restorative benefits of nature: Toward an integrative framework. Journal of Environmental Psychology, 15 (3), 169–182. https://doi.org/10.1016/0272-4944(95)90001-2

Kaakinen, J. K., Ballenghein, U., Tissier, G., & Baccino, T. (2018). Fluctuation in cognitive engagement during reading: Evidence from concurrent recordings of postural and eye movements. Journal of Experimental Psychology: Learning, Memory, and Cognition, 44 (10), 1671. https://doi.org/10.1037/xlm0000539

*Khu, M., Chambers, C. G., & Graham, S. A. (2020). Preschoolers flexibly shift between speakers’ perspectives during real-time language comprehension. Child Development, 91 (3), e619–e634. https://doi.org/10.1111/cdev.13270

Kiefer, P., Giannopoulos, I., Raubal, M., & Duchowski, A. (2017). Eye tracking for spatial research: Cognition, computation, challenges. Spatial Cognition & Computation, 17 (1–2), 1–19. https://doi.org/10.1080/13875868.2016.1254634

King, J., & Markant, J. (2020). Individual differences in selective attention and scanning dynamics influence children’s learning from relevant non-targets in a visual search task. Journal of Experimental Child Psychology, 193 , 104797. https://doi.org/10.1016/j.jecp.2019.104797

Kitchenham, B. (2004). Procedures for performing systematic reviews. Technical report TR/SE0401, Keele University, and Technical Report 0400011T.1, National ICT Australia. https://www.inf.ufsc.br/~aldo.vw/kitchenham.pdf

Klingner, J. (2010). Measuring cognitive load during visual tasks by combining pupillometry and eye tracking . Stanford University.

*Koch, F. S., Sundqvist, A., Thornberg, U. B., Nyberg, S., Lum, J. A., Ullman, M. T., Barr, R., Rudner, M., & Heimann, M. (2020). Procedural memory in infancy: Evidence from implicit sequence learning in an eye-tracking paradigm. Journal of Experimental Child Psychology, 191 , 104733. https://doi.org/10.1016/j.jecp.2019.104733

*Köder, F., & Falkum, I. L. (2020). Children’s metonymy comprehension: Evidence from eye-tracking and picture selection. Journal of Pragmatics, 156 , 191–205. https://doi.org/10.1016/j.pragma.2019.07.007

Kooiker, M. J., Pel, J. J., van der Steen-Kant, S. P., & van der Steen, J. (2016). A method to quantify visual information processing in children using eye tracking. JoVE (journal of Visualized Experiments), 113 , e54031. https://doi.org/10.3791/54031

Korbach, A., Brünken, R., & Park, B. (2018). Differentiating different types of cognitive load: A comparison of different measures. Educational Psychology Review, 30 (2), 503–529. https://doi.org/10.1007/s10648-017-9404-8

Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., & Torralba, A. (2016). Eye tracking for everyone. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2176–2184). IEEE. https://doi.org/10.1109/CVPR.2016.239

Kruger, J. L., & Doherty, S. (2016). Measuring cognitive load in the presence of educational video: Towards a multimodal methodology. Australasian Journal of Educational Technology . https://doi.org/10.14742/ajet.3084

Kulke, L. V., Atkinson, J., & Braddick, O. (2016). Neural differences between covert and overt attention studied using EEG with simultaneous remote eye tracking. Frontiers in Human Neuroscience , 10 . https://doi.org/10.3389/fnhum.2016.00592

Lai, H. Y., Saavedra-Pena, G., Sodini, C. G., Sze, V., & Heldt, T. (2019). Measuring saccade latency using smartphone cameras. IEEE Journal of Biomedical and Health Informatics, 24 (3), 885–897. https://doi.org/10.1109/jbhi.2019.2913846

Lai, M. L., Tsai, M. J., Yang, F. Y., Hsu, C. Y., Liu, T. C., Lee, S. W. Y., Lee, M. H., Chiou, G. L., Liang, J. C., & Tsai, C. C. (2013). A review of using eye-tracking technology in exploring learning from 2000 to 2012. Educational Research Review, 10 , 90–115. https://doi.org/10.1016/j.edurev.2013.10.001

*Laing, C. E. (2017). A perceptual advantage for onomatopoeia in early word learning: Evidence from eye-tracking. Journal of Experimental Child Psychology, 161 , 32–45. https://doi.org/10.1016/j.jecp.2017.03.017

Lavie, N. (1995). Perceptual load as a necessary condition for selective attention. Journal of Experimental Psychology: Human Perception and Performance, 21 (3), 451. https://doi.org/10.1037/0096-1523.21.3.451

*Law, F., 2nd., Mahr, T., Schneeberg, A., & Edwards, J. (2017). Vocabulary size and auditory word recognition in preschool children. Applied Psycholinguistics, 38 (1), 89–125. https://doi.org/10.1017/S0142716416000126

*Li, M., Chen, Y., Wang, J., & Liu, T. (2020). Children’s attention toward cartoon executed photos. Annals of Tourism Research, 80 , 102799. https://doi.org/10.1016/j.annals.2019.102799

Liu, H. C., Lai, M. L., & Chuang, H. H. (2011). Using eye-tracking technology to investigate the redundant effect of multimedia web pages on viewers’ cognitive processes. Computers in Human Behavior, 27 (6), 2410–2417. https://doi.org/10.1016/j.chb.2011.06.012

Loberg, O., Hautala, J., Hämäläinen, J. A., & Leppänen, P. H. (2019). Influence of reading skill and word length on fixation-related brain activity in school-aged children during natural reading. Vision Research, 165 , 109–122. https://doi.org/10.1016/j.visres.2019.07.008

Lockhofen, D. E. L., & Mulert, C. (2021). Neurochemistry of visual attention. Frontiers in Neuroscience, 15 , 643597. https://doi.org/10.3389/fnins.2021.643597

Majaranta, P., & Bulling, A. (2014). Eye tracking and eye-based human–computer interaction. Advances in physiological computing (pp. 39–65). Springer.

Marcus, D. J., Karatekin, C., & Markiewicz, S. (2006). Oculomotor evidence of sequence learning on the serial reaction time task. Memory & Cognition, 34 (2), 420–432. https://doi.org/10.3758/BF03193419

*McEwen, R. N., & Dube, A. (2015). Engaging or distracting: Children’s tablet computer use in education. https://psycnet.apa.org/record/2015-47277-001

Mestres, E. T., & Pellicer-Sánchez, A. (2019). Young EFL learners’ processing of multimodal input: Examining learners’ eye movements. System, 80 , 212–223. https://doi.org/10.1016/j.system.2018.12.002

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Annals of Internal Medicine , 151 (4), 264–269. https://doi.org/10.7326/0003-4819-151-4-200908180-00135

*Murray, L., Wegener, S., Wang, H. C., Parrila, R., & Castles, A. (2022). Children processing novel irregular and regular words during reading: An eye tracking study. Scientific Studies of Reading, 26 (5), 417–431.

Miller, B. W. (2015). Using reading times and eye-movements to measure cognitive engagement. Educational Psychologist, 50 (1), 31–42. https://doi.org/10.1080/00461520.2015.1004068

*Miller, H. E., Kirkorian, H. L., & Simmering, V. R. (2020). Using eye-tracking to understand relations between visual attention and language in children’s spatial skills. Cognitive Psychology, 117 , 101264. https://doi.org/10.1016/j.cogpsych.2019.101264

*Molina, A. I., Navarro, Ó., Ortega, M., & Lacruz, M. (2018). Evaluating multimedia learning materials in primary education using eye tracking. Computer Standards & Interfaces, 59 , 45–60. https://doi.org/10.1016/j.csi.2018.02.004

*Nazaruk, S. (2020). Diagnosis of the mathematical skills of children from polish kindergartens and its importance for geometric shape recognition. Early Childhood Education Journal, 48 (4), 463–472. https://doi.org/10.1007/s10643-019-01005-8

Obaidellah, U., Al Haek, M., & Cheng, P. C. H. (2018). A survey on the usage of eye-tracking in computer programming. ACM Computing Surveys (CSUR), 51 (1), 1–58. https://doi.org/10.1145/3145904

O’Brien, H. L., Cairns, P., & Hall, M. (2018). A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. International Journal of Human-Computer Studies, 112 , 28–39. https://doi.org/10.1016/j.ijhcs.2018.01.004

O’Brien, H. L., & Toms, E. G. (2008). What is user engagement? A conceptual framework for defining user engagement with technology. Journal of the American Society for Information Science and Technology, 59 (6), 938–955. https://doi.org/10.1002/asi.20801

*Olsen, J. K., Ozgur, A. G., Sharma, K., & Johal, W. (2022). Leveraging eye tracking to understand children’s attention during game-based, tangible robotics activities. International Journal of Child-Computer Interaction, 31 , 100447.

*Pan, J., Liu, M., Li, H., & Yan, M. (2021). Chinese children benefit from alternating-color words in sentence reading. Reading and Writing, 34 (2), 355–369. https://doi.org/10.1007/s11145-020-10067-9

*Papavlasopoulou, S., Sharma, K., Giannakos, M., & Jaccheri, L. (2017). Using eye-tracking to unveil differences between kids and teens in coding activities. In proceedings of the 2017 conference on interaction design and children (pp. 171–181). https://doi.org/10.1145/3078072.3079740

*Papavlasopoulou, S., Sharma, K., & Giannakos, M. N. (2018). How do you feel about learning to code? Investigating the effect of children’s attitudes towards coding using eye-tracking. International Journal of Child-Computer Interaction, 17 , 50–60. https://doi.org/10.1016/j.ijcci.2018.01.004

*Papavlasopoulou, S., Sharma, K., & Giannakos, M. N. (2020). Coding activities for children: Coupling eye-tracking with qualitative data to investigate gender differences. Computers in Human Behavior, 105 , 105939. https://doi.org/10.1016/j.chb.2019.03.003

Park, S., Aksan, E., Zhang, X., & Hilliges, O. (2020). Towards end-to-end video-based eye-tracking. In A. Vedaldi, H. Bischof, T. Brox, & J. M. Frahm (Eds.), European conference on computer vision (pp. 747–763). Cham: Springer. https://doi.org/10.1007/978-3-030-58610-2_44

Papoutsaki, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J., Hays, J. (2016). Webgazer: Scalable webcam eye tracking using user interactions. In Proceedings of the twenty-fifth international joint conference on artificial intelligence (pp. 3839–3845).

*Pellicer-Sánchez, A., Conklin, K., & Vilkaitė-Lozdienė, L. (2021). The effect of pre-reading instruction on vocabulary learning: An investigation of L1 and L2 readers’ eye movements. Language Learning, 71 (1), 162–203.

*Pellicer-Sánchez, A., Tragant, E., Conklin, K., Rodgers, M., Serrano, R., & Llanes, Á. (2020). Young learners’ processing of multimodal input and its impact on reading comprehension: An eye-tracking study. Studies in Second Language Acquisition, 42 (3), 577–598. https://doi.org/10.1017/S0272263120000091

*Pellicer-Sánchez, A., Tragant, E., Conklin, K., Rodgers, M., Llanes, A., & Serrano, R. (2018). L2 reading and reading-while-listening in multimodal learning conditions: An eye-tracking study. ELT Research Papers, 18 (1), 1–28.

Peterson, M. S., Kramer, A. F., & Irwin, D. E. (2004). Covert shifts of attention precede involuntary eye movements. Perception & Psychophysics, 66 (3), 398–405. https://doi.org/10.3758/bf03194888

Poole, A., & Ball, L. J. (2006). Eye tracking in HCI and usability research. Encyclopedia of human computer interaction (pp. 211–219). IGI Global. https://doi.org/10.4018/978-1-59140-562-7.ch034

Rayner, K. (2009). Eye movements in reading: Models and data. Journal of Eye Movement Research, 2 (5), 1–10. https://doi.org/10.16910/jemr.2.5.2

Rayner, K., Chace, K. H., Slattery, T. J., & Ashby, J. (2006). Eye movements as reflections of comprehension processes in reading. Scientific Studies of Reading, 10 (3), 241–255. https://doi.org/10.1207/s1532799xssr1003_3

Reichle, E. D., Reineberg, A. E., & Schooler, J. W. (2010). Eye movements during mindless reading. Psychological Science, 21 (9), 1300–1310.

Rueda, M. R., Fan, J., McCandliss, B. D., Halparin, J. D., Gruber, D. B., Lercari, L. P., & Posner, M. I. (2004). Development of attentional networks in childhood. Neuropsychologia, 42 (8), 1029–1040. https://doi.org/10.1016/j.neuropsychologia.2003.12.012

*Reuter, T., Borovsky, A., & Lew-Williams, C. (2019). Predict and redirect: Prediction errors support children’s word learning. Developmental Psychology, 55 (8), 1656. https://doi.org/10.1037/dev0000754

Schindler, M., & Lilienthal, A. J. (2019). Domain-specific interpretation of eye tracking data: Towards a refined use of the eye-mind hypothesis for the field of geometry. Educational Studies in Mathematics, 101 (1), 123–139. https://doi.org/10.1007/s10649-019-9878-z

*Shaked, K. B. Z., Shamir, A., & Vakil, E. (2020). An eye tracking study of digital text reading: a comparison between poor and typical readers. Reading and Writing . https://doi.org/10.1007/s11145-020-10021-9

Sharafi, Z., Soh, Z., & Guéhéneuc, Y. G. (2015). A systematic literature review on the usage of eye-tracking in software engineering. Information and Software Technology, 67 , 79–107. https://doi.org/10.1016/j.infsof.2015.06.008

*Skrabankova, J., Popelka, S., & Beitlova, M. (2020). Students’ ability to work with graphs in physics studies related to three typical student groups. Journal of Baltic Science Education, 19 (2), 298–316. https://doi.org/10.33225/jbse/20.19.298

Soluch, P., & Tarnowski, A. (2013). Eye-tracking methods and measures. In S. Grucza, M. Płużyczka, & J. Zając (Eds.), Translation studies and eye-tracking analysis (pp. 85–104). Peter Lang.

Sorden, S. D. (2012). The cognitive theory of multimedia learning. In B. Irby, G. H. Brown, R. Lara-Aiecio, & S. A. Jackson (Eds.), Handbook of educational theories (pp. 155–167). Information Age Publisher.

*Sprenger, P., & Benz, C. (2020). Children’s perception of structures when determining cardinality of sets—results of an eye-tracking study with 5-year-old children. ZDM, 52 (4), 753–765. https://doi.org/10.1007/s11858-020-01137-x

*Stevenson, M. P., Dewhurst, R., Schilhab, T., & Bentsen, P. (2019). Cognitive restoration in children following exposure to nature: Evidence from the attention network task and mobile eye tracking. Frontiers in Psychology, 10 , 42. https://doi.org/10.3389/fpsyg.2019.00042

Strauss, A., & Corbin, J. (1998). Basics of qualitative research techniques . Thousand Oaks: Sage publications.

*Sun, H., Loh, J., & Charles Roberts, A. (2019). Motion and sound in animated storybooks for preschoolers’ visual attention and mandarin language learning: An eye-tracking study with bilingual children. AERA Open, 5 (2), 2332858419848431.

Sweller, J. (2010). Cognitive load theory: Recent theoretical advances. In J. L. Plass, R. Moreno, & R. Brünken (Eds.), Cognitive load theory (pp. 29–47). Cambridge University Press. https://doi.org/10.1017/CBO9780511844744.004

*Tamási, K., McKean, C., Gafos, A., & Höhle, B. (2019). Children’s gradient sensitivity to phonological mismatch: considering the dynamics of looking behavior and pupil dilation. Journal of Child Language, 46 (1), 1–23. https://doi.org/10.1017/S0305000918000259

*Takacs, Z. K., & Bus, A. G. (2016). Benefits of motion in animated storybooks for children’s visual attention and story comprehension: An eye-tracking study. Frontiers in Psychology, 7 , 1591. https://doi.org/10.3389/fpsyg.2016.01591

*Takacs, Z. K., & Bus, A. G. (2018). How pictures in picture storybooks support young children’s story comprehension: An eye-tracking experiment. Journal of Experimental Child Psychology, 174 , 1–12. https://doi.org/10.1016/j.jecp.2018.04.013

*Tiffin-Richards, S. P., & Schroeder, S. (2015). Word length and frequency effects on children’s eye movements during silent reading. Vision Research, 113 , 33–43. https://doi.org/10.1016/j.visres.2015.05.008

*Tribushinina, E., & Mak, W. M. (2016). Three-year-olds can predict a noun based on an attributive adjective: evidence from eye-tracking. Journal of Child Language, 43 (2), 425–441. https://doi.org/10.1017/S0305000915000173

*Trecca, F., Bleses, D., Madsen, T. O., & Christiansen, M. H. (2018). Does sound structure affect word learning? An eye-tracking study of Danish learning toddlers. Journal of Experimental Child Psychology, 167 , 180–203. https://doi.org/10.1016/j.jecp.2017.10.011

Valenti, R., Staiano, J., Sebe, N., & Gevers, T. (2009). Webcam-based visual gaze estimation. International conference on image analysis and processing (pp. 662–671). Springer. https://doi.org/10.1007/978-3-642-04146-4_71

Valliappan, N., Dai, N., Steinberg, E., et al. (2020). Accelerating eye movement research via accurate and affordable smartphone eye tracking. Nature Communications, 11 , 4553. https://doi.org/10.1038/s41467-020-18360-5

Vakil, E., Bloch, A., & Cohen, H. (2017). Anticipation measures of sequence learning: manual versus oculomotor versions of the serial reaction time task. The Quarterly Journal of Experimental Psychology, 70 (3), 579–589.

Van der Stigchel, S., Meeter, M., & Theeuwes, J. (2006). Eye movement trajectories and what they tell us. Neuroscience & Biobehavioral Reviews, 30 (5), 666–679. https://doi.org/10.1016/j.neubiorev.2005.12.001

Van’t Noordende, J. E., van Hoogmoed, A. H., Schot, W. D., & Kroesbergen, E. H. (2016). Number line estimation strategies in children with mathematical learning difficulties measured by eye tracking. Psychological Research Psychologische Forschung, 80 (3), 368–378. https://doi.org/10.1007/s00426-015-0736-z

Van Gog, T., Kester, L., Nievelstein, F., Giesbers, B., & Paas, F. (2009). Uncovering cognitive processes: Different techniques that can contribute to cognitive load research and instruction. Computers in Human Behavior, 25 (2), 325–331. https://doi.org/10.1016/j.chb.2008.12.021

van Gog, T., & Jarodzka, H. (2013). Eye tracking as a tool to study and enhance cognitive and metacognitive processes in computer-based learning environments. International handbook of metacognition and learning technologies (pp. 143–156). Springer. https://doi.org/10.1007/978-1-4419-5546-3_10

van Viersen, S., Protopapas, A., Georgiou, G. K., Parrila, R., Ziaka, L., & de Jong, P. F. (2022). Lexicality effects on orthographic learning in beginning and advanced readers of Dutch: An eye-tracking study. Quarterly Journal of Experimental Psychology, 75 (6), 1135–1154.

*Valleau, M. J., Konishi, H., Golinkoff, R. M., Hirsh-Pasek, K., & Arunachalam, S. (2018). An eye-tracking study of receptive verb knowledge in toddlers. Journal of Speech, Language, and Hearing Research, 61 (12), 2917–2933.

*Verdine, B. N., Bunger, A., Athanasopoulou, A., Golinkoff, R. M., & Hirsh-Pasek, K. (2017). Shape up: An eye-tracking study of preschoolers’ shape name processing and spatial development. Developmental Psychology, 53 (10), 1869. https://doi.org/10.1037/dev0000384

Wedel, M. (2015). Attention research in marketing: A review of eye-tracking studies. In J. M. Fawcett, E. F. Risko, & A. Kingstone (Eds.), The handbook of attention (pp. 569–588). Boston Review.

*Weighall, A. R., Henderson, L. M., Barr, D. J., Cairney, S. A., & Gaskell, M. G. (2017). Eye-tracking the time-course of novel word learning and lexical competition in adults and children. Brain and Language, 167 , 13–27. https://doi.org/10.1016/j.bandl.2016.07.010

Whittemore, R., & Knafl, K. (2005). The integrative review: Updated methodology. Journal of Advanced Nursing, 52 (5), 546–553. https://doi.org/10.1111/j.1365-2648.2005.03621.x

Wu, C. J., & Liu, C. Y. (2022). Refined use of the eye-mind hypothesis for scientific argumentation using multiple representations. Instructional Science, 50 (4), 551–569. https://doi.org/10.1007/s11251-022-09581-w

*Wu, C. J., Liu, C. Y., Yang, C. H., & Jian, Y. C. (2020). Eye-movements reveal children’s deliberative thinking and predict performance on arithmetic word problems. European Journal of Psychology of Education . https://doi.org/10.1007/s10212-020-00461-w

Xu, P., Ehinger, K. A., Zhang, Y., Finkelstein, A., Kulkarni, S. R., & Xiao, J. (2015). Turkergaze: Crowdsourcing saliency with webcam based eye tracking. Preprint retrieved from https://arxiv.org/abs/1504.06755

*Yan, Z., Pei, M., & Su, Y. (2017). Children’s empathy and their perception and evaluation of facial pain expression: An eye tracking study. Frontiers in Psychology, 8 , 2284. https://doi.org/10.3389/fpsyg.2017.02284

Yang, S. N., & McConkie, G. W. (2001). Eye movements during reading: A theory of saccade initiation times. Vision Research, 41 (25–26), 3567–3585. https://doi.org/10.1016/S0042-6989(01)00025-6

*Yu, C., Suanda, S. H., & Smith, L. B. (2019). Infant sustained attention but not joint attention to objects at 9 months predicts vocabulary at 12 and 15 months. Developmental Science, 22 (1), e12735. https://doi.org/10.1111/desc.12735

Zagermann, J., Pfeil, U., & Reiterer, H. (2016). Measuring cognitive load using eye tracking technology in visual computing. In Proceedings of the sixth workshop on beyond time and errors on novel evaluation methods for visualization (pp. 78–85). https://doi.org/10.1145/2993901.2993908

*Zargar, E., Adams, A. M., & Connor, C. M. (2020). The relations between children’s comprehension monitoring and their reading comprehension and vocabulary knowledge: An eye-movement study. Reading and Writing, 33 (3), 511–545. https://doi.org/10.1007/s11145-019-09966-3

*Zawoyski, A. M., & Ardoin, S. P. (2019). Using eye-tracking technology to examine the impact of question format on reading behavior in elementary students. School Psychology Review, 48 (4), 320–332. https://doi.org/10.17105/SPR-2018-0014.V48-4

Zekveld, A. A., Heslenfeld, D. J., Johnsrude, I. S., Versfeld, N. J., & Kramer, S. E. (2014). The eye as a window to the listening brain: Neural correlates of pupil size as a measure of cognitive listening load. NeuroImage, 101 , 76–86. https://doi.org/10.1016/j.neuroimage.2014.06.069

Zhang, C., Yao, R., & Cai, J. (2018). Efficient eye typing with 9-direction gaze estimation. Multimedia Tools and Applications, 77 (15), 19679–19696. https://doi.org/10.1007/s11042-017-5426-y

*Zhou, P., Zhan, L., & Ma, H. (2019). Predictive language processing in preschool children with autism spectrum disorder: An eye-tracking study. Journal of Psycholinguistic Research, 48 (2), 431–452. https://doi.org/10.1007/s10936-018-9612-5

Download references


This work was partially supported by the US Department of Education’s Education Innovation and Research program [U411C190179].

US Department of Education (Grant No. U411C190179).

Author information

Authors and affiliations.

Florida State University, Tallahassee, FL, 32306-4453, USA

Fengfeng Ke & Zlatko Sokolikj

University of Virginia, Charlottesville, VA, 22904, USA

TERC, Cambridge, MA, 02140, USA

Ibrahim Dahlstrom-Hakki

University of Florida, Gainesville, FL, 32611, USA

Maya Israel

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Fengfeng Ke .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Ke, F., Liu, R., Sokolikj, Z. et al. Using eye-tracking in education: review of empirical research and technology. Education Tech Research Dev (2024). https://doi.org/10.1007/s11423-024-10342-4

Download citation

Accepted : 02 January 2024

Published : 24 January 2024

DOI : https://doi.org/10.1007/s11423-024-10342-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Eye-tracking
  • Visual attention
  • Eye movement
  • Gaze prediction
  • Find a journal
  • Publish with us
  • Track your research

Login into Epistemonikos

Your browser is ancient! Upgrade to a different browser to experience this site.

Perform simple searches, like the ones you use in Google. A single term for a condition and another for an intervention may suffice.

For instance,

  • ...vitamin C cold
  • ...vaccine autism

There is no real need of using a taxonomy, or boolean terms, such as AND or OR (they work, but are rarely needed. If you are really fond of this kind of search strategies, you can use our advanced search).

You can search in your own language or in English. You can even combine terms from different languages.

A systematic review of eye tracking research on multimedia learning.

a systematic review of eye tracking research on multimedia learning

Export Citation

Create your own Matrix Beta

Evidence related with this article: (not yet available)

Broad syntheses, systematic reviews, primary studies, external links:, available languages for this document.


  1. A systematic review of eye‐tracking‐based research on animated

    a systematic review of eye tracking research on multimedia learning

  2. (PDF) How visual search relates to visual diagnostic performance: a

    a systematic review of eye tracking research on multimedia learning

  3. Eye tracking for skills assessment and training: a systematic review

    a systematic review of eye tracking research on multimedia learning

  4. Frontiers

    a systematic review of eye tracking research on multimedia learning

  5. Eye-Tracking

    a systematic review of eye tracking research on multimedia learning

  6. (PDF) A Systematic Review of Visualization Techniques and Analysis

    a systematic review of eye tracking research on multimedia learning


  1. 3 Critical Questions to Answer in the First Minutes of an Eye Gaze Trial

  2. Webinar: Understanding Eye Tracking Research with Smart Eye

  3. Overview on Eye Tracking Metrics and Gaze Statistics

  4. Research Vignette: Facilitating communication with patients through eye-tracking software

  5. What’s it like to use wearable eye tracking for research

  6. syncAudioTutorial


  1. A systematic review of eye tracking research on multimedia learning

    This study provides a current systematic review of eye tracking research in the domain of multimedia learning. The particular aim of the review is to explore how cognitive processes in multimedia learning are studied with relevant variables through eye tracking technology. To this end, 52 articles, including 58 studies, were analyzed.

  2. A systematic review of eye tracking research on multimedia learning

    In their systematic review of eye-tracking research in multimedia learning, [14] found that most studies have been conducted with university students, providing little empirical evidence for ...

  3. A systematic review of eye tracking research on multimedia learning

    A review of eye tracking research on video-based learning. This review sought to uncover how the utilisation of eye tracking technology has advanced understandings of the mechanisms underlying effective video-based learning and what type of caution should be exercised when interpreting the findings of these studies.

  4. A systematic review of eye‐tracking‐based research on animated

    The most challenging task in eye-tracking-based multimedia research is to establish a relationship between eye-tracking metrics (or cognitive processes) and learners' performance scores. Additionally, there are current debates about the effectiveness of animations (or simulations) in promoting learning in multimedia settings.

  5. A review of eye tracking research on video-based learning

    Eye tracking technology is increasingly used to understand individuals' non-conscious, moment-to-moment processes during video-based learning. This review evaluated 44 eye tracking studies on video-based learning conducted between 2010 and 2021. Specifically, the review sought to uncover how the utilisation of eye tracking technology has advanced understandings of the mechanisms underlying ...

  6. PDF A systematic review of eye tracking research on multimedia learning

    Please cite this article as: Alemdag E. & Cagiltay K., A systematic review of eye tracking research on multimedia learning, Computers & Education (2018), doi: 10.1016/j.compedu.2018.06.023. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to

  7. A systematic review of eye-tracking-based research on animated

    A systematic review of eye-tracking-based research on animated multimedia learning @article{Cokun2021ASR, title={A systematic review of eye-tracking-based research on animated multimedia learning}, author={Atakan Coşkun and Kursat Cagiltay}, journal={J. Comput. Assist.

  8. A systematic review of eye‐tracking‐based research on animated

    Eye-tracking technology can provide a multitude of metrics, enabling us to establish a reliable link between cognitive processes and learning. In a recent systematic review, Coskun and Cagiltay ...

  9. ERIC

    Background: The most challenging task in eye-tracking-based multimedia research is to establish a relationship between eye-tracking metrics (or cognitive processes) and learners' performance scores. Additionally, there are current debates about the effectiveness of animations (or simulations) in promoting learning in multimedia settings.

  10. Multimedia tools in the teaching and learning processes: A systematic

    Alemdag and Cagiltay (2018) conducted a systematic review of eye-tracking research on multimedia learning and found that while this research method was on the rise it was mainly used to understand the effects of multimedia use among higher education students. They also identified that although eye movements were linked to how students select ...

  11. A review study on eye-tracking technology usage in immersive virtual

    This systematic review study synthesizes research findings pertaining to the use of eye-tracking technology in immersive virtual reality (IVR) learning environments created by using head mounted displays. ... The study of multimedia learning through eye-tracking technology can provide feasible suggestions for the improvement of virtual teaching ...

  12. Looking through the model's eye: A systematic review of eye movement

    Eye-tracking methodology has begun to be used in educational studies, especially in multimedia learning. Literature review studies (Alemdag & Cagiltay, 2018; Lai et al., 2013) confirmed that there is an increasing interest in and use of the eye-tracking methodology in educational studies, particularly in multimedia learning.One potential reason for this increase may be the emerging use of eye ...

  13. PDF Using eye-tracking in education: review of empirical research and

    2013; Shara et al., 2015). Alemdag and Cagiltay (2018) conducted a systematic review of eye-tracking for the research of multimedia learning. This work reported the deploy-ment of temporal and count scales of eye-tracking in studying the selection, organization, and integration of multimedia information. The authors advocated more research on using

  14. A systematic review of eye tracking research on multimedia learning

    This study provides a current systematic review of eye tracking research in the domain of multimedia learning. The particular aim of the review is to explore how cognitive processes in multimedia learning are studied with relevant variables through eye tracking technology. To this end, 52 articles, including 58 studies, were analyzed. Remarkable results are that (1) there is a burgeoning ...

  15. An Eye Tracking Based Investigation of Multimedia Learning ...

    2010), color coding (e.g., Ozcelik, et al., 2009), and spatial contiguity (e.g., Johnson &Mayer, 2012). Review results from eye tracking studies in science learning suggest designing digital learning materials for science education based on multimedia learning design principles (Yang et al., 2018). Eye tracking methodology may

  16. Eye-tracking and artificial intelligence to enhance motivation and learning

    Moreover, in recent eye-tracking research we see similar sizes of the population used. For example, in two recent systematic reviews (Alemdag & Cagiltay, 2018; Ashraf et al., 2018) with a combined 85 different eye-tracking studies the majority (84.71%) of the studies had between 8 and 60 participants. The papers cited in this contribution with ...

  17. Using eye-tracking in education: review of empirical research and

    The review findings are synthesized as six themes. Themes 1-4 describe emergent patterns of using eye-tracking in studying cognition and learning (or RQ1), by recapitulating (a) theories and hypotheses tested or referenced in applied eye-tracking research for education, (b) cognitive constructs and mechanisms of learning frequently measured or inferred by eye-tracking; c) the corresponding ...

  18. The Use of Eye Tracking as a Research and Instructional Tool in

    The present chapter summarizes the state of the art of using eye tracking in research on multimedia learning. It first provides an overview the various eye tracking parameters that have been used in this field before describing its various functions as a research tool. As a research tool eye tracking serves to test and refine assumptions regarding the process of learning with multimedia ...

  19. When Eye-Tracking Meets Machine Learning: A Systematic Review on

    Eye-gaze tracking research offers significant promise in enhancing various healthcare-related tasks, above all in medical image analysis and interpretation. Eye tracking, a technology that monitors and records the movement of the eyes, provides valuable insights into human visual attention patterns. This technology can transform how healthcare professionals and medical specialists engage with ...

  20. PDF When Eye-Tracking Meets Machine Learning: A Systematic Review on

    This systematic review investigates eye-gaze tracking appli-cations and methodologies for enhancing ML/DL algorithms for medical image analysis in depth. Keywords: Eye-gaze Tracking; Medical Image Analysis; Deep Learning; Machine Learning 1Introduction Human eye-gaze tracking research extends beyond the restriction of controlled laboratory environ-

  21. A systematic review of eye tracking research on multimedia learning

    This study provides a current systematic review of eye tracking research in the domain of multimedia learning. The particular aim of the review is to explore how cognitive processes in multimedia learning are studied with relevant variables through eye tracking technology. To this end, 52 articles, including 58 studies, were analyzed.