Download Venue: https://ieee-dataport.org/open-access/ireye4task
Publish Date: April, 2023
Linked paper: S. Chen, J. Epps, A High-Quality Landmarked Infrared Eye Video Dataset (IREye4Task): Eye Behaviors, Insights and Benchmarks for Wearable Mental State analysis, IEEE Transactions on Affective Computing , 2023.
ABSTRACT
IREYE4TASK is a dataset for wearable eye landmark detection and mental state analysis. Sensing the mental state induced by different task contexts, where cognition is a focus, is as important as sensing the affective state where emotion is induced in the foreground of consciousness, because completing tasks is part of every waking moment of life. However, few datasets are publicly available to advance mental state analysis, especially those using the eye as the sensing modality with detailed ground truth for eye behaviors. Here, we share a high-quality publicly accessible eye video dataset, IREye4Task, where the eyelid, pupil and iris boundary as well as six eye states, four mental states in two load levels, and before and after experiment are annotated for each frame to obtain eye behaviors as responses to different task contexts, over more than a million frames. This dataset provides an opportunity to recognize eye behaviors from close-up infrared eye images and examine the relationship between eye behaviors, different mental states and task performance.
Instructions:
This dataset is for research use only.
Participants:
Twenty participants (10 males, 10 females; Age: M = 25.8, SD = 7.17) above 18 years old volunteered. All participants had normal or corrected to normal vision with contact lenses and had no eye diseases causing obvious excessive blinking. They signed informed consent before the experiment and were unaware of the precise experimental hypotheses. All procedures performed in this study were approved by the University of New South Wales research ethics committee in Australia and were in accordance with the ethical standards.
Environment:
A research lab where lights were uniformly distributed in the ceiling and the ripple-free lighting condition in the room was constant. The room is surrounded by dark drapes on the three sides of walls and one white wall with a window to another room (not lights come to the window). Two chairs and one table were set up in one side of the room where the participant was sitting at the table using a laptop and the experimenter was sitting opposite and using another laptop to conduct conversation as needed. Another table and chair were placed on the other side of the room where the participant was requested to walk to from their table and back. Detailed experimental setting can be found in Figure 1 in [1].
Apparatus:
A wearable headset from Pupil Labs [2] was used to record left and right eye video, 640 × 480 pixels at 60 Hz, and a scene video, 1280 × 720 pixels at 30 Hz. The headset was connected to a lightweight laptop via USB, so the three videos were recorded and stored in the laptop. The laptop was placed into a backpack and participants carried it during the experiment so that their movement was not restricted.
Experimental design:
Task instructions were displayed on a 14-inch laptop, which was placed around 20-30 cm away from the participants seated at a desk. Participants used a mouse or touch pad to click the button shown on the laptop screen to choose the answer (for tasks requiring a response via the laptop) or proceed to the next task. Meanwhile, to reduce the pupillary light reflex effect on pupil size change, they were instructed to always fixate their eyes on the screen and not to look around during the experiment. However, during the physical load task, their eyes were naturally on the surroundings, but they followed the same walking path for the low and high load levels.
The four types of tasks were used to induce the four different load types by ensuring the dominant load and manipulating the dominant load level given the same of other load types. They were modified based on the experiment in [4]. The cognitive load task required summing two numbers which were displayed sequentially on a screen (rather than presented simultaneously in [4]) and giving the answer verbally when ready after clicking a button on the screen. The perceptual load task was to search for a given first name (rather than an object in [4]), which was previously shown on the screen, from a full-name list and click the name. The physical load task was to stand up and walk from the desk to another desk (around 5 metres) and walk back and sit down (rather than lifting in [4]). The communicative load task was to hold conversations with the experimenter to complete a simple conversation or an object guessing game (different questions and guessing objects in [4]).
Before the experiment, participants had a training session where they completed an example of each load level of each task type to get familiarised with these tasks. Then they put on the wearable devices and data collection started. There were four blocks corresponding to the four types of tasks: five addition tasks in the block of cognitive load tasks, two search tasks in the block of perceptual load tasks, one physical task in the block of physical load tasks, and 10 questions to answer or ask in the block of communicative load tasks. The procedure aimed to make participants spend a similar time completing each block. The order of the 8 blocks (4 blocks × 2 levels) was generated randomly beforehand and was the same for every participant. After the 8 blocks, participants repeated another set of 8 blocks but in a different order and with different adders, target names, and questions. At the end of each block, there was a subjective rating (on a 9-point scale) of their effort on the completed task followed by a pause option which allowed them to take a break if needed. The subjective rating consists of a series of nine choices representing degrees of effort from 'Nearly max, Very much, Much’ to ‘Medium to much, Medium, Medium to little’ to ‘A little, Little, Very little’ with the interval being as subtle and equal as possible. The session lasted around 13 to 20 min in total.
Detailed experimental design can be found in Figure 2 in [1].
Ground truth:
The task labels were automatically obtained from the task stimuli presentation interface which recorded when task stimuli were presented and when participants clicked a button to begin and end each task. The synchronization between the eye videos and the presentation interface is through a long blink at the beginning of the experiment.
Eye landmarks and eye state were obtained using machine learning first then manually checked frame by frame to correct to ensure high quality. Details can be found in Section 3.2 in [3].
As for the ground truth of mental state, four load types and the associated load level were labelled based on the task design, verified by subjective ratings, available performance score and task duration. Details can be found in Figure 3 in [1].
Data Records, Usage Notes and download site
go to
https://ieee-dataport.org/open-access/ireye4task
Acknowledgements:
This work was supported in part by US Army Cooperative Agreement W911NF1920330. Opinions expressed are the authors’, and may not reflect those of the US Army.
References
[1] S. Chen, J. Epps, F. Paas, “Pupillometric and blink measures of diverse task loads: Implications for working memory models”, British Journal of Educational Psychology, 2022.
[2] https://pupil-labs.com/
[3] S. Chen, J. Epps. " A High-Quality Landmarked Infrared Eye Video Dataset (IREye4Task): Eye Behaviors, Insights and Benchmarks for Wearable Mental State Analysis ", IEEE Transactions on Affective Computing, 2023.
[4] S. Chen and J. Epps, "Task Load Estimation from Multimodal Head-Worn Sensors Using Event Sequence Features," in IEEE Transactions on Affective Computing, vol. 12, no. 3, pp. 622-635, 1 July-Sept. 2021.
If you use this dataset, please cite the following paper:
S. Chen, J. Epps. "A High-Quality Landmarked Infrared Eye Video Dataset (IREye4Task): Eye Behaviors, Insights and Benchmarks for Wearable Mental State Analysis ", IEEE Transactions on Affective Computing, 2023.