IREye4Task

Mental states, such as cognition, emotion and action (contexts) can be analyzed and predicted from the eye acquired by a close-up infrared sensitive camera.

Tutorial: Introduction to eye and speech behaviour computing for affect analysis in wearable contexts

Duration: Half-day (3h)
Venue: Auditorium (main conference), Sorbonne University, Campus Pierre & Marie Curie
Date: Morning, 13 Oct, 2023

Demo_eye download 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and Demo_audio download

Eye and speech are two of the most ubiquitous sensing modalities during human-human interaction and human-computer interaction in daily life. With wearables becoming lightweight, easy to wear and offering powerful computing capabilities, they are likely to be the next generation of computing devices (e.g., Apple Vision Pro). This provides novel opportunities to explore new types of eye behaviour and new methods of body sound sensing for affect analysis and modelling. Multimodal affective computing systems within the machine learning framework have seen success in certain scenarios, but their application to multimodal wearable contexts remains less explored. Understanding the theoretical (e.g., psychophysiological) basis and approaches for eye and speech/audio behaviour computing, and for multimodal computing is paramount for affect analysis and for innovative wearable affect systems in a range of contexts during human-human and human-computer interactions.
This tutorial focuses on the topics of wearable sensing for affect computing systems and specifically targets both fundamental and state-of-the-art eye and speech/audio behaviour processing, and multimodal computing. This is the first tutorial that discusses multimodal perspectives on wearable sensing and computing for affect analysis and is particularly suited to the ICMI 2023 theme of science of multimodal interactions. This tutorial consists of four parts: (i) eye behaviour computing, which introduces wearable devices to enquire eye images, novel eye behaviour types and correlation to affect, and computing methods to extract eye behaviours; (ii) Speech and audio analysis, which covers wearable devices for audio collection, different forms of audio and relevance to affect, as well as processing pipeline and potential innovative applications; (iii) Multimodality, which focuses on the motivation, approaches, and applications; (iv) practical session, which contains demonstrations of eye behaviour computing, audio sensing and processing in practice and hands-on exercises with shared code.

Learning Objectives:
• Participants understand the benefits and limits of wearable sensing using eye, speech/audio and multimodality as well as the relationship between each modality and affect.
• Participants can elaborate the computing methods for eye image, speech /audio and multimodal signal processing, and the statistical modelling, machine learning pipelines for affect analysis.
• Participants know where to find and use datasets and tools to begin investigating eye, speech/audio in their projects and can address some future challenges.

References:
D. W. Hansen and Q. Ji. “In the Eye of the Beholder: A Survey of Models for Eyes and Gaze”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 478-500, 2010.
K. Holmqvist, et al. "Eye tracking: empirical foundations for a minimal reporting guideline." Behavior research methods 55.1: 364-416, 2023.
RA, Khalil, et al. "Speech emotion recognition using deep learning techniques: A review." IEEE Access 7 (2019): 117327-117345, 2019.
Y. Wang, et al., “A systematic review on affective computing: Emotion models, databases, and recent advances”, Information Fusion, vol. 83, pp.19-52, 2022.
M. Kassner et al., "Pupil: an open source platform for pervasive eye tracking and mobile gaze-based interaction" In Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing: Adjunct publication, pp. 1151-1160. 2014.
V. Skaramagkas et al., “Review of eye tracking metrics involved in emotional and cognitive processes”, IEEE Reviews in Biomedical Engineering, 2021.
L. Itti, “New Eye Tracking Techniques May Revolutionize Mental Health Screening”, Neuron, 88(3), pp. 442-43, 2015.
M. Kassner et al., "Pupil: an open source platform for pervasive eye tracking and mobile gaze-based interaction" In Proceedings of the 2014 ACM international joint conference on pervasive and ubiquitous computing: Adjunct publication, pp. 1151-1160. 2014.
V. Skaramagkas et al., “Review of eye tracking metrics involved in emotional and cognitive processes”, IEEE Reviews in Biomedical Engineering, 2021.
B. Laeng et al., “Pupillometry: A window to the preconscious?”. Perspectives on psychological science, 7(1), pp.18-27, 2012.

Tutorial presenters:
Dr. Siyuan Chen, University of New South Wales (Siyuan.chen(at)unsw.edu.au)
Siyuan Chen is a lecturer at the University of New South Wales (UNSW). Her work focuses on using “big data” from close-up eye videos, speech and head movement to understand human internal state such as emotion, cognition and action. She received her PhD in Electrical Engineering from UNSW. Before joining UNSW, she worked as a Research Intern at NII, Tokyo, Japan., a Research Fellow in the Department of Computer Science and Information Systems at the University of Melbourne and a visiting researcher to the STARS team, INRIA, Sophia Antipolis, France. Dr. Siyuan Chen is a recipient of the NICTA Postgraduate and the top-up Project Scholarship, the Commercialization Training Scheme Scholarship, and the Australia Endeavor Fellowship 2015. She has published over 30 papers in high quality peer-reviewed venues and filed two patents. She led a special session in SMC2021 and a special issue in Frontiers in Computer Science in 2021. She also served as a session chair in WCCI 2020 and SMC2021, and was a Programme Committee member of several conferences, such as ACII, IEEE CBMS, Social AI for Healthcare 2021 workshop. She is a member of Woman in Signal Processing Committee. Her work has been supported by US-based funding source multiple times. She was also a recipient of UNSW Faculty Engineering Early Career Academics funding in 2021.

Dr. Ting Dang, Nokia Bell Labs/ University of Cambridge (ting.dang(at)nokia-bell-labs.com)
Ting Dang is currently a Senior Research Scientist in Nokia Bell Labs, and a visiting researcher in the Department of Computer Science and Technology, University of Cambridge. Prior to this, she worked as a Senior Research Associate at the University of Cambridge. She received her Ph.D. from the University of New South Wales, Australia. Her primary research interests are on human centric sensing and machine learning for mobile health monitoring and delivery, specifically on exploring the potential of audio signals (e.g., speech, cough) via mobile and wearable sensing for automatic mental state (e.g., emotion, depression) prediction and disease (e.g., COVID-19) detection and monitoring. Further, her work aims to develop generalized, interpretable, and robust machine learning models to improve healthcare delivery. She served as the (senior) program committee and reviewer for more than 30 conferences and top-tier journals, such as NeurIPS, AAAI, IJCAI, IEEE TAC, IEEE TASLP, JMIR, ICASSP, INTERSPEECH, etc. She was shortlisted and invited to attend Asian Dean’s Forum Rising Star 2022 and won the IEEE Early Career Writing Retreat Grant 2019 and ISCA Travel Grant 2017. She has previous experience in successful bidding of INTERSPEECH 2026 (social media co-chair) and is organizing scientific meetings such as UbiComp WellComp 2023 (co-organizer).

Prof. Julien Epps, University of New South Wales (j.epps(at)unsw.edu.au)
Julien Epps received the BE and PhD degrees from the University of New South Wales, Sydney, Australia, in 1997 and 2001, respectively. From 2002 to 2004, he was a Senior Research Engineer with Motorola Labs, where he was engaged in speech recognition. From 2004 to 2006, he was a Senior Researcher and Project Leader with National ICT Australia, Sydney, where he worked on multimodal interface design. He then joined the UNSW School of Electrical Engineering and Telecommunications, Australia, in 2007 as a Senior Lecturer, and is currently a Professor and Head of School. He is also a Co-Director of the NSW Smart Sensing Network, a Contributed Researcher with Data61, CSIRO, and a Scientific Advisor for Sonde Health (Boston, MA). He has authored or co-authored more than 270 publications and serves as an Associate Editor for the IEEE Transactions on Affective Computing. His current research interests include characterisation, modelling, and classification of mental state from behavioral signals, such as speech, eye activity, and head movement.

Back to the main page