Automatic analysis of human social behavior in - the - wild using multimodal streams

More Info
expand_more

Abstract

The automated analysis of human non-verbal behavior during crowded mingle scenarios is part of the newly emerged domain of Social Signal Processing (SSP). This specific line of research aims to develop computational methods to automatically understand social interactions in-the-wild, while facing the many challenges inherent with the noisy nature of mingle scenarios.
While most works about the analysis of social interactions are focused on structured and task-driven setups such as small group meetings, mingle scenarios consist of free-standing conversational groups that dynamically form, merge and split aligning with the participants’ intentions and desires.
Data collected in structured scenarios is rather clean, whereas mingle scenarios have frequent and heavy subject cross-contamination as well as missing data due to the inherent crowded and dynamic nature of the events, with people mingling freely. The goal of this thesis is to leverage multiple modalities for the analysis of social interactions during crowded mingle scenarios, to overcome these challenges.
The approach taken in this thesis is to record mingling events with overhead cameras and wearable sensors recording body acceleration and proximity, to be minimally intrusive and to scale rather easily to a higher number of people. We focused on different tasks for the understanding of social interactions such as automatic association of multiple modalities, detection of social hand gestures, personality estimation, and group enjoyment.
We show that the use of multiple modalities improves the performance of our classification tasks and the understanding of social interactions, compared to unimodal approaches.
This was particularly important when data in one of the modalities was noisy or completely missing.

Files

Dissertation.pdf
(pdf | 74.8 Mb)
Unknown license