Hengshun Zhou

Conference paper (1)

Journal article (1)

2 records found

The First Multimodal Information Based Speech Processing (Misp) Challenge

Data, Tasks, Baselines And Results

Conference paper (2022) - Hang Chen (author), Hang Chen (author), Hengshun Zhou (author), Jun Du (author), Chin-Hui Lee (author), Jingdong Chen (author), Shinji Watanabe (author), Sabato Marco Siniscalchi (author), Sabato Marco Siniscalchi (author), O.E. Scharenborg (author), Odette Scharenborg (author), Di-Yuan Liu (author), More Authors..., More authors...

In this paper we discuss the rational of the Multi-model Information based Speech Processing (MISP) Challenge, and provide a detailed description of the data recorded, the two evaluation tasks and the corresponding baselines, followed by a summary of submitted systems and evaluat ...

Audio-Visual Wake Word Spotting in MISP2021 Challenge

Dataset Release and Deep Analysis

Journal article (2022) - Hengshun Zhou (author), Jun Du (author), Gongzhen Zou (author), Zhaoxu Nian (author), Chin-Hui Lee (author), Sabato Marco Siniscalchi (author), Sabato Marco Siniscalchi (author), Shinji Watanabe (author), O.E. Scharenborg (author), Odette Scharenborg (author), Jingdong Chen (author), More Authors..., More authors...

In this paper, we describe and release publicly the audio-visual wake word spotting (WWS) database in the MISP2021 Challenge, which covers a range of scenarios of audio and video data collected by near-, mid-, and far-field microphone arrays, and cameras, to create a shared and p ...