State-of-the-art Automatic Speech Recognition Systems on Dutch Regional Dialects

Exploring Bias in Dutch-trained Automatic Speech Recognition Systems

More Info
expand_more

Abstract

Automatic Speech Recognition is a field that has seen a strong increase in developments in recent years. In order to ensure objectivity and reliability in these systems, it is crucial they remain unbiased and treat speakers equally. This paper explores the bias of two state-of-the-art ASR systems in the domain of Dutch and Flemish speech, specifically towards regional dialects. Specifically, it explores Microsoft's Azure AI Speech Services ASR system and Google Chirp. It analyses the performance of these two systems on the JASMIN-CGN language corpus. The results show that speech from West-Dutch regions is recognized correctly significantly more often than other Dutch regions and speech from Brabant is recognized correctly significantly more often than other Flemish regions.

Files

RP_paper_final.pdf
(pdf | 0.269 Mb)
Unknown license