Comparing performance of ASR systems on native Dutch children and teenagers: Google vs. Microsoft

van Dijk, G.

Comparing performance of ASR systems on native Dutch children and teenagers: Google vs. Microsoft

Evaluating Speech Recognition Accuracy of state-of-the-art ASR models on Dutch child and teenager speech

Bachelor thesis (2024)

Authors

G. van Dijk Electrical Engineering, Mathematics and Computer Science

Contributors

O.E. Scharenborg (mentor)

Y. Zhang (mentor)

Catharine Oertel École Polytechnique Fédérale de Lausanne (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Google Dutch Automatic speech recognition Microsoft Child speech Teenager spech

To reference this document use:

http://resolver.tudelft.nl/uuid:0f2f8e50-7901-49c2-9122-e6d77ced9653

More Info

expand_more

Published Date

25-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Automatic Speech Recognition (ASR) technology is becoming more and more useful in everyday life, therefor also requiring higher accuracy across all different user demographics. This study compares the performance of Google's and Microsoft's ASR systems on native Dutch child and teenager speech using the JASMIN-CGN dataset as ASR for children presents unique challenges due to their shorter vocal tracts and irregular speech patterns. This research evaluates each system's performance based on Word Error Rate (WER) and Character Error Rate (CER), highlighting the differences between gender, age, and dialect regions. The results indicate that while Microsoft's ASR consistently outperforms Google's in terms of WER, Google demonstrates slightly higher precision in terms of CER. Therefor Microsoft is considered the better overall performing system but depending on one's needs, such as precision, Google would be the more favorable one.

Files

Research_project_paper.pdf

(pdf | 0.297 Mb)

Unknown license