AK
Andy W.H. Khong
5 records found
1
LGM3A 2024
The 2nd Workshop on Large Generative Models Meet Multimodal Applications
This workshop aims to explore the potential of large generative models to revolutionize how we interact with multimodal information. A Large Language Model (LLM) represents a sophisticated form of artificial intelligence engineered to comprehend and produce natural language text,
...
MERLIon CCS Challenge
A English-Mandarin code-switching child-directed speech corpus for language identification and diarization
To enhance the reliability and robustness of language identification (LID) and language diarization (LD) systems for heterogeneous populations and scenarios, there is a need for speech processing models to be trained on datasets that feature diverse language registers and speech
...
Investigating model performance in language identification
Beyond simple error statistics
Language development experts need tools that can automatically identify languages from fluent, conversational speech and provide reliable estimates of usage rates at the level of an individual recording. However, LID systems are typically evaluated on metrics such as equal error
...
We propose two end-to-end neural configurations for language diarization on bilingual code-switching speech. The first, a BLSTM-E2E architecture, includes a set of stacked bidirectional LSTMs to compute embeddings and incorporates the deep clustering loss to enforce grouping of l
...