SS

Sebastiaan Scholten

3 records found

Many computational models of speech recognition assume that the set of target words is already given. This implies that these models learn to recognise speech in a biologically unrealistic manner, i.e. with prior lexical knowledge and explicit supervision. In contrast, visually g ...
We investigated word recognition in a Visually Grounded Speech model. The model has been trained on pairs of images and spoken captions to create visually grounded embeddings which can be used for speech to image retrieval and vice versa. We investigate whether such a model can b ...
Recommender systems can be useful in group settings, e.g. when choosing a movie to watch with a group. However, while considerable research in group recommendation has been performed, we still lack truly ecological datasets on group recommendations in real life consumption scenar ...