Simon Holk

Conference paper (2)

2 records found

SEQUEL

Semi-Supervised Preference-based RL with Query Synthesis via Latent Interpolation

Conference paper (2024) - Daniel Marta (author), Simon Holk (author), Christian Pek (author), Iolanda Leite (author)

Preference-based reinforcement learning (RL) poses as a recent research direction in robot learning, by allowing humans to teach robots through preferences on pairs of desired behaviours. Nonetheless, to obtain realistic robot policies, an arbitrarily large number of queries is r ...

Aligning Human Preferences with Baseline Objectives in Reinforcement Learning

Conference paper (2023) - Daniel Marta (author), Simon Holk (author), Christian Pek (author), Jana Tumova (author), Iolanda Leite (author), Iolanda Leite (author)

Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an amplitude of factors, such as designing reward functions that cover every possible interaction. To address the heavy burden of robot reward engineering, we aim to leverage subjectiv ...