SH
Simon Holk
2 records found
1
SEQUEL
Semi-Supervised Preference-based RL with Query Synthesis via Latent Interpolation
Preference-based reinforcement learning (RL) poses as a recent research direction in robot learning, by allowing humans to teach robots through preferences on pairs of desired behaviours. Nonetheless, to obtain realistic robot policies, an arbitrarily large number of queries is r
...
Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an amplitude of factors, such as designing reward functions that cover every possible interaction. To address the heavy burden of robot reward engineering, we aim to leverage subjectiv
...