J. He | TU Delft Repository

What model does MuZero learn?

Conference paper (2024) - J. He (author), Thomas M. Moerland (author), J.A. de Vries (author), F.A. Oliehoek (author)

Model-based reinforcement learning (MBRL) has drawn considerable interest in recent years, given its promise to improve sample efficiency. Moreover, when using deep-learned models, it is possible to learn compact and generalizable models from data. In this work, we study MuZero, ...

Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks

Conference paper (2024) - J.A. de Vries (author), J. He (author), J. He (author), M.M. de Weerdt (author), M.T.J. Spaan (author)

Benchmarking Robustness and Generalization in Multi-Agent Systems

A Case Study on Neural MMO

Journal article (2023) - Yangkun Chen (author), Yangkun Chen (author), Chenghui Yu (author), Chenghui Yu (author), Hengman Zhu (author), Shuai Liu (author), Yibing Zhang (author), Joseph Suarez (author), Liang Zhao (author), J. He (author), Jiaxin Chen (author), More Authors...

We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponent ...

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

Conference paper (2022) - M. Suau (author), J. He (author), Mustafa Mert Çelikok (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning. Many real-world problems, however, exhibit overly complex dynamics, which makes their full-scale simulation computationally slow. In this paper, we sh ...

Influence-Augmented Local Simulators

A Scalable Solution for Fast Deep RL in Large Networked Systems

Conference paper (2022) - M. Suau (author), J. He (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Influence-aware memory architectures for deep reinforcement learning in POMDPs

Journal article (2022) - M. Suau (author), J. He (author), E. Congeduti (author), R.A.N. Starre (author), A.T. Czechowski (author), F.A. Oliehoek (author)

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods use r ...

Online Planning in POMDPs with Self-Improving Simulators

Conference paper (2022) - J. He (author), M. Suau (author), Hendrik Baier (author), Michael Kaisers (author), F.A. Oliehoek (author)

How can we plan efficiently in a large and complex environment when the time budget is limited? Given the original simulator of the environment, which may be computationally very demanding, we propose to learn online an approximate but much faster simulator that improves over tim ...

Speeding up Deep Reinforcement Learning through Influence-Augmented Local Simulators

Conference paper (2022) - M. Suau (author), J. He (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Influence-Augmented Online Planning for Complex Environments

Journal article (2020) - J. He (author), M. Suau (author), F.A. Oliehoek (author)

How can we plan efficiently in real time to control an agent in a complex environment that may involve many other agents? While existing sample-based planners have enjoyed empirical success in large POMDPs, their performance heavily relies on a fast simulator. However, real-world ...

Multitask Soft Option Learning

Conference paper (2020) - Maximilian Igl (author), Andrew Gambardella (author), J. He (author), Nantas Nardelli (author), N Siddharth (author), J.W. Böhmer (author), Shimon Whiteson (author)

We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids seve ...