J. He | TU Delft Repository

What model does MuZero learn?

Conference paper (2024) - Jinke He (author) , Thomas M. Moerland (author) , J.A. de Vries (author) , F.A. Oliehoek (author)

Model-based reinforcement learning (MBRL) has drawn considerable interest in recent years, given its promise to improve sample efficiency. Moreover, when using deep-learned models, it is possible to learn compact and generalizable models from data. In this work, we study MuZero, ...

Benchmarking Robustness and Generalization in Multi-Agent Systems

A Case Study on Neural MMO

Journal article (2023) - Yangkun Chen (author) , Chenghui Yu (author) , Hengman Zhu (author) , Shuai Liu (author) , Yibing Zhang (author) , Joseph Suarez (author) , Liang Zhao (author) , J. He (author) , Jiaxin Chen (author) , More Authors...

We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponent ...

Speeding up Deep Reinforcement Learning through Influence-Augmented Local Simulators

Conference paper (2022) - Miguel Suau (author) , Jinke He (author) , Matthijs T. J. Spaan (author) , Frans Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Online Planning in POMDPs with Self-Improving Simulators

Conference paper (2022) - J. He (author) , M. Suau (author) , Hendrik Baier (author) , Michael Kaisers (author) , Frans A. Oliehoek (author)

How can we plan efficiently in a large and complex environment when the time budget is limited? Given the original simulator of the environment, which may be computationally very demanding, we propose to learn online an approximate but much faster simulator that improves over tim ...

Influence-Augmented Local Simulators

A Scalable Solution for Fast Deep RL in Large Networked Systems

Conference paper (2022) - Miguel Suau (author) , Jinke He (author) , Matthijs T. J. Spaan (author) , Frans Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

Conference paper (2022) - Miguel Suau (author) , Jinke He (author) , Mustafa Mert Çelikok (author) , Matthijs T.J. Spaan (author) , Frans A Oliehoek (author)

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning. Many real-world problems, however, exhibit overly complex dynamics, which makes their full-scale simulation computationally slow. In this paper, we sh ...

Influence-aware memory architectures for deep reinforcement learning in POMDPs

Journal article (2022) - Miguel Suau (author) , Jinke He (author) , E. Congeduti (author) , Rolf Starre (author) , A.T. Czechowski (author) , Frans Oliehoek (author)

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods use r ...

Multitask Soft Option Learning

Conference paper (2020) - Maximilian Igl (author) , Andrew Gambardella (author) , Jinke He (author) , Nantas Nardelli (author) , N Siddharth (author) , Wendelin Böhmer (author) , Shimon Whiteson (author)

We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This “soft” version of options avoids seve ...

Influence-Augmented Online Planning for Complex Environments

Journal article (2020) - Jinke He (author) , M. Suau (author) , F.A. Oliehoek (author)

How can we plan efficiently in real time to control an agent in a complex environment that may involve many other agents? While existing sample-based planners have enjoyed empirical success in large POMDPs, their performance heavily relies on a fast simulator. However, real-world ...