Thiago D. Simão

Conference paper (1)

Journal article (1)

2 records found

Scalable Safe Policy Improvement for Factored Multi-Agent MDPs

Conference paper (2024) - Federico Bianchi (author), Edoardo Zorzi (author), Alberto Castellini (author), Thiago D. Simão (author), M.T.J. Spaan (author), Alessandro Farinelli (author)

In this work, we focus on safe policy improvement in multi-agent domains where current state-of-the-art methods cannot be effectively applied because of large state and action spaces. We consider recent results using Monte Carlo Tree Search for Safe Policy Improvement with Baseli ...

Scalable Safe Policy Improvement via Monte Carlo Tree Search

Journal article (2023) - Alberto Castellini (author), Federico Bianchi (author), Edoardo Zorzi (author), Edoardo Zorzi (author), Thiago D. Simão (author), Alessandro Farinelli (author), M.T.J. Spaan (author)

Algorithms for safely improving policies are important to deploy reinforcement learning approaches in real-world scenarios. In this work, we propose an algorithm, called MCTS-SPIBB, that computes safe policy improvement online using a Monte Carlo Tree Search based strategy. We th ...