A cross-entropy approach to solving Dec-POMDPs

Oliehoek, Frans; Kooij, Julian; Vlassis, Nikos

A cross-entropy approach to solving Dec-POMDPs

Conference paper (2008)

Authors

Frans Oliehoek Universiteit van Amsterdam

Julian Kooij Universiteit van Amsterdam

Nikos Vlassis Technical University of Crete

Affiliation

External organisation

To reference this document use:

http://resolver.tudelft.nl/uuid:8c69bd50-e7b2-4c1c-9eb9-2f5c8bc80d99

More Info

expand_more

Published Date

2008

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Affiliation

External organisation

Abstract

In this paper we focus on distributed multiagent planning under uncertainty. For single-agent planning under uncertainty, the partially observable Markov decision process (POMDP) is the dominant model (see [Spaan and Vlassis, 2005] and references therein). Recently, several generalizations of the POMDP to multiagent settings have been proposed. Here we focus on the decentralized POMDP (Dec-POMDP) model for multiagent planning under uncertainty [Bernstein et al., 2002, Goldman and Zilberstein, 2004]. Solving a Dec-POMDP amounts to finding a set of optimal policies for the agents that maximize the expected shared reward. However, solving a Dec-POMDP has proven to be hard (NEXP-complete): The number of possible deterministic policies for a single agent grows doubly exponentially with the planning horizon, and exponentially with the number of actions and observations available. As a result, the focus has shifted to approximate solution techniques [Nair et al., 2003, Emery-Montemerlo et al., 2005, Oliehoek and Vlassis, 2007].