Performance of Decision Transformer in multi-task offline reinforcement learning

Bieszczad, P.Z.

Performance of Decision Transformer in multi-task offline reinforcement learning

How does the introduction of sub-optimal data affect the performance of the model?

Bachelor thesis (2024)

Authors

P.Z. Bieszczad Electrical Engineering, Mathematics and Computer Science

Contributors

Matthijs T. J. Spaan Sequential Decision Making (mentor)

M.R. Weltevrede Sequential Decision Making (mentor)

Elena Congeduti Computer Science & Engineering-Teaching Team (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Offline Reinforcement Learning Multitask Learning Decision Transformer

To reference this document use:

http://resolver.tudelft.nl/uuid:4ccd2c9f-438a-420d-9eec-c260719c1240

More Info

expand_more

Published Date

27-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

In the field of Artificial Intelligence (AI), techniques like Reinforcement Learning (RL) and Decision Transformer (DT) are utilized by machines to learn from experiences and solve problems. The distinction between offline and online learning determines whether the machine learns from a live environment or simply observes pre-recorded actions. The difference between single-task and multi-task settings indicates whether the machine can handle similar but not identical tasks. Multi-task, offline learning, the focus of this paper, allows machines to address a variety of related tasks, based on a pre-recorded set of experiences. This approach is particularly valuable in situations where traditional training methods are costly or challenging. For instance, in robotics, multi-task, offline learning enables robots to use experiences from various tasks, such as picking up objects, to solve new problems like placing them down. This research paper explores the effectiveness of Decision Transformers in multi-task environments through theoretical discussions and practical examples. It also tries to answer the question, of how introducing sub-optimal training data, affects the performance or generalisation ability of the model.

Files

Piotr_Bieszczad_-_Decision_Tan... (pdf)

(pdf | 6.57 Mb)

Unknown license