Circular Image

46 records found

This study explores the application of risk-sensitive Reinforcement Learning (RL) in portfolio optimization, aiming to integrate asset pricing and portfolio construction into a unified, end-to-end RL framework. While RL has shown promise in various domains, its traditional risk-n ...

Influence Based Multi Agent Reinforcement Learning for Active Wake Control

Using influence to increase energy production using multi agent reinforcement learning


The increasing demand for electricity has lead to demand for more efficient energy production. One promising option is wind power, which currently provides an estimated 7.8% of the world’s energy production. One of the problems with wind energy is that a small percentage of ...
This paper addresses the issue of double-dipping in off-policy evaluation (OPE) in behaviour-agnostic reinforcement learning, where the same dataset is used for both training and estimation, leading to overfitting and inflated performance metrics especially for variance. We intro ...
In offline reinforcement learning, deriving a policy from a pre-collected set of experiences is challenging due to the limited sample size and the mismatched state-action distribution between the target policy and the behavioral policy that generated the data. Learning a dynamic ...
Off-policy evaluation has some key problems with one of them being the “curse of horizon”. With recent breakthroughs [1] [2], new estimators have emerged that utilise importance sampling of the individual state-action pairs and reward rather than over the whole trajectory. With t ...
Behavior-agnostic reinforcement learning is a rapidly expanding research area focusing on developing algorithms capable of learning effective policies without explicit knowledge of the environment's dynamics or specific behavior policies. It proposes robust techniques to perform ...
In the field of reinforcement learning (RL), effectively leveraging behavior-agnostic data to train and evaluate policies without explicit knowledge of the behavior policies that generated the data is a significant challenge. This research investigates the impact of state visitat ...
Traditionally, Recurrent Neural Networks (RNNs) are used to predict the sequential dynamics of the environment. With the advancement and breakthroughs of Transformer models, there has been demonstrated improvement in the performance & sample efficiency of Transformers as worl ...

Understanding the Effects of Discrete Representations in Model-Based Reinforcement Learning

An analysis on the effects of categorical latent space world models on the MinAtar Environment

While model-free reinforcement learning (MFRL) approaches have been shown effective at solving a diverse range of environments, recent developments in model-based reinforcement learning (MBRL) have shown that it is possible to leverage its increased sample efficiency and generali ...
Real-world environments require robots to continuously acquire new skills while retain-ing previously learned abilities, all without the need for clearly defined task boundaries. Storing all past data to prevent forgetting is impractical due to storage and privacy con-cerns. To a ...

Cooperative AI for Overcooked

Multi-Agent RL with Population-Based Training

In ad-hoc cooperative environments, the usage of artificial intelligence to take supportive roles and work in collaboration with humans has proven to be of great benefit. The objective of this research is to evaluate the use of population-based training for reinforcement learning ...
Arguably the main goal of artificial intelligence is to create agents that can collaborate with humans to achieve a shared goal. It has been shown that agents that assume their partner to be optimal can converge to protocols that humans do not understand. Taking human suboptimali ...
Cooperative AI is AI designed to cooperate with humans. One example of such an AI, made using planning algorithms, was studied in a paper from 2019 which used a simplified version of the video game Overcooked for evaluation. However, only limited evaluations were possible due to ...

Scripted AI for Overcooked

Designing and Evaluating a Scripted AI Controller for Simplified Overcooked

Overcooked, an immersive multiplayer video game centered around cooperative cooking challenges, provides the roots for this research project. The study focuses on designing and evaluating a hand-authored controller in comparison to controllers implemented using various machine le ...
The popular video game "Overcooked" is a great example of a task requiring complex planning and cooperation with other players. This game is used as the inspiration for an environment for evaluating AI, called "Overcooked-AI". This paper implements a centralized critic into the O ...
Operation and maintenance of the built environment have a major effect on socioeconomic stability and sustainability. A significant part of our built world approaches or has well exceeded its designated structural life. As engineers, we need to find efficient ways to extend this ...
Agriculture plays a vital role in the global economy, providing the necessary food and resources for human survival. With the world’s population projected to surge, the demand for food is set to escalate in the coming decades. This increasing demand, coupled with the challenges p ...
The ability to model other agents can be of great value in multi-agent sequential decision making problems and has become more accessible due to the introduction of deep learning into reinforcement learning. In this study, the aim is to investigate the usefulness of modelling oth ...
By increasing the step frequency of the runners, it is possible to reduce the risk of injuries due to overload. Techniques like auditory pacing help the athletes to have better control over their step frequency. Nevertheless, synchronizing to a continuous external rhythm costs en ...
Agents trained through single-agent reinforcement learning methods such as self-play can provide a good level of performance in multi-agent settings and even in fully cooperative environments. However, most of the time, training multiple agents together using single-agent self-pl ...