In the field of Systems and Control, optimal control problem-solving for complex systems is a core task. The development of accurate mathematical models to represent these systems’ dynamics is often difficult. This complexity comes from potential uncertainties, complex non-linear
...
In the field of Systems and Control, optimal control problem-solving for complex systems is a core task. The development of accurate mathematical models to represent these systems’ dynamics is often difficult. This complexity comes from potential uncertainties, complex non-linearities, or unknown factors that might affect the system. Because of these challenges, there is a need for methods that can understand the dynamics using available data and control strategies that can work with such models without relying too much on expert knowledge or task-specific insights. These methods are essential for creating efficient and reliable solutions in a wide variety of applications within the discipline. The need for models that do not require expert knowledge has spurred the interest in applying machine learning methods to control problems. Probabilistic Inference for Learning COntrol (PILCO) is a model-based Reinforcement Learning (RL) algorithm known for its probabilistic approach to model-based RL. By employing Gaussian Process (GP) dynamics models, PILCO integrates uncertainties into its learning process, allowing it to derive control policies from limited data. PILCO’s use of the Squared Exponential (SE) kernel in its GP can restrict the learning capacity. Especially in higher-dimensional spaces, due to the SE kernel’s inherent smoothness assumption that might not capture complex or non-smooth dynamics effectively. The algorithm’s reliance on moment matching for approximating posterior distributions introduces another weakness, which can lead to inaccuracies in non-Gaussian or multi-modal contexts. These shortcomings may limit PILCO’s efficiency and scalability in more complex, higher-dimensional tasks or in situations where the underlying dynamics are not well-captured by the chosen kernel and approximation methods. This thesis introduces Deep Kernel PILCO (DKL PILCO), a novel framework that uses Deep Kernel Learning (DKL) for learning the dynamics, and the Unscented Transform (UT) to propagate the uncertainty. The effectiveness of this approach is demonstrated across various tasks, highlighting the potential of DKL and UT to enhance the scalability and efficiency of model-based RL methods such as PILCO, making it a promising candidate for real-world control applications.