Facing the critical challenge of reducing greenhouse gas (GHG) emissions in the maritime industry, this thesis explores the potential of smart control systems using Reinforcement Learning (RL) for autonomous sailing. Traditional controls for sailing fall short in navigating the c
...
Facing the critical challenge of reducing greenhouse gas (GHG) emissions in the maritime industry, this thesis explores the potential of smart control systems using Reinforcement Learning (RL) for autonomous sailing. Traditional controls for sailing fall short in navigating the complex, dynamic conditions of maritime environments. RL has shown to be effective for continuous control applications in these types of conditions, however, primarily in simulated environments. Therefore, this study aims to show the potential of RL for autonomous sailing control (ASC) by means of a small scale project. A fast-time simulation of an Optimist is used to train the sailing controls required to reach an upwind target. The controls are then transferred to a robotized Optimist in a real-world environment to test the transferability of the simulation trained controls. First, the reality gap, or modelling error, between the simulation and real-world environment is quantified to be able to assess the performance of the used techniques to bridge the existing gap. The sim-to-real techniques of Domain Randomization (DR) and the addition of observation noise (ON) are applied during the training process. To test the effectiveness of the trained RL controls, the best performing ones in the simulation are selected and tested in the real-world environment. The performance of the RL controlled Optimist is compared to state-of-the-art controls in robotic sailing. Their performances are measured and compared by means of success rate and a physics-based metric that calculates the efficiency of the sailboat to use the power of the wind to propel itself, called the energy ratio. The results show that the RL controls are highly successful in the sailing simulation, however, the transfer to the real-world remains a major challenge. DR does improve the sim-to-real transfer, resulting in an agent that is able to reach a 100% success rate throughout 12 runs in the real-world environment.