Improving DRL Of Vision-Based Navigation By Stereo Image Prediction

den Ridder, L.S.

Improving DRL Of Vision-Based Navigation By Stereo Image Prediction

Master thesis (2023)

Authors

L.S. den Ridder Aerospace Engineering

Contributors

G.C.H.E. De Croon (mentor)

Y. Wu (mentor)

Faculty

Aerospace Engineering, Aerospace Engineering

Simulation Deep Reinforcement Learning UAV Feature Extraction Depth Estimation Autonomous Navigation Self-supervised learning Auxiliary tasks Monocular Vision

To reference this document use:

http://resolver.tudelft.nl/uuid:ef354713-924e-4907-a44f-95b67efa638e

More Info

expand_more

Published Date

20-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Aerospace Engineering

Abstract

Although deep reinforcement learning (DRL) is a highly promising approach to learning robotic vision-based control, it is plagued by long training times. This report introduces a DRL setup that relies on self-supervised learning for extracting depth information valuable for navigation. Specifically, a literature study is conducted to investigate the effects of learning how to synthesize one view from the other in a stereo-vision setup without relying on any preliminary knowledge of the camera extrinisics and how it can be integrated for its downstream use for an obstacle avoidance task. As such, the literature study concludes that competitive geometry-free monocular-to-stereo image view synthesis is feasible due to recent developments in computer vision. The scientific paper further develops concepts proposed in the literature study and benchmarks the proposed architectures on depth estimation benchmarks for KITTI. Competitive results are achieved for view synthesis and despite sub-optimal performance compared to state-of-the-art monocular depth estimation, an ability to encode depth and detect shapes is present and, therefore, satisfactory for the application to DRL. Additionally, the research examines the benefits of using the latent space of a view synthesis architecture compared to other feature extractor methods as an input to the PPO agent implemented as auxiliary tasks. This method achieves quicker convergence and better performance for an obstacle avoidance task in a simulated indoor environment than the autoencoding feature extractor and end-to-end DRL methods. It is only outperformed by the monocular depth estimation feature extractor method. Overall, this research provides valuable insights for developing more efficient and effective DRL methods for monocular camera-based drones. Finally, the complementary code for this research can be found: \url{https://github.com/ldenridder/drl-obstacle-avoidance-view-synthesis}.

Files

MSc_Thesis_Final_Report_Luc_de... (pdf)

(pdf | 28.9 Mb)

Unknown license