Motion Planning in Dynamic Environments with Learned Scalable Policies

Serra Gomez, A.

Abstract

The application of multi-robot systems has gained popularity in recent years. Multi-robot systems show great potential in scaling up robotic applications in surveillance, monitoring, and exploration. Although single robots can already be used to automatize search and rescue, and surveillance tasks, their capability to solve their given task is still dependent on the region of interest size. For example, in expansive environments, a single robot’s ability to cover or surveil the entire area effectively diminishes, resulting in information gaps. To this end, this thesis aims to improve multi-robot coordination and navigation in active perception tasks, e.g. exploration and surveillance. The multi-robot coordination problem aims to determine when and with whom multiple robots should exchange information to avoid collisions while navigating to their goal. The active perception problem involves guiding a sensing robot to positions rich in task-relevant information. The primary objective of this dissertation is to provide algorithmic contributions, specifically focusing on the obtention of robust policies that can adapt and scale to fleets and tasks of varying sizes. To this end, this thesis leverages the combination of high-level policies, learned through Reinforcement Learning, with robust low-level optimal control policies. Firstly, the communication-based coordination problem for decentralized multi-robot systems is formally defined and modeled as a Markov Decision Process. This thesis then proposes a novel communication policy for decentralized multi-robot systems, improving coordination and collision avoidance. The proposed policy, learned via Reinforcement Learning, allows robots to selectively communicate, requesting trajectory plans from potential risks while assuming constant velocities for others. This utilizes an attention-based neural network architecture for scalability, integrated with Non-Linear Model Predictive Control for safe and robust motion planning. When tested with 12 robots, it reduced communications compared to alternatives, maintaining safety. Scalable and robust, it performed well with different team sizes and in the presence of observation noise. Real-world tests on quadrotors confirmed its practical applicability. The Active Perception problem represents the second major challenge addressed in this dissertation. Specifically, this thesis addresses the challenge of using a drone to collect semantic information for classifying multiple moving targets, focusing on computing control inputs for optimal viewpoints. This task is complicated by the variable amount of targets to classify in the region of interest, and the use of a ”black-box" classifier, like a deep learning neural network, which lacks clear analytical relationships between viewpoints and outputs. This thesis proposes an attention-based architecture, trained with Reinforcement Learning (RL), which determines the best viewpoints for the drone to gather evidence from multiple unclassified targets, considering their movement, orientation, and potential occlusions. A low-level MPC controller then guides the drone to these viewpoints. The approach outperforms various baselines and shows adaptability to new scenarios and scalability to numerous targets with varying movement dynamics. To conclude the thesis, the previous approach is extended to more realistic and larger environments where targets need to be localized, tracked, and then classified. This thesis introduces a novel decentralized hybrid multi-camera system designed for surveillance and monitoring applications. Traditional fixed camera networks suffer from blind spots and backlighting issues. A decentralized hybrid framework is proposed that integrates both static and mobile cameras to actively and dynamically enhance critical information gathering. All networked cameras collaborate to monitor and localize people in the environment by comparing their local information. The mobile camera is guided by a viewpoint control policy to maximize semantic information from observed targets. The framework was implemented in a photorealistic environment using Unreal Engine and enabled distributed communications through the Robot Operating System (ROS), bridging the gap between simulation and real-world applications. Results in large environments demonstrate the advantages of collaborative mobile cameras over static and individual setups both in target identification and tracking accuracy, respectively. In crowded scenarios, mobile cameras excel in avoiding occlusions and capturing desired viewpoints, improving the percentage of classified tracked targets compared to static setups. Qualitatively, mobile cameras provide superior target observation quality unmatched by the static framework. In summary, this thesis makes significant contributions that are validated through extensive evaluations in simulated photo-realistic environments and with commercial drones, demonstrating the potential for practical applications. Despite the progress, the thesis acknowledges the remaining challenges in deploying multi-robot systems in realworld perception tasks, especially when the policy is learned, and suggests directions for future research.

Motion Planning in Dynamic Environments with Learned Scalable Policies

Abstract

Files