Uncertainty-aware Interactive Imitation Learning for Robot Manipulation

More Info
expand_more

Abstract

While Artificial Intelligence (AI) is geared towards automating tasks like writing and designing, the challenge persists in finding adequate human resources for tasks such as handling luggage in and out of airplanes or harvesting produce in greenhouses. Nonetheless, the demand to tailor robotic abilities to diverse scenarios, ranging from agriculture to household chores, necessitates a general-purpose morphology for the robot, such as a dexterous arm, along with sufficient sensory capabilities and intelligence to swiftly adjust to new situations.
Despite the prevalence of click-baiting videos shared online, current robot technologies have yet to address this requirement adequately. The primary obstacle hindering robot manipulators from effectively performing daily chores, aiding in supermarkets, and harvesting fruits from fields is the insufficient data available to construct a robust model of the world. Typically, autonomously exploring their surroundings and determining optimal strategies is considered unsafe and impractical.
A more effective approach to imparting knowledge to robots involves human supervision. Ideally, this entails interactive supervision where robots can seek clarification when uncertain about a situation, and humans can intervene when the robot’s actions are incorrect or fail to meet the required performance. Moreover, when receiving instructions or asking for them, the robot should quantify the confidence in the interpretation of the corrections. This thesis makes significant contributions to the field of interactive robot learning by introducing various uncertainty-aware methods. These methods facilitate enhancements in data efficiency during learning and safety during execution.
Before delving into the main contributions, Chapter 2 introduces the reader to the topic of Interactive Imitation Learning (IIL) and the different modalities that can be used to give feedback, from evaluative to corrective, underlying the importance of uncertainty quantification on the robot belief. For this reason, Chapter 3, introduces the foundations of the main function approximator used in this thesis, i.e. Gaussian Process (GP), to learn behaviors while quantifying uncertainties. The chapter highlights how a GP is trained given the evidence of the data and the corrections and how predictions of the mean and the variance of the actions are obtained. Particular attention is given to how GP models can be used for efficient updating and aggregation of online data and how to analytically estimate the uncertainty rate of change.
The proposed function approximator is first applied in Chapter 4. The presented machine learning framework allows the robot to learn complex manipulation tasks from interactive demonstrations. Essentially, the user needs to show a kinesthetic demonstration to the robot, i.e. dragging the robot around in a fully compliant modality to transfer their knowledge on a desired skill, e.g. cleaning a table or inserting a plug in a socket. The experiments highlight how the quantification and the rejection of uncertainties can be used to bring the robot always close to high-confidence regions. Moreover, the GP online model update is used to aggregate the corrections received from the user to reshape the learned attractor and the stiffness field. This ensures that the proper force is executed in the correct direction for instance when cleaning a table.
To extend the learning of a skill to the whole robot pose and gripper, Chapter 5 studies how to address this with GP and with the least amount of demonstrations and corrections. Moreover, the experiments focus on teaching human-like skills to robots by exploiting the possibility of giving incremental corrections. In particular, novice users, are asked to perform the picking task of objects in one fluid motion by teaching the complete pose and gripper behavior. The execution of the skill without any supervision is usually too slow or knocks the object down before closing the gripper. Nevertheless, after providing feedback, novice users were able to incrementally shape the robot’s velocity to perform the picking at non-zero velocity, without knocking the object and correcting for any delay in gripper dynamics.
However, learning skills only relying on the current robot’s Cartesian position can be a limitation since it cannot encode skills that entail overlapping, e.g. when approaching a goal and then moving back on the same trajectory. This motivates Chapter 6 which formulates a new trajectory encoding to teach single or bimanual manipulation skills while being safe around humans with constrained velocity and force actuation. The user study also investigates the effectiveness of giving kinesthetic corrections, i.e. by simply touching the robot, and validating this in teaching bimanual skills. Teaching two manipulators at the same time or correcting them using teleoperation devices can become overwhelming. Hence, the method explores adjusting movements interactively through kinesthetic perturbations rather than re-teaching skills entirely from scratch due to imprecise attempts.
Despite the successful applications of the proposed methods in single and bimanual motion skills, during task learning, the robot must not only master the motor aspect but also be attentive to the context, such as the object’s location or shape. This motivates Chapter 7, which emphasizes the generalization of acquired motor skills across various contexts. The proposed approach hinges on GP theory to acquire a non-linear transformation map from the demonstrated task space to the execution space while preserving and propagating uncertainties. Through experiments involving tasks such as pick-and-place operations, dressing human arms, and cleaning surfaces, it is demonstrated how the robot can generalize the execution by transforming the attractor, orientation, and stiffness policy to numerous new scenario configurations even with just a single demonstration of the skill.
In Chapter 8, the concept of task parametrization and uncertainty awareness is expanded to over-parameterizing the context, such as by tracking more objects than required. The proposed algorithm would prompt user attention when encountering ambiguity, like when multiple detected objects could be the goal of the skill. Decision ambiguity can be resolved by various feedback modalities, such as pushing the robot, moving it, or providing reward/punishment. A user study also highlighted the preference of novice users for not giving conventional kinesthetic demonstrations but only intervening when necessary.

Files