Comparison of Optimal Control Techniques for Learning-based RRT

Master thesis (2018)

Authors

D. Paramkusam Mechanical Engineering

Contributors

M. Wisse (mentor)

M. Bharatheesha (mentor)

S. Grammatico (graduation committee member)

W.J. Wolfslag (graduation committee member)

Faculty

Mechanical Engineering, Mechanical Engineering

To reference this document use:

http://resolver.tudelft.nl/uuid:742ed24e-0525-4ae2-b6d4-2dc6f69e60e1

More Info

expand_more

Published Date

27-02-2018

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Mechanical Engineering

Abstract

Kinodynamic motion planning for a robot involves generating a trajectory from a given robot state to goal state while satisfying kinematic and dynamic constraints. Rapidly-exploring Random Trees (RRT) is a sampling-based algorithm that has been widely adopted for this. However, RRT is not fast enough to enable its use in industrial applications. Recently, supervised learning has been used to pre-learn time consuming steps of RRT which resulted in improvement in planning times. The supervised learning models require cost and control input of the system as training data which are generated using optimal control.

The training data can be obtained either by indirect optimal control or direct optimal control techniques. In this thesis, both the techniques are each used to generate cost and control inputs for a two-link manipulator using random initial-final state pairs. Then each dataset is used to train a model and the datasets are compared based on certain training metrics. K-nearest neighbours regression and multi-layer perceptron neural network are the supervised learning models used in this thesis. It is observed that both the datasets result in similar convergence of the models, but indirect optimal control approach allows upto 24-fold faster data generation and upto 3-fold reduction in dimensionality of training data compared to the direct optimal approach.

Real-world robots have torque limits based on actuator configuration. The torque limits are modeled as control constraints in both the optimal control techniques and the effect of
this restriction on data generation and supervised learning is studied in this thesis. Direct
optimal control is found to be better for data generation in this case due to the ease of
applying control bounds as inequality constraints on the function approximations. Indirect
optimal control is very tedious as active constraints should be known a priori to determine
the switching points. An alternate method is explored instead where samples are generated similar to the unconstrained case but samples violating the constraints are removed. Poor control input learning is observed in both approaches and the models struggled to extrapolate. It is hypothesised that this is due to inability of the constrained data to fully capture the system dynamics. However, good cost prediction is achieved using neural networks.

Files

MscThesis_deepak.pdf

(pdf | 75.1 Mb)