R. Babuska

Doctoral thesis (8)

Master thesis (27)

35 records found

SSM-FSR: Symbolic Regression for Dynamical Systems Using Foundation Models

Master thesis (2025) - C.V. Terpstra (author) , Robert Babuska (mentor) , Holger Caesar (graduation committee member) , Christian Pek (graduation committee member)

Identification of Finite Dimensional Spatiotemporal Systems

Doctoral thesis (2025) - Z. Hidayat (author) , R. Babuška (promotor) , BHK de Schutter (promotor) , Alfredo Nunez Vicencio (copromotor)

Spatiotemporal systems are systems whose dynamics depend on time and space and are commonly found in real life. These systems are mathematically modeled using partial differential equations and are also known as distributed-parameter systems. Due to their structure and the high number of variables involved, control and estimation for this class of systems are very challenging. This thesis addresses two problems related to spatiotemporal systems: state estimation and system identification.

Monitoring the states of a control system is important to ensure the behavior of the system is achieving the control objectives. This can be achieved, among others, by using state observers that estimate the states of the systems regularly. First, we present a literature review of observer design methods for distributed parameter systems. In general, the design requires a dimension-reduction approach to implement the observer. From the dimension reduction, the design approaches can be classified into late and early lumping. In the late lumping perspective, model reduction is performed at the end of the observer design. In the early lumping perspective, dimension reduction is applied to the model of the system. We incorporate both approaches in our literature review.

State observer design requires the model of the systems. This thesis also presents a system identification method for distributed-parameter systems. The identification of such systems typically requires spatially dense and regular measurements, followed by selecting sensors that provide significant measurements to the model to reduce the model complexity. However, these requirements may be challenging to fulfill. In case the sensor locations are irregular and sparse in space, we propose the use of lumped-parameter system identification.

For models with a large number of regressors, we propose a method for reducing the number of regressors using a tree representation. The tree is a way to list models with different numbers of regressors. From all possible regressors for the model, the proposed method builds the tree from the simplest models, i.e., models with one regressor. The number of regressors in the models is incrementally increased to one or more models with the best performance. The addition is repeated until the tree contains models with the desired maximum number of regressors.

System identification is typically performed using a complete data set, i.e., for each input sample, there is an associated output sample available. However, there are cases in which some output samples are not recorded in the data set, making the identification data incomplete. This thesis also considers the problem of incomplete data for Takagi-Sugeno (TS) fuzzy system identification using the product space clustering method. This method comprises two steps: fuzzy clustering and rules construction. The first proposed method enables the use of incomplete system identification data to fuzzy c-means clustering algorithm developed for incomplete classification, which yields different estimates for a missing sample. This can be achieved by fusing those different values into a single value. The second proposed method treats missing samples as optimization variables during the identification process. The optimization is repeated until the change of all optimization variables is small.
@en

Designing Simulators for Robot Learning

Doctoral thesis (2025) - Bas van der Heijden (author) , Robert Babuška (promotor) , Jens Kober (promotor) , Laura Ferranti (copromotor)

Reinforcement learning has emerged as a promising approach for enabling robots to learn from interactions with their environments, without relying on predefined behaviors. However, robots face significant challenges when learning directly from real-world interactions. Real-world learning is time-consuming and resource-intensive, often requiring extensive data collection over long periods. Additionally, the risks involved in trial-and-error learning in physical settings are high, as faulty policies can lead to safety issues or system damage. Simulations offer a safer and more efficient alternative, allowing robots to learn in simulated environments at faster-than-real-time speeds. Despite these benefits, simulations often serve as imperfect approximations of reality. As a result, robots may learn behaviors that exploit simulation-specific quirks, which may not perform well in real-world settings, creating difficulties in transferring learned behaviors from simulation to real environments—a challenge known as the sim-to-real gap. Several factors contribute to the sim-to-real gap, such as unmodeled physical phenomena like friction and deformation, and the asynchronous nature of real-world systems that simulations often fail to capture accurately. Additionally, using separate software stacks for simulation and deployment can unintentionally lead to discrepancies. Finally, simulating at faster-than-real-time speeds with asynchronous frameworks that distribute computation across multiple cores may also introduce inaccuracies without proper synchronization.

This thesis focuses on improving simulation tools and methodologies to enhance the efficiency and effectiveness of learning-based approaches in robotics. The work addresses key trade-offs between flexibility, speed, and accuracy in robotic simulations, which are critical for successfully transferring learned policies from simulation to real-world environments. Additionally, it introduces a strategy to improve resilience, ensuring that learned behaviors are robust to irrelevant and unknown dynamics.
By tackling these challenges, this thesis provides insights into the design of effective robotic simulators and presents contributions that help bridge the gap between simulated and real-world robotic learning.@en

Model Predictive Path Integral Control for Interaction-Rich Local Motion Planning in Dynamic Environments

Doctoral thesis (2025) - Elia Trevisan (author) , Javier Alonso-Mora (promotor) , R. Babuska (promotor)

In urban environments like Amsterdam, where road congestion and stress on infrastructure are critical issues, the city’s waterways remain underutilized for transportation. Autonomous Surface Vessels (ASVs), making waterway transport cheaper and less labor intensive, present a potential solution by reducing transportation costs and alleviating road traffic. Although companies are developing ASVs, current solutions are mainly limited to larger, less congested waterways, with further innovation needed for denser urban areas.

The challenges in autonomous navigation for ASVs in urban canals stem from the complex nonlinear dynamics of vessels, which require long-term planning and rapid responses to environmental changes. Urban waterways are narrow and unstructured, with loosely defined navigation rules that can lead to discontinuities in the motion planner’s cost function. Typically, motion planners rely on predictions of other agents’ movements and plan around them, resulting in no interaction awareness. This approach can lead to the freezing robot problem in dense environments, where the ASV halts, deeming all the space unsafe. A local motion planner must address these challenges by ensuring long-horizon interaction-aware motions, rule adherence, and real-time planning.

An emerging approach in autonomous navigation is a sampling-based Model Predictive Control (MPC) strategy known as Model Predictive Path Integral control (MPPI). This algorithm approximates the optimal control sequence by sampling from a continuous input distribution. Thanks to its gradient-free nature, MPPI only requires collision checking to avoid collision and allows discontinuous cost functions. Additionally, its computation speed remains largely unaffected by the complexity of the robots’ nonlinear dynamics, enabling longer planning horizons. While this approach has grown in popularity in the motion planning community, its application in dynamic environments with multiple interacting agents remains relatively unexplored.

Building on the state-of-the-art in MPPI, this thesis first introduces an Interaction-Aware Model Predictive Path Integral (IA-MPPI) controller tailored for motion planning in crowded urban canals. While conventional planners passively react to other agents, IA-MPPI actively predicts and plans cooperatively in real-time, addressing the freezing robot problem and handling discontinuous navigation rules. The decentralized, communication-free architecture assumes agent cooperation and employs a two-stage sample evaluation to enhance efficiency. This approach manages nonlinear dynamics, exact obstacle shapes, and longer planning horizons, demonstrating robustness and efficiency compared to state-of-the-art MPC in multi-agent environments.

While IA-MPPI uses a constant velocity model for predicting other agents’ hidden goals, Chapter 4 of this thesis introduces a learning-based trajectory prediction model trained on realistic artificial data to improve accuracy in crowded environments. By extracting the agents’ goals from predicted trajectories and integrating this information into the IA-MPPI controller, the planner’s performance increases significantly in environments with tight interactions. Moreover, this approach outperforms treating the predicted trajectory as occupied space, leading to safer navigation in dense urban canals.

One of MPPI’s main drawbacks is that it struggles in highly dynamic environments because it usually only samples around a previous plan. When rapid changes occur, such as strong disturbances or an obstacle unexpectedly cutting off the path, the previous plan becomes invalid, and all the generated samples may lead to collisions. To address this, Chapter 5 introduces Biased-MPPI, incorporating multiple classic and learning-based ancillary controllers into the sampling distribution. While the mathematical derivations show that this introduces a bias, this significantly improves reactivity, safety, and robustness to local minima. Additionally, this approach reduces the required samples, making the controller more efficient regarding computational demand.

Lastly, Chapter 6 leverages the gradient-free nature of MPPI and applies it to a different domain involving contact-rich manipulation tasks, which are notoriously difficult for model-based planning. With a parallelizable GPU-based physics engine like IsaacGym, MPPI can efficiently roll out hundreds of input sequences in parallel, faster than in real-time, to approximate optimal control. This approach eliminates the need for explicit modeling of contact dynamics by exploiting the models embedded in the simulator, making it particularly suited for manipulation tasks like pushing and picking with both prehensile and non-prehensile manipulators. Experiments with real robots demonstrate the effectiveness of this method in handling discontinuous, contact-heavy environments.

Overall, this thesis explores key aspects of motion planning, particularly for ASVs, with extensions to ground robots and mobile manipulators. It advances the MPPI framework for autonomous navigation by adapting it for multi-agent systems and introducing biases through classic and learning-based ancillary controllers. Additionally, the thesis compares MPPI to MPC in navigation experiments and to optimization fabrics in manipulation tasks. The work presented promotes the use of MPPI, especially in domains where complex nonlinear dynamics and discontinuous costs complicate the use of real-time optimization-based MPC.@en

Inclined quadrotor landing using on-board sensors and computing

Master thesis (2024) - E.L. Cancrinus (author) , R. Babuska (mentor) , D.S. van der Heijden (mentor) , L. Ferranti (graduation committee member) , G. Li (graduation committee member) , Sihao Sun (graduation committee member)

Achieving autonomous inclined landing would be an important step towards quadrotors which are able to land anywhere and under any conditions. If a quadrotor is able to safely land anywhere, even when a landing platform is not available, this would open up many useful applications. Firstly, the quadrotor could safely land whenever its battery levels get low or when contact with an operator is lost. Secondly, the quadrotor could be used for different applications, such as reconnaissance, search and rescue, infrastructure or delivery services.

There are many challenges to the field of autonomous inclined quadrotor landing. Firstly, landing safely on an inclined surface without the use of a perching mechanism requires that the quadrotor has a low approach velocity. The quadrotor should also land at the same angle as the inclination of the landing surface. To meet these constraints, the quadrotor will be required to perform an agile landing maneuver. Furthermore, the quadrotor will also have to estimate its attitude, position and velocity towards the platform during the landing maneuver. Due to the agile landing maneuver, landmark-based localization of the quadrotor will be more difficult, since these methods require the quadrotor's on-board camera to be directed at a certain landmark which can be used for guidance. Current methods for autonomous inclined landing either use external sensors or a landing mechanism to deal with the presented challenges.

During this project, an algorithm is developed which can estimate the quadrotor's attitude, position and velocity during the landing maneuver, while using only on-board sensors. State estimations are generated using two sources: a landmark-based localization algorithm and a Visual-Inertial Odometry (VIO) algorithm. The landmark-based localization algorithm uses markers placed near the landing surface to determine the quadrotor's attitude and position relative to the landing platform. Estimations from these two systems are fused by an Extended Kalman Filter (EKF). Furthermore, we train a policy network in a deep reinforcement learning approach for control of the quadrotor during the landing maneuver. We use a field-of-view constraint during the training of this policy network to keep markers used by the localization algorithm in sight of the quadrotor's on-board camera sensor during the landing maneuver.

During a series of experiments in the Gazebo simulator, we validate performance of the state estimation system during the inclined landing maneuver. We show that the marker localization algorithm's performance is improved by implementing a field-of-view constraint during the training of the policy network. We also show that state estimation by the EKF outperforms the two individual state estimation algorithms. In the Gazebo simulator, the quadrotor is able to use the state estimation system to land without the use of external sensors.

ReAIC: Reactive Active Inference Control for Enhanced Performance of Robotic Manipulators

Master thesis (2024) - A.R. Dawe (author) , R. Babuska (mentor) , Martijn Wisse (coach) , G. Franzese (coach) , C.A. van Hoof (coach)

Critically Pre-Trained Neural Cellular Automata as Robot Controllers

Master thesis (2024) - E. Guichard (author) , R. Babuska (mentor) , Tomas Mikolov (graduation committee member)

Neural Cellular Automata (NCA) have recently been proposed as neuromorphic robot controllers. Despite their promising behavioural characteristics and small parameter counts, training NCA for control tasks has proven difficult and unstable. It is so tricky that curriculum-like mul ...

Specialised Hook Gripper For Robotic Grasping of Vine Tomato Trusses

Master thesis (2024) - W.R. Kok (author) , J. L. Herder (mentor) , Robert Babuška (mentor)

In this master thesis, the development of an innovative gripper specifically designed for grasping the delicate peduncles of vine tomato trusses is presented. This research addresses a critical challenge in agricultural automation: efficiently and safely manipulating high-value crops without causing damage. The thesis begins with a comprehensive review of the current state of agricultural automation, particularly focusing on the post-harvest handling of vine tomatoes.
Building on identified limitations in existing robotic grippers and automated systems, a novel gripper design is proposed. This design uniquely approaches grasping the peduncle of vine tomato trusses. Traditionally, trusses have been grasped by the peduncle with standard pinching grippers for pick and place operations. The proposed gripper in this paper grasps the truss with a hook around the peduncle, often optimally at the centre of mass, increasing the success rate of grasping, stability, and avoiding damage to the truss. Furthermore, it has a higher tolerance for detection errors, allowing for inaccurate positioning of the robotic system. A hook-gripper can successfully grasp a wider range of tomato varieties than a pinch gripper.
The hook, consisting of two fingers, allows for more stable lifting of the truss and can handle peduncles in hard-to-grasp positions, such as in a crate filled with tomatoes. In addition to increased reach capabilities, the hook-gripper also enables manipulations like dragging and pushing.
The study involved an iterative design process, prototyping, and rigorous testing of the gripper. Key features include a hook mechanism for secure grasping, enhanced mobility for reaching into cramped spaces like packed crates, and a delicate touch to prevent bruising or damaging the fruit. The research also integrates the gripper with advanced detection systems for precise and effective operation within automated setups.
Results from extensive testing demonstrate that the newly designed gripper not only improves the success rate of grasping and manipulating vine tomato trusses but also significantly reduces the risk of damage compared to conventional pinch grippers. Individual trusses can be grasped with great success.
Testing for the different positions showed an increased range of grasping position that resulted in a successful grip. In practical experiments, the gripper performed well, lifting trusses with ease. Other test results show that the position of the peduncle is of great importance for the success rate. Test results indicated an 80% success rate in grasping trusses positioned near the edges of the crate. Replaying waypoint with "learning from demonstration" improves the grasping of the trusses compared to existing detection possibilities, and that emptying a crate is a challenging task.
The specialized hook-gripper offers insights into a practical solution for picking and placing vine tomatoes using the peduncle. This thesis contributes to the field of agricultural robotics and facilitates a step towards future innovations in the automation of high-value crop handling. The study delves into hook-gripper design and actuation, crucial for optimal performance in robotic manipulation systems.

Learning Human Preferences for Physical Human-Robot Cooperation

Doctoral thesis (2024) - L.F. van der Spaa (author) , Jens Kober (promotor) , R Babuška (promotor)

Physical human-robot cooperation (pHRC) has the potential to combine human and robot strengths in a team that can achieve more than a human and a robot working on the task separately. However, how much of the potential can be realized depends on the quality of cooperation, in whi ...

Generalizable Robotic Imitation Learning

Interactive Learning and Inductive Bias

Doctoral thesis (2024) - Rodrigo Perez-Dattari (author) , Jens Kober (promotor) , Robert Babuška (promotor)

Robots have the potential to assume tasks across various real-world scenarios. To achieve this, we require adaptable and reactive robots that can robustly deal with products and environments that present variability. For example, in the agro-food sector, each tomato plant inside ...

Robotic Grasping of Vine Tomatoes: A Learning Approach

Master thesis (2023) - L. van den Bent (author) , Robert Babuška (mentor) , Tomas Coleman (graduation committee member) , Christian Pek (coach) , Cosimo Lieu (coach)

Legible Grasping with Teleoperated Learning from Demonstration

Master thesis (2022) - M.H. van Beem (author) , Robert Babuška (mentor) , Dimitris Karageorgos (mentor) , Jens Kober (coach) , Cock Heemskerk (graduation committee member)

State-of-the-art object grasping with 7-DOF robotic manipulators requires joint configuration planning methods in order to provide position control of the end-effector. These motion planners are able to calculate a motion plan to execute a safe grasp, while t ...

Comparison of Pose Estimation Algorithms for a Deleafing Greenhouse Robot

Master thesis (2022) - V. Easwaramurthy (author) , R. Babuška (mentor)

For autonomous navigation of mobile robots in an unknown environment, mapping the robot's environment, and localizing its relative position in the environment is of utmost importance. However, in a known and controlled environment, being able to localize its position at a given t ...

Robust leaf detection and tracking in greenhouse environments

Master thesis (2021) - M.G.M. de Haas (author) , R Babuška (mentor) , Ton Ten Kate (mentor) , M. Kok (graduation committee member)

Over the last three decades, labor shortages and increased labor costs in greenhouses have driven investments in the development of agricultural robots. Priva has been developing a robot to automate the repetitive task of deleafing tomato plants. The main challenges for commercia ...

Modeling and Control of a TCPM

Master thesis (2021) - T.R. Robeerts (author) , Robert Babuška (mentor) , H Vallery (mentor) , A. Dabiri (graduation committee member)

Twisted and coiled polymer muscles (TCPMs) show promise to function as artificial muscles, because of their lightweight, low cost, large contraction, and respectively low hysteresis. A TCPM contracts when it is heated and extends when it is cooled. Different modeling and controll ...

Active Perception in Autonomous Fruit Harvesting

Viewpoint Optimization with Deep Reinforcement Learning

Master thesis (2021) - D. Klaoudatos (author) , Robert Babuška (mentor) , Raf Van de Plas (graduation committee member) , Peyman Mohajerin Mohajerinesfahani (graduation committee member) , Gert Kootstra (graduation committee member)

This MSc thesis presents the development of a viewpoint optimization framework to face the problem of detecting occluded fruits in autonomous harvesting. A Deep Reinforcement Learning (DRL) algorithm is developed in order to train a robotic manipulator to navigate to occlusion-fr ...

Generalised Motions in Active Inference by finite differences

Active Inference in Robotics

Master thesis (2020) - I.L. Hijne (author) , M Wisse (mentor) , Robert Babuška (coach) , C. Pezzato (coach)

This thesis is a contribution to the research on Active Inference for Robotics. Active Inference is an intricate, intriguing theory from neuroscience, a field in which it has already gained a greater following and popularity. This theory, based on the underlying Free Energy Principle, provides a unified account of perception, action and learning in the biological brain. It has great explanatory power of the function of the biological brain and furthermore it is mathematically well-defined. This property makes the theory suitable for a translation to robotics, in which it can also provide a unified account of action and perception. This is not only elegant, but potentially very powerful too. The research for Active Inference in robotics is young, but the current research already shows that Active Inference indeed has great potential for robotics control. Literature on Active Inference is narrow and complex, and provides a lot of concepts to work with in a translation to robotics control. Once such concept are the generalised coordinates of motion, which are the instantaneous derivatives of a dynamic variable. The incorporation of generalised coordinates, especially in combination with the assumption that the noise encountered in a dynamic environment is coloured, has great potential to be beneficial for both action and perception when it comes to robot control in real environments. Generalised coordinates provide a reference frame for the gradient descent that is applied to provide the action and perception laws, which in a dynamic setting has to `hit a moving target'. Furthermore, in combination with coloured noise the generalised coordinates are advantageous for dealing with such noise. In this thesis, detailed research is provided with regards to the application of generalised coordinates in Active Inference for robotics. Current research for robotics in which Active Inference has been applied doesn't exploit the full potential of generalised coordinates. Therefore, this research aims to explore the constructs necessary to apply generalised coordinates of motion in an on-line Active Inference control loop of an LTI State Space system. A detailed derivation of the generalised precision, which relates generalised coordinates and coloured noise, is provided. A method for obtaining generalised output by means of finite differences is proposed, that constructs generalised coordinates from the on-line data in scenarios in which the environment does not provide the required generalised coordinates naturally. The method is implemented in the simulation of a one degree of freedom SISO LTI State Space scenario which highlights the potential but also the difficulties still faced when applying Active Inference for on-line robotics control. Besides the detailed derivations of some aspects of Active Inference for robotics, open problems are identified and suggested for future research that can potentially yield methods to apply Active Inference in robotics at full capacity, providing a true biologically plausible robot control method.

Exploration and Coverage

A Deep Reinforcement Learning Approach

Master thesis (2020) - undefined Nguyen Hai Anh Nguyen Hai Anh (author) , Robert Babuška (mentor) , Manon Kok (graduation committee member) , Rodrigo Perez-Dattari (graduation committee member)

This work addresses the problem of exploration and coverage using visual inputs. Exploration and coverage is a fundamental problem in mobile robotics, the goal of which is to explore an unknown environment in order to gain vital information. Some of the diverse scenarios and appl ...

The Landing of a Quadcopter on Inclined Surfaces using Reinforcement Learning

Master thesis (2020) - R. Fris (author) , R. Babuška (mentor) , Wei Pan (graduation committee member) , Bruno de Brito (graduation committee member)

Deep Reinforcement Learning (DRL) enables us to design controllers for complex tasks with a deep learning approach. It allows us to design controllers that are otherwise cumbersome to design with conventional control methodologies. Often, an objective for RL is binary in nature. However, exploring in environments with sparse rewards is a problem in RL, and finding positive reward becomes exponentially more difficult with increased environment complexity. For this project, our objective is to design an RL based controller for the landing of a quadcopter on inclined surfaces. Landing is defined as reaching these inclined surfaces with reasonable speed, such that no damage is done to either the quadcopter or the surface to land on upon impact. We aim to use a binary reward for this task. We use methods to aid exploration in sparse reward environments, namely Hindsight Experience Replay (HER), and non-optimized demonstrations. HER can resample goals from the demonstrator data and the policy rollouts. The resampling of goals is done by considering a portion of the visited states during policy rollouts as the intended goals. The demonstrations are non-optimized in the sense that the demonstrations do not follow the same objective as ours. We consider demonstrations valid if these demonstrations are obtained from arbitrary stable policies. Our results show that the RL system does generalize to other goals when using HER and demonstrations. The demonstrations are not imitated as were to happen in pure imitation learning. HER, on the other hand, enabled us to receive reward in our complex environment, while also allowing us to experience multiple goals in one policy rollout. We found that lack of HER and demonstrations were not able to overcome the problems of exploration in sparse reward environments. We found that landing a quadcopter on inclined surfaces using an RL controller is feasible. Our trajectories clearly showed a swinging motion which in theory should be a valid control strategy for this problem. This swinging motion results in dead spots with the quadcopter being in a state with a minimal translational and rotational velocities under a relatively large angle. Further research is needed to increase the accuracy and robustness of our RL based controller.

Robotic Grasping of Deformable Food Objects

A Human-Inspired Reinforcement Learning Approach

Master thesis (2020) - J.D. Luijkx (author) , Padmaja Kulkarni (mentor) , R. Babuška (mentor) , M. Kok (graduation committee member) , M. Wisse (graduation committee member)

There are many stages that involve humans handling food objects in the processing chains from farms to stores. For some of these tasks it is desirable to look for a robotic solution to either assist the human or even take over that task, e.g. if it is physically demanding, impose ...