Program Synthesis from Rewards using Probe and FrAngel

Filat, N.

Program Synthesis from Rewards using Probe and FrAngel

Impact of Exploration-Exploitation Configurations on Probe and FrAngel in Minecraft

Bachelor thesis (2024)

Authors

N. Filat Electrical Engineering, Mathematics and Computer Science

Contributors

Sebastijan Dumančić Algorithmics (mentor)

T.R. Hinnerichs Algorithmics (mentor)

J.W. Böhmer Sequential Decision Making (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Program Synthesis Probe Minecraft Rewards MineRL FrAngel Exploration-Exploitation

To reference this document use:

http://resolver.tudelft.nl/uuid:20952d2e-436d-4ff3-ad1e-eaedeaa9db76

More Info

expand_more

Published Date

21-06-2024

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Program synthesis involves finding a program that meets the user intent, typically provided as input/output examples or formal mathematical specifications. This paper explores a novel specification in program synthesis - learning from rewards.
We explore existing synthesizers, Probe and FrAngel, to solve navigation tasks inside the popular Minecraft game. The problem formulation is inspired by reinforcement learning but was adapted to program synthesis. Similar to reinforcement learning, balancing exploration and exploitation is essential for solving the task efficiently. Excessive exploration can prevent finding the correct program because the feedback from the environment is not used. On the other hand, excessive exploitation is not ideal, as seemingly promising programs might not lead to the actual solution. This work compares different trade-offs between exploration and exploitation of Probe and FrAngel when applied to Minecraft environments.

Files

Final_Paper.pdf

(pdf | 0.746 Mb)

Unknown license