Mixed-Integer Non-linear Formulation for Optimisation over Trained Transformer Models

More Info
expand_more

Abstract

In the past few years, rapid strides have been made in the modelling of complex systems due to the advent of machine learning (ML) technologies. In particular, the transformer neural network (TNN) has gained a lot of attention due to its powerful “sequence-to-sequence” modelling for several tasks in science and engineering. Predictive ML models are trained to learn the relationship between a set of input and output data so that new input data can be mapped to their expected output values. This gives us the ability to understand the key factors impacting a system and to make predictions about its future behaviours. This research investigates how to use TNNs for decision making by optimising the output of a TNN. To this end, a non-convex mixed-integer non-liner programming (MINLP) formulation for a trained TNN is proposed. The proposed formulation facilitates solving problems with TNNs embedded
to global optimality. The effectiveness of the formulation is tested on three case studies consisting of an optimal trajectory problem, a verification problem, and a reactor optimization case study. Results show that optimisation over small TNNs can be achieved in under 3 minutes. However the tractability of the formulation quickly vanishes for larger models, highlighting the need for further research to refine the proposed formulation.

Files

Thesis_SHallsworth.pdf
(pdf | 4.12 Mb)
Unknown license