R.G. Hanea

Master thesis (1)

1 records found

Sequential decision making using Bayesian networks and reinforcement learning

To determine the hydrocarbon prospect exploration strategy

Master thesis (2022) - L.P. Veerhoek (author), G. F. Nane (mentor), Gabriela Florentina Nane (mentor), Gabriela F. Nane (mentor), G.F. Nane (mentor), Reidar Bratvold (graduation committee member), R.G. Hanea (graduation committee member)

In the process of drilling wells to produce hydrocarbons, an exploration strategy is used to determine which wells should be drilled and in which order. This strategy is vital, as a suboptimal drilling sequence will lead to more expenses and fewer gains.
Furthermore, the wells considered in most exploration strategies are geologically
dependent. Thus, a realistic model of these dependencies will be beneficial and
contribute to a more reliable optimal drilling strategy.
Previous research has shown that modelling similarities between the geological
properties of prospect wells in the same region and updating the drilling strategy dynamically after more information is available can add much value. However, the currently developed models are not realistic enough to use in practice. Previous models separated the process into two parts, first modelling the geological success with deterministic rewards and then applying dynamic programming to obtain the sequential drilling strategy.
This thesis proposes a new model with the addition of flexible uncertainty and
dependency models between the geological characteristics in the hydrocarbon volume calculation of a reservoir. For this purpose, non-parametric Bayesian networks in combination with copulae are used. This model and the input format of the required data align with how data is gathered in the oil and gas industry. The dependencies are modelled based on simple expert assessments.
After the new model for the geological properties of prospect wells and their dependencies is constructed, a decision policy that hinges on the observations of previously drilled wells is constructed. Previously used dynamic programming becomes infeasible with continuous variables and with a large number of wells. Therefore, the reinforcement learning algorithm Q-learning is used, which is better equipped to handle these more complex models.
The analysis shows that Q-learning achieves similar results to dynamic programming. Afterwards, the new model is applied to a case based on real oil and gas industry data. The results show that this case does not lend itself to an improvement when the drilling strategy is sequentially updated as the decisions compared to the original drilling strategy did not change. This occurs because the prospects all have high positive expected rewards and should be drilled, regardless of the other wells’ outcomes and dependencies.
Another case with synthetic data is explored with more favourable characteristics
that, in theory, should lead to the result of the policy from the new model and Qlearning outperforming the current industry policy. However, this was not shown yet within the time constraints of this thesis, as the Q-learner should be run for a more extended period and tuned according to the case.
While the model is created for dynamic sequential decision making in the application of drilling wells, this method can easily be adapted to fit any other application where decisions should be updated sequentially, and the
dependencies can be modelled through a Bayesian network.
In conclusion, the key contributions of this thesis are the creation of a new model for the dependencies between geological properties of a hydrocarbon reservoir and the use of the Q-learning reinforcement learning algorithm to compute the sequentially updated drilling strategy as an alternative to dynamic programming.