Utility operators must rely on predictive analyses regarding the availability of their subsurface assets, which highly depend on damage by increasing amounts of excavation work. However, straightforward use of standard statistical techniques, such as logistic regression or Bayesi
...
Utility operators must rely on predictive analyses regarding the availability of their subsurface assets, which highly depend on damage by increasing amounts of excavation work. However, straightforward use of standard statistical techniques, such as logistic regression or Bayesian logistic regression, does not allow for accurate predictions of these rare events. Therefore, in this paper, alternative approaches are investigated. These approaches involve weighting the likelihood as well as over-and undersampling the data. It was found that these data methods could substantially improve the accuracy of predicting rare failure events. More specifically, an application based on the real data of a Dutch water utility operator showed that undersampling and weighting improved the balanced accuracy, varying between 0.61 and 0.66, whereas the proposed methods resulted in failure predictions on between 38% and 58% of the validation data set. Hence, the proposed methods will enable utility operators to arrive at more accurate forecasts, enhancing their asset operation decision-making.
@en