Pipe failure modelling for water distribution networks using boosted decision trees
More Info
expand_more
Abstract
Pipe failure modelling is an important tool for strategic rehabilitation planning of urban water distribution infrastructure. Rehabilitation predictions are mostly based on existing network data and historical failure records, both of varying quality. This paper presents a framework for the extraction and processing of such data to use it for training of decision tree-based machine learning methods. The performance of trained models for predicting pipe failures is evaluated for simple as well as more advanced, ensemble-based, decision tree methods. Bootstrap aggregation and boosting techniques are used to improve the accuracy of the models. The models are trained on 50% of the available data and their performance is evaluated using confusion matrices and receiver operating characteristic curves. While all models show very good performance, the boosted decision tree approach using random undersampling turns out to have the best performance and thus is applied to a real world case study. The applicability of decision tree methods for practical rehabilitation planning is demonstrated for the pipe network of a medium sized city.