Data-Driven Optimization of Slow Sand Filters

Machine Learning for New Design Paradigms

More Info
expand_more

Abstract

This thesis presents a comprehensive investigation into the optimization of Slow Sand Filter (SSF) performance through data driven modeling, with a focus on bacterial (E. coli, Coliform) and viral (Enterovirus, Adenovirus, Bacteriophage MS2) removal efficiencies. Combining an extensive literature review, exploratory data analysis (EDA), and machine learning modeling using XGBoost, this research provides new insights into the complex interactions between design parameters and SSF performance, ultimately contributing to the development of more efficient and sustainable SSF configurations. The study began with a systematic literature review of 135 experimental studies, identifying key design parameters and operational conditions of SSFs, including Filtration Rate (FR), Hydraulic Retention Time (HRT), Effective Grain Size (D10), Total Bed Height (T.B.H.), and Schmutzdecke characteristics. The literature review was supplemented with Exploratory Data Analysis (EDA) techniques such as scatterplots and correlation matrices to uncover trends and potential relationships between design parameters and removal efficiencies for bacteria and viruses. While the scatterplots and correlation matrices revealed limited linear relationships, they highlighted the complexity of the interactions, indicating the need for advanced modeling techniques capable of capturing nonlinear patterns. To address these complexities, XGBoost, a powerful machine learning model known for its ability to handle nonlinear relationships and parameter interactions, was developed and applied to the dataset. The model’s predictive capabilities were evaluated using performance metrics such as R², Mean Squared Error (MSE), and cross-validation results. Additionally, feature importance analysis was conducted using the XGBoost model, which provided insights into which design parameters and operational conditions had the most significant influence on removal efficiencies. A key contribution of this research was the development of interactive tools, such as feature contribution sliders and interactive heatmaps, which enabled dynamic exploration of how changes in design parameters affect removal efficiencies. These tools facilitated the simulation of design scenarios, offering an intuitive understanding of parameter impacts and guiding the formulation of new design paradigms. The model outputs were used to propose optimized parameter ranges that improve bacterial and viral removal while maintaining operational efficiency. The thesis further explored the robustness and reliability of the XGBoost models, discussing the results from controllable and uncontrollable design parameters and operational conditions and comparing the model’s predictive performance for bacterial and viral removal. In conclusion, this research highlights the effectiveness of XGBoost modeling for both predicting and explaining SSF performance. The insights gained provide a foundation for developing more efficient SSF design paradigms, enabling targeted adjustments to parameters based on model predictions. Additionally, the interactive tools created in this study offer practical resources for engineers to simulate different design configurations and optimize SSF performance. Overall, this thesis emphasizes the potential of machine learning to revolutionize the design and operation of SSFs, providing a foundation for future advancements in sustainable and effective water treatment technologies.

Files

MSc_Thesis_Oguzhan_Kobya_Defin... (pdf)
Unknown license
warning

File under embargo until 31-03-2026