P. Chen | TU Delft Repository

Automated detection of failures based on service records

Master thesis (2023) - A. Boroni Grazioli (author) , P. Chen (mentor) , G. Jongbloed (graduation committee member) , F. Yu (coach)

With the ever-increasing need to reduce the use of fossil fuels, Tesla is accelerating the world's transition to sustainable energy. This means replacing all internal combustion vehicles with electric ones over time. The growing number of Tesla vehicles on the road poses interest ...

Temporal Fusion Transformer for time series forecasting

Bachelor thesis (2023) - M. Kielhöfer (author) , P. Chen (mentor) , T.W.C. Vroegrijk (graduation committee member)

The ability to accurately forecast sales volumes holds substantial significance for businesses. Current classical models struggle in capturing the impact of different variables upon the sales volume. These machine learning models are also not applicable to more than one specific ...

Transfer Learning Framework for Battery Lifetime Prediction Using Early Cycle Data: Addressing the Challenge of Limited Training Data Diversity

Master thesis (2023) - F.J.C.M. Schürmann (author) , P. Chen (mentor) , Wenda Kang (graduation committee member)

The integration of large-scale battery storage systems can aid the transition to renewable energy and stabilize energy systems for optimization. However, batteries can be cost-prohibitive and unprofitable, highlighting the need for a more comprehensive understanding and modelling ...

The integration of large-scale battery storage systems can aid the transition to renewable energy and stabilize energy systems for optimization. However, batteries can be cost-prohibitive and unprofitable, highlighting the need for a more comprehensive understanding and modelling of battery degradation. Battery degradation prediction models play a crucial role in battery manufacturing, especially when they can be created using early cycle data. However, a challenge in battery prediction is the lack of diversity for training data, leading to models that are not robust and hard to generalize. Transfer learning can address this issue, as it doesn't require the test and training data to come from similar distributions. This thesis introduces a framework that uses early cycle data to predict battery lifetimes. The framework employs a regularized method, the elastic net, for battery lifetime prediction and Transfer Component Analysis (TCA) for transfer. The major contribution of this thesis is the transfer learning framework that is proposed for battery lifetime prediction, which involves the analysis of various aspects, including when what, and how to transfer. To demonstrate the framework's performance, a dataset based on early cycle data from a real-world case study is used. The results show that the proposed methods outperform existing methods in both the simulation and case study results. Different methods are used for selecting and weighing features before the transfer, resulting in 39 out of 42 improvements in the case study results. In particular, utilizing elastic net coefficients to weigh the features before the transfer yields the optimal approach in 15 out of 21 cases and enhances the RMSE and MAPE compared to not using transfer in 38 out of 42 cases. Additionally, this study, as one of the first studies in this field, provided innovative approaches to quantitatively examine negative transfer. It conducts a comprehensive analysis mainly for univariate distributions, utilizing a robust 2-sample goodness-of-fit test to gain a deeper understanding of the relationship between transfer performance and distributional differences.

Evaluating Constant Failure Rates in Storm Surge Barriers

A Statistical Framework Applied to Censored Component Lifetimes of the Oosterscheldekering

Master thesis (2023) - J.H. Epema (author) , M Kok (graduation committee member) , Geurt Jongbloed (graduation committee member) , P. Chen (mentor) , R.R.P. van Nooyen (graduation committee member) , Alexander Maria Rogier Hoffmans (graduation committee member) , L.F. Mooyaart (mentor)

This study examines the validity of constant failure rates in the reliability assessment of storm surge barriers, with a focus on the Stormvloedkering Oosterschelde (SVKO). Analysing a dataset of 1,501 malfunctions, including 87 critical incidents over six years, we employ Expone ...

Energy Study of Drying

Using Machine Learning to Predict the Energy Consumption of an Industrial Powder Drying Process

Master thesis (2022) - M. El Ouasgiri (author) , P. Chen (mentor) , A. Papapantoleon (graduation committee member) , Robin Verhoek (coach)

In this thesis, we use data science / statistical techniques to better understand the energy consumption behind a powder drying facility located in Zwolle, as part of Abbott's initiative to better manage its energy consumption. As powder drying is by far the facility's most energ ...

Comparison Studies of Estimators for the Generalized Gamma Distribution and New Findings

Bachelor thesis (2022) - J. Chang (author) , P. Chen (mentor)

The Generalized Gamma Distribution (GGD) is a three-parameter distribution with desirable properties. For certain values of the parameters, the GGD can reduce to the gamma, exponential and lognormal distribution, among others. This makes it a flexible distribution that can be use ...

Restoration of Missing Data using a Human Adaptive Framework

The Cleansing Algorithm

Master thesis (2022) - S.S. Dijkstra (author) , P. van Buuren (mentor) , Piao Chen (mentor) , G. Jongbloed (graduation committee member) , Antonis Papapantoleon (graduation committee member)

Improving data quality is of the utmost importance for any data-driven company, as data quality is unmistakably tied to business analytics and processes. One method to improve upon data quality is to restore missing and wrong data entries.

The goal of this research is construct an algorithm such that it is possible to restore missing and wrong data entries, while making use of a human adaptive framework. This algorithm has been constructed in a modular fashion and consists of three main modules: Data Transformation, Data Structure Analysis and Model Selection. Data Transformation has concerned itself with conversion of raw data to data types and forms the other modules can use.

Data Structure Analysis has been designed to deal with correctly missing data and dichotomy in the target feature by making use of three clustering algorithms: DBSCAN, K-Means and Diffusion Maps. DBSCAN is used to determine the necessity of clustering as well as the initialisation of the K-Means algorithm. K-Means and Diffusion Maps have been used as clustering methods in the one-dimensional target feature and the two-dimensional input-target feature pairs, respectively. Data Structure Analysis has further been designed to perform feature selection through three filter methods: CorrCoef, FCBF and Treelet.

Model Selection has proposed a novel approach to selection of the best model of a candidate set through the optimisation of a conditional model ranking strategy based on the prior construction of theoretical testing. Our candidate set consisted of Expectation Maximisation, K-Means, Multi-Layer Perceptron, Nearest Neighbor, Random Forest, Linear Regression, Polynomial Regression, ElasticNet Regression.

In terms of restorability, it was shown that the optimal configuration of the Cleansing Algorithm for the restoration of missing data, was provided by opting not to use clustering, using a custom alteration to the Treelet algorithm for feature selection and making use of the model selection strategy. This not only lead to the greatest restorability of 56.90% on Aegon data sets, which was an improvement of 44.83% when compared to not using the Cleansing Algorithm, but also to the reduction of computation time by over 400%. A more realistic restorability due to the presence of correctly missing data, was given by the same configuration making use of one-dimensional output clustering. This resulted in a restorability on Aegon data sets of 43.10%. As such it was deemed possible to restore missing data on Aegon data sets.

With respect to the human adaptive framework, it was determined that the construction of the algorithm be modular in the sense that any alternate feature selection or clustering approach can be implemented with ease. Furthermore, the model selection module allows us to customize the theoretical testing and choice of regression or classification models for the restoration of missing data. In doing so, the algorithm has laid the foundations for human adaptivity of the Cleansing Algorithm.

Tail characteristics of CRPS-based distributions

Master thesis (2022) - J. Roseboom (author) , P. Chen (mentor)

In my thesis I researched the potential paths and pitfalls of the newly created ``Taillardat index''.
This index uses the tail characteristics of several CRPS-based distributions to rank forecasters on how well they forecast, with a slight emphasis on extreme events.
From ...

Process Capability Analysis Considering Asymmetric Tolerance

Bachelor thesis (2021) - L. Krudde (author) , Piao Chen (mentor) , Alessandra Cipriani (graduation committee member)

The classical process capability indices are still the most prominently used by practitioners for asymmetrical tolerances even while not accurately reflecting on process capability. It appears that an adequate measure of capability for asymmetrical tolerances is yet to be discove ...

Developing an R-package for the Gamma distribution

Bachelor thesis (2020) - K. Buis (author) , P. Chen (mentor)

Since the gamma distribution is one of the most important models, and no convenient statistical tools for this distribution are available, the aim of this project is to construct an R package for the gamma distribution. In this package five functions are created, that can be used ...