Free competition in the insurance markets increases the competitiveness and lowers the premiums. If insurers lower their premiums without having a model that accurately quantifies the expected claim size, they can be in serious trouble. This research aims to accurately model the
...
Free competition in the insurance markets increases the competitiveness and lowers the premiums. If insurers lower their premiums without having a model that accurately quantifies the expected claim size, they can be in serious trouble. This research aims to accurately model the premiums and quantify the uncertainty involved using historic claims data from an insurer. The current approach, Generalized Linear Models (GLMs), is compared to some Machine Learning techniques: Random Forests (RFs) and Gradient Boosting Machines (GBMs). Insights gained from these models and other methods (MARS) are then used to improve the GLMs. Bayesian Additive Regression Trees (BART) and Hierarchical Models (HMs) are then used to quantify the uncertainty. HMs provides the insurer with the means to make proper credible intervals for the total expected claim size of the active portfolio. The HMs also allow the use of risk premium principles that include measures of uncertainty in the pricing of premiums. All relevant models are applied on the dataset of the holdout year. It is apparent that the GLMs and HMs provide too low estimations of the premiums when the profit is tracked. It is therefore prudent to either use the HM with a risk premium principle that incorporates a percentage of the standard deviation in the estimation of the premiums or apply the RF model. The model lift shows that the Machine Learning techniques are better at recognising the risky policies from the non-risky. We recommend the insurer to use RFs to price the premiums and HMs to measure the uncertainty of the active portfolio. It is recommended for further study to either apply other techniques to further improve the predictive performance, to improve the structure of the Hierarchical Model or to include left truncation and right censoring into the model.