When dealing with datasets where the observations are obtained from the same cross-sectional units at multiple time points, most of the times, heterogeneity arises across he cross-sectional units. If one ignores this heterogeneity, assuming that the data are pooled, the parameter
...
When dealing with datasets where the observations are obtained from the same cross-sectional units at multiple time points, most of the times, heterogeneity arises across he cross-sectional units. If one ignores this heterogeneity, assuming that the data are pooled, the parameters estimations run the risk of being inconsistent. This thesis studies the difference between panel data and pooled data models with regard to their construction procedure and their predictive performance. An application is discussed per credit risk modelling for a mortgage portfolio. Therein, different models were constructed, covering pooled and panel linear models and pooled and panel logistic models. By model performance and testing comparison, we found that by adding the heterogeneity effect in the regression model the discriminatory power is improved. At the same time, however, it provides lower predicted losses than the observed ones. We have also noted that, most of the times, the pooled model fails to estimate accurate predictions. This thesis has been carried out jointly with TU Delft / Department of Applied Mathematics and the Central Risk Management / Model Validation department of ABN AMRO Bank.