Unsupervised classification of ships and their operations

More Info
expand_more

Abstract

In this modern age, data is being generated constantly and data is being saved for analysis everywhere. In the maritime industry, interest in the analysis of ship data has grown over the years. In this thesis, we will take a look at AIS data coupled with sea state data. AIS data is data generated from the ship, concerning the ship locations, speed, and heading, among others. When coupled with data such as the wave height and wave directions at these locations, we can analyse the ship operations in different sea conditions. We analyzed 46 Damen ships of the same type, that operate in different regions of the world. The aim was to make interpretable groups of ships that have similar operation profiles, and to investigate the effect of different sea states on the ship operations. We first enrich the data with port labels, from which we can define trips as sequences of points away from port. We also estimate path lengths between points using Bézier curves. From this we get a relevant set of variables that can use for an unsupervised learning task. We clustered the ships using three methods: principal components analysis, Kmeans, and hierarchical clustering. Principal components analysis showed variation in the ships, but interpretation and definitive clusters were not clear. We then used the K-means method to make 12 clusters of ships, of which six clusters proved to be stable. Hierarchical clustering showed similar results. Interpretation of these clusters was possible, mainly by looking at separate trips. We therefore also clustered the trips, to get classes of trips. We used the K-means method and obtained six clusters of trips, of which five were stable. We also look at ship availability in different regions during different sea conditions. We use an isotonic regression method to test whether ships ships stay in port more often during heavy weather. We found regions where availability decreases during high waves and regions where availability seemed independent from wave height. This most likely has to do with the function of the ship. And finally we look at sailing speeds during different sea states and find that sea state data alone is not sufficient to adequately estimate sailing speeds of a ship. The conclusion is that using the variables that we created, stable clusters can be obtained. These clusters are interpretable and can lead to a better understanding of customer needs. Coupling the AIS data with more data sources however would be a recommendation, since that can lead to more informative clusters, and might lead to more insight into sailing speeds.

Files

Thesis_Tim_C.pdf
(pdf | 24 Mb)
Unknown license