Clustering satellite data to define eutrophication monitoring zones based on chlorophyll-a concentration

More Info
expand_more

Abstract

OSPAR's Commission has been battling eutrophication since the problem was first established in the 1950s. To battle eutrophication, an important factor is to monitor it. Five indicators are used together to assess the status of eutrophication, determined by the Common Procedure. These are the chlorophyll-a concentration, the turbidity, the nitrate and phosphorus concentration, the oxygen levels and the biological water quality. All five indicators need to be known to obtain the final eutrophication status. However, just looking at the chlorophyll-a concentration on its own is also a good measure. This thesis focuses only on the chlorophyll-a concentration as an indicator for eutrophication.

To monitor the North Sea, the OSPAR's Commission has established eutrophication monitoring zones. The aim of this study is to determine eutrophication monitoring zones based on available satellite data of the chlorophyll-a concentration in the Dutch part of the North Sea. The zones are defined using four clustering algorithms: K-means clustering, Hierarchical clustering, Random Forest clustering and HDBSCAN. The results from these clustering algorithms are compared to both each other and to the previously defined eutrophication zones.

First, the case study region is split into two areas: the coastal area, which lies closer to the shore, and the offshore area, which lies farther away from the shore. The best result for this separation was generated by K-means clustering with two clusters.

Afterwards, the eutrophication zones are determined separately in the offshore area and the coastal area. The clustering results are ranked based on four criteria. The first criterion is correspondence to OSPAR's previously defined eutrophication monitoring zones. The second criterion is the similarity of the clusters to the zones that are visible in the data. The third criterion is the performance determined by validation metrics. This criterion was considered less important because of the lack of ability to capture the goals of the research. The last criterion is confirmation through the HDBSCAN clustering. This was added later during the study when it was found that HDBSCAN yielded very accurate results. Due to how HDBSCAN works these accurate results were not usable directly, as the number of clusters this yields it too high, but they were usable for verification. The best results were found through random forest clustering with respectively nine and five clusters for the offshore and coastal areas.

Subsequently, the zones derived from clustering were compared to other data to see whether the determined monitoring zones also hold over time. This appeared to be the case.
Moreover, the distribution of the chlorophyll-a concentration for each zone is determined. Additionally, the trend of the chlorophyll-a concentration of one determined monitoring zone is analysed over time. Lastly, the defined eutrophication monitoring zones are compared to other defined zones within the Dutch North Sea coast. These other zones were fishery policies, marine protected areas, spatial planning, and bathymetry. The comparison validated the defined monitoring zones.