Like squinting your eyes: The impact of different fusion modules on change detection with deep learning
More Info
expand_more
Abstract
Change detection with remote sensing data highlights se- mantic differences in an area between two or more time intervals. It involves the comparison of aerial photographs of the same location taken some time apart. This faci itates mass scale analysis of urban and rural data over time, including population trends, city expansion trends and illegal building detection. State-of-the-art methods for the task are predominantly deep learning networks, following an encoder-decoder architecture. These architectures all share the trait of having a ”fusion” point - a location in the network where inputs transition from being processed independently to becoming correlated. F sions can be classified in three categories: early, middle and late, depending on how deep within the network they occur. This study aims to show how changing the fusion impacts the size, spread and number of changes detected. It is motivated by how the receptive field of feature maps in convolutional neural networks expands in deeper layers, extracting features with different complexities. For this, four fusion architectures on three different datasets are compared: LEVIR-CD, HiUCD and a new, fully-controled dataset, CSCD. In terms of test accuracy and the changes’ size and spread, results are inconclusive. Which fusion achieves the highest performance varies per dataset. Possible reasons why include the complexity of remote sensing data and general differences between areas, but this is a subject of further study. The only conclusive category is the number of changes detected. On aver- age, all architectures overestimate the number of changes in a scene. When the accuracy of architectures is com- parable, however, early fusion overestimates the number of objects changed the most, while middle and late fusion give more realistic estimates. The case study has room for refinement in problem isolation, more data and extending the problem towards more architectures, but is a promising step towards understanding fusion.