We investigate how an Unmanned Air Vehicle (UAV) can detect manned aircraft with a single microphone. In particular, we create an audio data set in which UAV ego-sound and recorded aircraft sound can be mixed together, and apply convolutional neural networks to the task of air tr
We investigate how an Unmanned Air Vehicle (UAV) can detect manned aircraft with a single microphone. In particular, we create an audio data set in which UAV ego-sound and recorded aircraft sound can be mixed together, and apply convolutional neural networks to the task of air traffic detection. Due to restrictions on flying UAVs close to aircraft, the data set has to be artificially produced, so the UAV sound is captured separately from the aircraft sound. The aircraft data set is collected at Lelystad airport by capturing flyovers with a microphone array. It is mixed with UAV recordings, during which labels are given indicating whether the mixed recording contains aircraft audio or not. The mixed recordings are the input for a model that determines whether an aircraft is present or not. The model is a CNN which uses the features MFCC, spectrogram or Mel spectrogram as input. For each feature the effect of UAV/aircraft amplitude ratio, the type of labeling, the window length and the addition of third party aircraft sound database recordings is explored. The results show that the best performance is achieved using the Mel spectrogram feature. The performance increases when the UAV/aircraft amplitude ratio is decreased, when the time window is increased or when the data set is extended with aircraft audio recordings from a third party sound database. It is not desirable to train the model on distant approaches and test them on nearby approaches as the performance then drops. The results also prove that the performance increases the closer the aircraft is. Although the currently presented approach has a number of false positives and false negatives, that is still too high for real-world application, this study indicates multiple paths forward that can lead to an interesting performance. In addition, the data set is provided as open access, allowing the community to contribute to the improvement of the detection task.