Acoustic Perception in Intelligent Vehicles using a single microphone system
More Info
expand_more
Abstract
Passive acoustic sensing utilizes the ability of sound to travel beyond the line-of-sight to understand the surroundings. This provides an advantage over the currently used sensors in Intelligent Vehicles that can sense obstacles within their line-of-sight only. Recently, a localization based approach has been implemented to take advantage of this sensing modality to predict approaching vehicles behind the blind corner in an urban scenario. While this approach shows a lot of promise, there is a difficulty in integrating the multi-microphone system. Additionally, the system would be unable to differentiate between the nature of two sound sources. This motivates the exploration of a classification based approach which uses audio data from only a single microphone to identify the sound sources present in them. This thesis investigates the possibility of having such a system on the Intelligent Vehicle to predict approaching vehicles from behind the blind corners. A review of the literature revealed that techniques categorized under Sound Event Detection (SED) are suitable to implement a classification based approach. The prediction of the vehicle is treated as a binary classification problem and a Convolutional Recurrent Neural Network (CRNN) is used as the acoustic model to detect the presence of an approaching car in the audio sample represented by Log Mel Spectrogram features. Additionally, domain adaptation techniques were implemented to explore the possibility
of improving the system performance with limited data collected while the ego-vehicle is driving. Experiments carried out indicate that when the ego-vehicle is static, the system performs well with the approaching vehicle predicted 1.4s before it is in line-of-sight and a balanced accuracy of 86.9% achieved for the classification task. However, the system achieved an accuracy of 68% on the samples recorded while the ego-vehicle was driving. Further experiments indicate that the acoustic model cannot generalize well to unseen situations in most cases and experiment with domain adaptation did not show
any improvement in performance.