The development of the Spiking Neural Network (SNN) offers great potential in combination with new types of event-based sensors, by exploiting the embedded temporal information. When combined with dedicated neuromorphic hardware it enables ultra-low power solutions and local on-c
...
The development of the Spiking Neural Network (SNN) offers great potential in combination with new types of event-based sensors, by exploiting the embedded temporal information. When combined with dedicated neuromorphic hardware it enables ultra-low power solutions and local on-chip learning. This work implements and presents a viable architecture and training methodology to detect and classify audio data using Spiking Neural Networks. The architecture consists of two core components: the first component is an auditory front-end that performs low-level feature extraction. The second component is the SNN classifier supported by the spike encoder and decoder. The results show that the encoder has a major impact on the overall performance of the network. The temporal-based network is trained with help of common training methods, both supervised and unsupervised. The performance of the network is validated under both clean and different levels of noisy conditions. The impact on classification performance is analyzed and compared with traditional non-spiking Artificial Neural Networks. This in terms of classification accuracy, estimate energy consumption, and latency of inference. The proposed architectures achieve a max accuracy of 97.0% under ideal conditions. This is comparable to other non-spiking artificial neural networks, which require significantly more energy for inference. The implementation demonstrates that the architecture is a viable solution for detecting and classifying audio data.