Keyword spotting (KWS) is an important task on edge low-power audio devices. A typical edge KWS system consists of a front-end feature extractor which outputs mel-scale frequency cepstral coefficients (MFCC) features followed by a back-end neural network classifier. KWS edge designs aim for the best power-performance-area metrics. This work proposes an area-efficient ultra-low-power time-domain infinite impulse response (IIR) filter-based feature extractor for a KWS system. It uses a serial architecture, and the architecture is further optimized for a low-cost computing structure and mixed-precision bit selection of the IIR coefficients while maintaining good KWS accuracy. Using a 65 nm process technology and a back-end neural network classifier, this simulated feature extractor has an area of 0.02 mm2 and achieves 3.3 μW @ 1.2 V, and achieves 92.5% accuracy on a 10-keyword, 12-class KWS task using the GSCD dataset.
@en