Super-resolution to enhance low-resolution thermal facial expression images for thermal facial emotion recognition
More Info
expand_more
Abstract
Facial emotion recognition from thermal images has gained more attention in recent years. Thermal cameras capture the heat emitted by objects and therefore thermal images are not sensitive to illumination changes. Furthermore, changes in temperature can indicate emotions and it is harder for humans to fake emotions in front of a thermal camera. However, a limitation is that, thermal cameras that capture high-resolution images are expensive, and cheaper thermal cameras often capture images with a low-resolution and/or contaminated with noise and blur. Besides, low-resolution thermal images can also arise when images are captured from a far distance or from moving persons. When using these low-resolution thermal images for facial emotion recognition this can negatively influence the emotion classification accuracy. To tackle the problem of low-resolution thermal facial expression images, super-resolution can be used. In this exploratory work, we propose the Thermal Face Super-Resolution Network (TFSRNet) and the Thermal Face Super-Resolution Generative Adversarial Network (TFSRGAN) to recover high-resolution thermal facial expression images from low-resolution thermal facial expression images, with the goal to use the super-resolved images for thermal facial emotion recognition. The architecture TFSRNet is optimized to minimize the mean squared error (MSE), which results in images with a high peak signal-to-noise ratio (PSNR). However, these images often contain an unsatisfying perceptual quality. To generate high-resolution images with a high perceptual quality we propose TFSRGAN. Both architectures use facial prior knowledge, such as facial landmark heatmaps and parsing maps, to enhance low-resolution thermal facial expression images. To emphasize the most important parts of each facial expression and to suppress irrelevant facial parts, we integrate the Convolutional Block Attention Module (CBAM) in both super-resolution architectures. The proposed super-resolution architectures are used to enhance low-resolution thermal facial expression images, which are obtained with three different degradation models, namely bi-cubic down-sampling (BI) on scale x2, x3 and x4, blurring followed by bi-cubic down-sampling (BD) on scale x3 and bi-cubic down-sampling followed by adding noise (DN) on scale x3. With an ablation study, the effectiveness of using facial prior knowledge and the attention mechanism CBAM for thermal super-resolution is shown. When using facial prior knowledge and the attention mechanism CBAM, the image quality of the super-resolved images improves. Furthermore, experiments show that images enhanced by TFSRNet outperform bi-cubic interpolated images, for degradation models BI x4, BD x3 and DN x3. Using these super-resolved images for thermal facial emotion recognition also leads to an increase of the emotion classification accuracy. In addition, images enhanced by TFSRGAN outperform bi-cubic interpolated images for degradation model DN x3. Although, this an exploratory work containing limitations, the experiments show the effectiveness of using facial prior knowledge and the attention mechanism CBAM for thermal facial expression super-resolution. In addition, thermal face super-resolution shows promising results for thermal facial emotion recognition where future work can build upon.