Human Pose Estimation using Millimeter Wave radars has emerged as a promising alternative to traditional camera-based systems, addressing privacy and deployment constraints. While state-of-the-art Deep Learning models predominantly focus on spatial feature extraction to determine
...
Human Pose Estimation using Millimeter Wave radars has emerged as a promising alternative to traditional camera-based systems, addressing privacy and deployment constraints. While state-of-the-art Deep Learning models predominantly focus on spatial feature extraction to determine the positions of key points in the human body, this research investigates the effects of incorporating temporal dynamics in such models. It focuses of modifying an existing state-of-the-art spatial model to account for temporal dynamics and compares the performance of the two models. Long Short-Term Memory networks are used to capture temporal dependencies between frames of point clouds which significantly boosts the precision of key point detection. The proposed temporal model demonstrates a 53% reduction in Mean Absolute Error and a 45% reduction in Root Mean Squared Error compared to state-of-the-art model. Moreover, these improvements were achieved with a less complex model architecture and similar training times. The robustness of the model was further validated on a different dataset, showcasing its potential for broad application in fields such as healthcare, sports analysis, traffic monitoring and robotics. This study underscores the efficacy of temporal dynamics in pose estimation, and showcases the advantages of accounting for temporal dependencies when evaluating more complex movements.