Recently non-image-based data capturing methods such as sensors like RF, ultrasonic or radars, Wi-Fi, Bluetooth, etc for People Counting (PC) applications have gained momentum as an alternative to camera-based systems due to the preference for privacy preservation. Among them mm-
...
Recently non-image-based data capturing methods such as sensors like RF, ultrasonic or radars, Wi-Fi, Bluetooth, etc for People Counting (PC) applications have gained momentum as an alternative to camera-based systems due to the preference for privacy preservation. Among them mm-wave radars are a strong choice for data capture since they consume less power, they are robust to the extremes of weather and environment. Further, they record data in the form of a collection of points making them computationally lightweight. However, such type of data becomes a challenge when used outdoors since those environments are complex. It becomes extremely difficult to separately identify individuals from the collection of points. This makes PC outcomes inaccurate when using radar data.
Therefore, in our project, we aimed to address these limitations of using mm-wave radar data. We treated groups of people as a unit, distinguishing them based on the number of people in the group. In this way, we were able to count the number of people by detecting the group size. To achieve this, we processed the raw radar output data made available by the AMS Institute into a dataset to suit our chosen deep learning model. Through a literature survey, we found that computer vision algorithms based on Deep Neural Networks (DNN), detect objects while maintaining a balance between speed and accuracy. Moreover, DNNs are capable of extracting features more effectively than traditional Machine Learning models. We selected the YOLOv8 model by Ultralytics after concluding our literature survey, as the most suitable model for our research problem. However, the selected model accepts three-channel images and videos to detect objects. Therefore, we customized the YOLOv8 model as well as formatted our radar data into a four-channel tensor input which would be acceptable by the model. We were successful in detecting groups of up to four people outdoors with a detection accuracy of 79.21%.