Applications of Statistical Theory to Sensor Data Analysis

Ciszewski, M.G.

Abstract

Technological progress irreversibly changes the nature of sports. The relevance of technology in sports can be seen with relative ease to most spectators in tennis, football and many other elite sports. Some technologies have changed the sport in a way that many spectators might not be aware of. Behind any professional sport, there are countless hours of training and preparation. Athletes are pushing their own limits in achieving perfection. Coaches are trying to make sure that the training the athletes go through results in improvement of their performance, but without straining themselves too much which can lead to an injury. The technology of today helps with this training process and coaches need to be able to use it to provide good feedback to their athletes.
This thesis is written in the context of the Citius Altius Sanius (CAS) project aimed at injury prevention and performance improvement in sports. The CAS project combines the expertise of data scientists, industrial designers and biomechanical engineers together with the resources of sports associations and sports equipment designers among others. The goal of the CAS project is to initiate collaboration between various universities and departments to develop sensor technology, provide analysis based on the sensor data and provide a clear guideline of feedback to the athlete.
The primary goal of this thesis is to extract meaningful insights from sensor data through statistical modeling. Two sources of sensor data are used within the thesis: data from prototype sensor trousers worn by football players during training and data from a sensor sleeve worn by tennis players during serve practice. The research employs supervised learning algorithms within the framework of machine learning and deep learning models for capturing intricate patterns in the data as well as functional data analysis techniques such as functional principal components analysis and functional regression models applied for imputation purposes and dimension reduction.
We used neural network architecture, which mixes both convolutional and recurrent layers, consistently throughout this thesis. The main application of this network lies in recognizing football-related activities using sensor data. The neural network achieves good accuracy and is easily adaptable to other human activity recognition problems. We also considered various other models for this task, however none could match the computational speed and accuracy of the neural network. Nonetheless, given a plethora of methods that were tested and dissatisfaction with the accuracy measures used to assess the goodness-of-fit of the tested methods, a novel quality measure was introduced for activity recognition problems, to leverage the domain knowledge for the purpose of determining accuracy of an activity recognition method. In the case of our application, one of the constraints is the length of activities that are predicted. This measure accounts for the fact that activities such as jumping or passing a ball realistically have a minimum duration. Instances where a prediction model outputs an activity shorter than physically plausible incur harsh penalties.
We also propose a novel post-processing procedure tailored specifically to human activity recognition problems, ensuring that predictive models adhere to physical constraints, such as the minimum duration of activities. This post-processing method aims to increase the accuracy of prediction models which violate these constraints and as a result, to narrow the gap in accuracy between different prediction methods.
In the context of tennis, we encountered difficulties in predicting the serve performance metrics using sensor data. While predicting the ball speed can be easily achieved, accurately predicting the velocity-accuracy index (VA index), which combines ball speed with serve accuracy, proved more complex. To assess the effectiveness of our model in distinguishing true predictions from noise, we applied a permutation test. Notably, the main contribution of this research lies in the rigorous formulation of the null hypothesis for this test, linking it to established permutation test theory.
This research contributes to the fields of sports science and data analysis by offering insights into activity recognition and performance prediction using sensor data. The methodologies developed here have potential applications across various other sports as well as activities unrelated to sports. While data provided for purposes of this research comes from wearable sensors, it is possible to also apply these models and procedures in other types of sensor data or even beyond.

Applications of Statistical Theory to Sensor Data Analysis

Abstract

Files