Eye Tracking-Based Desktop Activity Recognition with Conventional Machine Learning

More Info
expand_more

Abstract

Cognitive processes have been used in recent years for context sensing and this has shown promising results. Multiple sets of features have shown good performance but no set of features has been determined the best for classifying gaze data. This paper looks at different feature sets and the heterogeneity of gaze signals from subjects and hardware to determine what impacts the performance of the classifiers and what returns the best results. These results are compared with deep learning classifiers using the same data set to determine which performs better.

For the different feature sets, saccade features show great positive influence on the accuracy (88\% accuracy) but fixation features show a significant lower ability to classify correctly (63% accuracy), a combination of some fixation and saccade features show the best results(95% accuracy). The way the data is split, has a huge impact on the performance, splitting the data on every activity gives an accuracy of 95%, while the splitting on subjects only reaches a maximum of 60% accuracy. Deep learning algorithms perform only slightly better at 97% accuracy but dropping down massively (38%) when splitting the data over subjects.

The main conclusions from this research revolve around feature selection and subject bias. Saccade features have the most impact on the classification of activity recognition using eye tracking data. Each subject performs each task in a significantly different way which drastically decreases performance when completely new subject data is tested on a trained classifier. Deep learning classifiers show similar results and back up the importance of the heterogeneity of the data. The evaluation of different types of hardware has not been accomplished in this research due to time constraints.