A human driver can gauge the intention and signals given by other road users indicative of their future behaviour. The intentions and signals are identified by looking at the cues originating from vulnerable road users or their surroundings (hand signals, head orientation, postur
...
A human driver can gauge the intention and signals given by other road users indicative of their future behaviour. The intentions and signals are identified by looking at the cues originating from vulnerable road users or their surroundings (hand signals, head orientation, posture, traffic signals, distance to curb, etc.). Taking all these cues into account by creating a separate detector for each is an extremely difficult task. Instead, this MSc Thesis will explore the possibility of using a generic contextual cue in optical flow originating from a pedestrian with deep learning methods to improve the path prediction in a naturalistic driving scenario. The contribution of this work is to examine multiple ways to extract relevant information from the optical flow and also explore the possibility of using the entire the high-dimensional optical flow using convolutions and soft-attention to help identify relevant pixels for the prediction task. This work elaborates on the extraction and processing of optical flow features. It proposes 2 Recurrent neural networks (RNN) based model: one to work with the histogram of optical flow features and the other one to take in the dense optical flow directly. Also, visualization of the soft-attention weights is done to add a step that helps in the interpretability of the RNN model incorporating dense optical flow. From the experimental results, optical flow features have shown significant improvements in terms of predicting probabilistic confidence for tracks with some changes in their motion mode. It was seen that the convolution-attention RNN model was able to work with dense optical flow features and position of pedestrians as input to obtain better results among all the combinations of features and models compared in this work.