Forwarding Frames Fast
Minimizing video latency for bilateral teleoperation
More Info
expand_more
Abstract
The pandemic showed the power of video calls, as it was used to accomplish a multitude of tasks that normally required physical attendance. Expanding on the video call’s transmission of sight and sound, haptic bilateral teleoperation aims to add the sense of touch and the ability to manipulate a remote envir- onment as if the user is physically present. Haptic bilateral teleoperation is a technique where a robot mimics the actions of a human operator, who in turn receives feedback from the robot’s sensors. The technique requires low-latency haptic feedback, in the range of single-digit milliseconds, which is completely im- practical when transmitting force measurements over a large distance network. Instead, the low latency requirements can be fulfilled with Model Mediated Teleoperation (MMT) to predict the user’s perceived force feedback. With the addition of force feedback, the latency of the video feedback becomes an area of greater concern. Force and video feedback in conjunction are necessary for true immersion, but using MMT to generate real-time video feedback is im- practical. This work explores a configuration where MMT is used to generate haptic feedback and video feedback is transmitted from the remote environment to the user. Transmittance of video feedback is made possible due to its relaxed latency requirements compared to force feedback. The sources of video latency can be divided into two, latency caused by the network transmission and latency from the non-network components. While minimization of both is valuable, any time saved on the non-network components can be spent on expanding the net- work and maximizing the technology’s range. In this work, we aim to answer the question ”How can we minimize the latency caused by the non-network components of a live video feedback system?” Our approach is to design a field- programmable gate array (FPGA) configuration to minimize all latencies in a live video system excluding the camera, network, and monitor components. We demonstrate a no-network live video system with 34 ms latency, of which around 5 ms originate from the FPGA. This is a marked improvement of the 63 ms latency our department achieved using a non-FPGA solution. Utilization of cameras and monitors designed for low latencies is recommended for bet- ter results, with a hypothesized latency of 11.5 ms for a high-end non-network approach.
Files
File under embargo until 29-08-2026