Indoor self-localization has become a highly desirable system function for smartphones. The existing systems based on imaging, radio frequency, and geomagnetic sensing may have sub-optimal performance when their limiting factors prevail. In this paper, we present a new indoor sim
...
Indoor self-localization has become a highly desirable system function for smartphones. The existing systems based on imaging, radio frequency, and geomagnetic sensing may have sub-optimal performance when their limiting factors prevail. In this paper, we present a new indoor simultaneous localization and mapping (SLAM) system that is based on the smartphone's built-in audio hardware and inertial measurement unit (IMU). Our system uses a smartphone's loudspeaker to emit near-inaudible chirps and then the microphone to record the acoustic echoes from the indoor environment. The echoes contain the smartphone's location information with sub-meter granularity. To enable SLAM, we apply contrastive learning to train an echoic location feature (ELF) extractor, such that the loop closures on the smartphone's trajectory can be accurately detected from the associated ELF trace. The detection results effectively regulate the IMU-based trajectory reconstruction. The reconstructed trajectories are used for <italic>trajectory map superimposition</italic> and <italic>room geometry reconstruction</italic>. Extensive experiments show that our SLAM achieves median localization errors of <inline-formula><tex-math notation="LaTeX">$\text{0.1}\,\text{m}$</tex-math></inline-formula>, <inline-formula><tex-math notation="LaTeX">$\text{0.53}\,\text{m}$</tex-math></inline-formula>, and <inline-formula><tex-math notation="LaTeX">$\text{0.4}\,\text{m}$</tex-math></inline-formula> in a living room, an office, and a shopping mall, and outperforms both the Wi-Fi and geomagnetic SLAM systems. The room geometry reconstruction achieves up to 4× lower errors compared with the latest echo-based approaches.
@en