Paper ID: 2407.08219

Generating Contextually-Relevant Navigation Instructions for Blind and Low Vision People

Zain Merchant, Abrar Anwar, Emily Wang, Souti Chattopadhyay, Jesse Thomason

Navigating unfamiliar environments presents significant challenges for blind and low-vision (BLV) individuals. In this work, we construct a dataset of images and goals across different scenarios such as searching through kitchens or navigating outdoors. We then investigate how grounded instruction generation methods can provide contextually-relevant navigational guidance to users in these instances. Through a sighted user study, we demonstrate that large pretrained language models can produce correct and useful instructions perceived as beneficial for BLV users. We also conduct a survey and interview with 4 BLV users and observe useful insights on preferences for different instructions based on the scenario.

Submitted: Jul 11, 2024