Searching for objects in their surrounding is challenging for blind and visually impaired individuals (BVI) in daily life. Current assistive technologies powered by large language models (LLMs) and vision language models (VLMs) can offer BVI scene descriptions through conversations. However, communication is often inefficient in helping BVI to reach daily objects or destinations, because those general purpose LLMs/VLMs are not optimized for interpreting or conveying spatial information. We developed a smart glass solution that can utilize open vocabulary object detection models to aid BVI in searching/reaching for a variety of specific objects that are not limited to fixed categories of model training. In our implementations, video streams from the glasses can be processed using open vocabulary object detection models either locally or on other connected devices, such as a smartphone or computer. User can input custom search prompt verbally. This hands-free solution allows people to naturally scan their surroundings by moving their heads, and the stereo audio tones provide directional cues in horizontal and vertical directions to help zero in on the targets, so that it becomes possible to reach these objects accurately. We conducted a human subject pilot study involving 5 blindfolded individuals who reached specific objects (e.g. grabbing the red bottle; reaching the empty chair) among other distractors. The smart glasses solution was compared with Ray-Ban Meta glasses that were running built-in Meta AI for scene recognition. The average task time with our solution (53 seconds) was significantly lower than Meta glasses (126 seconds, p<0.001). The device was also demonstrated to successfully aid a blind user in a grocery shopping scenario. This work shows that active orientation guidance, which is typically lacking in VLMs but provided by our smart glasses solution, can aid in interaction with surrounding environment, such as when reaching for objects and destinations.
Abstract
Journal Article
eng
42266717
Singh, Aditya, et al. "Assisting the Blind to Reach Daily Objects Using Smart Glasses." Displays, vol. 93, 2026.
Singh A, Bhanushali MA, Luo J, et al. Assisting the blind to reach daily objects using smart glasses. Displays. 2026;93.
Singh, A., Bhanushali, M. A., Luo, J., Luo, G., & Pundlik, S. (2026). Assisting the blind to reach daily objects using smart glasses. Displays, 93. https://doi.org/10.1016/j.displa.2026.103390
Singh A, et al. Assisting the Blind to Reach Daily Objects Using Smart Glasses. Displays. 2026;93 PubMed PMID: 42266717.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR
T1 - Assisting the blind to reach daily objects using smart glasses.
AU - Singh,Aditya,
AU - Bhanushali,Meet Anil,
AU - Luo,Jingwu,
AU - Luo,Gang,
AU - Pundlik,Shrinivas,
Y1 - 2026/02/12/
PY - 2026/07/01/pmc-release
PY - 2026/6/10/medline
PY - 2026/6/10/pubmed
PY - 2026/6/10/entrez
KW - activities of daily living
KW - object detection
KW - open vocabulary object detection
KW - smart glasses
KW - vision language model
KW - wearable assistive technology
JF - Displays
JO - Displays
VL - 93
N2 - Searching for objects in their surrounding is challenging for blind and visually impaired individuals (BVI) in daily life. Current assistive technologies powered by large language models (LLMs) and vision language models (VLMs) can offer BVI scene descriptions through conversations. However, communication is often inefficient in helping BVI to reach daily objects or destinations, because those general purpose LLMs/VLMs are not optimized for interpreting or conveying spatial information. We developed a smart glass solution that can utilize open vocabulary object detection models to aid BVI in searching/reaching for a variety of specific objects that are not limited to fixed categories of model training. In our implementations, video streams from the glasses can be processed using open vocabulary object detection models either locally or on other connected devices, such as a smartphone or computer. User can input custom search prompt verbally. This hands-free solution allows people to naturally scan their surroundings by moving their heads, and the stereo audio tones provide directional cues in horizontal and vertical directions to help zero in on the targets, so that it becomes possible to reach these objects accurately. We conducted a human subject pilot study involving 5 blindfolded individuals who reached specific objects (e.g. grabbing the red bottle; reaching the empty chair) among other distractors. The smart glasses solution was compared with Ray-Ban Meta glasses that were running built-in Meta AI for scene recognition. The average task time with our solution (53 seconds) was significantly lower than Meta glasses (126 seconds, p<0.001). The device was also demonstrated to successfully aid a blind user in a grocery shopping scenario. This work shows that active orientation guidance, which is typically lacking in VLMs but provided by our smart glasses solution, can aid in interaction with surrounding environment, such as when reaching for objects and destinations.
SN - 0141-9382
UR - https://www.unboundmedicine.com/prime/citation/42266717/Assisting_the_blind_to_reach_daily_objects_using_smart_glasses.
DB - PRIME
DP - Unbound Medicine
ER -


