Modern Sensory Substitution for Vision in Dynamic Environments
Document
Description
Societal infrastructure is built with vision at the forefront of daily life. For those with
severe visual impairments, this creates countless barriers to the participation and
enjoyment of life’s opportunities. Technological progress has been both a blessing and
a curse in this regard. Digital text together with screen readers and refreshable Braille
displays have made whole libraries readily accessible and rideshare tech has made
independent mobility more attainable. Simultaneously, screen-based interactions and
experiences have only grown in pervasiveness and importance, precluding many of
those with visual impairments.
Sensory Substituion, the process of substituting an unavailable modality with
another one, has shown promise as an alternative to accomodation, but in recent
years meaningful strides in Sensory Substitution for vision have declined in frequency.
Given recent advances in Computer Vision, this stagnation is especially disconcerting.
Designing Sensory Substitution Devices (SSDs) for vision for use in interactive settings
that leverage modern Computer Vision techniques presents a variety of challenges
including perceptual bandwidth, human-computer-interaction, and person-centered
machine learning considerations. To surmount these barriers an approach called Per-
sonal Foveated Haptic Gaze (PFHG), is introduced. PFHG consists of two primary
components: a human visual system inspired interaction paradigm that is intuitive
and flexible enough to generalize to a variety of applications called Foveated Haptic
Gaze (FHG), and a person-centered learning component to address the expressivity
limitations of most SSDs. This component is called One-Shot Object Detection by
Data Augmentation (1SODDA), a one-shot object detection approach that allows a
user to specify the objects they are interested in locating visually and with minimal
effort realizing an object detection model that does so effectively.
The Personal Foveated Haptic Gaze framework was realized in a virtual and real-
world application: playing a 3D, interactive, first person video game (DOOM) and
finding user-specified real-world objects. User study results found Foveated Haptic
Gaze to be an effective and intuitive interface for interacting with dynamic visual
world using solely haptics. Additionally, 1SODDA achieves competitive performance
among few-shot object detection methods and high-framerate many-shot object de-
tectors. The combination of which paves the way for modern Sensory Substitution
Devices for vision.
severe visual impairments, this creates countless barriers to the participation and
enjoyment of life’s opportunities. Technological progress has been both a blessing and
a curse in this regard. Digital text together with screen readers and refreshable Braille
displays have made whole libraries readily accessible and rideshare tech has made
independent mobility more attainable. Simultaneously, screen-based interactions and
experiences have only grown in pervasiveness and importance, precluding many of
those with visual impairments.
Sensory Substituion, the process of substituting an unavailable modality with
another one, has shown promise as an alternative to accomodation, but in recent
years meaningful strides in Sensory Substitution for vision have declined in frequency.
Given recent advances in Computer Vision, this stagnation is especially disconcerting.
Designing Sensory Substitution Devices (SSDs) for vision for use in interactive settings
that leverage modern Computer Vision techniques presents a variety of challenges
including perceptual bandwidth, human-computer-interaction, and person-centered
machine learning considerations. To surmount these barriers an approach called Per-
sonal Foveated Haptic Gaze (PFHG), is introduced. PFHG consists of two primary
components: a human visual system inspired interaction paradigm that is intuitive
and flexible enough to generalize to a variety of applications called Foveated Haptic
Gaze (FHG), and a person-centered learning component to address the expressivity
limitations of most SSDs. This component is called One-Shot Object Detection by
Data Augmentation (1SODDA), a one-shot object detection approach that allows a
user to specify the objects they are interested in locating visually and with minimal
effort realizing an object detection model that does so effectively.
The Personal Foveated Haptic Gaze framework was realized in a virtual and real-
world application: playing a 3D, interactive, first person video game (DOOM) and
finding user-specified real-world objects. User study results found Foveated Haptic
Gaze to be an effective and intuitive interface for interacting with dynamic visual
world using solely haptics. Additionally, 1SODDA achieves competitive performance
among few-shot object detection methods and high-framerate many-shot object de-
tectors. The combination of which paves the way for modern Sensory Substitution
Devices for vision.