LanSAR – Language-commanded Scene-aware Action Response
Description
Robot motion and control remains a complex problem both in general and inthe field of machine learning (ML). Without ML approaches, robot controllers are
typically designed manually, which can take considerable time, generally requiring
accounting for a range of edge cases and often producing models highly constrained
to specific tasks. ML can decrease the time it takes to create a model while simultaneously allowing it to operate on a broader range of tasks. The utilization of neural
networks to learn from demonstration is, in particular, an approach with growing
popularity due to its potential to quickly fit the parameters of a model to mimic
training data.
Many such neural networks, especially in the realm of transformer-based architectures, act more as planners, taking in an initial context and then generating a
sequence from that context one step at a time. Others hybridize the approach, predicting a latent plan and conditioning immediate actions on that plan. Such approaches may limit a model’s ability to interact with a dynamic environment, needing to replan to fully update its understanding of the environmental context. In this
thesis, Language-commanded Scene-aware Action Response (LanSAR) is proposed as
a reactive transformer-based neural network that makes immediate decisions based
on previous actions and environmental changes. Its actions are further conditioned
on a language command, serving as a control mechanism while also narrowing the
distribution of possible actions around this command. It is shown that LanSAR successfully learns a strong representation of multimodal visual and spatial input, and
learns reasonable motions in relation to most language commands. It is also shown
that LanSAR can struggle with both the accuracy of motions and understanding the
specific semantics of language commands
Date Created
The date the item was original created (prior to any relationship with the ASU Digital Repositories.)
2024
Agent
- Author (aut): Hardy, Adam
- Thesis advisor (ths): Ben Amor, Heni
- Committee member: Srivastava, Siddharth
- Committee member: Pavlic, Theodore
- Publisher (pbl): Arizona State University