Full metadata
Title
Differentiable Harvard Machine Architecture with Neural Network Controller
Description
There have been multiple attempts of coupling neural networks with external memory components for sequence learning problems. Such architectures have demonstrated success in algorithmic, sequence transduction, question-answering and reinforcement learning tasks. Most notable of these attempts is the Neural Turing Machine (NTM), which is an implementation of the Turing Machine with a neural network controller that interacts with a continuous memory. Although the architecture is Turing complete and hence, universally computational, it has seen limited success with complex real-world tasks.
In this thesis, I introduce an extension of the Neural Turing Machine, the Neural Harvard Machine, that implements a fully differentiable Harvard Machine framework with a feed-forward neural network controller. Unlike the NTM, it has two different memories - a read-only program memory and a read-write data memory. A sufficiently complex task is divided into smaller, simpler sub-tasks and the program memory stores parameters of pre-trained networks trained on these sub-tasks. The controller reads inputs from an input-tape, uses the data memory to store valuable signals and writes correct symbols to an output tape. The output symbols are a function of the outputs of each sub-network and the state of the data memory. Hence, the controller learns to load the weights of the appropriate program network to generate output symbols.
A wide range of experiments demonstrate that the Harvard Machine framework learns faster and performs better than the NTM and RNNs like LSTM, as the complexity of tasks increases.
In this thesis, I introduce an extension of the Neural Turing Machine, the Neural Harvard Machine, that implements a fully differentiable Harvard Machine framework with a feed-forward neural network controller. Unlike the NTM, it has two different memories - a read-only program memory and a read-write data memory. A sufficiently complex task is divided into smaller, simpler sub-tasks and the program memory stores parameters of pre-trained networks trained on these sub-tasks. The controller reads inputs from an input-tape, uses the data memory to store valuable signals and writes correct symbols to an output tape. The output symbols are a function of the outputs of each sub-network and the state of the data memory. Hence, the controller learns to load the weights of the appropriate program network to generate output symbols.
A wide range of experiments demonstrate that the Harvard Machine framework learns faster and performs better than the NTM and RNNs like LSTM, as the complexity of tasks increases.
Date Created
2020
Contributors
- Bhatt, Manthan Bharat (Author)
- Ben Amor, Hani (Thesis advisor)
- Zhang, Yu (Committee member)
- Yang, Yezhou (Committee member)
- Arizona State University (Publisher)
Resource Type
Extent
51 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.57204
Level of coding
minimal
Note
Masters Thesis Computer Science 2020
System Created
- 2020-06-01 08:19:58
System Modified
- 2021-08-26 09:47:01
- 3 years 2 months ago
Additional Formats