Study of Knowledge Transfer Techniques For Deep Learning on Edge Devices
The purpose of this work is to provide an extensive study on the performance (both in terms of accuracy and convergence speed) of knowledge transfer, considering different student-teacher architectures, datasets and different techniques for transferring knowledge from teacher to student.
A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact. For example, a smaller and shorter network, trained with knowledge transfer on Caltech 101 achieved a significant improvement of 7.36\% in the accuracy and converges 16 times faster compared to the same network trained without knowledge transfer. On the other hand, smaller network which is thinner than the teacher network performed worse with an accuracy drop of 9.48\% on Caltech 101, even with utilization of knowledge transfer.
- Author (aut): Sistla, Ragini
- Thesis advisor (ths): Zhao, Ming
- Committee member: Zhao, Ming
- Committee member: Li, Baoxin
- Committee member: Tong, Hanghang
- Publisher (pbl): Arizona State University