University of Cambridge > Talks.cam > Computer Laboratory Computer Architecture Group Meeting > Redundancy in Deep Neural Networks and Its Impacts to Hardware Accelerator Design

Redundancy in Deep Neural Networks and Its Impacts to Hardware Accelerator Design

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Robert Mullins.

Hardware systems for neural networks are limited in their applicability to power-constrained hardware environments as they are both highly compute and memory intensive. As a result, model-level redundancy approaches such as dropout, pruning and parameter compression have been proposed to increase classification accuracy and/or lower hardware complexity. Additionally, significant data-level redundancy of the weight parameters has been consistently shown to produce comparable classification accuracy to their floating point equivalent models. As a result, there’s recently been a growing interest in networks with low-precision weight representations, especially ones with only 1 or 2 bits. Such computational structures significantly reduce the compute, spatial complexity, and memory footprint which ultimately improves their applicability to power-constrained application scenarios.

In this talk, these two levels of redundancy are introduced as well as their impacts to hardware system design. Some personal opinions about efficient deep neural network acceleration system design are finally proposed for more open discussion.

This talk is part of the Computer Laboratory Computer Architecture Group Meeting series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity