Type
Text
Type
Thesis
Advisor
Ferdman, Michael | Honarmand, Nima | Berg, Alex. | Samaras, Dimitris
Date
2015-12-01
Keywords
Convolutional Neural Network, Deep Learning, FPGA, High Level Synthesis | Computer science
Department
Department of Computer Science.
Language
en_US
Source
This work is sponsored by the Stony Brook University Graduate School in compliance with the requirements for completion of degree.
Identifier
http://hdl.handle.net/11401/77267
Publisher
The Graduate School, Stony Brook University: Stony Brook, NY.
Format
application/pdf
Abstract
Deep convolutional neural networks (CNNs) are rapidly becoming the domi-nant approach to computer vision and a major component of many other pervasivemachine learning tasks, such as speech recognition, natural language processing,and fraud detection. As research and development of CNNs progresses, the size ofthe networks grows, leading to large increases in the computation and bandwidthrequired to evaluate these networks. Typical CNNs in use today already exceedthe capabilities of general-purpose CPUs, resulting in rapid adoption and activeresearch of CNN hardware accelerators such as GPUs, FPGAs, and ASICs. Inthis work, we develop a novel CNN accelerator architecture and design method-ology that breaks away from the commonly accepted practice of processing thenetworks layer by layer. By modifying the order in which the original input dataare brought on chip, changing it to a pyramid-shaped multi-layer sliding window,our architecture enables effective on-chip caching during CNN evaluation. Thecaching in turn reduces the off-chip memory bandwidth requirements, which is aprimary bottleneck in many CNN environments. | 54 pages
Recommended Citation
Alwani, Manoj, "Fused Convolutional Neural Network Accelerators" (2015). Stony Brook Theses and Dissertations Collection, 2006-2020 (closed to submissions). 3088.
https://commons.library.stonybrook.edu/stony-brook-theses-and-dissertations-collection/3088