NVIDIA’s Deep Learning Accelerator at GitHub

NVIDIA has open-sourced their “Deep Learning Accelerator” (NVDLA), available at GitHub. It comes with the whole package

  • Synthesizable RTL
  • Synthesis scripts
  • Verification testbench
  • C-model (to be released)
  • Documentation
  • Linux drivers

Seems like there are no strings attached licensing-wise and patent-grant-wise Рanyone can integrate it in a commercial product, sell the product and owe nothing to NVIDIA.

NVIDIA wants to continue NVDLA development in public, via GitHub community contribution.

Architecture-wise, NVDLA appears to be a convolution accelerator

  • Input data streams from memory, via “Memory interface block” and via “Convolution buffer” (4Kb..32Kb) in to “Convolution core”
  • The “Convolution core” is a “wide MAC pipeline”
  • Followed by “Activation engine”
  • Followed by “Pooling engine”
  • Followed by “Local response normalization” block
  • Followed by “Reshape” block
  • and streaming out back to “Memory interface block”

The architecture is configurable using RTL synthesis parameters, supports

  • Data type choice of Binary, INT4, INT8, INT16,¬† INT32, FP16, FP32, FP64
  • Winograd convolution
  • Sparse compression for both weights and feature data to reduce memory storage, bandwidth – especially useful for fully-connected layers
  • Second memory interface for on-chip buffering to increase bandwidth, reduce latency vs. DRAM access
  • Batching, ranging 1..32 samples

Leave a Reply

Your email address will not be published. Required fields are marked *