NA³Os
Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs
Embedded Machine Learning (ML) constitutes an admittedly fast-growing
field that comprises ML algorithms, hardware, and software capable of
performing on-device sensor data analyses at extremely low power,
enabling thus several always-on and battery-powered applications and
services. Running ML-based applications on embedded edge devices
witnesses a phenomenal research and business interest for many reasons,
including accessibility, privacy, latency, cost, and security. Embedded
ML is primarily represented by artificial intelligence (AI) at the edge
(EdgeAI) and on tiny, ultra resource constrained devices, a.k.a. TinyML.
TinyML poses requirements for energy efficiency but also low latency as
well as to retain accuracy in acceptable levels mandating, thus,
optimization of the software and hardware stack.
GPUs form the
default platform for DNN training workloads, due to their high
parallelism computing originating by the massive number of processing
cores. Though, GPU is often not an optimal solution for DNN inference
acceleration due to the high energy-cost and the lack of
reconfigurability, especially for high sparsity models or customized
architectures. On the other hand, Field Programmable Gate Arrays (FPGAs)
have a unique privilege of potentially lower latency and higher
efficiency than GPUs while offering high customization and faster
time-to-market combined with potentially longer useful life than ASIC
solutions.
In the context of TinyML, NA³Os focuses on a neural
approximate accelerator-architecture co-search targeting specifically
lightweight FPGA devices. This project investigates design techniques to
optimally and automatically map DNNs to resource- constrained FPGAs
while exploiting principles of approximate computing. Our particular
topics of investigation include:
- Efficient mapping of DNN operations onto approximate hardware components (e.g., multipliers, adders, DSP Blocks, BRAMs).
- Techniques for fast and automated design space exploration of mappings of DNNs defined by a set of approximate operators and a set of FPGA platform constraints.
- Investigation of a hardware-aware neural architecture co-search methodology targeting FPGA-based DNN accelerators.
- Evaluation of robustness vs. energy efficiency tradeoffs.
- Finally, all developed methods shall be evaluated experimentally by providing a proper synthesis path and comparing the quality of generated solutions with state-of-the-art solutions.
Publikationen
- Sabih M., Hannig F., Teich J.:
Fault-Tolerant Low-Precision DNNs using Explainable AI
Workshop on Dependable and Secure Machine Learning (DSML) (Virtual Workshop, 21. Juni 2021 - 24. Juni 2021)
In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W) 2021
DOI: 10.1109/DSN-W52860.2021.00036
URL: https://ieeexplore.ieee.org/document/9502445/
BibTeX: Download - Sabih M., Yayla M., Hannig F., Teich J., Chen JJ.:
Robust and Tiny Binary Neural Networks using Gradient-based Explainability Methods
EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems (Rome, Italy, 8. Mai 2023 - 8. Mai 2023)
In: Eiko Yoneki, Luigi Nardi (Hrsg.): EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and System, New York(NY) United States: 2023
DOI: 10.1145/3578356.3592595
URL: https://dl.acm.org/doi/10.1145/3578356.3592595
BibTeX: Download - Sabih M., Sesli B., Hannig F., Teich J.:
Accelerating DNNs using Weight Clustering on RISC-V Custom Functional Units
Conference on Design, Automation and Test in Europe (DATE) (Valencia, 25. März 2024 - 27. März 2024)
In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE) 2024
BibTeX: Download