Journal article
Hardware-Friendly 3D CNN Acceleration With Balanced Kernel Group Sparsity
IEEE transactions on computer-aided design of integrated circuits and systems, v 43(10), pp 1-1
15 Apr 2024
Featured in Collection : UN Sustainable Development Goals @ Drexel
Abstract
Being capable of extracting more information than 2D Convolutional Neural Networks (CNNs), 3D CNNs have been playing a vital role in video analysis tasks like human action recognition, but their massive operations hinder the real-time execution on edge devices with constrained computation and memory resources. Although various model compression techniques have been applied to accelerate 2D CNNs, there are rare efforts in investigating hardware-friendly pruning of 3D CNNs and acceleration on customizable edge platforms like FPGAs. This work starts from proposing a kernel group row-column (KGRC) weight sparsity pattern, which is fine-grained to achieve high pruning ratios with negligible accuracy loss, and balanced across kernel groups to achieve high computation parallelism on hardware. The reweighted pruning algorithm for this sparsity is then presented and performed on 3D CNNs, followed by quantization under different precisions. Along with model compression, FPGA-based accelerators with four modes are designed in support of the kernel group sparsity in multiple dimensions. The co-design framework of the pruning algorithm and the accelerator is tested on two representative 3D CNNs, namely C3D and R(2+1)D, with the Xilinx ZCU102 FPGA platform for action recognition. The experimental results indicate that the accelerator implementation with the KGRC sparsity and 8-bit quantization achieves a good balance between the speedup and model accuracy, leading to acceleration ratios of 4.12× for C3D and 3.85× for R(2+1)D compared with the 16-bit baseline designs supporting only dense models.
Metrics
17 Record Views
Details
- Title
- Hardware-Friendly 3D CNN Acceleration With Balanced Kernel Group Sparsity
- Creators
- Mengshu Sun - Beijing University of TechnologyKaidi Xu - Drexel UniversityXue Lin - Northeastern UniversityYongli Hu - Beijing University of TechnologyBaocai Yin - Beijing University of Technology
- Publication Details
- IEEE transactions on computer-aided design of integrated circuits and systems, v 43(10), pp 1-1
- Publisher
- IEEE; PISCATAWAY
- Number of pages
- 14
- Grant note
- 2021ZD0111902 / National Key Research and Development Program of China (10.13039/501100012166) 62376014; U19B2039; U21B2038 / National Natural Science Foundation of China (10.13039/501100001809) KZ202210005008 / R&D Program of Beijing Municipal Education Commission
- Resource Type
- Journal article
- Language
- English
- Academic Unit
- Computer Science (Computing)
- Web of Science ID
- WOS:001319522900010
- Scopus ID
- 2-s2.0-85190723049
- Other Identifier
- 991021873896304721
UN Sustainable Development Goals (SDGs)
This publication has contributed to the advancement of the following goals:
InCites Highlights
Data related to this publication, from InCites Benchmarking & Analytics tool:
- Collaboration types
- Domestic collaboration
- International collaboration
- Web of Science research areas
- Computer Science, Hardware & Architecture
- Computer Science, Interdisciplinary Applications
- Engineering, Electrical & Electronic