Dynamic inter-node ILP based multi-scratchpad management for deep learning accelerators

Tajung Jang

doi:10.17918/00001796

Back

Dynamic inter-node ILP based multi-scratchpad management for deep learning accelerators

Thesis

Open access

Dynamic inter-node ILP based multi-scratchpad management for deep learning accelerators

Tajung Jang

Master of Science (M.S.), Drexel University

02 Aug 2023

DOI:

https://doi.org/10.17918/00001796

Files and links (1)

pdf

Jang_Tajung_20231.44 MBDownload View

PDF Open Access Open Access (License Unspecified)

Abstract

Deep learning (Machine learning)

Integer Linear Programming

Scratchpad

Scratchpad management

Integer Programming

The landscape of deep learning compiler frameworks has evolved rapidly with the development of various tools, such as TVM, deeptools, TensorFlow, DLVM, nGraph, and Glow. These frameworks offer unique optimizations to address computation and data movement challenges in deep learning accelerators (DLAs). These approaches include graph or IR level optimizations related to intra node memory access optimizations, operator fusion, and various tiling techniques. Despite their unique approaches, these frameworks primarily concentrate on node level optimizations that focus on increasing the performance of executing a scheduled kernel operation in the graph and overlook the potential for inter-node data reuse optimizations within on-chip memory resources. OnSRAM, a scratchpad management framework build to work with deep learning compilers, addresses this gap by focusing on internode scratchpad management in DLAs. OnSRAM exploits the static graph representations of deep learning models by identifying data structures that can be pinned to on-chip memory based on their reuse rate and cost of transfer from main memory. OnSRAM has been implemented and evaluated on a single DLA that contains a monolithic scratchpad and is integrated as part of a custom deep learning compiler framework. In this work, we extend the capabilities of OnSRAM by introducing an optimal dynamic scratchpad allocation for static graph execution models using any number of scratchpads via Integer Linear Programming (ILP) to optimize an accurate cost model of data transfers. This enhancement allows for more wholistic control over on- chip memory resources compared to the heuristic approach OnSRAM takes, providing increased flexibility and adaptability to better accommodate diverse deep learning accelerators and memory access patterns. By optimizing inter-node data movement and storage across multiple scratchpads, our approach further reduces energy consumption and latency associated with inter-node communication.

Metrics

48 File views/ downloads

30 Record Views

Details

Title: Dynamic inter-node ILP based multi-scratchpad management for deep learning accelerators
Creators: Tajung Jang
Contributors: Anup Das (Advisor)
Awarding Institution: Drexel University
Degree Awarded: Master of Science (M.S.)
Publisher: Drexel University; Philadelphia, Pennsylvania
Number of pages: vii, 43 pages
Resource Type: Thesis
Language: English
Academic Unit: College of Engineering (1970-2026); Electrical (and Computer) Engineering (1970-2026); Drexel University
Other Identifier: 991021227814204721

Dynamic inter-node ILP based multi-scratchpad management for deep learning accelerators

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media