All on board: fully on-chip neuromorphic Q-learning with embedded CartPole simulation

Steven Christian Nesbit

doi:10.17918/00011148

Back

All on board: fully on-chip neuromorphic Q-learning with embedded CartPole simulation

Dissertation

Open access

All on board: fully on-chip neuromorphic Q-learning with embedded CartPole simulation

Steven Christian Nesbit

Doctor of Philosophy (Ph.D.), Drexel University

Aug 2025

DOI:

https://doi.org/10.17918/00011148

Files and links (21)

pdf

Nesbit_Steven_20254.04 MBDownload View

PDFOpen Access (License Unspecified), Open Access

mp4

Nesbit_Steven_2025_Suppl0186.78 kBDownload View

Video (supplemental)Figure 2.2aOpen Access (License Unspecified), Open Access

mp4

Nesbit_Steven_2025_Suppl0286.13 kBDownload View

Video (supplemental)Figure 2.2bOpen Access (License Unspecified), Open Access

mp4

Nesbit_Steven_2025_Suppl03108.77 kBDownload View

Video (supplemental)Figure 2.2cOpen Access (License Unspecified), Open Access

mp4

Nesbit_Steven_2025_Suppl04112.95 kBDownload View

Video (supplemental)Figure 2.2dOpen Access (License Unspecified), Open Access

Abstract

Energy-efficient AI

Neuromorphic computing

On-chip learning

Q-learning

Reinforcement learning

Spiking neural networks

Machine learning models offer transformative benefits across disciplines such as medicine, chemistry, and physics. However, as these models grow in size and usage, their energy demands increase dramatically, raising sustainability concerns. Neuromorphic hardware, inspired by the energy efficiency of the human brain, seeks to address this challenge by offering low-power, fast-processing alternatives to conventional computing. A key difference between neuromorphic systems and traditional architectures is the absence of shared memory between neurons, which poses a challenge to implementing learning algorithms. Yet, the brain learns effectively under this constraint, indicating the potential for machine learning methods to be adapted to such hardware. This research presents the design and implementation of the first ever fully on-chip neuromorphic Loihi 2 reinforcement learning agent. This circuit consists of a fully embedded Q-learning algorithm and an on-chip simulation of the CartPole-v0 environment on Intel's Loihi 2 neuromorphic processor. The system successfully trained agents that solved the CartPole-v0 task 36% of the time. Among all agents trained on Loihi 2, the top 50% achieved an average episode reward of 193.01, which is very near the benchmark score of 195 required to solve the task. In comparison, a similar Q-learning algorithm implemented on a conventional Intel Core i7-10870H CPU solved the task 62% of the time, with the top 50% of agents achieving a perfect score of 200. Despite the lower success rate, Loihi 2 exhibited major advantages in efficiency. During training, its dynamic power draw was only 0.02 watts, compared to 12 watts on the CPU. Execution time per weight update was also faster: 58.42 microseconds on Loihi 2 versus 162.54 microseconds on the CPU. Consequently, the neuromorphic system can train the same number of successful agents in only 44% of the required time for the CPU and with 600 times less power. This translates to a 1,365-fold increase in energy efficiency for an equivalent level of training success, and a 4,149-fold increase during inference. These findings demonstrate the viability of reinforcement learning on neuromorphic hardware and highlight its promise for building energy-efficient, real-time, embedded AI systems, as well as bring neuromorphic computing closer to realizing its potential as the backbone of a new sustainable, brain-like generation of AI.

Metrics

269 File views/ downloads

88 Record Views

Details

Title: All on board
Creators: Steven Christian Nesbit
Contributors: Edward Kim (Advisor)
Awarding Institution: Drexel University
Degree Awarded: Doctor of Philosophy (Ph.D.)
Publisher: Drexel University; Philadelphia, Pennsylvania
Number of pages: xi, 95 pages
Resource Type: Dissertation
Language: English
Academic Unit: Computer Science (Computing) [Historical]; College of Computing and Informatics (2013-2026); Drexel University
Other Identifier: 991022084155004721

All on board: fully on-chip neuromorphic Q-learning with embedded CartPole simulation

Files and links (21)

Abstract

Metrics

Details

Drexel University Social media