Machine learning models offer transformative benefits across disciplines such as medicine, chemistry, and physics. However, as these models grow in size and usage, their energy demands increase dramatically, raising sustainability concerns. Neuromorphic hardware, inspired by the energy efficiency of the human brain, seeks to address this challenge by offering low-power, fast-processing alternatives to conventional computing. A key difference between neuromorphic systems and traditional architectures is the absence of shared memory between neurons, which poses a challenge to implementing learning algorithms. Yet, the brain learns effectively under this constraint, indicating the potential for machine learning methods to be adapted to such hardware. This research presents the design and implementation of the first ever fully on-chip neuromorphic Loihi 2 reinforcement learning agent. This circuit consists of a fully embedded Q-learning algorithm and an on-chip simulation of the CartPole-v0 environment on Intel's Loihi 2 neuromorphic processor. The system successfully trained agents that solved the CartPole-v0 task 36% of the time. Among all agents trained on Loihi 2, the top 50% achieved an average episode reward of 193.01, which is very near the benchmark score of 195 required to solve the task. In comparison, a similar Q-learning algorithm implemented on a conventional Intel Core i7-10870H CPU solved the task 62% of the time, with the top 50% of agents achieving a perfect score of 200. Despite the lower success rate, Loihi 2 exhibited major advantages in efficiency. During training, its dynamic power draw was only 0.02 watts, compared to 12 watts on the CPU. Execution time per weight update was also faster: 58.42 microseconds on Loihi 2 versus 162.54 microseconds on the CPU. Consequently, the neuromorphic system can train the same number of successful agents in only 44% of the required time for the CPU and with 600 times less power. This translates to a 1,365-fold increase in energy efficiency for an equivalent level of training success, and a 4,149-fold increase during inference. These findings demonstrate the viability of reinforcement learning on neuromorphic hardware and highlight its promise for building energy-efficient, real-time, embedded AI systems, as well as bring neuromorphic computing closer to realizing its potential as the backbone of a new sustainable, brain-like generation of AI.
Metrics
269 File views/ downloads
88 Record Views
Details
Title
All on board
Creators
Steven Christian Nesbit
Contributors
Edward Kim (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University; Philadelphia, Pennsylvania
Number of pages
xi, 95 pages
Resource Type
Dissertation
Language
English
Academic Unit
Computer Science (Computing) [Historical]; College of Computing and Informatics (2013-2026); Drexel University