Diffusion Feedback Hypernetworks Large language models Low-rank adaptation
Recent advancements in Large Language Models (LLMs) have focused on scaling architectures and developing efficient fine-tuning strategies like Low-Rank Adaptation (LoRA). While traditionally applied to autoregressive models, these methods are increasingly being adapted for text-based diffusion models. Standard fine-tuning, however, applies a static set of weight modifications across all samples and generation steps. In this work, we propose a dynamic, per-step adaptation method. We introduce a hypernetwork, termed the feedbackward (FB) model, which generates unique LoRA weights for a base feedforward (FF) model at each step of the diffusion process. Inspired by feedback connections in the brain, the FB model processes a masked input to generate LoRA weights for the query, key, value, and output projections within each of the FF model's attention blocks. Notably, these weights are generated in reverse sequence, from the final block to the first, allowing high-level contextual adjustments to inform lower-level feature representations. This dynamic weight generation is hypothesized to better guide the sampling trajectory for each input, offering more nuanced control than a static set of weights. The FF model, augmented with these bespoke LoRA weights, then processes the same input to produce the output for that generation step.
Metrics
25 File views/ downloads
31 Record Views
Details
Title
On feedback connections in text-diffusion LLMs
Creators
Akramjit Singh Sandhu
Contributors
Edward Kim (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Master of Science (M.S.)
Publisher
Drexel University
Number of pages
x, 57 pages
Resource Type
Thesis
Language
English
Academic Unit
College of Computing and Informatics (2013-2026); Drexel University