Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative

Bansari Patel; Edward Kim

doi:10.1609/aaaiss.v7i1.36933

Back

Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative

Journal article

Open access

Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative

Bansari Patel and Edward Kim

Proceedings of the AAAI Symposium Series, v 7(1), pp 566-566

23 Nov 2025

DOI: https://doi.org/10.1609/aaaiss.v7i1.36933

Files and links (1)

url

https://doi.org/10.1609/aaaiss.v7i1.36933View

Published, Version of Record (VoR) Open

Abstract

Large Language Models (LLMs) are widely utilized for tasks requiring contextual understanding; however, their reliance on large context windows introduces significant computational overhead due to the transformer's quadratic complexity. This inefficiency is a critical barrier to their deployment in resource-constrained settings like rural healthcare, where processing longitudinal patient data from Electronic Health Records (EHRs) is essential. To achieve this, our research investigates an alternative paradigm: training lightweight, specialized models for complete knowledge internalization, enabling them to function as persistent and efficient knowledge bases on local hardware. Our methodology involves training a 12-layer, 124-million-parameter nanoGPT model de novo on specialized subsets of the MMLU benchmark, including domains relevant to healthcare. The training objective was explicitly data internalization, not generalization. The entire domain-specific corpus, consisting of over 250,000 tokens formatted for a question-and-answer recall task, was used for training until the model achieved near-zero training loss. Performance was then evaluated on the model's ability to perfectly reproduce answers from a "seen" validation set, with recall certainty quantified via softmax probabilities. The resulting models successfully internalized their respective knowledge domains, achieving near-100% accuracy on recall tasks with high confidence scores. This outcome validates that targeted training for memorization can produce reliable and computationally efficient expert agents. For rural health, this approach offers a practical alternative to large context windows, enabling the deployment of a fleet of specialized models on local hardware for tasks like patient history recall or clinical guideline retrieval. This drastically reduces computational costs and latency, providing a scalable solution without requiring continuous, high-bandwidth cloud access.

Metrics

1 Record Views

Details

Title: Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative
Creators: Bansari Patel - Drexel University
Edward Kim - Drexel University
Publication Details: Proceedings of the AAAI Symposium Series, v 7(1), pp 566-566
Publisher: Association for the Advancement of Artificial Intelligence
Number of pages: 1
Resource Type: Journal article
Language: English
Academic Unit: Computer Science (Computing)
Other Identifier: 991022138183104721

Efficient Context Retention in LLMs: Enhancing In-Context Memorization as an Alternative

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media