Logo image
Beyond Cross-Entropy: Discounted Least Information Theory of Entropy (DLITE) Loss and the Impact of Loss Functions on AI-Driven Named Entity Recognition
Journal article   Open access   Peer reviewed

Beyond Cross-Entropy: Discounted Least Information Theory of Entropy (DLITE) Loss and the Impact of Loss Functions on AI-Driven Named Entity Recognition

Sonia Pascua, Michael Pan and Weimao Ke
Information (Basel), v 16(9), 760
02 Sep 2025
Featured in Collection :   Research Supported by Drexel Libraries' OA Programs
url
https://doi.org/10.3390/info16090760View
Published, Version of Record (VoR)Open Access Discount via Drexel Libraries Read and Publish Program 2025CC BY V4.0 Open

Abstract

DLITE Loss named entity recognition loss functions information theory transformer models information-theoretic optimization entropy-aware learning recall optimization model uncertainty noisy datasets
Loss functions play a significant role in shaping model behavior in machine learning, yet their design implications remain underexplored in natural language processing tasks such as Named Entity Recognition (NER). This study investigates the performance and optimization behavior of five loss functions—L1, L2, Cross-Entropy (CE), KL Divergence (KL), and the proposed DLITE (Discounted Least Information Theory of Entropy) Loss—within transformer-based NER models. DLITE introduces a bounded, entropy-discounting approach to penalization, prioritizing recall and training stability, especially under noisy or imbalanced data conditions. We conducted empirical evaluations across three benchmark NER datasets: Basic NER, CoNLL-2003, and the Broad Twitter Corpus. While CE and KL achieved the highest weighted F1-scores in clean datasets, DLITE Loss demonstrated distinct advantages in macro recall, precision–recall balance, and convergence stability—particularly in noisy environments. Our findings suggest that the choice of loss function should align with application-specific priorities, such as minimizing false negatives or managing uncertainty. DLITE adds a new dimension to model design by enabling more measured predictions, making it a valuable alternative in high-stakes or real-world NLP deployments.

Metrics

8 Record Views

Details

UN Sustainable Development Goals (SDGs)

This publication has contributed to the advancement of the following goals:

#3 Good Health and Well-Being
#16 Peace, Justice and Strong Institutions

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types
Domestic collaboration
Web of Science research areas
Computer Science, Information Systems
Logo image