Know when to trust: towards reliable decision-making in large foundation models

Jinhao Duan

doi:10.17918/00011095

Back

Know when to trust: towards reliable decision-making in large foundation models

Dissertation

Open access

Know when to trust: towards reliable decision-making in large foundation models

Jinhao Duan

Doctor of Philosophy (Ph.D.), Drexel University

Jun 2025

DOI:

https://doi.org/10.17918/00011095

Files and links (1)

pdf

Duan_Jinhao_20258.53 MBDownload View

PDFOpen Access (License Unspecified), Open Access

Abstract

Though Foundation Models (FMs), such as Large Language Models (LLMs) and Large Vision-Language Models (LVLMs), are demonstrated to be powerful in various real-world tasks, it is essential to know when users should trust their decisions in safety-critical scenarios. In this dissertation, I study the reliable decision-making of FMs from three perspectives: 1. Know when FM decisions to be correct (Chapter 2), i.e., how to identify the correctness of model responses? To achieve this, I develop advanced Uncertainty Quantification (UQ) of LLMs. Specifically, I study how the uncertainty should be expressed in a single-turn question-answering format, and how the uncertainty should be propagated in multi-step agentic decision-making processes. 2. Know how to correct FM Decisions (Chapter 3), i.e., how to correct model decisions to be truthful? To handle this, I first build uncertainty-driven hallucination detection methods on FMs. I then develop intervention mechanisms that utilize hallucination detection as rewards to encourage truthful decoding. These interventions are conducted from either the latent subspace or the output space of FMs, over both visual captioning and strategic reasoning benchmarks. 3. Know when to trust FM decisions (Chapter 4), i.e., the trustworthiness of FMs. I study the general trustworthiness, such as toxicity, fairness, and adversarial robustness, of LLMs and compare how real-world deployment, e.g., model pruning and quantization, affects trustworthy behaviors. I also study membership inference privacy issues in image generation models such as Diffusion Models (DMs) and design new membership inference attacks for these models.

Metrics

32 File views/ downloads

22 Record Views

Details

Title: Know when to trust
Creators: Jinhao Duan
Contributors: Kaidi Xu (Advisor)
Awarding Institution: Drexel University
Degree Awarded: Doctor of Philosophy (Ph.D.)
Publisher: Drexel University; Philadelphia, Pennsylvania
Number of pages: xiv, 138 pages
Resource Type: Dissertation
Language: English
Academic Unit: Computer Science (Computing) [Historical]; College of Computing and Informatics (2013-2026); Drexel University
Other Identifier: 991022061154504721

Know when to trust: towards reliable decision-making in large foundation models

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media