Though Foundation Models (FMs), such as Large Language Models (LLMs) and Large Vision-Language Models (LVLMs), are demonstrated to be powerful in various real-world tasks, it is essential to know when users should trust their decisions in safety-critical scenarios. In this dissertation, I study the reliable decision-making of FMs from three perspectives: 1. Know when FM decisions to be correct (Chapter 2), i.e., how to identify the correctness of model responses? To achieve this, I develop advanced Uncertainty Quantification (UQ) of LLMs. Specifically, I study how the uncertainty should be expressed in a single-turn question-answering format, and how the uncertainty should be propagated in multi-step agentic decision-making processes. 2. Know how to correct FM Decisions (Chapter 3), i.e., how to correct model decisions to be truthful? To handle this, I first build uncertainty-driven hallucination detection methods on FMs. I then develop intervention mechanisms that utilize hallucination detection as rewards to encourage truthful decoding. These interventions are conducted from either the latent subspace or the output space of FMs, over both visual captioning and strategic reasoning benchmarks. 3. Know when to trust FM decisions (Chapter 4), i.e., the trustworthiness of FMs. I study the general trustworthiness, such as toxicity, fairness, and adversarial robustness, of LLMs and compare how real-world deployment, e.g., model pruning and quantization, affects trustworthy behaviors. I also study membership inference privacy issues in image generation models such as Diffusion Models (DMs) and design new membership inference attacks for these models.
Metrics
32 File views/ downloads
22 Record Views
Details
Title
Know when to trust
Creators
Jinhao Duan
Contributors
Kaidi Xu (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University; Philadelphia, Pennsylvania
Number of pages
xiv, 138 pages
Resource Type
Dissertation
Language
English
Academic Unit
Computer Science (Computing) [Historical]; College of Computing and Informatics (2013-2026); Drexel University