Trustworthy generative AI through security, safety, grounding, and verification

Manil Shrestha

doi:10.17918/00011305

Back

Trustworthy generative AI through security, safety, grounding, and verification

Dissertation

Open access

Trustworthy generative AI through security, safety, grounding, and verification

Manil Shrestha

Doctor of Philosophy (Ph.D.), Drexel University

Mar 2026

DOI:

https://doi.org/10.17918/00011305

Files and links (1)

pdf

Shrestha_Manil_20266.37 MBDownload View

PDFOpen Access (License Unspecified), Open Access

Abstract

Generative artificial intelligence

Large language models

Trustworthy artificial intelligence

The rapid growth of generative artificial intelligence has created opportunities across critical domains, yet fundamental trust barriers still limit deployment in high-stakes applications. The field currently faces three interconnected challenges: safety risks from agents that can be exploited for harmful purposes, privacy vulnerabilities arising from centralized inference, and unreliable hallucinated outputs. This dissertation addresses these challenges through four independent research contributions aligned with the themes of Safety, Security, Grounding, and Verification. The first contribution examines application-level safety by analyzing the behavior of LLM-powered penetration testing agents, characterizing their capabilities, limitations, and the risks associated with autonomous offensive-security tools. The second contribution investigates Secure Multi-Party Computation (SMPC) for privacy-preserving inference, demonstrating how generative models can operate over decentralized servers while protecting both sensitive data and proprietary model parameters. The third contribution focuses on grounding by coupling language models with knowledge graphs and embedding-guided graph traversal so that generated outputs remain connected to structured, verifiable information rather than relying on unconstrained text generation. The fourth contribution develops conformal prediction-based methods that provide finite-sample statistical guarantees on model outputs. By decomposing predictions into atomic statements and attaching calibrated confidence measures, this work offers a principled mechanism to quantify uncertainty, validate outputs, and enforce safety constraints during generative inference. Together, these contributions unify statistical guarantees, structured grounding, and privacy-preserving computation to support interpretable and secure deployment of generative AI in high-stakes environments.

Metrics

1 File views/ downloads

1 Record Views

Details

Title: Trustworthy generative AI through security, safety, grounding, and verification
Creators: Manil Shrestha
Contributors: Edward Kim (Advisor)
Awarding Institution: Drexel University
Degree Awarded: Doctor of Philosophy (Ph.D.)
Publisher: Drexel University
Number of pages: xvi, 106 pages
Resource Type: Dissertation
Language: English
Academic Unit: Computer Science (Computing) (2013-2026); College of Computing and Informatics (2013-2026); Drexel University
Other Identifier: 991022170559304721

Trustworthy generative AI through security, safety, grounding, and verification

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media