Generative artificial intelligence Large language models Trustworthy artificial intelligence
The rapid growth of generative artificial intelligence has created opportunities across critical domains, yet fundamental trust barriers still limit deployment in high-stakes applications. The field currently faces three interconnected challenges: safety risks from agents that can be exploited for harmful purposes, privacy vulnerabilities arising from centralized inference, and unreliable hallucinated outputs. This dissertation addresses these challenges through four independent research contributions aligned with the themes of Safety, Security, Grounding, and Verification. The first contribution examines application-level safety by analyzing the behavior of LLM-powered penetration testing agents, characterizing their capabilities, limitations, and the risks associated with autonomous offensive-security tools. The second contribution investigates Secure Multi-Party Computation (SMPC) for privacy-preserving inference, demonstrating how generative models can operate over decentralized servers while protecting both sensitive data and proprietary model parameters. The third contribution focuses on grounding by coupling language models with knowledge graphs and embedding-guided graph traversal so that generated outputs remain connected to structured, verifiable information rather than relying on unconstrained text generation. The fourth contribution develops conformal prediction-based methods that provide finite-sample statistical guarantees on model outputs. By decomposing predictions into atomic statements and attaching calibrated confidence measures, this work offers a principled mechanism to quantify uncertainty, validate outputs, and enforce safety constraints during generative inference. Together, these contributions unify statistical guarantees, structured grounding, and privacy-preserving computation to support interpretable and secure deployment of generative AI in high-stakes environments.
Metrics
1 File views/ downloads
1 Record Views
Details
Title
Trustworthy generative AI through security, safety, grounding, and verification
Creators
Manil Shrestha
Contributors
Edward Kim (Advisor)
Awarding Institution
Drexel University
Degree Awarded
Doctor of Philosophy (Ph.D.)
Publisher
Drexel University
Number of pages
xvi, 106 pages
Resource Type
Dissertation
Language
English
Academic Unit
Computer Science (Computing) (2013-2026); College of Computing and Informatics (2013-2026); Drexel University