Conference proceeding
Rate-Distortion Guided Knowledge Graph Construction from Lecture Notes Using Gromov-Wasserstein Optimal Transport
IEEE International Conference on Big Data, pp 5573-5582
08 Dec 2025
Abstract
Task-oriented knowledge graphs (KGs) enable AI-powered learning assistant systems to automatically generate high-quality multiple-choice questions (MCQs). Yet converting unstructured educational materials, such as lecture notes and slides, into KGs that capture key pedagogical content remains difficult. We propose a framework for knowledge graph construction and refinement grounded in rate-distortion (RD) theory and optimal transport geometry. In the framework, lecture content is modeled as a metric-measure space, capturing semantic and relational structure, while candidate KGs are aligned using Fused Gromov-Wasserstein (FGW) couplings to quantify semantic distortion. The rate term, expressed via the size of KG, reflects complexity and compactness. Refinement operators (add, merge, split, remove, rewire) minimize the rate-distortion Lagrangian, yielding compact, information-preserving KGs. Our prototype applied to data science lectures yields interpretable RD curves and shows that MCQs generated from refined KGs consistently surpass those from raw notes on fifteen quality criteria. This study establishes a principled foundation for information-theoretic KG optimization in personalized and AI-assisted education.
Metrics
1 Record Views
Details
- Title
- Rate-Distortion Guided Knowledge Graph Construction from Lecture Notes Using Gromov-Wasserstein Optimal Transport
- Creators
- Yuan An - Drexel UniversityRuhma Hashmi - Drexel UniversityMichelle Rogers - Drexel UniversityJane Greenberg - Drexel UniversityBrian K Smith - Boston College
- Publication Details
- IEEE International Conference on Big Data, pp 5573-5582
- Conference
- 2025 IEEE International Conference on Big Data (BigData) (Macau, China, 08 Dec 2025–10 Dec 2025)
- Publisher
- IEEE
- Resource Type
- Conference proceeding
- Language
- English
- Academic Unit
- Information Science; College of Computing and Informatics
- Other Identifier
- 991022167011204721