Dissertation
Multilevel probabilistic canonical correlation analysis for integrative analysis of multi-omics data with repeated measurements
Doctor of Philosophy (Ph.D.), Drexel University
Jun 2023
DOI:
https://doi.org/10.17918/00001710
Abstract
Multi-omics data have been used to characterize covariation in multiple biological profiles, allowing for a more comprehensive understanding of complex biological processes. Moreover, the reduction in costs of high-throughput technologies has further broadened the scope of multi-omics studies enabling the collection of repeated measurements or longitudinal data. While mixed effects models are widely used in repeated measures single omics applications, their use in applications for integrative analyses of multilevel structured multi-omics data is less developed. Probabilistic canonical correlation analysis (PCCA) considers probability models for jointly studying the relations among two sets of data, collected on the same set of samples. In this dissertation, we propose (1) multilevel probabilistic CCA that extends PCCA to repeated measurements data to help learn the underlying shared structures between two omics data sources simultaneously at both the within- and between-subject levels, (2) sparse multilevel PCCA for a variable selection and better interpretability by imposing sparsity on the feature loadings using adaptive lasso, and (3) sparse multilevel multiple PCCA to facilitate the integration of more than two sets of variables. We examine our proposed methods' operating characteristics and variable selection performance and compare our approach with the standard integration methods through simulation studies. Finally, our methods are illustrated with an application to real data from a study, which investigated the associations between advanced colorectal adenoma, pattern recognition receptor genes (PRRs), and gut microbiota, for integration of gene expression and microbiome data.
Metrics
52 File views/ downloads
76 Record Views
Details
- Title
- Multilevel probabilistic canonical correlation analysis for integrative analysis of multi-omics data with repeated measurements
- Creators
- Yuna Kim
- Contributors
- Scarlett L. Bellamy (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Doctor of Philosophy (Ph.D.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- ix, 110 pages
- Resource Type
- Dissertation
- Language
- English
- Academic Unit
- Dana and David Dornsife School of Public Health; Epidemiology and Biostatistics; Drexel University
- Other Identifier
- 991021047215204721