Logo image
Multilevel probabilistic canonical correlation analysis for integrative analysis of multi-omics data with repeated measurements
Dissertation   Open access

Multilevel probabilistic canonical correlation analysis for integrative analysis of multi-omics data with repeated measurements

Yuna Kim
Doctor of Philosophy (Ph.D.), Drexel University
Jun 2023
DOI:
https://doi.org/10.17918/00001710
pdf
Kim_Yuna_20234.90 MBDownloadView

Abstract

Multi-omics data Repeated measurements
Multi-omics data have been used to characterize covariation in multiple biological profiles, allowing for a more comprehensive understanding of complex biological processes. Moreover, the reduction in costs of high-throughput technologies has further broadened the scope of multi-omics studies enabling the collection of repeated measurements or longitudinal data. While mixed effects models are widely used in repeated measures single omics applications, their use in applications for integrative analyses of multilevel structured multi-omics data is less developed. Probabilistic canonical correlation analysis (PCCA) considers probability models for jointly studying the relations among two sets of data, collected on the same set of samples. In this dissertation, we propose (1) multilevel probabilistic CCA that extends PCCA to repeated measurements data to help learn the underlying shared structures between two omics data sources simultaneously at both the within- and between-subject levels, (2) sparse multilevel PCCA for a variable selection and better interpretability by imposing sparsity on the feature loadings using adaptive lasso, and (3) sparse multilevel multiple PCCA to facilitate the integration of more than two sets of variables. We examine our proposed methods' operating characteristics and variable selection performance and compare our approach with the standard integration methods through simulation studies. Finally, our methods are illustrated with an application to real data from a study, which investigated the associations between advanced colorectal adenoma, pattern recognition receptor genes (PRRs), and gut microbiota, for integration of gene expression and microbiome data.

Metrics

52 File views/ downloads
76 Record Views

Details

Logo image