Bayesian Inference in Multiway Graphical Factor Models with Correlated Residuals
Complex scientific problems easily produce high-dimensional and multiway data. Recently the statistical analysis and inference of high-dimensional multiway data has seen increased interest as technological and statistical advances help speed up computation or recast the problem into known statistical frameworks. One particular area of interest has been the identification and interpretation of structured and unstructured multivariate associations amongst observed variables. Such problems are relevant in a multitude of fields including biology, finance, and psychology, and have been tackled in various statistical frameworks including graphical models and factor models. The overarching goal of this thesis is to bridge the gap between these two mainstream classes of statistical models by developing hybrid models that borrow key features from graphical and factor models. Additionally, we show that these hybrid models can be used to analyze multiway data by borrowing information across, possibly related, groups.
In the first project, we propose an efficient Bayesian framework for inference and model determination for single factor models with correlated residuals (SFMCR) in which a Gaussian graphical model (GMM) represents the distribution of the residuals. These models have been introduced by Stanghellini (1997) and Vicard (2000), but have not received much attention in the literature since. We motivate the development of SFMCRs in the analysis of multiway datasets. Many scientific fields collect related but distinct datasets to examine associations between random variables across experimental conditions, time and space. GMMs have been used to study the dependence structure in multiway data but most methods lack the flexibility to account for unobserved confounders. We show that SFMCRs can efficiently capture structured dependence across datasets through the latent factors in addition to associations expressed through graphical models that are specific to each dataset. We show that the dataset-specific graphs can be further modeled to encourage the presence of joint edges. We exemplify the use of our methodology in simulated and real world examples.
In the second project, we extend SFMCRs to capture dynamics of graphs that characterize multivariate dependence in multiway data collected at different time points. In the third project, we explore how SFMCRs can be extended to the problem of changepoint detection.