Leveraging Scalable Bayesian Inference
Massive datasets have created exciting opportunities for the scientific community while simultaneously imposing new challenges. “These new data-intensive problems are especially challenging for Bayesian methods, which typically involve intractable models that rely on computationally intensive simulations for their implementation,” says Professor Babak Shahbaba. “While simple algorithms — for example, the random walk Metropolis algorithm — might be effective at exploring low-dimensional distributions, they can be very inefficient for complex, high-dimensional distributions.” To address this issue, Professor Shahbaba and his team have been developing methods that exploit the geometric properties of the parameter space to improve the efficiency of sampling algorithms.
Applying Nonparametric Bayesian Models
Another research focus for Professor Shahbaba is nonparametric Bayesian models. “While parametric models are convenient and easy to interpret, they are constrained by assumptions that rarely hold true in practice,” he explains. “Modern Bayesian nonparametric methods, such as Dirichlet process mixtures (DPM) and Gaussian process (GP) models, liberate quantitative scientists from the shortcomings of assuming simple distributional forms (such as normality) and linear relationships among variables.” In recent years, Professor Shahbaba and his students have been applying these methods to a variety of statistical problems.
Addressing Large-Scale Biological Problems
Professor Shahbaba’s methodological research is mainly motivated by applied problems, and the main focus of his applied research has been on large-scale biological studies. He is currently focusing on two such projects. The first aims to develop a new class of statistical methods for answering fundamental and unresolved questions about hippocampal function. “This research can provide unprecedented insight into the neural mechanisms underlying memory impairments,” he says. The second project involves developing a new data-driven framework for investigating complex biological systems in order to find novel biological phenomena and identify the rules that govern them. “More specifically,” he says, “we are using this approach to investigate hematopoiesis.”