Statistics and Data Science Seminar: Filtering (Data Assimilation) in High Dimensions: Sequential MCMC with Applications to Lagrangian Data Assimilation with Unknown Data Location

Speaker: Hamza Ruzayqat, King Abdullah University of Science and Technology

Abstract: Data assimilation, also known as filtering, is the process of integrating a mathematical model or numerical implementation of a time-dependent physical system with a sequence of observations. The objective is to combine these two sources of information to obtain a more accurate estimation of the system's true state, leading to improved predictions of its future state in real-time. However, filtering high-dimensional state-space models (SSMs), especially those with a nonlinear mathematical model, presents significant challenges as analytical solutions of the filter are usually not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. In this talk, we present a method that utilizes sequential Markov chain Monte Carlo to obtain samples from an approximation of the filtering distribution. For certain SSMs, this method is proven to converge to the true filter as the number of samples, N, tends to infinity. We benchmark our algorithms on linear Gaussian SSMs against competing ensemble methods and demonstrate a significant improvement in both execution speed and accuracy (the algorithm cost can range from O(Nd) to O(Nd[d+1]/2) depending on the model noise covariance matrix structure, where d is the dimension of the hidden state. We then consider a SSM with Lagrangian observations such that the spatial locations of these observations are unknown and driven by the partially observed hidden signal. This problem is exceptionally challenging as it is not only high-dimensional, but the model for the signal yields longer-range time dependencies through the observation locations. This is demonstrated through a rotating shallow water model with real data obtained from drifters in the ocean. 

Biography: Dr. Hamza Ruzayqat currently is a research scientist in the Uncertainty Quantification research group at King Abdullah University of Science and Technology (KAUST), under Prof. Omar Knio. Prior to this, he held positions as a postdoctoral researcher and research scientist in the Applied and Computational Probability research group, led by Prof. Ajay Jasra, from late 2019 to December 2023. Dr. Ruzayqat obtained his PhD in Mathematics from the University of Tennessee-Knoxville, USA, in May 2019 under the supervision of Prof. Tim Schulze. His doctoral research focused on off-lattice kinetic Monte Carlo for atomic simulations. Dr. Ruzayqat's primary areas of interest revolve around applied and computational statistics and data science. His research encompasses a wide range of topics, including Monte Carlo algorithms (such as sequential Monte Carlo, sequential and regular MCMC), data assimilation (such as particle filters, ensemble methods like EnKBF, and SMCMC), Bayesian statistics, uncertainty quantification, multilevel estimation, and unbiased estimation. He has authored over 15 papers, with 11 of them being published (or to appear) in esteemed journals such as QJRMS, JCP, JCTC and SISC.

Host: Xuming He

This talk will be virtual over Zoom. Please use the following link: https://wustl.zoom.us/j/99698911461