Course:CPSC522/SMC for PGMs
Contents
Sequential Monte Carlo for Probabilistic Graphical Models
Twisted sequential Monte Carlo (tSMC) is a general framework for adaptive samplingbased inference. In the context of probabilistic graphical models (PGMs), it enables the combination of fast (but biased) deterministic approximations with asymptotically consistent (but expensive) stochastic inference.
Principal Author: Boyan Beronov
Collaborators:
Abstract
Probabilistic graphical models (PGMs) are a broad class of statistical models employed in all branches of science. Their static and explicit independence structure has allowed for the development of several powerful inference algorithms. Since in all but a limited subclass of models, exact inference is computationally intractable, approximate inference algorithms have been crucial for their adoption in applied domains. This article presents a line of work which combines two classes of such algorithms in a mutually beneficial way:
 deterministic, biased but efficient approximations to exact inference algorithms which make use of the explicit PGM structure, and
 adaptive versions of the stochastic, asymptotically consistent but slow methods from the sequential Monte Carlo (SMC) family.
Builds on
All concepts in this article are built on the fundamental basis of probability theory. The probabilistic model representations employed here are graphical models, with notable subclasses being Markov networks and Bayesian networks. The stochastic approximate inference algorithms rest on the theory of Markov Chain Monte Carlo and Sequential Monte Carlo, see also references therein.
Related Pages
Being a general probabilistic inference method, SMC relates to many topics in the category of Probability and Graphical Models. See the sections below on deterministic and stochastic approximate inference for references to related algorithms.
Background
This section introduces the necessary technical context for the line of research represented by Naesseth et al. (2014) ^{[1]} and Lindsten et al. (2018) ^{[2]}.
Probabilistic Graphical Models
(Un)Directed Graphical Models
Main articles: Graphical models, Markov networks, Bayesian networks
For example, the joint density of a Bayesian network is defined by construction as:
Factor Graphs
Main article: Factor graph
The joint density of any PGM can be equivalently represented as a bipartite factor graph with edges only connecting factors with variables:
Deterministic Approximate Inference
Main article: Inference in graphical models
There exist exact inference algorithms for many restricted classes of PGMs, including variable elimination, weighted model counting, belief propagation and junction trees. They usually require restricted graph connectivities, discrete random variables with finite support, or conjugate priors. In the general case, though, exact inference is intractable, hence numerous deterministic approximations have been devised and successfully employed in various applications. The following lists several important methods.
Laplace Approximation
Main article: Laplace's method
(Loopy) Belief Propagation
Main article: Belief propagation
Expectation Propagation
Main article: Expectation propagation
Variational Inference
Main article: Variational inference
Stochastic Approximate Inference: FeynmanKac Models
Importance Sampling
Main article: Importance sampling
When the expectation of a quantity of interest is to be computed under a distribution from which exact sampling is intractable, importance sampling allows the derivation of an unbiased estimator via a proposal distribution on the same space . If the latter admits efficient sampling and (absolute continuity), the likelihood ratio (RadonNikodym derivative) exists and can be used to reweight the samples obtained from :
The last identity simply observes .
Sequential Monte Carlo
Main article: SMC samplers
^{[3]}^{[4]}^{[5]}
Twisted Sequential Monte Carlo
^{[6]}^{[7]}^{[8]}
Twisting function :
FeynmanKac Models
Main articles: Particle filter, Mean field particle methods, FeynmanKac formula
This section is meant to highlight the commonality between the stochastic inference algorithms presented above – and their numerous variants in literature – in a systematic way. All such methods are special cases of the general class of FeynmanKac path models. ^{[9]} They provide a means for constructing a measure over paths by reweighting a base measure with a path potential , where is directly obtained from an initial measure and Markov transition kernels , and similarly is defined via a sequence of state potentials :
From this abstract perspective, the explanation of twisted SMC is very simple:
The target densities are manipulated by the factors in such a way that the effective sample size at each SMC step is improved, while all factors cancel out in the full expression for , i.e., they maintain an unbiased estimator for the final distribution of interest and its normalization constant . The power of this approach stems from the flexibility of choosing the twisting potentials adaptively, i.e., depending on the data and current particle forest.
SMC for PGMs
Using the formalism, data structures and algorithms introduced above, this section summarizes the core contributions of ^{[1]} and ^{[2]}, highlighting the respective incremental improvements.
Basic SMC
Sequential Graph Decomposition
^{[1]} formulates generic PGM inference as an SMC problem, i.e., as the task of sampling from a sequence of increasingly complex probability distributions – where the first distribution is efficiently accessible and the last distribution corresponds to the full target. This is a very common theme in statistical inference, the most widely used techniques of this sort being tempering, annealing and bridging. In the setting of PGMs, this sequence is constructed by ordering the factors according to some criterion and gradually including them into the sequence of unnormalized densities:
Twisted SMC
Twisting functions
^{[2]} extends the above framework by introducing a general twisting mechanism in the setting of PGMs. In particular, the authors derive the optimal twisting function for intermediate target densities to be:
Experiments
Both methods were tested on a variety of applications, including the Classical XY and Ising models from statistical mechanics, topic models (Latent Dirichlet Allocation), and a Gaussian Markov random field. In general, the basic SMC method performed comparably to state of the art inference algorithms at the time of publication, and the more recent twisted SMC variant consistently outperformed basic SMC, requiring 12 orders of magnitude less particles to achieve similar accuracy in terms of .
Conclusions
Recent years have seen important developments in adaptive samplingbased inference methods within the SMC framework. While these were mostly introduced in the context of statespace models, the publications discussed here have demonstrated the potential of such methods in the much more general class of PGMs. Despite adaptation methods presenting a computational overhead, in many cases this cost can be amortized over subsequent inference steps, opening up the possibility of significant gains in accuracy per computation. Perhaps the most valuable contribution in ^{[2]} is the insight that, in combination with traditional deterministic inference approximations, twisted SMC provides an alternative path towards inference amortization, whereas the statistical discourse has been dominated by variational inference methods in recent years.
Annotated Bibliography
 ↑ ^{1.0} ^{1.1} ^{1.2} Naesseth, Christian Andersson, Fredrik Lindsten, and Thomas B. Schön. "Sequential Monte Carlo for graphical models." Advances in Neural Information Processing Systems. 2014. http://papers.nips.cc/paper/5570sequentialmontecarloforgraphicalmodels
 ↑ ^{2.0} ^{2.1} ^{2.2} ^{2.3} Lindsten, Fredrik, Jouni Helske, and Matti Vihola. "Graphical model inference: Sequential Monte Carlo meets deterministic approximations." Advances in Neural Information Processing Systems. 2018. http://papers.nips.cc/paper/8041graphicalmodelinferencesequentialmontecarlomeetsdeterministicapproximations
 ↑ Liu, Jun S., and Rong Chen. "Sequential Monte Carlo methods for dynamic systems." Journal of the American statistical association 93.443 (1998): 10321044. https://www.tandfonline.com/doi/abs/10.1080/01621459.1998.10473765
 ↑ Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411436. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.14679868.2006.00553.x
 ↑ Doucet, Arnaud, and Adam M. Johansen. "A tutorial on particle filtering and smoothing: Fifteen years later." Handbook of nonlinear filtering 12.656704 (2009): 3. http://www.warwick.ac.uk/fac/sci/statistics/staff/academicresearch/johansen/publications/dj11.pdf
 ↑ Kappen, Hilbert Johan, and Hans Christian Ruiz. "Adaptive importance sampling for control and inference." Journal of Statistical Physics 162.5 (2016): 12441266. https://link.springer.com/article/10.1007/s1095501614467
 ↑ Guarniero, Pieralberto, Adam M. Johansen, and Anthony Lee. "The iterated auxiliary particle filter." Journal of the American Statistical Association 112.520 (2017): 16361647. https://amstat.tandfonline.com/doi/abs/10.1080/01621459.2016.1222291
 ↑ Heng, J., Bishop, A. N., Deligiannidis, G., & Doucet, A. (2017). Controlled sequential monte carlo. arXiv preprint arXiv:1708.08396. https://arxiv.org/abs/1708.08396
 ↑ Moral, P. D. "Feynmankac formulae: Genealogical and interacting particle systems with applications, Probability and its applications." (2004). https://www.infona.pl/resource/bwmeta1.element.springer3c891fdeb2183132870a10bb8290626c
To Add
