Course:CPSC522/SMC for PGMs
Sequential Monte Carlo for Probabilistic Graphical Models
Twisted sequential Monte Carlo (tSMC) is a general framework for adaptive sampling-based inference. In the context of probabilistic graphical models (PGMs), it enables the combination of fast (but biased) deterministic approximations with asymptotically consistent (but expensive) stochastic inference.
Principal Author: Boyan Beronov
Probabilistic graphical models (PGMs) are a broad class of statistical models employed in all branches of science. Their static and explicit independence structure has allowed for the development of several powerful inference algorithms. Since in all but a limited subclass of models, exact inference is computationally intractable, approximate inference algorithms have been crucial for their adoption in applied domains. This article presents a line of work which combines two classes of such algorithms in a mutually beneficial way:
- deterministic, biased but efficient approximations to exact inference algorithms which make use of the explicit PGM structure, and
- adaptive versions of the stochastic, asymptotically consistent but slow methods from the sequential Monte Carlo (SMC) family.
All concepts in this article are built on the fundamental basis of probability theory. The probabilistic model representations employed here are graphical models, with notable subclasses being Markov networks and Bayesian networks. The stochastic approximate inference algorithms rest on the theory of Markov Chain Monte Carlo and Sequential Monte Carlo, see also references therein.
Being a general probabilistic inference method, SMC relates to many topics in the category of Probability and Graphical Models. See the sections below on deterministic and stochastic approximate inference for references to related algorithms.
Probabilistic Graphical Models
(Un-)Directed Graphical Models
For example, the joint density of a Bayesian network is defined by construction as:
Main article: Factor graph
The joint density of any PGM can be equivalently represented as a bipartite factor graph with edges only connecting factors with variables:
Deterministic Approximate Inference
Main article: Inference in graphical models
There exist exact inference algorithms for many restricted classes of PGMs, including variable elimination, weighted model counting, belief propagation and junction trees. They usually require restricted graph connectivities, discrete random variables with finite support, or conjugate priors. In the general case, though, exact inference is intractable, hence numerous deterministic approximations have been devised and successfully employed in various applications. The following lists several important methods.
Main article: Laplace's method
(Loopy) Belief Propagation
Main article: Belief propagation
Main article: Expectation propagation
Main article: Variational inference
Stochastic Approximate Inference: Feynman-Kac Models
Main article: Importance sampling
When the expectation of a quantity of interest is to be computed under a distribution from which exact sampling is intractable, importance sampling allows the derivation of an unbiased estimator via a proposal distribution on the same space . If the latter admits efficient sampling and (absolute continuity), the likelihood ratio (Radon-Nikodym derivative) exists and can be used to re-weight the samples obtained from :
The last identity simply observes .
Sequential Monte Carlo
Main article: SMC samplers
Twisted Sequential Monte Carlo
Twisting function :
This section is meant to highlight the commonality between the stochastic inference algorithms presented above – and their numerous variants in literature – in a systematic way. All such methods are special cases of the general class of Feynman-Kac path models.  They provide a means for constructing a measure over paths by re-weighting a base measure with a path potential , where is directly obtained from an initial measure and Markov transition kernels , and similarly is defined via a sequence of state potentials :
From this abstract perspective, the explanation of twisted SMC is very simple:
The target densities are manipulated by the factors in such a way that the effective sample size at each SMC step is improved, while all factors cancel out in the full expression for , i.e., they maintain an unbiased estimator for the final distribution of interest and its normalization constant . The power of this approach stems from the flexibility of choosing the twisting potentials adaptively, i.e., depending on the data and current particle forest.
SMC for PGMs
Sequential Graph Decomposition
 formulates generic PGM inference as an SMC problem, i.e., as the task of sampling from a sequence of increasingly complex probability distributions – where the first distribution is efficiently accessible and the last distribution corresponds to the full target. This is a very common theme in statistical inference, the most widely used techniques of this sort being tempering, annealing and bridging. In the setting of PGMs, this sequence is constructed by ordering the factors according to some criterion and gradually including them into the sequence of unnormalized densities:
 extends the above framework by introducing a general twisting mechanism in the setting of PGMs. In particular, the authors derive the optimal twisting function for intermediate target densities to be:
Both methods were tested on a variety of applications, including the Classical XY and Ising models from statistical mechanics, topic models (Latent Dirichlet Allocation), and a Gaussian Markov random field. In general, the basic SMC method performed comparably to state of the art inference algorithms at the time of publication, and the more recent twisted SMC variant consistently outperformed basic SMC, requiring 1-2 orders of magnitude less particles to achieve similar accuracy in terms of .
Recent years have seen important developments in adaptive sampling-based inference methods within the SMC framework. While these were mostly introduced in the context of state-space models, the publications discussed here have demonstrated the potential of such methods in the much more general class of PGMs. Despite adaptation methods presenting a computational overhead, in many cases this cost can be amortized over subsequent inference steps, opening up the possibility of significant gains in accuracy per computation. Perhaps the most valuable contribution in  is the insight that, in combination with traditional deterministic inference approximations, twisted SMC provides an alternative path towards inference amortization, whereas the statistical discourse has been dominated by variational inference methods in recent years.
- Naesseth, Christian Andersson, Fredrik Lindsten, and Thomas B. Schön. "Sequential Monte Carlo for graphical models." Advances in Neural Information Processing Systems. 2014. http://papers.nips.cc/paper/5570-sequential-monte-carlo-for-graphical-models
- Lindsten, Fredrik, Jouni Helske, and Matti Vihola. "Graphical model inference: Sequential Monte Carlo meets deterministic approximations." Advances in Neural Information Processing Systems. 2018. http://papers.nips.cc/paper/8041-graphical-model-inference-sequential-monte-carlo-meets-deterministic-approximations
- Liu, Jun S., and Rong Chen. "Sequential Monte Carlo methods for dynamic systems." Journal of the American statistical association 93.443 (1998): 1032-1044. https://www.tandfonline.com/doi/abs/10.1080/01621459.1998.10473765
- Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436. https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9868.2006.00553.x
- Doucet, Arnaud, and Adam M. Johansen. "A tutorial on particle filtering and smoothing: Fifteen years later." Handbook of nonlinear filtering 12.656-704 (2009): 3. http://www.warwick.ac.uk/fac/sci/statistics/staff/academic-research/johansen/publications/dj11.pdf
- Kappen, Hilbert Johan, and Hans Christian Ruiz. "Adaptive importance sampling for control and inference." Journal of Statistical Physics 162.5 (2016): 1244-1266. https://link.springer.com/article/10.1007/s10955-016-1446-7
- Guarniero, Pieralberto, Adam M. Johansen, and Anthony Lee. "The iterated auxiliary particle filter." Journal of the American Statistical Association 112.520 (2017): 1636-1647. https://amstat.tandfonline.com/doi/abs/10.1080/01621459.2016.1222291
- Heng, J., Bishop, A. N., Deligiannidis, G., & Doucet, A. (2017). Controlled sequential monte carlo. arXiv preprint arXiv:1708.08396. https://arxiv.org/abs/1708.08396
- Moral, P. D. "Feynman-kac formulae: Genealogical and interacting particle systems with applications, Probability and its applications." (2004). https://www.infona.pl/resource/bwmeta1.element.springer-3c891fde-b218-3132-870a-10bb8290626c