Web Site for Perfectly Random Sampling with Markov Chains

Introduction

Annotated Bibliographic Entries
Articles
Theses
Research Talks
Related Literature

Picture Gallery

Definitions

Code

Credits

## Perfectly Random Sampling with Markov Chains

• Bibliographic data in BiBTeX format or .bbl format
• GNU Scientific Library pseudo-random number generators and random number distributions
• NCSA Pseudo-Random Number Generators
• Probability part of the XXX e-print server
• If you would like to submit an article or join a perfect-sampling mailing list, send me mail at If you would like to learn CFTP and/or Fill's algorithm, there are several expositions to choose from. Many articles which develop new coupling techniques for use in CFTP or Fill's algorithm also contain brief explanations of these techniques.

• Propp-Wilson '96 describe CFTP.
• Fill '98 describes his new algorithm, and also gives an explanation of CFTP that some probabilists prefer. Sections 1 and 7 are good places to start reading.
• Propp '97 gives a survey of CFTP with a focus on combinatorial applications.
• Propp-Wilson '98 give a user's guide to CFTP.
• Dimakos '99 gives a guide to CFTP and Fill's algorithm.
• Thönnes '99 gives a primer on perfect simulation focussed on CFTP.
• Fill-Machida-Murdoch-Rosenthal '00 give a readable extension of Fill's algorithm.
• Wilson '00 section 1 gives a primer on CFTP and extensions.
• Häggström '00 wrote course notes (now a book) on ``Finite Markov chains and algorithmic applications'', including a chapter on CFTP.
• Casella-Lavine-Robert '00 give an explanation of CFTP and Fill's algorithm.
• Møller '00 gives an exposition of CFTP and Fill's algorithm with a focus on stochastic geometry.
• David J. C. MacKay '03 wrote a book (``Information Theory, Inference, and Learning Algorithms'') which contains a 9-page introduction to perfect sampling.
• There is a chapter on ``Coupling from the past,'' by James G. Propp and David B. Wilson, Chapter 22 of the textbook ''Markov Chains and Mixing Times,'' by David A. Levin, Yuval Peres, and Elizabeth L. Wilmer, to be published by the American Mathematical Society, 2008.

### Introduction and Scope

Random sampling has found numerous applications in physics, statistics, and computer science. Perhaps the most versatile method of generating random samples from a probability space is to run a Markov chain. But for how many steps?

In most cases one simply does not know how many Markov chain steps are needed to get a sufficiently random state. There is a large literature of heuristic algorithms for inferring when enough steps have been taken, but they are non-rigorous, and one never knows for sure that an adequate number of steps have been taken. These heuristic algorithms are beyond the scope of this bibliography, but the interested reader is referred to some lecture notes by Sokal and the MCMC Preprint Service.

In the past decade there is been much research on obtaining rigorous bounds of how many Markov chain steps are needed to generate a random sample. Sometimes these bounds are tight, sometimes they are unduly pessimistic. The interested reader is referred to a survey by Diaconis and Saloff-Coste and a survey by Jerrum and Sinclair; the size of this literature makes it beyond the scope of this bibliography.

In recent years there have been a large number of algorithms developed for sampling from the steady state distribution of suitably well-structured Markov chains, which require no a priori knowledge of how long the Markov chains take to get mixed. The algorithms determine on their own, during run time, how many steps to run the Markov chain. It is these algorithms that are the focus of this bibliography. Most of these algorithms return samples that are distributed exactly according to the stationary distribution of the Markov chain, but a few return samples that have some bias ε that the user can make as small as desired. Since the focus of this bibliography is on working computer algorithms, the symbol is placed next to those articles that contain simulation results or give sample outputs. Each annotated entry contains links relevant to the paper, giving the article's abstract (click on the title) and authors' homepages when available, as well as links to online preprints. Several of the annotations were contributed by people other than the maintainer.

It has been pointed out that ``stationary stopping times'' may also be considered to be algorithms for sampling from a Markov chain's stationary distribution. However, these Markov chains typically have state spaces such as the symmetric group or the hypercube, for which one already knows how to effectively generate a random sample on a computer. Rather, the point of studying these stopping times is to understand interesting mathematical processes, such as shuffling a deck of cards. (A notable exception is Fill's algorithm.) Since the literature on these stopping times is sizable, only those articles with an algorithmic theme are included. The reader interested in stopping times is referred to the articles by Aldous and Diaconis and Diaconis and Fill, which contain many references.

Also relevant to the present bibliography is a literature on backwards compositions of random maps. The backwards composition of random maps can be used to define a ``stochastically recursive sequence'' of points from the state space. The articles from this literature study when this sequence of points converge almost surely, since convergence implies the existence of a stationary distribution for the Markov chain. (Existence can be nontrivial for infinite state spaces.) If one has the additional conditions that 1) the convergence occurs after a finite number steps, and 2) one can determine when this convergence has occurred, then one may use the technique of ``coupling from the past'', which is used in several the articles listed below, to generate random samples from the state space. (There are numerous examples in which conditions 1 or 2 do not hold.) Diaconis and Freedman give a survey of the literature on stochastically recursive sequences; a few representative articles are listed below.     (click for figure captions)

### Annotated Bibliographic Entries

Andrei Broder. Generating random spanning trees. In 30th Annual Symposium on Foundations of Computer Science, pages 442--447, 1989.

David J. Aldous. A random walk construction of uniform spanning trees and uniform labelled trees. SIAM Journal on Discrete Mathematics, 3(4):450--465, 1990.

These two articles give the same independently discovered random-walk based algorithm for generating random spanning trees of a graph. The algorithm uses a Markov chain on the set of spanning trees of an undirected graph to return a (perfectly) random spanning tree. Broder uses the algorithm to analyze the random walk on a ring. Aldous uses the algorithm to determine the properties of random trees and to compute some non-trivial probabilities pertaining to the random walk in the plane. (There is another random tree algorithm based on computing determinants.)

Søren Asmussen, Peter W. Glynn, and Hermann Thorisson. Stationary detection in the initial transient problem. ACM Transactions on Modeling and Computer Simulation, 2(2):130--157, 1992.

This paper explores what is possible and what is not, and was the first paper to show that it is possible to obtain unbiased samples from the steady state distribution of a finite Markov chain by observing it, provided it is irreducible and one knows how many states it has. Equivalently, there is a universal randomized stationary stopping time that works for all (irreducible) Markov chains with a given (finite) number of states.

David J. Aldous. On simulating a Markov chain stationary distribution when transition probabilities are unknown. In D. J. Aldous, P. Diaconis, J. Spencer, and J. M. Steele, editors, Discrete Probability and Algorithms, volume 72 of IMA Volumes in Mathematics and its Applications, pages 1--9. Springer-Verlag, 1995.

Abstract: We present an algorithm which, given a n-state Markov chain whose steps can be simulated, outputs a random state whose distribution is within ε of the stationary distribution, using O(n) space and O-2 τ) time, where τ is a certain ``average hitting time" parameter of the chain.

Remarks: While not exact, this algorithm was much more efficient than the previous one, and directly stimulated the development of two subsequent exact sampling algorithms. This paper gives the only nontrivial lower bound on the running time of an algorithm for exact sampling from generic Markov chains.

Dana Randall and Alistair Sinclair. Testable algorithms for self-avoiding walks. In Proceedings of the Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 593-602, 1994. (postscript)

Abstract: We present a polynomial time Monte Carlo algorithm for almost uniformly generating and approximately counting self-avoiding walks in rectangular lattices Zd. These are classical problems that arise, for example, in the study of long polymer chains. While there are a number of Monte Carlo algorithms used to solve these problems in practice, these are heuristic and their correctness relies on unproven conjectures. In contrast, our algorithm depends on a single, widely-believed conjecture that is weaker than preceding assumptions, and, more importantly, is one which the algorithm itself can test. Thus our algorithm is reliable, in the sense that it either outputs answers that are guaranteed, with high probability, to be correct, or finds a counter-example to the conjecture.

László Lovász and Peter Winkler. Exact mixing in an unknown Markov chain. Electronic Journal of Combinatorics, 2, 1995. Paper #R15.

Abstract: We give a simple stopping rule which will stop an unknown, irreducible \$n\$-state Markov chain at a state whose probability distribution is exactly the stationary distribution of the chain. The expected stopping time of the rule is bounded by a polynomial in the maximum mean hitting time of the chain. Our stopping rule can be made deterministic unless the chain itself has no random transitions.

Remarks: This paper gives the first (universal) exact sampling algorithm that runs in time that is polynomial in certain parameters associated with the Markov chain. Gives a deterministic stationary stopping time that works when the Markov chain itself is not deterministic. The paper also contains a pretty lemma on random trees that is of independent interest. James G. Propp and David B. Wilson. Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms, 9(1&2):223--252, 1996.

Abstract: For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain has run for M steps, with M sufficiently large, the distribution governing the state of the chain approximates the desired distribution. Unfortunately it can be difficult to determine how large M needs to be. We describe a simple variant of this method that determines on its own when to stop, and that outputs samples in exact accordance with the desired distribution. The method uses couplings, which have also played a role in other sampling schemes; however, rather than running the coupled chains from the present into the future, one runs from a distant point in the past up until the present, where the distance into the past that one needs to go is determined during the running of the algorithm itself. If the state space has a partial order that is preserved under the moves of the Markov chain, then the coupling is often particularly efficient. Using our approach one can sample from the Gibbs distributions associated with various statistical mechanics models (including Ising, random-cluster, ice, and dimer) or choose uniformly at random from the elements of a finite distributive lattice.

Remarks: This paper gives an algorithm, monotone coupling from the past, for exact sampling with Markov chains on huge state spaces. Since monotone-CFTP relies on the Markov chain having special structure, it is not ``universal'' as several of the above algorithms are, but it is practical for a surprising number of previously studied Markov chains. Includes simulation results for the random cluster and dimer models. Valen E. Johnson. Studying convergence of Markov chain Monte Carlo algorithms using coupled sample paths. Journal of the American Statistical Association, 91(433):154--166, 1996. (Postscript.)

Abstract: I describe a simple procedure for investigating the convergence properties of Markov Chain Monte Carlo sampling schemes. The procedure employs multiple runs from a sampler, using the same random deviates for each run. When the sample paths from all sequences converge, it is argued that approximate equilibrium conditions hold. The procedure also provides a simple diagnostic for detecting modes in multimodal posteriors. Several examples of the procedure are provided. In Ising models, the relation between the correlation parameter and the convergence rate of rudimentary Gibbs samplers is investigated. In another example, the effects of multiple modes on the convergence of coupled paths are explored using mixtures of bivariate normal distributions. The technique is also used to evaluate the convergence properties of a Gibbs sampling scheme applied to a model for rat growth rates (Gelfand et al 1990).

Remarks: While technically not a paper on exact sampling, this paper investigates how the mixing time of a Markov chain may be inferred by running a large number of coupled simulations until they coalesce. The initial states of the Markov chains are chosen at random, and if the probability of rejection in rejection sampling is known, then rigorous estimates of the mixing time are given. Includes the (independently made) observation that for monotone Markov chains, only two coupled states need to be simulated. Includes simulation results. Additional articles that take a similar approach are available from Johnson's homepage.

Michael Luby, Dana Randall, and Alistair Sinclair. Markov chain algorithms for planar lattice structures (extended abstract). In 36th Annual Symposium on Foundations of Computer Science, pages 150--159, 1995.

Abstract: Consider the following Markov chain, whose states are all domino tilings of a 2n X 2n chessboard: starting from some arbitrary tiling, pick a 2 X 2 window uniformly at random. If the four squares appearing in this window are covered by two parallel dominoes, rotate the dominoes in place. Repeat many times. This process is used in practice to generate a random tiling, and is a key tool in the study of the combinatorics of tilings and the behavior of dimer systems in statistical physics. Analogous Markov chains are used to randomly generate other structures on various two-dimensional lattices. This paper presents techniques which prove for the first time that, in many interesting cases, a small number of random moves suffice to obtain a uniform distribution.

Remarks: This paper gives three new Markov chains for sampling certain dimer and ice systems. The focus of this paper is provable running time bounds. Monotone-CFTP may be applied to each of their Markov chains to get an exact algorithm; when this is done, their proofs may be interpreted as a priori bounds on the running time of CFTP, though in practice the exact algorithm runs much more quickly than the bounds suggest.

Remarks: For the particular case of the 2n X 2n chessboard, it is much faster to generate a random spanning tree, and then use a Temperley-like bijection to convert it to a (perfectly) random domino tiling. Their methods apply to more general regions for which such a Temperleyan bijection does not exist. James G. Propp and David B. Wilson. How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph. Journal of Algorithms, 27:170--217, 1998. This article combines two conference papers, appearing in Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms and Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing.

Abstract: A general problem in computational probability theory is that of generating a random sample from the state space of a Markov chain in accordance with the steady-state probability law of the chain. Another problem is that of generating a random spanning tree of a graph or spanning arborescence of a directed graph in accordance with the uniform distribution, or more generally in accordance with a distribution given by weights on the edges of the graph or digraph. This paper gives algorithms for both of these problems, improving on earlier results and exploiting the duality between the two problems. Each of the new algorithms hinges on the recently-introduced technique of coupling from the past or on the linked notions of loop-erased random walk and ``cycle-popping''.

Remarks: This article gives what are so far the best algorithms for exact sampling on generic Markov chains, as well as the first application of CFTP to a huge non-monotone state space. It gives a universal randomized stationary stopping time that is within a constant factor of optimal, and another algorithm that is faster when the Markov chain can be simulated starting from any state rather than just observed in action. Surprisingly, both algorithms are intimately related to the generation of random spanning trees of a weighted directed graph, and this paper gives tree algorithms that are both faster and more general than the Broder/Aldous algorithm. The algorithms use various versions of CFTP (voter-CFTP, coalescing-CFTP, cover-CFTP, and tree-CFTP), and another technique called cycle popping. (An upcoming book by Lyons and Peres makes use of the cycle-popping algorithm when analyzing essential spanning forests of infinite graphs.) Includes a sample output from the tree algorithm.

David Eppstein. Representing all minimum spanning trees with applications to counting and generation, Technical Report 95-50, U.C. Irvine, 1995.

Shows how to use the algorithms for random spanning tree generation to solve other random sampling problems. Per Eppstein's description: Shows how to find for any edge weighted graph G an equivalent graph EG such that the minimum spanning trees of G correspond one-for-one with the spanning trees of EG. The equivalent graph can be constructed in time O(m+n log n) given a single minimum spanning tree of G. As a consequence one can find fast algorithms for counting, listing, and randomly generating MSTs. Also discusses similar equivalent graph constructions for shortest paths, minimum cost flows, and bipartite matching. Stefan Felsner and Lorenz Wernisch. Markov chains for linear extensions, the two-dimensional case. In Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 239--247, 1997. (full version)

Abstract: We study the generation of uniformly distributed linear extensions using Markov chains. In particular we show that monotone coupling from the past can be applied in the case of linear extensions of two-dimensional orders. For width two orders a mixing rate of O(n^3 log n) is proved. We conjecture that this is the mixing rate in the general case and support the conjecture by empirical data. On the course we obtain several nice structural results concerning Bruhat order and weak Bruhat order of permutations.

Remarks: Subsequent work confirmed their conjecture. Wilfrid S. Kendall. Perfect simulation for the area-interaction point process. In L. Accardi and C. C. Heyde, editors, Probability Towards 2000, pages 218--234. Springer, 1998.
Proceedings of the International Symposium, ``Probability Towards 2000'', held in 1995. (Postscript.)

Abstract: The ideas of Propp and Wilson ("Exact sampling with coupled Markov chains and applications to statistical mechanics", Random Structures and Algorithms, 1996, 9:223-252) for exact simulation of Markov chains are extended to deal with perfect simulation of attractive area-interaction point processes in bounded windows. A simple modification of the basic algorithm is described which provides perfect simulation of the non-monotonic case of the repulsive area-interaction point process. Results from simulations using a C computer program are reported; these confirm the practicality of this approach in both attractive and repulsive cases. The paper concludes by mentioning other point processes which can be simulated perfectly in this way, and by speculating on useful future directions of research.

Remarks: Shows how to do perfect simulations of the area-interaction point process, though the techniques extend to more general point processes. A unique feature of this application is that the Markov chain used is not uniformly ergodic, i.e. its mixing time is infinite. (The state space is an infinite partially ordered set with no top state, and one can find states from which the Markov chain takes arbirtrarily long to equilibrate.) To deal with this problem, Kendall modified monotone-CFTP, replacing references to the (nonexistent) top state with references to a stochastically dominating process. Kendall also shows how to apply CFTP to repulsive point processes, which are anti-monotone. Includes simulation results.

Wilfrid S. Kendall. On some weighted Boolean models. In D. Jeulin, editor, Advances in Theory and Applications of Random Sets, pages 105--120. World Scientific Publishing Company, 1997. (postscript)

Following up on the previous article, shows how to apply CFTP to attractive birth-death processes and exclusion processes.

James A. Fill. An interruptible algorithm for perfect sampling via Markov chains. The Annals of Applied Probability, 8(1):131--162, 1998. (Postscript.) An extended abstract appeared in the Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing.

Abstract: For a large class of examples arising in statistical physics known as attractive spin systems (e.g., the Ising model), one seeks to sample from a probability distribution π on an enormously large state space, but elementary sampling is ruled out by the infeasibility of calculating an appropriate normalizing constant. The same difficulty arises in computer science problems where one seeks to sample randomly from a large finite distributive lattice whose precise size cannot be ascertained in any reasonable amount of time. The Markov chain Monte Carlo (MCMC) approximate sampling approach to such a problem is to construct and run ``for a long time'' a Markov chain with long-run distribution π. But determining how long is long enough to get a good approximation can be both analytically and empirically difficult. Recently, Jim Propp and David Wilson have devised an ingenious and efficient algorithm to use the same Markov chains to produce perfect (i.e., exact) samples from π. However, the running time of their algorithm is an unbounded random variable whose order of magnitude is typically unknown a priori and which is not independent of the state sampled, so a naive user with limited patience who aborts a long run of the algorithm will introduce bias. We present a new algorithm which (1) again uses the same Markov chains to produce perfect samples from π, but is based on a different idea (namely, acceptance/rejection sampling); and (2) eliminates user-impatience bias. Like the Propp-Wilson algorithm, the new algorithm applies to a general class of suitably monotone chains, and also (with modification) to ``anti-monotone'' chains. When the chain is reversible, naive implementation of the algorithm uses fewer transitions but more space than Propp-Wilson. When fine-tuned and applied with the aid of a typical pseudorandom number generator to an attractive spin system on n sites using a random site updating Gibbs sampler whose mixing time tau is polynomial in n, the algorithm runs in time of the same order (bound) as Propp-Wilson [expectation O(tau log n)] and uses only logarithmically more space [expectation O(n log n), vs. O(n) for Propp-Wilson].

Peter W. Glynn and Philip Heidelberger. Bias properties of budget constrained simulations. Operations Research, 38(5):801--814, 1990.

Among other things, this article shows that so long as the user insists on waiting long enough to get at least one random sample, a deadline will not introduce bias. This holds for any sampling procedure, whether the deadline is in terms of Markov chain steps, real time, or any other measure.

James A. Fill. The move-to-front rule: a case study for two exact sampling algorithms. Probability in the Engineering and Informational Sciences, 12:283--302, 1998.

Abstract: The elementary problem of exhaustively sampling a finite population without replacement is used as a nonreversible test case for comparing two recently proposed MCMC algorithms for perfect sampling, one based on backward coupling and the other on strong stationary duality. The backward coupling algorithm runs faster in this case, but the duality-based algorithm is unbiased for user impatience. An interesting by-product of the analysis is a new and simple stochastic interpretation of a mixing-time result for the move-to-front rule. O. Häggström, M. N. M. van Lieshout, and J. Møller. Characterisation results and Markov chain Monte Carlo algorithms including exact simulation for some spatial point processes. Bernoulli, 5(4):641--658, 1999.
(Technical Report R-96-2040)

Abstract: The area-interaction process and the continuum random-cluster model are characterised in terms of certain functional forms of their respective conditional intensities. In certain cases, these two point process models can be derived from a bivariate point process model which in many respects is simpler to analyse and simulate. Using this correspondence we devise a two-component Gibbs sampler, which can be used for fast and exact simulation by extending the recent ideas of Propp and Wilson. We further introduce a Swendsen-Wang type algorithm. The relevance of the results within spatial statistics as well as statistical physics is discussed.

Remarks: Shows how to apply monotone-CFTP to sample from the Widom-Rowlinson point process. Here there is no top or bottom state, but (in contrast to Kendall's chain) such states can be adjoined to the state space. (The authors instead use two other states, already in the state space, that are just as effective at testing coalescence.) The Widom-Rowlinson model can be marginalized to obtain the attractive area-interaction point process and the continuum random cluster model (with positive integral q). Sampling from the area-interaction process in this way turns out to be faster than (though not as generalizable as) Kendall's approach. The paper also describes a Swendsen-Wang type algorithm, and includes simulation results. Jesper Møller. Markov chain Monte Carlo and spatial point processes. In W. S. Kendall, O. E. Barndorff-Nielsen, and M. N. M. van Lieshout, editors, Stochastic Geometry: Likelihood and Computation, Monographs on Statistics and Applied Probability #80, pages 141--172. Chapman and Hall / CRC Press, 1998.

Surveys the spatial point processes, including recent work on exact sampling, and includes simulation results.

James Propp Generating random elements of a finite distributive lattice. Electronic Journal of Combinatorics, 4(2), 1997. Paper #R15. (Postscript.) arXiv:math.CO/9801066

Abstract: This survey article describes a method for choosing uniformly at random from any finite set whose objects can be viewed as constituting a distributive lattice. The method is based on ideas of the author and David Wilson for using ``coupling from the past'' to remove initialization bias from Monte Carlo randomization. The article describes several applications to specific kinds of combinatorial objects such as tilings, constrained lattice paths, an alternating sign matrices. Olle Häggström and Karin Nelander. Exact sampling from anti-monotone systems. Statistica Neerlandica, 52:360--380, 1998. Postscript.

Abstract: A new approach to Markov chain Monte Carlo simulation was recently proposed by Propp and Wilson. This approach, unlike traditional ones, yield samples which have exactly the desired distribution. The Propp-Wilson algorithm requires this distribution to have a certain structure called monotonicity. In this paper an idea of Kendall is applied to show how the algorithm can be extended to the case where monotonicity is replaced by anti-monotonicity. As illustrating examples, simulations of the hard-core model and the random-cluster model are presented.

Michael Luby and Eric Vigoda. Approximately counting up to four (extended abstract). In Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, pages 682--687, 1997. (Postscript.)

Abstract: We present a fully-polynomial scheme to approximate the number of independent sets in graphs with maximum degree four. In general, for graphs with maximum degree Δ≥4, the scheme approximates a weighted sum of independent sets. The weight of each independent set is expressed in terms of a positive parameter λ≤ 1/(Δ-3), where the weight of independent set S is λ|S|. We also prove complementary hardness of approximation results, which show that it is hard to approximate the weighted sum for values of λ > c/Δ for some constant c > 0.

Remarks: Gives a new Markov chain for generating random independent sets with activity λ. When the maximum degree Δ≥4 and λ ≤ 1/(Δ-3), they show that this Markov chain is rapidly mixing. CFTP may be efficiently applied to their Markov chain; when this is done, their proofs (but not the theorem itself) may be interpreted as a priori bounds on the running time of CFTP.

An article by Dyer and Greenhill extends and improves on this one, though itself is not connected to CFTP. D. J. Murdoch and P. J. Green. Exact sampling from a continuous state space. Scandinavian Journal of Statistics 25(3):483--502, 1998. (Postscript)

Abstract: Propp and Wilson (1995) described a protocol, called coupling from the past, for exact sampling from a finite distribution using a coupled Markov chain Monte Carlo algorithm. In this paper we extend coupling from the past to various MCMC samplers on a continuous state space; rather than following the monotone sampling device of Propp and Wilson, our approach uses methods related to gamma-coupling and rejection sampling to simulate the chain, and direct accounting of sample paths to check whether they have coupled.

Remarks: Describes a variety of scenarios for which CFTP may be applied to a (non-monotone) continuous state space, giving a sequence of algorithms that start out simple, but become increasingly sophisticated. The methods used are related to gamma-coupling and rejection sampling, and appear to be applicable to Bayesian parameter estimation. Includes a case study comparing the performance of each technique when used to sample from the posterior distribution of a set of pump reliability parameters. Wilfrid S. Kendall and Jesper Møller. Perfect simulation using dominating processes on ordered spaces, with application to locally stable point processes. Advances in Applied Probability 32(3):844--865, 2000. (Postscript)

Abstract: In this paper we investigate the application of perfect simulation, in particular Coupling from The Past (CFTP), to the simulation of random point processes. We give a general formulation of the method of dominated CFTP and apply it to the problem of perfect simulation of general locally stable point processes as equilibrium distributions of spatial birth-and-death processes. We then investigate discrete-time Metropolis-Hastings samplers for point processes, and show how a variant which samples systematically from cells can be converted into a perfect version. An application is given to the Strauss point process.

Haiyan Cai. A note on an exact sampling algorithm and Metropolis Markov chains. Technical report, University of Missouri, St. Louis, 1997.

Shows how to take a generic probability distribution on a space, together with a reference distribution from which it is already known how to sample, and construct a certain particular Metropolis-type monotone Markov chain. Coalescence occurs precisely when the rejection sampler with the same reference measure would accept.

S. G. Foss and R. L. Tweedie. Perfect simulation and backward coupling. Stochastic Models, 14(1-2):187--203, 1998. (Postscript.)

Abstract: Algorithms for perfect or exact simulation of random samples from the invariant measure of a Markov chain have received considerable recent attention following the introduction of the "coupling-from-the-past" (CFTP) technique of Propp and Wilson. Here, we place such algorithms in the context of backward coupling of stochastically recursive sequences. We show that although general backward couplings can be constructed for chains with finite mean forward coupling times, and can even be thought of as extending the classical "Loynes schemes" from queuing theory, successful "vertical" CFTP algorithms such as those of Propp and Wilson can be constructed if and only if the chain is uniformly geometric ergodic. We also relate the convergence moments for backward coupling methods to those of forward coupling times: the former typically lose at most one moment compared to the latter.

Remarks: Relates CFTP to more general stochastically recursive sequences, for which one has a sequence of random variables that with probability one converge to a particular value with the right distribution, but for which one isn't necessarily able to determine when this convergence has happened. Studies the moments of the convergence time of stochastically recursive sequences, some of which may be finite with others being infinite.

Robert B. Lund and David B. Wilson. Exact sampling algorithms for storage systems. In preparation.

Show how to apply monotone-CFTP to sample from the steady-state distribution of certain storage systems. Rather than the linear data-flow used in traditional monotone-CFTP, the flow of data through the algorithm is two-dimensional.

Jesper Møller. Perfect simulation of conditionally specified models. Journal of the Royal Statistical Society, Ser. B, 61(1):251--264, 1999.

Abstract: We discuss how the ideas of producing perfect simulations based on coupling from the past for finite state space models naturally extend to multivariate distributions with infinite or uncountable state spaces such as auto-gamma, auto-Poisson and autonegative binomial models, using Gibbs sampling in combination with sandwiching methods originally introduced for perfect simulation of point processes.

Remarks: Shows how to apply CFTP to sample from a class of Markov random fields, in which the individual variables (from a finite, countable, or continuous space) depend upon one another in a monotone or anti-monotone way. For the continuous distributions, the user specifies an ε, which controls not the bias (which is 0), but instead the numerical precision of the output. The application to the auto-gamma distribution, which includes the pump-reliability application above, appears to be faster than the approach of Murdoch and Green.

For the continuous distributions there is an even faster method, which also takes the numerical error ε to zero.

The aforementioned sandwiching methods are the ones introduced for antimonotone CFTP by Kendall and Häggström-Nelander.

W. S. Kendall. Perfect Simulation for Spatial Point Processes. In Bulletin of the International Statistical Institute 51st Session, Istanbul (August 1997), volume 3, pages 163--166, 1997.

Abstract: A survey is given of the ideas of perfect simulation involving Coupling From The Past (CFTP) as introduced by Propp and Wilson (1996), and its application to the perfect simulation of spatial point processes. A sketch is given of an extension to the perfect simulation of point processes defined over all R^2.

Sommaire: Voici un expos� des id�es de la simulation parfaite au moyen de Coupling From The Past (CFTP) selon Propp et Wilson (1996), et son application � la simulation des processus ponctuels en espace. Une indication est fournie du moyen de conduire la simulation parfaite des processus ponctuels �tendus � travers tout R^2. S. G. Foss, R. L. Tweedie, and J. N. Corcoran. Simulating the invariant measures of Markov chains using backward coupling at regeneration times. Probability in the Engineering and Informational Sciences, 12:303--320, 1998. (postscript).

Abstract: We develop an algorithm for simulating approximate random samples from the invariant measure of a Markov chain using backward coupling of embedded regeneration times. Related methods have been used effectively for finite chains and for stochastically monotone chains: here we propose a method of implementation which avoids these restrictions by using a ``cycle-length'' truncation. We show that the coupling times have good theoretical properties and describe benefits and difficulties of implementing the methods in practice.

Remarks: Gives a method for approximately sampling from the steady-state distribution of a Markov chain when the state of the chain takes on a certain particular value on a semi-regular basis. The Markov chain of interest is lifted to one that keeps track of 1) the original chain's state, and 2) the number of steps since the original chain's state took on that special value. The lifted chain is then truncated to force a return to the special state whenever the second coordinate gets too big. Using an independent set of coins for each value of time and each value of the second coordinate of the modified lifted chain, the authors show how to effectively test for coalescence. Includes discussion for how to choose the truncation parameter, though this not mechanized to the extent where the user could specify a desired maximum bias. Includes a case study on a two-server re-entrant queueing network. Elke Thönnes. Perfect simulation of some point processes for the impatient user. Advances in Applied Probability, Stochastic Geometry and Statistical Applications, 31(1):69--87, 1999. (Postscript.)

Abstract: Recently Propp and Wilson have proposed an algorithm, called Coupling from the Past (CFTP), which allows not only an approximate but perfect (i.e. exact simulation of the stationary distribution of certain finite state space Markov chains. Perfect Sampling using CFTP has been successfully extended to the context of point processes, amongst other authors, by Häggström et al.. In Häggström et al. Gibbs Sampling is applied to a bivariate point process, the pentrable spheres mixture model. Fill also introduced an exact sampling algorithm, which, in contrast to CFTP, is unbiased for user impatience. Fill's algorithm is a form of rejection sampling and similar to CFTP requires sufficient monotonicity properties of the transition kernel used. We show how Fill's version of rejection sampling can be extended to produce perfect samples of the penetrable spheres mixture process and related models. Following Häggström et al. we use Gibbs sampling and make use of the partial order of the mixture model state space. Thus we construct an algorithm which protects agains bias caused by user impatience and which delivers samples not only of the mixture model but also of the attractive area-interaction and the continuum random-cluster process.

James Propp and David Wilson. Coupling from the past: a user's guide. In D. Aldous and J. Propp, editors, Microsurveys in Discrete Probability, volume 41 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 181--192. American Mathematical Society, 1998. (Postscript)

An expository paper describing a variety of very different situations for which coupling-from-the-past may be applied, and gives advice on recognizing additional applications. Includes some caveats for the practitioner. Olle Häggström and Karin Nelander. On exact simulation of Markov random fields using coupling from the past. Scandinavian Journal of Statistics, 26(3):395--411, 1999.

Abstract: A general framework for exact simulation of Markov random fields using the Propp-Wilson coupling from the past approach is proposed. Our emphasis is on situations lacking the monotonicity properties that have been exploited in previous studies. A critical aspect is the convergence time of the algorithm; this we study both theoretically and experimentally. Our main theoretical result in this direction says, roughly, that if interactions are sufficiently weak, then the expected running time of a carefully designed implementation is O(N log N), where N is the number of interacting components of the system. Computer experiments are carried out for random q-colourings and for the Widom-Rowlinson lattice gas model.

Some of the techniques in this paper were independently developed by Huber.

J. van den Berg and J. E. Steif. On the existence and nonexistence of finitary codings for a class of random fields. The Annals of Probability 27(3):1501--1522, 1999. (Postscript.)

Abstract: We study the existence of finitary codings (also called finitary homomorphisms or finitary factor maps) from a finite-valued i.i.d. process to certain random fields. For Markov random fields we show, using ideas of Marton and Shields, that the presence of a phase transition is an obstruction for the existence of the above coding: this yields a large class of Bernoulli shifts for which no such coding exists. Conversely, we show that for the stationary distribution of a monotone exponentially ergodic probabilistic cellular automaton such a coding does exist. The construction of the coding is partially inspired by the Propp-Wilson algorithm for exact simulation. In particular, combining our results with a theorem of Martinelli and Olivieri, we obtain the fact that for the plus state for the ferromagnetic Ising model on Zd, d≥2, there is (not) such a coding when the interaction parameter is below (above) its critical value. J. N. Corcoran and R. L. Tweedie. Perfect sampling of ergodic Harris chains. Annals of Applied Probability 11(2):438--451, 2001. (Postscript.)

Abstract: We develop an algorithm for simulating ``perfect'' random samples from the invariant measure of a Harris recurrent Markov chain. The method uses backward coupling of embedded regeneration times, and works most effectively for finite chains, or on continuous spaces for stochastically monotone chains, where paths may be sandwiched between ``upper'' and ``lower'' processes. Examples show that more naive approaches to constructing such bounding processes may be considerably biased, but that the algorithm can be simplified in certain cases to make it easier to run. We give explicit analytic bounds on the backward coupling times in the stochastically monotone case. An application of the algorithm to bounded storage models is given, and we also develop a random ``upper'' process for analyzing unbounded storage models.

Remarks: For the given storage model application there are more effective algorithms.

David B. Wilson. Annotated bibliography of perfectly random sampling with Markov chains. In D. Aldous and J. Propp, editors, Microsurveys in Discrete Probability, volume 41 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 209-220. American Mathematical Society, 1998. Updated versions to appear at http://dimacs.rutgers.edu/~dbwilson/exact/. (Postscript.) Morten Fismen. Exact simulation using Markov chains. Technical Report 6/98, Institutt for Matematiske Fag, 1998. Diploma-thesis. (Postscript.)

Abstract: This reports gives a review of the new exact simulation algorithms using Markov chains. The first part covers the discrete case. We consider two different algorithms, Propp and Wilsons ``coupling from the past'' (CFTP) technique and Fills rejection sampler. The algorithms are tested on the Ising model, with and without an external field. The second part covers continuous state spaces. We present several algorithms developed by Murdoch and Green, all based on ``coupling from the past''. We discuss the applicability of these methods on a Bayesian analysis problem of surgical failure rates. Peter J. Green and Duncan J. Murdoch. Exact sampling for Bayesian inference: towards general purpose algorithms (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, editors, Bayesian Statistics 6, pages 301--321. Oxford University Press, 1999. Presented as an invited paper at the 6th Valencia International Meeting on Bayesian Statistics, Alcossebre, Spain, June 1998. Preprint.

Abstract: Propp and Wilson (1996) described a protocol, called coupling from the past, for exact sampling from a target distribution using a coupled Markov chain Monte Carlo algorithm. In this paper we discuss the implementation of coupling from the past for samplers on a continuous state space; our ultimate objective is Bayesian MCMC with guaranteed convergence. We make some progress towards this objective, but our methods are still not automatic or generally applicable.

Per author's description: Describes several coupling techniques aimed at routine application in the context of Bayesian inference using random walk Metropolis: the ``bisection coupler'', calculation of automatic cell bounds and the adaptation of the Kendall (1996)/Haggstrom et al (1996) idea of bounded CFTP. Includes simulations of the posterior in two 2-parameter blood type datasets with a Dirichlet prior, a 3 dimensional Dirichlet distribution, and a N(0,1) distribution. W. S. Kendall and Elke Thönnes. Perfect Simulation in Stochastic Geometry. Pattern Recognition 32(9):1569--1586, 1999. Special issue on random sets. (Postscript.)

Abstract: Simulation plays an important r�le in stochastic geometry and related fields, because all but the simplest random set models tend to be intractable to analysis. Many simulation algorithms deliver (approximate) samples of such random set models, for example by simulating the equilibrium distribution of a Markov chain such as a spatial birth-and-death process. The samples usually fail to be exact because the algorithm simulates the Markov chain for a long but finite time, and thus convergence to equilibrium is only approximate. The seminal work by Propp and Wilson made an important contribution to simulation by proposing a coupling method, Coupling from the Past (CFTP), which delivers perfect, that is to say exact, simulations of Markov chains. In this paper we introduce this new idea of perfect simulation and illustrate it using two common models in stochastic geometry: the dead leaves model and a Boolean model conditioned to cover a finite set of points. J. N. Corcoran and R. L. Tweedie. Perfect sampling from independent Metropolis-Hastings chains. Journal of Statistical Planning and Inference 104(2):297--314, 2002. (Postscript)

Abstract: ``Perfect sampling'' enables exact draws from the invariant measure of a Markov chain. We show that the independent Metropolis-Hastings chain has certain stochastic monotonicity properties that enable a perfect sampling algorithm to be implemented, at least when the candidate is overdispersed with respect to the target distribution. We prove that the algorithm has an optimal geometric convergence rate, and applications show that it may converge extremely rapidly.

Persi Diaconis and David Freedman. Iterated random functions. SIAM Review, 41(1):45--76, 1999. (Postscript, PDF.)

Abstract: Iterated random functions are used to draw pictures or simulate large Ising models, among other applications. They offer a method for studying the steady state distribution of a Markov chain, and give useful bounds on rates of convergence in a variety of examples. The present paper surveys the field and presents some new examples. There is a simple unifying idea: the iterates of random Lipschitz functions converge if the functions are contracting on the average.

The Editors. Graphical illustration of some examples related to the article ``Iterated random functions'' by Diaconis and Freedman. SIAM Review, 41(1):77--82, 1999.

Abstract: Figures are presented to illustrate some examples of iterated random functions of the kind described in the article in this issue by Diaconis and Freedman. Jesper Møller and Katja Schladitz. Extensions of Fill's algorithm for perfect simulation. Journal of the Royal Statistical Society, Ser. B, 61(4):955--969, 1999. (Postscript.)

Abstract: Fill's algorithm for perfect simulation for attractive finite state space models, unbiased for user impatience, is presented in terms of stochastic recursive sequences and extended in two ways. Repulsive discrete Markov random fields with two coding sets like the auto-Poisson distribution on a lattice with 4-neighbourhood can be treated as monotone systems if a particular partial ordering and quasimaximal and quasi-minimal states are used. Fill's algorithm then applies directly. Combining Fill's rejection sampling with sandwiching leads to a version of the algorithm, which works for general discrete conditionally specified repulsive models. Extensions to other types of models are briefly discussed.

Mark Huber. Exact Sampling and Approximate Counting Techniques. In Proceedings of the 30th ACM Symposium on the Theory of Computing, pages 31--40, 1998. (Postscript) (PDF)

Per author's description: This paper introduces the idea of a bounding chain for determining when complete coupling has occurred so that Coupling From the Past may be run on the chain to obtain exact samples. A bounding chain for the antiferromagnetic Potts model is given which is shown to couple completely in polynomial time given either a large number of colors or a sufficiently high temperature. The same analytical techniques also show the single-site update model of Propp and Wilson (ferromagnetic Ising) and Häggström and Nelander (antiferromagnetic Ising) run in polynomial time given high enough temperature.

Some of the techniques in this paper were independently developed by Häggström and Nelander. Karin Nelander. A Markov chain Monte Carlo study of the beach model. Markov Processes and Related Fields 6(3):345--369, 2000.

Abstract: We study the beach model introduced by Burton and Steif. The model is an example of a strongly irreducible subshift of finite type which, when the parameter of the model exceeds a critical value, has more than one measure of maximal entropy. We are interested in the critical value and how it depends on dimension. By way of simulations, we find that the critical value seems to be decreasing as a function of the dimension. We also present a conjecture for the whereabouts of the critical value in 2 dimensions. Our simulations are made with Markov chain Monte Carlo methods. For some parameter values and sizes of systems we are able to use the Propp-Wilson algorithm for exact simulation. For other combinations of parameter value and size we use a variant of the Swendsen-Wang algorithm.

Krzysztof Burdzy and Wilfrid S. Kendall. Efficient Markovian couplings: examples and counterexamples. The Annals of Applied Probability 10(2):362--409, 2000. (Postscript)

Abstract: In this paper we study the notion of an efficient coupling of Markov processes. Informally, an efficient coupling is one which couples at the maximum possible exponential rate, as given by the spectral gap. This notion is of interest not only for its own sake, but also of growing importance arising from the very recent advent of methods of perfect simulation: it helps to establish the price of perfection' for such methods. In general one can always achieve efficient coupling if the coupling is allowed to cheat (if each component's behaviour is affected by future behaviour of the other component), but the situation is more interesting if the coupling is required to be co-adapted. We present an informal heuristic for the existence of an efficient coupling, and justify the heuristic by proving rigorous results and examples in the contexts of finite reversible Markov chains and of reflecting Brownian motion in planar domains.

Mark Huber. A bounding chain for Swendsen-Wang. Random Structure and Algorithms 22(1):43--59, 2003. (PDF)
A two-page version appeared in Tenth Annual ACM-SIAM Symposium on Discrete Algorithms (1999).

Abstract: The greatest drawback of Monte Carlo Markov chain methods is lack of knowledge of the mixing time of the chain. The use of bounding chains solves this difficulty for some chains by giving theoretical and experimental upper bounds on the mixing time. Moreover, when used with methodologies such as coupling from the past, bounding chains allow the user to take samples drawn exactly from the stationary distribution without knowledge of the mixing time. Here we present a bounding chain for the Swendsen-Wang process. The Swendsen-Wang bounding chain allow us to efficiently obtain exact samples from the ferromagnetic Q-state Potts model for certain classes of graphs. Also, by analyzing this bounding chain, we will show that Swendsen-Wang is rapidly mixing over a slightly larger range of parameters than was known previously. Duncan J. Murdoch and Jeffrey S. Rosenthal. Efficient use of exact samples. Statistics and Computing 10(3):237--243, 2000.
Presented at the 6th Valencia International Meeting on Bayesian Statistics, Alcossebre, Spain, June 1998. (Postscript) (Poster.) (Applet.)

Abstract: Propp and Wilson (1996,1998) described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state distribution of a Markov chain Monte Carlo (MCMC) process. In it a past time is identified from which the paths of coupled Markov chains starting at every possible state would have coalesced into a single value by the present time; this value is then a sample from the steady-state distribution. Unfortunately, producing an exact sample typically requires a large computational effort. We consider the question of how to make efficient use of the sample values that are generated. In particular, we make use of regeneration events (cf. Mykland et al., 1995) to aid in the analysis of MCMC runs. In a regeneration event, the chain is in a fixed reference distribution; this allows the chain to be broken up into a series of tours which are independent, or nearly so (though they do not represent draws from the true stationary distribution). In this paper we consider using the CFTP and related algorithms to create tours. In some cases their elements are exactly in the stationary distribution; their length may be fixed or random. This allows us to combine the precision of exact sampling with the efficiency of using entire tours. Several algorithms and estimators are proposed and analysed.
James Allen Fill, Motoya Machida, Duncan J. Murdoch, and Jeffrey S. Rosenthal. Extension of Fill's perfect rejection sampling algorithm to general chains. Extended abstract to appear in Neil Madras, editor, Monte Carlo Methods, volume 26 of Fields Institute Communications, pages 37--52. American Mathematical Society, 2000. Full version appears in a special double issue of Random Structures and Algorithms 17(3&4):290--316, 2000.
(Extended abstract) (Full paper)

Abstract: By developing and applying a broad framework for rejection sampling using auxiliary randomness, we provide an extension of the perfect sampling algorithm of Fill (1998) to general chains on quite general state spaces, and describe how use of bounding processes can ease computational burden. Along the way, we unearth a simple connection between the Coupling From The Past (CFTP) algorithm originated by Propp and Wilson (1996) and our extension of Fill's algorithm.

Remarks: The two versions of the four-authored paper supersede the 1998 preprint
Duncan J. Murdoch and Jeffrey S. Rosenthal. ``An extension of Fill's exact sampling algorithm to non-monotone chains.''

Olle Häggström and Jeffrey E. Steif. Propp-Wilson algorithms and finitary codings for high noise Markov random fields. Combinatorics, Probability and Computing 9(5):425--439, 2000. (Postscript.)

Abstract: In this paper, we combine two previous works, the first being by the first author and K. Nelander, and the second by J. van den Berg and the second author, to show (1) that one can carry out a Propp--Wilson exact simulation for all Markov random fields on Zd satisfying a certain high noise assumption, and (2) that all such random fields are a finitary image of a finite state i.i.d. process. (2) is a strengthening of the previously known fact that such random fields are Bernoulli shifts. James P. Hobert, Christian P. Robert, and D. M. Titterington. On perfect simulation for some mixtures of distributions. Statistics and Computing 9(4):287--298, 1999. (Postscript.)

Abstract: This paper studies the implementation of the coupling from the past (CFTP) method of Propp and Wilson (1996) in the set-up of two and three component mixtures with known components. We show that monotonicity structures can be exhibited in both cases, but that CFTP can still be costly for three component mixtures. We conclude with a simulation experiment exhibiting an almost perfect sampling scheme where we only consider a subset of the exhaustive set of starting values.

C. C. Holmes and B. K. Mallick. Perfect simulation for orthogonal model mixing, 1998. Preprint.

Abstract: In this article we demonstrate how to generate independent and identically distributed samples from the model space of the Bayes linear model with orthogonal predictors. We use the method of coupled Markov chains from the past as introduced by Propp and Wilson (1996). This procedure alleviates any concerns over convergence and sample mixing. We present a number of examples including a perfect simulation of Bayesian wavelet selection in a 1024 dimensional model space, a knot selection problem for spline smoothers and, a standard linear regression variable selection problem.

Updated to
C. C. Holmes and B. K. Mallick. Perfect simulation for Bayesian curve and surface fitting, 1999. Preprint. A. Mira, J. Møller, and G. O. Roberts. Perfect slice samplers. Journal of the Royal Statistical Society, Ser. B, 63(3):593--606, 2001. (Postscript.)

Abstract: Perfect sampling allows exact simulation of random variables from the stationary measure of a Markov chain. By exploiting monotonicity properties of the slice sampler we show that a perfect version of the algorithm can be easily implemented, at least when the target distribution is bounded. Various extensions, including perfect product slice samplers, and examples of applications are discussed. Christian P. Robert, editor. Discretization and MCMC Convergence Assessment. Lecture Notes in Statistics # 135. Springer, 1998.

Chapter 1, ``Markov chain Monte Carlo methods'' by Christian P. Robert and Sylvia Richardson, has a section 4 on ``Perfect sampling.''

James A. Fill and Motoya Machida. Stochastic monotonicity and realizable monotonicity. Technical Report 573, Department of Mathematical Sciences, The Johns Hopkins University, 1998. (Postscript.)

Abstract: We explore and relate two notions of monotonicity, stochastic and realizable, for a system of probability measures on a common finite partially ordered set (poset) S when the measures are indexed by another poset A. We give counterexamples to show that the two notions are not always equivalent, but for various large classes of S we also present conditions on the poset A that are necessary and sufficient for equivalence. When A = S, the condition that the cover graph of S have no cycles is necessary and sufficient for equivalence. This case arises in comparing applicability of the perfect sampling algorithms of Propp and Wilson and the first author of the present paper.

Mark Huber. Exact random sampling from independent sets, 1998. Preprint.

Abstract: We consider the problem of randomly sampling from the set of indpendent sets of a graph, where the probability of selecting a particular independent set is proportional to λ raised to the size of the independent set. This distribution is the hard core gas model in statistical physics. If Δ is the maximum degree of the graph, Dyer and Greenhill gave a method for efficiently approximately sampling from this distribution when λ≤2/(Δ-2) by utilizing a rapidly mixing Markov chain. We show here how to use their chain to exactly sample from this distribution. This method is shown to run slightly faster than the method of Dyer and Greenhill, and as a corollary gives a better bound on the mixing time of their chain.

Mark Huber. Interruptible exact sampling and construction of strong stationary times for Markov chains, 1998. Preprint.

Abstract: We consider the problem of exactly sampling from the stationary distribution of a Markov chain, a problem with numerous Monte Carlo applications. The earliest method for accomplishing this was construction of a strong stationary times, a stopping time that when reached, left the chain in the stationary distribution. Unfortunately for many chains of interest, these stopping times were very difficult to compute. Here we show how an algorithm of Murdoch and Rosenthal leads to construction of strong stationary times for many chains of practical interest. In addition, we analyze the running time of their algorithm, and compare to the running time of another successful algorithm for exact sampling, coupling from the past. We show how to test whether these stopping times have been reached for several chains for distributions of interest, the hard core gas model, proper k colorings of a graph, and the continuous Widom-Rowlinson mixture model. Armin S. A. Rährl. Fast, portable, parallel and scalable exact simulation using Markov chains, 1999. Preprint.

Abstract: Markov Chain Monte Carlo samplers have become popular tools for Bayesian computation. However, these techniques are computationally expensive. In order to speed them up, this paper presents portable, parallel and linear scalable computer implementations in the Bulk Synchronous Parallel model and in Java. Optimality in scalability and asymptotic optimality in speed is proved. It is shown how this parallel approach can be used for exact sampling using Propp and Wilson's algorithm. In 40 seconds it performs more than one million iterations using a 8192 X 8192 transition matrix of double precision floating point numbers on a SGI with 32 R10000 processors. For time-critical computational applications, an efficient use of the sample values generated via a repeated coupling from the past algorithm is made. Therefore many more samples per time are obtained, though these samples are not drawn from the true stationary distribution. How to develop and analyze a parallel solution in two rather different models of computation is explained. Reference to the complete code with documentation is provided in .

Wilfrid S. Kendall and Jesper Møller. Perfect implementation of a Metropolis-Hastings simulation of Markov point processes, 1999. Preprint.

Abstract: This document describes a C implementation of a perfect algorithm for a Metropolis-Hastings simulation of a point process, based on

• a decomposition into types relating to different square sub-regions, and
• a dominating process which is composed of independent discrete-time M/M/1) queues, one for each square sub-region, each in equilibrium.
Mathematical details are to be found in Kendall & Møller (1999a), which is the scientific paper corresponding to this research report. The differences between the current document and Kendall & Møller (1999a) are that
• this document gives a full listing of the program,
• this document does not discuss the underlying theory, which is covered in depth in Kendall & Møller (1999a).

A full Nuweb source for this program (in gzipped tarfile form, together with figures) is available at my [Kendall's] [homepage of available perfect simulation programs].

Motoya Machida. Stochastic Monotonicity and Realizable Monotonicity. PhD thesis, The Johns Hopkins University, Department of Mathematical Sciences, 1999.

Abstract:
A system (Pa: a in A) of probability measures on a common finite set S indexed by another finite set A can be "realized" by a system (Xa: a in A) of S-valued random variables on some probability space so that Xa is distributed as Pa for each a in A. Assuming that A and S are each equipped with a partial order, we pose the following question: When can a given system (Pa: a in A) be realized by a system (Xa: a in A) with the monotonicity property that Xa ≤ Xb almost surely whenever a < b? When such a realization is possible, we call the system (Pa: a in A) realizably monotone. Such a system necessarily is stochastically monotone, that is, satisfies Pa ≤ Pb in stochastic ordering whenever a < b. We give counterexample(s) to show that stochastic monotonicity is not in general sufficient for realizable monotonicity.

Given particular partial orderings, however, these two notions of monotonicity can be equivalent. For various large classes of partially ordered sets (posets) S, we present various conditions on the poset A that are necessary and sufficient for equivalence. Furthermore, for a certain broad class of acyclic posets S, we develop inverse probability transforms from [0,1) into S and show how to utilize them to explicitly construct systems (Xa: a in A) with the monotonicity property from a single uniform random variable U.

If A = S and both are equipped with the same partial ordering, the condition that the poset is acyclic is necessary and sufficient for monotonicity equivalence. This case arises in comparing applicability of two perfect sampling algorithms, one via coupling from the past introduced by Propp and Wilson, and the other based on rejection sampling introduced by Fill. Jesper Møller and G. K. Nicholls. Perfect simulation for sample-based inference. Statistics and Computing, to appear. (PDF)

Abstract: Perfect simulation algorithms based on Propp and Wilson (1996) have so far been of limited use for sampling problems of interest in statistics. We specify a new family of perfect sampling algorithms obtained by combining MCMC tempering algorithms with dominated coupling from the past, and demonstrate that our algorithms will be useful for sample based inference. Perfect tempering algorithms are less efficient than the MCMC algorithms on which they typically depend. However, samples returned by perfect tempering are distributed according to the intended distribution, so that these new sampling algorithms do not suffer from the convergence problems of MCMC. Perfect tempering is related to rejection sampling. When rejection sampling has been tried, but has proved impractical, it may be possible to convert the rejection algorithm into a perfect tempering algorithm, with a significant gain in algorithm efficiency.

Mark Huber. The swap move: a tool for building faster Markov chains, 1999. Preprint.

Abstract: In Monte Carlo Markov chain algorithms, a random sample is generated by running a Markov chain ``for a long time''. This technique has applications in many areas, including statistical physics and approximation algorithms. Two questions about this procedure arise, first, how long does the chain need to be run before the sample obtained is close to being in the desired distribution, and secondly, how should chains be designed to make the running time as low as possible? Here we present a method for increasing the speed of Markov chains that we call the swap move. First applied by Broder to approximate the permanent, here we apply it to several models from statistical physics. In each case we are able to create improved Markov chains for the problem. More importantly, we show how the concept of bounding chains may be applied to the swap move, thus allowing the use of perfect sampling algorithms such as coupling from the past. These algorithms eliminate the need to know the mixing time of the Markov chain, making these new chains always faster to use.

Mark Huber. Perfect sampling without a lifetime commitment, 1999. Preprint.

Abstract: Generating perfect samples from distributions using Markov chains has a wide range of applications, from statistical physics to approximation algorithms. In perfect sampling algorithms, a sample is drawn exactly from the stationary distribution of a chain, as opposed to methods that run the chain ``for a long time'' and create samples drawn from a distribution that is close to the stationary distribution. One of the primary methods for creating perfect samples, the coupling from the past protocol of Propp and Wilson, suffers from the fact that the user must be willing to commit to running the algorithm in its entirety in order to obtain unbiased, perfect samples. By using another method, the acceptance rejection method of Fill, Murdoch, and Rosenthal (FMR), the user may abort the procedure in the middle of a run without introducing bias. In this paper, we give the first general analysis of the running time of FMR compared to the running time of of CFTP. Furthermore, we show that FMR can be used to generate a strong stationary stopping time of a Markov chain. We show how to use FMR to generate perfect sampling algorithms for two real world examples, the hard core gas model and the Widom-Rowlinson mixture model. These two examples illustrate the techniques needed to apply FMR to a wide range of discrete and infinite state space chains.

Remarks: Two references for the general form of Fill's algorithm are (1) ``Fill'' (his original article) and (2) ``Fill, Machida, Murdoch, and Rosenthal'' (which supersedes the preprint by Murdoch and Rosenthal). David B. Wilson. How to couple from the past using a read-once source of randomness. Random Structures and Algorithms 16(1):85--113, 2000. arXiv:math.PR/9910050

Abstract: We give a new method for generating perfectly random samples from the stationary distribution of a Markov chain. The method is related to coupling from the past (CFTP), but only runs the Markov chain forwards in time, and never restarts it at previous times in the past. The method is also related to an idea known as PASTA (Poisson arrivals see time averages) in the operations research literature. Because the new algorithm can be run using a read-once stream of randomness, we call it read-once CFTP. The memory and time requirements of read-once CFTP are on par with the requirements of the usual form of CFTP, and for a variety of applications the requirements may be noticeably less. Some perfect sampling algorithms for point processes are based on an extension of CFTP known as coupling into and from the past; for completeness, we give a read-once version of coupling into and from the past, but it remains unpractical. For these point process applications, we give an alternative coupling method with which read-once CFTP may be efficiently used. Xiao-Li Meng. Towards a more general Propp-Wilson algorithm: multistage backward coupling. In Neil Madras, editor, Monte Carlo Methods, volume 26 of Fields Institute Communications. American Mathematical Society, 2000. To appear.

Abstract: A multistage implementation of Propp and Wilson's backward coupling scheme is proposed via cluster sampling, which can help reduce the time to coalescence. A random walk example is used to illustrate the multistage generalization and to compare it with Propp and Wilson's single-stage scheme. The use of backward cluster sampling is also proposed for stratified forward Markov chain Monte Carlo, taking advantage of the greater flexibility of the multistage framework. These proposals were inspired by a general principle in sample surveys --- multistage sampling is typically more effective than single stage sampling. George Casella, Kerrie L. Mengersen, Christian P. Robert and D. M. Titterington). Perfect slice samplers for mixtures of distributions. Journal of the Royal Statistical Society, Ser. B 64(4):777--790, 2002. (Postscript.)

Abstract: We propose a perfect sampler for mixtures of distributions, in the spirit of Mira and Roberts (1999), building on Hobert, Robert and Titterington (1999). The method relies on a marginalisation akin to Rao-Blackwellisation which illustrates the Duality Principle of Diebolt and Robert (1994) and utilises an envelope argument which embeds the finite support distribution on the latent variables within a continuous support distribution, easier to simulate by slice sampling. We also provide a number of illustrations in the cases of normal and exponential mixtures which show that the technique does not suffer from severe slow-down when the number of observations or the number of components increases. We thus obtain a general iid sampling method for mixture posterior distributions and illustrate convincingly that perfect sampling can be achieved for realistic statistical models and not only for toy problems.

Remarks: The authors have written a corrigendum, in which they recharacterize their results as ``approximate exact sampling''. Hans-Otto Georgii. Phase transition and percolation in Gibbsian particle models, 1999. arXiv:math.PR/9910005

Abstract: We discuss the interrelation between phase transitions in interacting lattice or continuum models, and the existence of infinite clusters in suitable random-graph models. In particular, we describe a random-geometric approach to the phase transition in the continuum Ising model of two species of particles with soft or hard interspecies repulsion. We comment also on the related area-interaction process and on perfect simulation. D. J. Murdoch. Exact sampling for Bayesian inference: unbounded state spaces. In Neil Madras, editor, Monte Carlo Methods, volume 26 of Fields Institute Communications. American Mathematical Society, 2000. To appear. (Postscript)

Abstract: Propp and Wilson [10, 11] described a protocol called coupling from the past (CFTP) for exact sampling from the steady-state distribution of a Markov chain Monte Carlo (MCMC) process. In it a past time is identified from which the paths of coupled Markov chains starting at every possible state would have coalesced into a single value by the present time; this value is then a sample from the steady-state distribution.
Foss and Tweedie  pointed out that for CFTP to work, the underlying Markov chain must be uniformly ergodic. Unfortunately, most of the chains in common use in Bayesian inference are not, when the state space is unbounded. However, this does not mean that CFTP can't be used; in this paper we present three modifications.
The first is a simple change to the chain to induce uniform ergodicity. The second (more extensively discussed in ) is a modification to CFTP due to Kendall  that makes use of random bounds on particular realizations of the chain. Finally, the last method attempts to make use of Meng's  multistage coupler to address the problem. David B. Wilson. Layered multishift coupling for use in perfect sampling algorithms (with a primer on CFTP). In Neil Madras, editor, Monte Carlo Methods, volume 26 of Fields Institute Communications, pages 141--176. American Mathematical Society, 2000. arXiv:math.PR/9912225

Abstract: In this article we describe a new coupling technique which is useful in a variety of perfect sampling algorithms. A multishift coupler generates a random function f() so that for each real x, f(x)-x is governed by the same fixed probability distribution, such as a normal distribution. We develop the class of layered multishift couplers, which are simple and have several useful properties. For the standard normal distribution, for instance, the layered multishift coupler generates an f() which (surprisingly) maps an interval of length l to fewer than 2+l/2.35 points --- useful in applications which perform computations on each such image point. The layered multishift coupler improves and simplifies algorithms for generating perfectly random samples from several distributions, including and autogamma distribution, posterior distributions for Bayesian inference, and the steady state distribution for certain storage systems. We also use the layered multishift coupler to develop a Markov-chain based perfect sampling algorithm for the autonormal distribution.
At the request of the organizers, we begin by giving a primer on CFTP (coupling from the past); CFTP and Fill's algorithm are the two predominant techniques for generating perfectly random samples using coupled Markov chains.

Haiyan Cai. Exact sampling using auxiliary variables, 1999. To Appear in the Statistical Computation Section of the ASA Proceedings. (Postscript.)

Abstract: We discuss two simple algorithms for generating exact samples from the equilibrium distribution of a Markov chain. These algorithms are based on a single run of the Markov chain in an ordinary way. One can pick up certain samples from the run as samples from the equilibrium distribution. This is done by introducing auxiliary variables as the indicators during the run. Alessandra Guglielmi, Chris C. Holmes, and Stephen G. Walker. Perfect simulation involving functionals of a Dirichlet process. Journal of Computational and Graphical Statistics 11(2):306--310, 2002. (Postscript)

Abstract: In this paper, we show how to perform perfect simulation of a functional of a Dirichlet process, when the values of the functional are bounded. We also give a procedure for ``approximate'' simulation in the case of unbounded functional values. Xeni K. Dimakos. A guide to exact simulation. International Statistical Review 69(1):27--48, 2001. (Postscript)

Abstract: Markov Chain Monte Carlo (MCMC) methods are used to sample from complicated multivariate distributions with normalizing constants that may not be computable and from which direct sampling is not feasible. A fundamental problem is to determine convergence of the chains. Propp & Wilson (1996) devised a Markov chain algorithm called Coupling From The Past (CFTP) that solves this problem, as it produces exact samples from the target distribution and determines automatically how long it needs to run. Exact sampling by CFTP and other methods is currently a thriving research topic. This paper gives a review of some of these ideas, with emphasis on the CFTP algorithm. The concepts of coupling and monotone CFTP are introduced, and results on the running time of the algorithm presented. The interruptible method of Fill (1998) and the method of Murdoch & Green (1998) for exact sampling for continuous distributions are presented. Novel simulation experiments are reported for exact sampling from the Ising model in the setting of Bayesian image restoration, and the results are compared to standard MCMC. The results show that CFTP works at least as well as standard MCMC, with convergence monitored by the method of Raftery & Lewis (1992, 1996). M. A. Novotny. Introduction to the Propp-Wilson method of exact sampling for the Ising model. In D. P. Landau, S. P. Lewis, and H.-B. Sch�ttler, editors, Computer Simulation Studies in Condensed Matter Physics XII, volume 85 of Springer Proceedings in Physics, pages 179--184. Springer Verlag, 2000. arXiv:cond-mat/9905195.

Abstract: An introduction to the Propp-Wilson method of coupling-from-the-past for the Ising model is presented. It enables one to obtain exact samples from the equilibrium spin distribution for ferromagnetic interactions. Both uniform and random quenched magnetic fields are included. The time to couple from the past and its standard deviation are shown as functions of system size, temperature, and the field strength. The time should be an estimate of the ergodic time for the Ising system.

Elke Thönnes. A primer on perfect simulation. Statistical Physics and Spatial Statistic, ed. Klaus R. Mecke and D. Stoyan, Springer Lecture Notes in Physics #554, 349--378, 2000.

Abstract: Markov chain Monte Carlo has long become a very useful, established tool in statistical physics and spatial statistics. Recent years have seen the development of a new, exciting generation of Markov chain Monte Carlo methods: perfect simulation algorithms. In contrast to conventional Markov chain Monte Carlo perfect simulation produces samples which are guaranteed to have the exact equilibrium distribution. In the following we provide an example-based introduction into perfect simulation focussed on the method called Coupling from the Past. Michael Harvey. Monte Carlo inference for belief networks using coupling from the past. Master's thesis, department of computer science, University of Toronto, 1999. (Postscript, PDF.)

Abstract: A common method of inference for belief networks is Gibbs sampling, in which a Markov chain converging to the desired distribution is simulated. In practice, however, the distribution obtained with Gibbs sampling differs from the desired distribution by an unknown error, since the simulation time is finite. Coupling from the past selects states from exactly the desired distribution by starting chains in every state at a time far enough back in the past that they reach the same state at time t= 0. To track every chain is an intractable procedure for large state spaces. The method proposed in this thesis uses a summary chain to approximate the set of chains. Transitions of the summary chain are efficient for noisy-or belief networks, provided that sibling variables of the network are not directly connected, but often require more simulation time steps than would be needed if chains were tracked exactly. Testing shows that the method is a potential alternative to ordinary Gibbs sampling, especially for networks that have poor Gibbs sampling convergence, and when the user has a low error tolerance. Tine Møller Sørensen. A study of Propp & Wilson's method for exact Markov chain Monte Carlo sampling, 1997. Master's thesis in computational statistics, University of Bath.

Abstract: The aim of this project is to study the Propp & Wilson method (Propp & Wilson, 1996) for exact simulation via Markov chain Monte Carlo.
The first part of the project involves investigating the computation time required for Propp & Wilson's algorithm in different applications. The objective is to find the optimal division of effort between `pre-sampling' which ensures the Markov chain is sampling from its ergodic distribution and the subsequent `production runs' which yield internally correlated realisations from which to draw inferences about the target distribution. In our application, it is found that the most accurate estimates are obtained by using one long run of correlated samples instead of several shorter runs, although there are other advantages of using several runs.
In the second part of the project different sampling methods based on the Propp & Wilson algorithm are studied in situations where exact sampling is not feasible. The objective of this part of the project is to develop methods that can be used as diagnostics for exact sampling in these cases. We consider a number of distributions and according to the situation, different methods are tried out. Furthermore, a test is developed to assess the methods.

Roberto Fernández, Pablo A. Ferrari, and Nancy L. Garcia. Perfect simulation for interacting point processes, loss networks and Ising models, 1999. math.PR/9911162

Abstract: We present a perfect simulation algorithm for measures that are absolutely continuous with respect to some Poisson process and can be obtained as invariant measures of birth-and-death processes. Examples include area- and perimeter-interacting point processes (with stochastic grains), invariant measures of loss networks, and the Ising contour and random cluster models. The algorithm does not involve couplings of the process with different initial conditions and it is not tied up to monotonicity requirements. Furthermore, it directly provides perfect samples of finite windows of the infinite-volume measure, subjected to time and space ``user-impatience bias''. The algorithm is based on a two-step procedure: (i) a perfect-simulation scheme for a (finite and random) relevant portion of a (space-time) marked Poisson processes (free birth-and-death process, free loss networks), and (ii) a ``cleaning'' algorithm that trims out this process according to the interaction rules of the target process. The first step involves the perfect generation of ``ancestors'' of a given object, that is of predecessors that may have an influence on the birth-rate under the target process. The second step, and hence the whole procedure, is feasible if these ``ancestors'' form a finite set with probability one. We present a sufficiency criteria for this condition, based on the absence of infinite clusters for an associated (backwards) oriented percolation model. Bas Straatman. Exact sampling and applications to a mite dispersal model, 1998. Master's thesis, Utrecht.

This thesis does not itself come with an abstract, but the abstract for a talk given by Straatman on the topic follows.
Abstract: I have studied a Markov chain which describes the dispersal of mites on cassava fields in West Africa. Instead of obtaining a sample of this model with approximately the stationary distribution by Monte Carlo simulation, it is possible to obtain a result which is exactly distributed according to the stationary distribution. One way to do this is coupling from the past. After a short introduction of this procedure I will explain the changes I have made to the original model in order to be able to use coupling from the past and show some simulations which give exact samples. Jens Lund and Elke Thönnes. Perfect simulation and inference for point processes given noisy observations. Computational Statistics 19(2):317--336, 2004. (Postscript)

Abstract: The paper is concerned with the Bayesian analysis of point processes which are observed with noise. It is shown how to produce exact samples from the posterior distribution of the unobserved true point process given a noisy observation. The algorithm is a perfect simulation method which applies dominated coupling from the past to a spatial birth-and-death process. Dominated coupling from the past is made amenable by augmenting the state space of the target chain. We discuss how to optimize this perfect simulation algorithm and how to use it in a statistical inference problem of pratical importance. We further describe the merits and the limitations of the method.

Jens Lund. Statistical inference and perfect simulation for point processes observed with noise. PhD thesis, The Royal Veterinary and Agricultural University, Copenhagen, Department of Statistics, 1999.

Abstract: The main themes of this thesis are spatial statistics and simulation algorithms. The thesis is split into five papers that may be read independently. All five papers deal with spatial models. Lund & Rudemo (1999), Lund et al. (1999), and Lund & Thönnes (1999b) deal with the same new model for point processes observed with noise, and Lund et al. (1999), Lund & Thönnes (1999b), and Lund & Thönnes (1999a) has a simulation aspect.
Lund & Rudemo (1999), Lund et al. (1999), and Lund & Thönnes (1999b) develop and analyse a new model for point processes observed with noise. Usually the analysis of spatial point patterns assume that the observed points (the true points) are a realization from a specific model. In contrast our approach is to assume the observed pattern generated by thinning and displacement of the true points, and allow for contamination by points not belonging to the true pattern.
Lund & Rudemo (1999) develop the model for point processes observed with noise. The likelihood function for an observation of a noise corrupted point pattern given the true positions is derived. As data for our analysis is indeed a realization of the underlying true process and its associated noise corrupted point pattern we need not consider a model for the underlying process. The parameters in the model describe how many of the true points are lost, how large the displacements are, and the number of contaminating surplus points. For estimation of the parameters in the noise model a deterministic, iterative, and approximative maximum likelihood estimation algorithm is developed. The likelihood function is a sum of an excessive large number of terms, and the algorithm works by finding large dominating terms. Alternative estimation methods are discussed.
Lund et al. (1999) analyse the model developed in Lund & Rudemo (1999) with respect to the now unobserved true points. We assume a noisy observation of a true point pattern and knowledge of the parameters in the model. A Bayesian point of view is now adopted and we specify a prior distribution for the underlying true process. Given the model, the prior distribution, and the noisy observation, we get the posterior distribution of the true points. This posterior distribution is investigated by samples from the distribution. These samples are obtained from a Markov chain Monte Carlo (MCMC) algorithm extending the Metropolis-Hastings sampler for point processes. A thorough discussion is provided on the choice of prior distribution and how to present the samples from the MCMC runs. The MCMC samples are used to estimate for example the K-function for the unobserved true point pattern. These estimates are clearly better than estimates based on the observed points alone.
The use of the MCMC algorithm in Lund et al. (1999) relies on the fact that a Markov chain run for a long time approaches its stationary distribution. Lund & Thönnes (1999b) uses a recent technique called Coupling From The Past (CFTP) to deliver a sample drawn from the exact posterior distribution of the unobserved true points described in Lund et al. (1999), a so-called perfect simulation. This perfect simulation algorithm is based on spatial birth-and-death processes for simulation of point processes. In order to apply CFTP in our problem the simulation is carried out on an augmented state space. The algorithm turns out to be too slow in practice and thus demonstrates possible current limits of CFTP.
Lund & Thönnes (1999a) describes a new perfect simulation algorithm for general locally stable point processes. The algorithm is based on CFTP for Metropolis-Hastings simulation of point processes and it is simpler than the previous known perfect algorithm based on Metropolis-Hastings simulation for this class of models. The present state of the algorithm is that it is far too slow to be useful in practice, and it might have some theoretical flaws. These problems are discussed in the introductory part of the thesis.
Lund (1998) develops a model for survival times of trees that take the spatial positions of the trees into account. At a finite number of timepoints it is observed whether a tree is alive or not, and thus we have interval censoring of the even aged trees. The model is a discrete time version of Cox's proportional hazards model. Positions of trees are considered as fixed, and they are used to compute competition indices that enter the model as covariates. It is shown that small trees have a higher risk of dying than large tress and the area of the experiment is inhomogeneous. In addition, Hegyi's competition index based on basal area is a significant covariate. Alison Gibbs. Bounding the convergence time of the Gibbs sampler in Bayesian image restoration. Biometrika 87(4):749--766, 2000. (Postscript).

Abstract: This paper shows how coupling methodology can be used to give precise, a priori bounds on the convergence time of Markov chain Monte Carlo algorithms for which a partial order exists on the state space which is preserved by the Markov chain transitions. This methodology is applied to give a bound on the convergence time of the random scan Gibbs sampler used in the Bayesian restoration of an image of N pixels. For our algorithm in which only one pixel is updated at each iteration, the bound is a constant times N2. The proportionality constant is given and is easily calculated. These bounds also give an indication of the running time of coupling from the past algorithms. Alison Gibbs. Convergence of Markov Chain Monte Carlo Algorithms with Applications to Image Restoration. PhD thesis, University of Toronto, Department of Statistics, 1999. (Postscript, PDF.)

Abstract: Markov chain Monte Carlo algorithms, such as the Gibbs sampler and Metropolis-Hastings algorithm, are widely used in statistics, computer science, chemistry and physics for exploring complicated probability distributions. A critical issue for users of these algorithms is the determination of the number of iterations required so that the result will be approximately a sample from the distribution of interest.
In this thesis, we give precise bounds on the convergence time of the Gibbs sampler used in the Bayesian restoration of a degraded image. We consider convergence as measured by both the usual choice of metric, total variation distance, and the Wasserstein metric. In both cases we exploit the coupling characterisation of the metric to get our results. Our results can also be applied to the coupling-from-the-past algorithm of Propp and Wilson (1996) to get bounds on its running time.
The application of our theoretical results requires the computation of parameters of the algorithm. These computations may be prohibitively difflcult in many situations. We discuss how our results can be applied in these situations through the use of auxiliary simulation to estimate these parameters.
We also give a summary of probability metrics and the relationships between them, including several new relationships. Paul Fearnhead. Perfect simulation from population genetic models with selection. Theoretical Population Biology, 59:263--279, 2001. (Postscript)
Software available here.

Abstract: We consider using the ancestral selection graph to simulate samples from genetic models with selection. Currently the use of the ancestral selection graph to simulate such samples is limited. This is because the computational requirement for simulating such samples increases exponentially with the selection rate, and also due to the need to be able to simulate a sample of size 1 from the population at equilibrium. For the only case where the distribution of a sample of size one is known, that of parent independent mutations, more efficient simulation algorithms exist. We will show that by applying the idea of coupling from the past to the ancestral selection graph, samples can be simulated from a general K-allele model without knowledge of the distribution of a sample of size 1. Further, the computation involved in generating such samples appears to be less than that of simulating the ancestral selection graph until its ultimate ancestor. In particular, in the case of genic selection with parent independent mutations, the computational requirement increases only quadratically with the selection rate. The algorithm is then used to simulate samples at microsatellite loci, and to consider the effect of selection on linked neutral sites. Jesper Møller. Aspects of spatial statistics, stochastic geometry and Markov chain Monte Carlo methods. Thesis for doctoral degree in Natural Sciences, Aalborg University.

From the preface: This doctoral thesis is concerned with certain aspects of spatial statistics, stochastic geometry, and Markov chain Monte Carlo methods. It is based on a monograph [L] and twenty selected papers [A--K, M--V] produced over about the last decade; see the following two pages. The papers [A, G, H, I] and [95, 140] constituted my Ph.D. thesis from 1988 entitled Stochastic Geometry and Markov Models; [A, H, I] in revised form are included in the present thesis.
The thesis consists of three parts: (i) statistical models for lattice data, spatial (marked) point patterns, and spatial birth-and-death processes, (ii) Markov chain Monte Carlo methods and statistical inference for spatial models, and (iii) random tessellations. Parts (i) and (ii) are closely related and the focus is on recent theoretical and applied statistical aspects. Part (iii) is more within the traditional framework of stochastic geometry, though new results will be discussed as well.

Remark: Section 3.2 deals with perfect simulation. Song-Chun Zhu and David Mumford. Learning Generic Prior Models for Visual Computation, 1997. To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence. (Postscript)

Abstract: Many generic prior models have been widely used in computer vision ranging from image and surface reconstruction to motion analysis, and these models presume that surfaces of objects be smooth, and adjacent pixels in images have similar intensity values. However, there is little rigorous theory to guide the construction and selection of prior models for a given application. Furthermore, images are often observed at arbitrary scales, but none of the existing prior models are scale-invariant. Motivated by these problems, this article chooses general natural images as a domain of application, and proposes a theory for learning prior models from a set of observed natural images. Our theory is based on a maximum entropy principle, and the learned prior models are of Gibbs distributions. A novel information criterion is proposed for model selection by minimizing a Kullback-Leibler information distance. We also investigate scale invariance in the statistics of natural images and study a prior model which has scale invariant property. In this paper, in contrast with all existing prior models, negative potentials in Gibbs distribution are first reported. The learned prior models are verified in two ways. Firstly images are sampled from the prior distributions to demonstrate what typical images they stand for. Secondly they are compared with existing prior models in experiments of image restoration.

Remarks: This paper is about image processing rather than perfect sampling per se, but the end of section 3 includes discussion of their use of the ``Propp-Wilson criterion'' for deciding how long to run their Markov chains. Andrew M. Childs, Ryan B. Patterson, and David J. C. MacKay. Exact sampling from non-attractive distributions using summary states. Physical Review E 63:036113, 2001. arXiv:cond-mat/0005132

Abstract: Propp and Wilson's method of coupling from the past allows one to efficiently generate exact samples from attractive statistical distributions (e.g., the ferromagnetic Ising model). This method may be generalized to non-attractive distributions by the use of summary states, as first described by Huber. Using this method, we present exact samples from a frustrated antiferromagnetic triangular Ising model and the antiferromagnetic q= 3 Potts model. We discuss the advantages and limitations of the method of summary states for practical sampling, paying particular attention to the slowing down of the algorithm at low temperature. In particular, we show that such a slowing down can occur in the absence of a physical phase transition.

Luc Devroye, James A. Fill, and Ralph Neininger. Perfect Simulation from the Quicksort Limit Distribution, 2000. Preprint.

Abstract: The weak limit of the normalized number of comparisons needed by the Quicksort algorithm to sort n randomly permuted items is known to be determined implicitly by a distributional fixed-point equation. We give an algorithm for perfect random variate generation from this distribution. George Casella, Michael Lavine, and Christian P. Robert. Explaining the perfect sampler. The American Statistician 55(4):299--305, 2001. (Postscript).

Abstract: In 1996, Propp and Wilson introduced Coupling from the Past (CFTP), an algorithm for generating a sample from the exact stationary distribution of a Markov chain. In 1998, Fill proposed another so-called perfect sampling algorithm. These algorithms have enormous potential in Markov Chain Monte Carlo (MCMC) problems because they eliminate the need to monitor convergence and mixing of the chain. This article provides a brief introduction to the algorithms, with an emphasis on understanding rather than technical detail.

Olle Häggström. Finite Markov Chains and Algorithmic Applications. Cambridge University Press, 2000.

Contents: These lecture notes give an elementary introduction to Markov chains, and applications to algorithms like MCMC and simulated annealing. Chapters are entitled:

0. Preface.
1. Basics of probability theory.
2. Markov chains.
3. Computer simulation of Markov chains.
4. Irreducible and aperiodic Markov chains.
5. Stationary distributions.
6. Reversible Markov chains.
7. Markov chain Monte Carlo.
8. The Propp-Wilson algorithm.
9. Simulated annealing.

Erik van Zwet. Perfect stochastic EM. In M. C. M. de Gunst, C. A. J. Klaassen, and A. W. van der Vaart, editors, State of the Art in Probability and Statistics, IMS Lecture Notes #36, pages 607--616, 2001. (Postscript.)

Abstract: In a missing data problem we observe the result of a (known) many-to-one mapping of an unobservable `complete' dataset. The aim is to estimate some parameter of the distribution of the complete data. In this situation, the stochastic version of the EM algorithm is sometimes a viable option. It is an iterative algorithm that produces an ergodic Markov chain on the parameter space. The stochastic EM (StEM) estimator is then a sample from the equilibrium distribution of this chain. Recently, a method called `coupling from the past' was invented to generate a Markov chain in equilibrium. We investigate when this method can be used for a StEM chain and give examples where this is indeed possible. W. L. Cooper and R. L. Tweedie. Perfect simulation of an inventory model for perishable products. Stochastic Models, 18(2):229--243, 2002. (Postscript.)

Abstract: We study an inventory model for perishable products with a critical-number ordering policy under the assumption that demand for the product forms an i.i.d. sequence, so that the state of the system forms a Markov chain. Explicit calculation of the stationary distribution has proved impractical in cases where items have reasonably long lifetimes and for systems with large ``under-up-to'' levels. Using the recently-developed coupling-from-the-past method, we introduce a technique to estimate the stationary distribution of the Markov chain via ``perfect simulation.'' The Markov chain that results from the use of a critical-number policy is particularly amenable to these simulation techniques, despite not being ordered in its initial state, since the recursive equations satisfied by the Markov chain enable us to identify specific demand patterns where the backward coupling occurs.

Jesper Møller. A review of perfect simulation in stochastic geometry. Selected Proceedings of the Symposium on Inference for Stochastic Processes. Eds. I.V. Basawa, C.C. Heyde and R.L. Taylor, IMS Lecture Notes & Monographs Series, 2001, Volume 37, pages 333--355. (Postscript.)

Abstract: We provide a review and a unified exposition of the Propp-Wilson algorithm and various other algorithms for making perfect simulations with a view to applications in stochastic geometry. Most examples of applications are for spatial point processes.

Francis Comets, Roberto Fernández, and Pablo A. Ferrari. Processes with Long Memory: Regenerative Construction and Perfect Simulation, 2000. arXiv:math.PR/0009204

Abstract: We present a perfect simulation algorithm for stationary processes indexed by Z, with summable memory decay. Depending on the decay, we construct the process on finite or semiinfinite intervals, explicitly from an i.i.d. uniform sequence. Even though the process has infinite memory, its value at time 0 depends only on a finite, but random, number of these uniform variables. The algorithm is based on a recent regenerative construction of these measures by Ferrari, Maass, Martinez and Ney. As applications, we discuss the perfect simulation of binary autoregressions and Markov chains on the unit interval.

James A. Fill and Mark Huber. The randomness recycler: a new technique for perfect sampling. 41st Annual Symposium on Foundations of Computer Science, pages 503--511, 2000. arXiv:math.PR/0009242

Abstract: For many probability distributions of interest, it is quite difficult to obtain samples efficiently. Often, Markov chains are employed to obtain approximately random samples from these distributions. The primary drawback to traditional Markov chain methods is that the mixing time of the chain is usually unknown, which makes it impossible to determine how close the output samples are to having the target distribution. Here we present a new protocol, the randomness recycler (RR), that overcomes this difficulty. Unlike classical Markov chain approaches, an RR-based algorithm creates samples drawn exactly from the desired distribution. Other perfect sampling methods such as coupling from the past use existing Markov chains, but RR does not use the traditionalMarkov chain at all. While by no means universally useful, RR does apply to a wide variety of problems. In restricted instances of certain problems, it gives the first expected linear time algorithms for generating samples. Here we apply RR to self-organizing lists, the Ising model, random independent sets, random colorings, and the random cluster model.

James A. Fill and Motoya Machida. Realizable Monotonicity and Inverse Probability Transform, 2000. arXiv:math.PR/0010026

Abstract: A system (Pa: a in A) of probability measures on a common state space S indexed by another index set A can be ``realized'' by a system (Xa: a in A) of S-valued random variables on some probability space in such a way that each Xa is distributed as Pa. Assuming that A and S are both partially ordered, we may ask when the system (Pa: a in A) can be realized by a system (Xa: a in A) with the monotonicity property that Xa <= Xb almost surely whenever a <= b. When such a realization is possible, we call the system (Pa: a in A) ``realizably monotone.'' Such a system necessarily is stochastically monotone, that is, satisfies Pa <= Pb in stochastic ordering whenever a <= b. In general, stochastic monotonicity is not sufficient for realizable monotonicity. However, for some particular choices of partial orderings in a finite state setting, these two notions of monotonicity are equivalent. We develop an inverse probability transform for a certain broad class of posets S, and use it to explicitly construct a system (Xa: a in A) realizing the monotonicity of a stochastically monotone system when the two notions of monotonicity are equivalent.

Marc A. Loizeaux and Ian W. McKeague. Bayesian inference for spatial point processes via perfect sampling, 2000. Preprint.

Abstract: We study a Bayesian cluster model for spatial point processes in which observations are assumed to cluster around a finite collection of underlying landmarks. Perfect sampling from the posterior distribution of landmarks is investigated. In the case of Neyman-Scott cluster models, perfect sampling from the posterior is shown to be computationally feasible via a coupling-from-the-past type algorithm of Kendall and Møller. An application to data on leukemia incidence in upstate New York is developed.

M. N. M. van Lieshout and Erik van Zwet. Maximum likelihood estimation for the bombing model, 2000. Preprint.

Abstract: Perhaps the best known example of a random set is the Boolean model. It is the union of `grains' such as discs, squares or triangles which are placed at the points of a Poisson point process. The Poisson points are called the `germs'. We are interested in estimating the intensity, say lambda, of the Poisson process from a sample of a Boolean model of discs (the bombing model). A natural estimate is the number of germs in the observation region divided by the area of that region. Unfortunately, we do not observe the presence of a given germ when its associated disc is completely covered by other discs. On the other hand, we observe the exact location of a germ when we observe any part of its associated disc's boundary. We demonstrate how to apply Coupling From The Past to sample from the conditional distribution, given the data, of the unobserved germs. Such samples allow us to approximate the maximum likelihood estimator of the intensity. We discuss and compare two methods to do so. The first method is based on a Monte Carlo approximation of the likelihood function. The second is a stochastic version of the EM algorithm.

James P. Hobert and Christian P. Robert. Moralizing perfect sampling, 2000. Preprint.

Abstract: Let Φ be an irreducible, aperiodic, Harris recurrent Markov chain with invariant probability measure π. We show that if a minorization condition can be established for Φ, then π can be represented as an infinite mixture from which it may be possible to sample. Making exact draws from this mixture (and hence from π requires the ability to simulate two different random variables, T* and Nt. When the small set in the minorization condition is the entire state space (and hence Φ is uniformly ergodic), simulating these random variables is straightforward and the resulting algorithm turns out to be equivalent to Murdoch and Green's (1998) multigamma coupler. Thus, we are able to achieve perfect sampling in the uniformly ergodic case without any appeal to coupling or backward simulation. When the small set is not the state space, simulating T* is more problematic. Fortunately, T* is univariate and has support N. We construct an estimate of the mass function of T* and an asymptotic bound on the error of this estimate. These results are used to address rigorously the issue of burn-in. Specifically, we form an approximation of π whose error can be controlled and which can be used as an initial distribution for Φ. We illustrate this with a realistic example. Finally, we provide a simple Markov chain Monte Carlo (MCMC) algorithm whose stationary distribution is that of T*. Finding a perfect sampler based on our MCMC algorithm for the univariate, discrete T* is tantamount to the ability to sample exactly from π.

Michael K. Schneider. Exact and Approximate Sampling from the Stationary Distribution of a Markov Chain, 1998. Area exam report. (postscript, compressed postscript, pdf)

Abstract: When simulating a physical system with discrete sates, one often would like to generate a sample from the stationary distribution of a Markov chain. This report focuses on three sampling methodologies which do not rely on explicitly computing the stationary distribution. Two of these lead to algorithms which can generate an exact sample in finite time. The third yields a sample whose distribution approximates, but is arbitrary close to, the stationary distribution from which one desires a sample. The approximate and one of the exact methodologies are illustrated with examples from statistical mechanics.

Krishna B. Athreya and Örjan Stenflo. Perfect Sampling for Doeblin Chains. Sankhya - The Indian Journal of Statistics 65(4):763--777, 2003. (PDF).

Abstract: For Markov chains that can be generated by iteration of i.i.d. random maps from the state space X into itself (this holds if X is Polish) it is shown that the Doeblin minorization condition is necessary and sufflcient for the method by Propp and Wilson for ``perfect'' sampling from the stationary distribution π to be successful.
Using only the transition probability P we produce in a geometrically distributed random number of steps N a ``perfect'' sample from π of size N!. Joakim Linde, Cristopher Moore, Mats G. Nordahl. An n-dimensional generalization of the rhombus tiling, 2001. Preprint.

Abstract: Several classic tilings, including rhombuses and dominoes, possess height functions which allow us to 1) prove ergodicity and polynomial mixing times for Markov chains based on local moves, 2) use coupling from the past to sample perfectly random tilings, 3) map the statistics of random tilings at large scales to physical models of random surfaces, and and 4) are related to the ``arctic circle'' phenomenon. However, few examples are known for which this approach works in three or more dimensions. Here we show that the rhombus tiling can be generalized to n-dimensional tiles for any n≥3. For each n, we show that a certain local move is ergodic, and conjecture that it has a mixing time of O(Ln+2 log L) on regions of size L. For n=3, the tiles are rhombohedra, and the local move consists of switching between two tilings of a rhombic dodecahedron. We use coupling from the past to sample random tilings of a large rhombic dodecahedron, and show that arctic regions exist in which the tiling is frozen into a fixed state. However, unlike the two-dimensional tiling in which the arctic region is an inscribed circle, here it seems to be octahedral. In addition, height fluctuations between the boundary of the region and the center appear to be constant rather than growing logarithmically. We conjecture that this is because the physics of the model is in a ``smooth'' phase where it is rigid at large scales, rather than a ``rough'' phase in which it is elastic. Kasper K. Berthelsen and Jesper Møller. Spatial jump processes and perfect simulation, 2001. Preprint.

Abstract: Spatial birth-and-death processes, spatial birth-and-catastrophe processes, and more general types of spatial jump processes are studied in detail. Particularly, varied kind of coupling constructions are considered, leading to some known and some new perfect simulation procedures for different types of spatial jump processes in equilibrium. The considered equilibrium distributions include many classical Gibbs point process models and a new class of models for spatial point processes introduced in the text. Jesper Møller and Rasmus P. Waagepetersen. Simulation-based inference for spatial point processes, 2001. Preprint.
Kasper K. Berthelsen and Jesper Møller. A primer on perfect simulation for spatial point processes, 2001. Preprint.

Abstract: This primer provides a self-contained exposition of the case where spatial birth-and-death processes are used for perfect simulation of locally stable point processes. Particularly, a simple dominating coupling from the past (CFTP) algorithm and the CFTP algorithms introduced in Kendall (1998), Kendall & Møller (2000), and Fernández, Ferrari & Garcia (2000) are studied. Some empirical results for the algorithms are discussed. Julian Besag. Likelihood analysis of binary data in space and time. Manuscript, 2001.

Keywords: Autologistic; Conditional probability; Coupling from the past; Lattice data; Markov chain Monte Carlo; Markov random fields; Maximum likelihood; Maximum pseudolikelihood; Perfect simulation; Space and time; Spread of disease David B. Wilson. Mixing times of lozenge tiling and card shuffling Markov chains, 2001. arXiv:math.PR/0102193

Abstract: We show how to combine Fourier analysis with coupling arguments to bound the mixing times of a variety of Markov chains. The mixing time is the number of steps a Markov chain takes to approach its equilibrium distribution. One application is to a class of Markov chains introduced by Luby, Randall, and Sinclair to generate random tilings of regions by lozenges. For an L X L region we bound the mixing time by O(L4 log L), which improves on the previous bound of O(L7), and we show the new bound to be essentially tight. In another application we resolve a few questions raised by Diaconis and Saloff-Coste, by lower bounding the mixing time of various card-shuffling Markov chains. Our lower bounds are within a constant factor of their upper bounds. When we use our methods to modify a path-coupling analysis of Bubley and Dyer, we obtain an O(n3 log n) upper bound on the mixing time of the Karzanov-Khachiyan Markov chain for linear extensions.

Remarks: Among other things this article analyses the running time of CFTP when applied to lozenge tilings, and contains experiments pertinent to Fill's algorithm. Kasper K. Berthelsen and Jesper Møller. Perfect simulation and inference for spatial point processes. Manuscript, 2002.

Abstract: The advantages and limitations of using perfect simulation for simulation-based inference for pairwise interaction point processes are discussed. Various aspects of likelihood inference for the Strauss process with unknown range of interaction are studied. A large part of the paper concerns non-parametric Bayesian inference for the interaction function. Markov chain Monte Carlo methods, particularly path sampling, play an important role. Several empirical results and various datasets are considered. Duncan J. Murdoch and Xiao-Li Meng. Towards perfect sampling for Bayesian mixture priors. Bayesian Methods with Applications to Science, Policy, and Official Statistics. Selected papers from ISBA 2000: the sixth world meeting of the International Society for Bayesian Analysis.

Abstract: Perfect sampling using the coupling from the past (CFTP) algorithm was introduced by Propp and Wilson in 1996. In much the way rejection sampling allows one to convert samplers from one distribution into samplers from another, CFTP allows one to convert Markov chain Monte Carlo algorithms from approximate samplers of the steady-state distribution into perfect ones. Since 1996 CFTP has been applied to many different Markov chains. However, its use in routine Bayesian computation is still in the early stages of development. This paper provides a couple of building blocks for its potentially routine application in Bayesian mixture priors, including a t mixture coupler, and demonstrates the types of difficulties that currently prevent CFTP from being applied routinely in Bayesian computation. Radu V. Craiu and Xiao-Li Meng. Antithetic coupling for perfect sampling. Bayesian Methods with Applications to Science, Policy, and Official Statistics. Selected papers from ISBA 2000: the sixth world meeting of the International Society for Bayesian Analysis.

Abstract: This paper reports some initial investigations of the use of antithetic variates in perfect sampling. A simple random walk example is presented to il-lustrate the key ingredients of antithetic coupling for perfect sampling as well as its potential benefit. A key step in implementing antithetic coupling is to generate k≥2 random variates that are negatively associated, a stronger condition than negative correlation as it requires that the variates remain non-positively corre-lated after any (component-wise) monotone transformations have been applied. For k=2, this step is typically trivial (e.g., by taking U and 1-U where U~U(0,1)) and it constitutes much of the common use of antithetic variates in Monte Carlo simulation. Our emphasis is on k>2 because we have observed some general gains in going beyond the commonly used pair of antithetic variates. We discuss several ways of generating negatively associated random variates for arbitrary k, and our comparison generally favors Iterative Latin Hypercube Sampling.

Nancy L. Garcia. Perfect simulation of spatial processes. Resenhas - IME-USP, 4(3):281--324, 2000.

Abstract: This work presents a review of some of the schemes used to perfect sample from spatial processes. As examples, the area-interaction point process, Strauss process, penetrable sphere model, Peierls contours of the Ising model and continuous loss networks are studied under Coupling from The Past Algorithm (CFTP), Acceptance and Rejection Algorithm (ARA) and Backward-Forward Algorithm(BFA). Sung-woo Cho and Ashish Goel. Exact sampling in machine scheduling problems. Lecture Notes in Computer Science 2129 (Approximation Randomization and Combinatorial Optimization: Algorithms and Techniques), Michel X. Goemans, Klaus Jansen, José D.P. Rolim, Luca Trevisan, Eds., pages 202--210.

Abstract: Analysis of machine scheduling problem can get very complicated even for simple scheduling policies and simple arrival processes. The problem becomes even harder if the scheduler and the arrival process are complicated, or worse still, given to us as a black box. In such cases it is useful to obtain a typical state of the system which can then be used to deduce information about the performance of the system or to tune the parameters for either the scheduling rule or the arrival process. We consider two general scheduling problems and present an algorithm for extracting an exact sample from the stationary distribution of the system when the system forms an ergodic Markov chain. We assume no knowledge of the internals of the arrival process or the scheduler. Our algorithm assumes that the scheduler has a natural monotonic property, and that the job service times are geometric/exponential. We use the Coupling From The Past paradigm due to Propp and Wilson to obtain our result. In order to apply their general framework to our problems, we perform a careful coupling of the different states of the Markov chain. Ashish Goel and Michael Mitzenmacher. Exact sampling of TCP window states. To appear in IEEE Infocom, 2002.

Abstract: We demonstrate how to apply Coupling from the Past, a simulation technique for exact sampling, to Markov chains based on TCP variants. This approach provides a new, statistically sound paradigm for network simulations: instead of simulating a protocol over long times, or explicitly finding the stationary distribution of a Markov chain, use Coupling from the Past to quickly obtain samples from the stationary distribution. Coupling from the Past is most efficient when the underlying state space satisfies a partial order and certain monotonicity conditions. To efficiently apply this general paradigm to TCP, we demonstrate that the states of a simple TCP model possess a monotonic partial order; this order appears interesting in its own right. Preliminary simulation results indicate that this approach is quite efficient, and produces results which are similar to those obtained by simulating a TCP-Tahoe connection.

Keith Crank and James A. Fill. Interruptible exact sampling in the passive case, 2002. arXiv:math.PR/0202136

Abstract: We establish, for various scenarios, whether or not interruptible exact stationary sampling is possible when a finite-state Markov chain can only be viewed passively. In particular, we prove that such sampling is not possible using a single copy of the chain. Such sampling is possible when enough copies of the chain are available, and we provide an algorithm that terminates with probability one.

Robert P. Dobrow and James A. Fill. Speeding up the FMMR perfect sampling algorithm: A case study revisited. Random Structures and Algorithms 23(4):434--452, 2003. arXiv:math.PR/0205120

Abstract: In a previous paper by the second author, two Markov chain Monte Carlo perfect sampling algorithms - one called coupling from the past (CFTP) and the other (FMMR) based on rejection sampling - are compared using as a case study the move-to-front (MTF) self-organizing list chain. Here we revisit that case study and, in particular, exploit the dependence of FMMR on the user-chosen initial state. We give a stochastic monotonicity result for the running time of FMMR applied to MTF and thus identify the initial state that gives the stochastically smallest running time; by contrast, the initial state used in the previous study gives the stochastically largest running time. By changing from worst choice to best choice of initial state we achieve remarkable speedup of FMMR for MTF; for example, we reduce the running time (as measured in Markov chain steps) from exponential in the length n of the list nearly down to n when the items in the list are requested according to a geometric distribution. For this same example, the running time for CFTP grows exponentially in n.

Anne Philippe and Christian P. Robert. Perfect simulation of positive Gaussian distributions. Statistics and Computing 13(2):179--186, 2003. (Postscript.)

Abstract: We provide an exact simulation algorithm that produces variables from truncated Gaussian distributions on R+p via a perfect sampling scheme, based on stochastic ordering and slice sampling, since accept-reject algorithms like the one of Geweke (1991) and Robert (1995) are difficult to extend to higher dimensions.

Gareth O. Roberts and Jeffrey S. Rosenthal. Combinatorial identities associated with CFTP, 2002.

Abstract: We explore a method of obtaining combinatorial identities by analysing partially-completed runs of the Coupling from the Past (CFTP) algorithm. In particular, using CFTP for simple symmetric random walk (ssrw) with holding boundaries, we derive an identity involving linear combinations of Cab(s) for different a and b, where Cab(s) is the probability that unconstrained ssrw run from 0 for time n has maximum value a, and minimum value b, and ends up at s at time n.

L. A. Breyer and G. O. Roberts. A new method for coupling random fields. LMS Journal of Computation and Mathematics 5:77--94, 2002.

Abstract: Given a Markov chain, a stochastic flow which simultaneously constructs sample paths started at each possible initial value can be constructed as a composition of random fields. In this paper, we describe a method for coupling flows by modifying an arbitrary field (consistent with the Markov chain of interest) by an independence Metropolis Hastings iteration. We show that the resulting stochastic flow has many desirable coalescence properties, regardless of the form of the original flow.

L. A. Breyer and G. O. Roberts. Catalytic perfect simulation. Methodology and Computing in Applied Probability 3(2):161--177, 2001. Abstract: We introduce a methodology for perfect simulation using so called catalysts to modify random fields. Our methodology builds on a number of ideas previously introduced by Breyer and Roberts (1999), by Murdoch (1999), and by Wilson (2000). We illustrate our techniques by simulating two examples of Bayesian posterior distributions.
David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press (2003).

Textbook (viewable online), 640 pages. Chapters include
29 Monte Carlo Methods
30 Efficient Monte Carlo Methods
31 Ising Models
32 Exact Monte Carlo Sampling
Chapter 32 is a 9-page introduction to exact sampling.

Graeme K. Ambler. Dominated Coupling From The Past and Some Extensions of the Area-Interaction Process. PhD thesis, University of Bristol, Department of Mathematics, 2002.

Abstract: Dominated coupling from the past, an example of perfect simulation, is a method for sampling from the equilibrium distribution of a Markov chain. A particular model which may be sampled in this way is the area-interaction point process of Baddeley and van Lieshout (1995). This thesis reviews perfect simulation and some aspects of point process theory. It also reviews some methods for simulating spatial point processes. It then introduces an extension of the area-interaction process which incorporates both attractive and repulsive components at different scales. The properties of this model are investigated using some standard test functions, and the model is fitted to a standard data set. Some discrete analogues of these models are then discussed along with methods of sampling from these models using dominated coupling from the past. While discussing these sampling methods, we highlight some of the pitfalls that exist when one attempts to use dominated coupling from the past. We set out and investigate the use of the discrete analogue of the area-interaction process as a prior distribution on wavelet coefficients in Bayesian non-parametric regression. Finally, we discuss some of the issues that arose while implementing the algorithms, both for simulating from the attractive-repulsive model and for the posterior distribution of the Bayesian wavelet thresholding model.

Jordi Art�s. Simulaci�n exacta en procesos puntuales espaciales: m�todo CFTP / Exact simulation for spatial point processes: CFTP method. Master's thesis, Universitat Jaume I, Departamento de Matem�ticas, 2004.
Faheem Mitha. Perfect Sampling on Continuous State Spaces. PhD thesis, University of North Carolina at Chapel, Department of Statistics, 2003.

Abstract: We consider algorithms that sample exactly from probability distributions on continuous state spaces. For the most part we focus our attention on multivariate distributions. Examples and computer implementations are included for the algorithms considered. Algorithms discussed include Murdoch and Green's multigamma and rejection couplers and Fill and Huber's randomness recycler. We prove convergence in finite time for the multigamma and rejection couplers as described by Murdoch and Green, and adapt these to a multivariate context. Applications of the Fill/Huber randomness recycler to multivariate distributions on continuous state spaces are also considered. Other algorithms such as Wilson's multishift coupler are also discussed.

Can Mert. Bayesian inference in Schizophrenia research. Master's thesis, Royal Institute of Technology, Computer Science, 2002.

Abstract: In this Master's thesis, Bayesian data mining techniques are used to analyze a database containing information from schizophrenia affected and healthy persons. The aim is to find dependencies between the diagnosis and various attributes. The database is assembled in the HUBIN project at the Karolinska Institutet. Two different techniques are employed, classification and graphical modeling. We also propose an MCMC extension to the graphical model problem. The results in this thesis indicate that the data mining approach is very promising in this kind of research. Several known facts about schizophrenia were reproducable in this study.

Tina Dam, Jan Brønnum Sørensen, and Michael Hedegaard Thomsen. Master's thesis, Aalborg University, Department of Mathematical Sciences, 1999.

Abstract: This thesis presents the theory of spatial point processes. It introduces the general setting and characterization results for point processes, followed by definitions of the binomial and the Poisson point processes. The Poisson point process is important for the theory of point processes as it models complete spatial randomness. Various summary statistics are presented, and estimation and examples of the use of such are given. Emphasis is placed on Ripley's K-function and statistics related to it. Different classes of models are defined; Cox, cluster, hard-core, Markov, nearest neighbour Markov, and Gibbs processes and simulation of each of these models is explained. Markov chain Monte Carlo simulation is covered and here the Metropolis-Hastings algorithm is an important tool. We then present spatial birth{and{death processes that provide an alternative to the Metropolis-Hastings algorithm. Perfect simulation is treated in order to make it possible to simulate from an exact equilibrium distribution. Parametric inference is the final topic covered. Here pseudo-likelihood estimation, contrast-methods, Markov chain Monte Carlo maximum likelihood estimation, and stochastic gradient estimation is treated along with parametric bootstrapping, likelihood ratio and Fisher-Wald tests. Finally, six different point patterns are analysed mainly to exemplify the use of the presented theory.

Ulrike Schneider. Advances and Applications in Perfect Sampling. PhD thesis, University of Colorado, Department of Applied Mathematics, 2003.

Abstract: Perfect sampling algorithms are Markov Chain Monte Carlo (MCMC) methods without statistical error. The latter are used when one needs to get samples from certain (non-standard) distributions. This can be accomplished by creating a Markov chain that has the desired distribution as its stationary distribution, and by running sample paths "for a long time", i.e. until the chain is believed to be in equilibrium. The question "how long is long enough?" is generally hard to answer and the assessment of convergence is a major concern when applying MCMC schemes. This issue completely vanishes with the use of perfect sampling algorithms which -- if applicable -- enable exact simulation from the stationary distribution of a Markov chain. In this thesis, we give an introduction to the general idea of MCMC methods and perfect sampling. We develop advances in this area and highlight applications of these advances to two relevant problems. As advances, we discuss and devise several variants of the well-known Metropolis-Hastings algorithm which address accuracy, applicability, and computational cost of this method. We also describe and extend the idea of slice coupling, a technique which enables one to couple continuous sample paths of Markov chains for use in perfect samplingalgorithms. As a first application, we consider Bayesian variable selection. The problem of variable selection arises when one wants to model the relationship between a variable of interest and a subset of explanatory variables. In the Bayesian approach one needs to sample from the posterior distribution of the model and this simulation is usually carried out using regular MCMC methods. We significantly expand the use of perfect sampling algorithms within this problem using ideas developed in this thesis. As a second application, we depict the use of these methods for the interacting fermion problem. We employ perfect sampling for computing self energy through Feynman diagrams using Monte Carlo integration techniques. We conclude by stating two open questions for future research.

Mariana Rodrigues Motta. Perfect Simulation of the Bi-dimentional Silo Model. Master's thesis, 2001.
Murali Haran. Efficient Perfect and MCMC Sampling Methods for Bayesian Spatial and Components of Variance Models. PhD thesis, University of Minnesota, School of Statistics, 2003.

Author's description: I worked on Bayesian computation for spatial models and some hierarchical models. In particular, I investigated sampling techniques for hierarchical models that use Gaussian Markov Fields (GMRFs). A major portion of my thesis contains work on perfect sampling approaches for Bayesian disease mapping models.

Graeme K. Ambler and Bernard W. Silverman. Perfect simulation for Bayesian wavelet thresholding with correlated coefficients, 2004. (PDF)

Abstract: We introduce a new method of Bayesian wavelet shrinkage for reconstructing a signal when we observe a noisy version. Rather than making the usual assumption that the wavelet coefficients of the signal are independent, we allow for the possibility that they are locally correlated in both location (time) and scale (frequency). This leads us to a prior structure which is, unfortunately, analytically intractable. Nevertheless, it is possible to draw independent samples from a close approximation to the posterior distribution by an approach based on Coupling From The Past, making it possible to use a simulation-based approach to fit the model.

Graeme K. Ambler and Bernard W. Silverman. Perfect simulation of spatial point processes using dominated coupling from the past with application to a multiscale area-interaction point process, 2004. (PDF)

Abstract: We consider perfect simulation algorithms for locally stable point processes based on dominated coupling from the past. A new version of the algorithm is developed which is feasible for processes which are neither purely attractive nor purely repulsive. Such processes include multiscale area-interaction processes, which are capable of modelling point patterns whose clustering structure varies across scales. We prove correctness of the algorithm and existence of these processes. An application to the redwood seedlings data is discussed.

M. N. M. van Lieshout, and R. S. Stoica. Perfect Simulation for Marked Point Processes. PNA-R0306, ISSN 1386-3711, 2003. (PDF)

Abstract: This paper extends some recently proposed exact simulation algorithms for point processes to marked patterns and reports on a simulation study into the relative efficiency for a range of Markov marked point processes.

Elke Thönnes. Generic CFTP in Stochastic Geometry and beyond, Revista de Estatistica, Special Issue, Proceedings of the 23rd European Meeting of Statistician in Funchal, 81 - 84, 2001.
Mark Huber. Perfect sampling using bounding chains. Annals of Applied Probability 14(2):734--753, 2004.

Abstract: Bounding chains are a technique that offers three benefits to Markov chain practitioners: a theoretical bound on the mixing time of the chain under restricted conditions, experimental bounds on the mixing time of the chain that are provably accurate and construction of perfect sampling algorithms when used in conjunction with protocols such as coupling from the past. Perfect sampling algorithms generate variates exactly from the target distribution without the need to know the mixing time of a Markov chain at all. We present here the basic theory and use of bounding chains for several chains from the literature, analyzing the running time when possible. We present bounding chains for the transposition chain on permutations, the hard core gas model, proper colorings of a graph, the antiferromagnetic Potts model and sink free orientations of a graph.

Wilfrid S. Kendall. Notes on Perfect Simulation, 2004.

Abstract: Perfect simulation refers to the art of converting suitable Markov Chain Monte Carlo algorithms into algorithms which return exact draws from the target distribution, instead of long-time approximations. The theoretical concepts underlying perfect simulation have a long history, but they were first drawn together to form a practical simulation technique in the ground-breaking paper of Propp and Wilson , which showed (for example) how to obtain exact draws from (for example) the critical Ising model. These lecture notes are organized around four main themes of perfect simulation: the original or classic Coupling From The Past algorithm (CFTP); variations which exploit regeneration ideas such as small-set or split-chain constructions from Markov chain theory (small-set CFTP); generalizations of CFTP which deal with non-monotonic and non-uniformly ergodic examples (dominated CFTP); and finally some theoretical complements.

Mark Huber. A bounding chain for Swendsen-Wang. Random Structures and Algorithms 22(1):43--59, 2002.

Abstract: The greatest drawback of Monte Carlo Markov chain methods is lack of knowledge of the mixing time of the chain. The use of bounding chains solves this difficulty for some chains by giving theoretical and experimental upper bounds on the mixing time. Moreover, when used with methodologies such as coupling from the past, bounding chains allow the user to take samples drawn exactly from the stationary distribution without knowledge of the mixing time. Here we present a bounding chain for the Swendsen-Wang process. The Swendsen-Wang bounding chain allow us to efficiently obtain exact samples from the ferromagnetic Q-state Potts model for certain classes of graphs. Also, by analyzing this bounding chain, we will show that Swendsen-Wang is rapidly mixing over a slightly larger range of parameters than was known previously.

Nancy L. Garcia and Nevena Maric. Perfect simulation for a continuous one-dimensional loss network, 2002. (Postscript)

Abstract: Sufficient conditions for ergodicity of a one-dimensional loss networks on R with length distribution G and cable capacity C are found. These processes are spatial birth-and-death processes with an invariant measure which is absolutely continuous with respect to a Poisson process and we implement the perfect simulation scheme based on the clan of ancestors introduced by Fernández, Ferrari and Garcia (2002) to obtain perfect samples viewed in a finite window of the infinite-volume invariant measure. Moreover, by a better understanding of the simulation process it is possible to get a better condition for ergodicity.

Yuzhi Cai and Wilfrid S. Kendall. Perfect simulation for correlated Poisson random variables conditioned to be positive. Statistics and Computing 12(3):229--243, 2002.

Abstract: In this paper we present a perfect simulation method for obtaining perfect samples from collections of correlated Poisson random variables conditioned to be positive. We show how to use this method to produce a perfect sample from a Boolean model conditioned to cover a set of points: in W.S. Kendall and E. Th�nnes (Pattern Recognition 32(9):1569--1586, 1999), this special case was treated in a more complicated way. The method is applied to several simple examples where exact calculations can be made, so as to check correctness of the program using χ²-tests, and some small-scale experiments are carried out to explore the behaviour of the conditioned Boolean model.

Ian W. McKeague and Marc Loizeaux. Perfect Sampling for Point Process Cluster Modelling. Chapter 5 of Spatial Cluster Modelling (A. Lawson and D. Denison, eds.), Chapman and Hall (2002), pages 87-107. (Postscript)
Marc A. Loizeaux and Ian W. McKeague. Perfect Sampling for Posterior Landmark Distributions with an Application to the Detection of Disease Clusters. IMS Lecture Notes - Monograph Series, Vol. 37, 321-331 (2001). (Postscript)

Abstract: We study perfect sampling for the posterior distribution in a class of spatial point process models introduced by Baddeley and van Lieshout (1993). For Neyman--Scott cluster models, perfect sampling from the posterior is shown to be computationally feasible via a coupling-from-the-past type algorithm of Kendall and Møller. An application to data on leukemia incidence in upstate New York is presented.

M. N. M. van Lieshout and Erik van Zwet. Exact sampling from conditional Boolean models with applications to maximum likelihood inference. Advances in Applied Probabability 33(2):339--353, 2001.

Abstract: We are interested in estimating the intensity parameter of a Boolean model of discs (the bombing model) from a single realization. To do so, we derive the conditional distribution of the points (germs) of the underlying Poisson process. We demonstrate how to apply coupling from the past to generate samples from this distribution, and use the samples thus obtained to approximate the maximum likelihood estimator of the intensity. We discuss and compare two methods: one based on a Monte Carlo approximation of the likelihood function, the other a stochastic version of the EM algorithm.

George Casella, Christian P. Robert, and Martin T. Wells. Rao-Blackwellization of Generalized Accept-Reject Schemes, 2000. (Postscript)

Abstract: This paper extends the accept-reject algorithm to allow the proposal distribution to change at each iteration. We first establish a necessary and sufficient condition for this generalized accept-reject algorithm to be valid, and then show how the Rao-Blackwellization of Casella and Robert (1996) can be extended to this setting. An important application of these results is to the perfect sampling technique of Fill (1998), which is a generalized accept-reject algorithm in disguise.

Jesper Møller and Jakob G. Rasmussen. Perfect Simulation of Hawkes Processes. Research Report R-2004-18, Department of Mathematical Sciences, Aalborg University, 2004.

Abstract: This article concerns a perfect simulation algorithm for unmarked and marked Hawkes processes. The usual straightforward simulation algorithm suffers from edge effects, whereas our perfect simulation algorithm does not. By viewing Hawkes processes as Poisson cluster processes and using their branching and conditional independence structure, useful approximations of the distribution function for the length of a cluster are derived. This is used to construct upper and lower processes for the perfect simulation algorithm. Examples of applications and empirical results are presented.

Wilfrid S. Kendall. Geometric Ergodicity and Perfect Simulation, 2004. (PDF)

Abstract: This note extends the work of Foss and Tweedie (1998), who showed that availability of the classic Coupling from The Past algorithm of Propp and Wilson (1996) is essentially equivalent to uniform ergodicity for a Markov chain (see also Hobert and Robert 2004). In this note we show that all geometrically ergodic chains possess dominated Coupling from The Past algorithms (not necessarily practical!) which are rather closely connected to Foster-Lyapunov criteria.

Chris Holmes and David G. T. Denison. Pefect sampling for wavelet reconstruction of signals. IEEE Transactions on Signal Processing 50(2):337--344, 2002. (PDF)

Abstract: The coupling from the past (CFTP) procedure is a protocol for finite-state Markov chain Monte Carlo (MCMC) methods whereby the algorithm itself can determine the necessary runtime to convergence. In this paper, we demonstrate how this protocol can be applied to the problem of signal reconstruction using Bayesian wavelet analysis where the dimensionality of the wavelet basis set is unknown, and the observations are distorted by Gaussian white noise of unknown variance. MCMC simulation is used to account for model uncertainty by drawing samples of wavelet bases for approximating integrals (or summations) on the model space that are either too complex or too computationally demanding to perform analytically. We extend the CFTP protocol by making use of the central limit theorem to show how the algorithm can also monitor its own approximation error induced by MCMC. In this way, we can assess the number of MCMC samples needed to approximate the integral to within a user specified tolerance level. Hence, the method automatically ensures convergence and determines the necessary number of iterations needed to meet the error criteria.

Radu V. Craiu and Xiao-Li Meng. Multi-process parallel antithetic coupling for backward and forward Markov chain Monte Carlo. Preprint, 2003

Abstract: Antithetic coupling is a general stratification strategy for reducing Monte Carlo variance without increasing the simulation size. The use of the antithetic principle in the Monte Carlo literature typically employs two strata via antithetic quantile coupling. We demonstrate here that further stratification, obtained by using k > 2 (e.g., k = 3--10) antithetically coupled variates, can offer substantial additional gain in Monte Carlo efficiency, in terms of both variance and bias. The reason for reduced bias is that antithetically coupled chains can provide a more dispersed search of the state space than multiple independent chains. The emerging area of perfect simulation provides a perfect setting for implementing the k-process parallel antithetic coupling for MCMC because, without antithetic coupling, this class of methods delivers genuine independent draws. Furthermore, antithetic backward coupling provides a very convenient theoretical tool for investigating antithetic forward coupling. However, the generation of k > 2 antithetic variates that are negatively associated, i.e., they preserve negative correlation under monotone transformations, and extremely antithetic, i.e., they are as negatively correlated as possible, is more complicated compared to the case with k = 2. In this paper, we establish a theoretical framework for investigating such issues. Among the generating methods that we compare, Latin hypercube sampling and its iterative extension appear to be general-purpose choices, making another direct link between Monte Carlo and Quasi Monte Carlo.

J. N. Corcoran and U. Schneider. Pseudo-perfect and adaptive variants of the Metropolis-Hasting algorithm with an independent candidate density, 2004. To appear in Journal of Statistical Computation and Simulation. (PDF)

Abstract: We describe and examine an imperfect variant of a perfect sampling algorithm based on the Metropolis-Hastings algorithm that appears to perform better than a more traditional approach in terms of speed and accuracy. We then describe and examine an ``adaptive'' Metropolis-Hasting algorithm which generates and updates a self-target candidate density in such a way that there is no ``wrong choice'' for an initial candidate density. Simulation examples are provided.

M. S. Carvalho and J. N. Corcoran. Solving stochastic difference equations originated from continuous-time threshold AR(1) models through perfect simulation, 2003. Preprint.

Abstract: In this paper we look at a discretization of a continuous-time threshold AR(1) process (CTAR(1)) and show how to apply perfect simulation to find the stationary distribution of the resulting stochastic difference equation. Such method leads to perfect solutions of stochastic difference equations of this kind. Depending on how sensitive the original CTAR(1) process is to the discretization, the method may give an approximate solution to the original continuous-time process. Also, we introduce the phenomenon of decoupling in a perfect simulation algorithm.

Jesper Møller, A. N. Pettitt, K. K. Berthelsen, and R. W. Reeves. An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. (PDF)

Abstract: We present new methodology for drawing samples from a posterior distribution when (i) the likelihood function or (ii) a part of the prior distribution is only specified up to a normalising constant. In the case (i), the novelty lies in the introduction of an auxiliary variable in a Metropolis-Hastings algorithm and the choice of proposal distribution so that the algorithm does not depend upon the unknown normalising constant. In the case (ii), similar ideas apply and the situation is even simpler as no auxiliary variable is required. Our method is ``on-line'' as compared with alternative approaches to the problem which require ``off-line'' computations. Since it is needed to simulate from the ``unknown distribution'', e.g. the likelihood function in case (i), perfect simulation such as the Propp-Wilson algorithm becomes useful. We illustrate the method in case (i) by producing posterior samples when the likelihood is given by an Ising model and by a Strauss point process.

Shane G. Henderson and Richard L. Tweedie. Perfect simulation for Markov chains arising from discrete-event simulation. Proceedings of the 38th Annual Allerton Conference on Communication, Control and Computing, pp. 1125--1134, 2000. (Postscipt)

Abstract: Virtually any discrete-event simulation can be rigorously defined as a Markov chain evolving on a general state space, and under appropriate conditions, the chain has a unique stationary probability distribution. Many steady-state performance measures can be expressed in terms of the stationary probability distribution of the chain. We would like to apply ``coupling from the past'' algorithms to obtain samples from the stationary probability distribution of such chains. Unfortunately, the structural properties of the chains arising from discrete-event simulations preclude the immediate application of current coupling from the past algorithms. We describe why this is the case, and extend a class of coupling from the past algorithms so that they may be applied in this setting.

Michael Korn and Igor Pak. Tilings of rectangles with T-tetrominoes, 2003. Theoretical Computer Science 319(1-3):3-27, 2004. [PDF]

Abstract: We prove that any two tilings of a rectangular region by T-tetrominoes are connected by moves involving only 2 and 4 tiles. We also show that the number of such tilings is an evaluation of the Tutte polynomial. The results are extended to a more general class of regions.

From the introduction: Along the way we show that our new height functions enable us to define a lattice structure on all T- tetromino tilings of a rectangle. While this may seem a theoretical curiosity, in fact there are important applications of this result. There is a natural definition of a Markov chain on tilings (perform ``random local moves'') which translates into a nice Markov chain on height functions. Using the ``coupling from the past'' technique of Propp and Wilson and the fact that the height functions form a lattice (see [15, 14]) one can sample random T-tetromino tilings from an exactly uniform distribution (as opposed to a ``nearly uniform'' distribution usually obtained by the Markov chain approach). We refer to  for a full discussion of this approach and other examples of lattice structures on tilings.

James P. Hobert and Christian P. Robert. A mixture representation of π with applications in Markov chain Monte Carlo and perfect sampling. The Annals of Applied Probability 14(3):1295--1305, 2004. (PDF)

Abstract: Let X ={Xn : n =0 1 2 ...} be an irreducible, positive recurrent Markov chain with invariant probability measure π. We show that if X satisfies a one-step minorization condition, then π can be represented as an infinite mixture. The distributions in the mixture are associated with the hitting times on an accessible atom introduced via the splitting construction of Athreya and Ney [Trans. Amer. Math. Soc. 245 (1978) 493--501] and Nummelin [Z. Wahrsch. Verw. Gebiete 43 (1978) 309--318]. When the small set in the minorization condition is the entire state space, our mixture representation of π reduces to a simple formula, first derived by Breyer and Roberts [Methodol. Comput. Appl. Probab. 3 (2001) 161--177] from which samples can be easily drawn. Despite the fact that the derivation of this formula involves no coupling or backward simulation arguments, the formula can be used to reconstruct perfect sampling algorithms based on coupling from the past (CFTP) such as Murdoch and Green's [Scand. J. Statist. 25 (1998) 483--502] Multigamma Coupler and Wilson's [Random Structures Algorithms 16 (2000) 85--113] Read-Once CFTP algorithm. In the general case where the state space is not necessarily 1-small, under the assumption that X satisfies a geometric drift condition, our mixture representation can be used to construct an arbitrarily accurate approximation to π from which it is straightforward to sample. One potential application of this approximation is as a starting distribution for a Markov chain Monte Carlo algorithm based on X .

Stephen P. Brooksy, Yanan Fan, and Jeffrey S. Rosenthal. Perfect Forward Simulation via Simulated Tempering. Preprint.

Abstract: Several authors discuss how the simulated tempering scheme provides a very simple mechanism for introducing regenerations within a Markov chain. In this paper we explain how regenerative simulated tempering schemes provide a very natural mechanism for perfect simulation. We use this to provide a perfect simulation algorithm, which uses a single-sweep forward-simulation without the need for recursively searching through negative times. We demonstrate this algorithm in the context of several examples.

F. Amaya and J. M. Benedí. 2000. Using Perfect Sampling in Parameter Estimation of a Wole Sentence Maximum Entropy Language Model. Proc. Fourth Computational Natural Language Learning Workshop, CoNLL-2000. Pages 79-82.

Abstract: The Maximum Entropy principle (ME) is an appropriate framework for combining information of a diverse nature from several sources into the same language model. In order to incorporate long-distance information into the ME framework in a language model, a Whole Sentence Maximum Entropy Language Model (WSME) could be used. Until now MonteCarlo Markov Chains (MCMC) sampling techniques has been used to estimate the paramenters of the WSME model. In this paper, we propose the application of another sampling technique: the Perfect Sampling (PS). The experiment has shown a reduction of 30% in the perplexity of the WSME model over the trigram model and a reduction of 2% over the WSME model trained with MCMC.

Shuji Kijima and Tomomi Matsui. Polynomial time perfect sampling algorithm for two-rowed contingency tables, 2003. Tech report METR 2003-15.

Abstract: This paper proposes a polynomial time perfect (exact) sampling algorithm for 2×n contingency tables. Our algorithm is a Las Vegas type randomized algorithm and the expected running time is bounded by O(n3 ln N) where n is the number of columns and N is the total sum of whole entries in a table. The algorithm is based on monotone coupling from the past (monotone CFTP) algorithm and new Markov chain for sampling two-rowed contingency tables uniformly. We employed the path coupling method and showed that the mixing rate of our chain is bounded by n(n-1)²(1+ln(n N)). Our result indicates that uniform generation of two-rowed contingency tables is easier than the corresponding counting problem, since the counting problem is known to be #P-complete.

Tomomi Matsui and Shuji Kijima. Polynomial Time Perfect Sampler for Discretized Dirichlet Distribution, 2003. Tech report METR 2003-17.

Abstract: In this paper, we propose a perfect (exact) sampling algorithm according to a discretized Dirichlet distribution. Our algorithm is a monotone coupling from the past algorithm, which is a Las Vegas type randomized algorithm. We propose a new Markov chain whose limit distribution is a discretized Dirichlet distribution. Our algorithm simulates transitions of the chain O(n3 ln Δ) times where n is the dimension (the number of parameters) and 1/Δ is the grid size for discretization. Thus the obtained bound does not depend on the magnitudes of parameters. In each transition, we need to sample a random variable according to a discretized beta distribution (2-dimensional Dirichlet distribution). To show the polynomiality, we employ the path coupling method carefully and show that our chain is rapidly mixing.

Motoya Machida. Fill's algorithm for absolutely continuous stochastically monotone kernels. Electronic Communications in Probability, Vol. 7, paper 15, 2002.

Abstract: Fill, Machida, Murdoch, and Rosenthal (2000) presented their algorithm and its variants to extend the perfect sampling algorithm of Fill (1998) to chains on continuous state spaces. We consider their algorithm for absolutely continuous stochastically monotone kernels, and show the correctness of the algorithm under a set of certain regularity conditions. These conditions succeed in relaxing the previously known hypotheses sufficient for their algorithm to apply.

Nancy L. Garcia and Mariana R. Motta. Perfect simulation for a stationary silo with absorbing walls. Methodology and Computing in Applied Probability 5(1):109--121, 2003. (Postscript)

Abstract: The objective of this work is to generate random samples of the unique stationary distribution associated to the stochastic model for grain storage in a finite bidimensional silo. The support of this measure is an unbounded and continuous state space and therefore a truncation was necessary to apply the CFTP perfect simulation scheme. The performance of the algorithm was measure by comparing the sample moments to the theoretical ones.

Wolfgang Hörmann and Josef Leydold. Improved perfect slice sampling, 2003. Preprint.

Abstract: Perfect slice sampling is a method to turn Markov Chain Monte Carlo (MCMC) samplers into exact generators for independent random variates. The originally proposed method is rather slow and thus several improvements have been suggested. However, two of them are erroneous. In this article we give a short introduction to perfect slice sampling, point out incorrect methods, and give a new improved version of the original algorithm.

Damian Clancy and Philip D. O'Neill Perfect simulation for stochastic models of epidemics among a community of households, 2002. Preprint.

Abstract: Recent progress in the analysis of infectious disease data using stochastic epidemic models has been largely due to Markov chain Monte Carlo methodology. In the present paper it is illustrated that certain methods of perfect simulation can also be applied in the context of epidemic models. Specifically, a perfect simulation algorithm due to Corcoran and Tweedie (2002) is considered for epidemic models that feature a population divided into households. The models allow individuals to be potentially infected both from outside the household, and from other household members. The data are assumed to consist of the final numbers ultimately infected within each household in the community. The methods are applied to data taken from outbreaks of in and the results compared with those obtained by other approaches.

R. Fernández, P. A. Ferrari, and A. Galves. Coupling, renewal and perfect simulation of chains of infinite order. Notes for a course in the V Brazilian School of Probability, Ubatuba 2001.

This is a 95 pages booklet developing an explicit construction of chains with infinite memory under the Harris regime.

S. Foss and T. Konstantopoulos. Extended renovation theory and limit theorems for stochastic ordered graphs. Markov Processes Relat. Fields 9:413--468, 2003. (PDF)

Abstract: We extend Borovkov's renovation theory to obtain criteria for coup-ling- convergence of stochastic processes that do not necessarily obey stochastic recursions. The results are applied to an ``infinite bin model'', a particular system that is an abstraction of a stochastic ordered graph, i.e., a graph on the integers that has (i, j), i < j, as an edge, with probability p, independently from edge to edge. A question of interest is an estimate of the length Ln of a longest path between two vertices at distance n. We give sharp bounds on C = limn→∞ (Ln/n). This is done by first constructing the unique stationary version of the infinite bin model, using extended renovation theory. We also prove a functional law of large numbers and a functional central limit theorem for the infinite bin model. Finally, we discuss perfect simulation, in connection to extended renovation theory, and as a means for simulating the particular stochastic models considered in this paper.

Francesco Bartolucci and Julian Besag. A recursive algorithm for Markov random fields. Biometrika 89(3):724--730, 2002. (PDF)

Abstract: We propose a recursive algorithm as a more useful alternative to the Brook expansion for the joint distribution of a vector of random variables when the original formulation is in terms of the corresponding full conditional distributions,as occurs for Markov random fields. Usually,in practical applications, the computational load will still be excessive but then the algorithm can be used to obtain the componentwise full conditionals of a system after marginalizing over some variables or the joint distribution of subsets of the variables, conditioned on values of the remainder, which is required for block Gibbs sampling. As an illustrative example, we apply the algorithm in the simplest nontrivial setting of hidden Markov chains. More important, we demonstrate how it applies to Markov random fields on regular lattices and to perfect block Gibbs sampling for binary systems.

Olle Häggström, Johan Jonasson, and Russell Lyons. Coupling and Bernoullicity in random-cluster and Potts models. Bernoulli 8(3):275--294, 2002. (Postscript)

Abstract: An explicit coupling construction of random-cluster measures is presented. As one of the applications of the construction, the Potts model on amenable Cayley graphs is shown to exhibit at every temperature the mixing property known as Bernoullicity.

M. N. M. van Lieshout, and A. J. Baddeley. Extrapolating and interpolating spatial patterns. Chapter 4 of Spatial Cluster Modelling (A. Lawson and D. Denison, eds.), Chapman and Hall/CRC, 2002. Pages 61--86.

Abstract: We discuss issues arising when a spatial pattern is observed within some bounded region of space, and one wishes to predict the process outside of this region (extrapolation) as well as to perform inference on features of the pattern that cannot be observed (interpolation). We focus on spatial cluster analysis. Here the interpolation arises from the fact that the centres of clustering are not observed. We take a Bayesian approach with a repulsive Markov prior, derive the posterior distribution of the complete data, i.e. cluster centres with associated o#spring marks, and propose an adaptive coupling from the past algorithm to sample from this posterior. The approach is illustrated by means of the redwood data set (Ripley, 1977).

Jorge Mateu, Jordi Artés, and José A. López. Computational issues for perfect simulation in spatial point patterns. Communications in Nonlinear Science and Numerical Simulation 9(2):229--240, 2004.
Petar M. Djurić, Yufei Huang, and Tadesse Ghirmai. Perfect sampling: a review and applications to signal processing. IEEE Transactions on Signal Processing 50(2):345--356, 2002. (PDF)

Abstract: In recent years, Markov chain Monte Carlo (MCMC) sampling methods have gained much popularity among researchers in signal processing. The Gibbs and the Metropolis-Hastings algorithms, which are the two most popular MCMC methods, have already been employed in resolving a wide variety of signal processing problems. A drawback of these algorithms is that in general, they cannot guarantee that the samples are drawn exactly from a target distribution. More recently, new Markov chain-based methods have been proposed, and they produce samples that are guaranteed to come from the desired distribution. They are referred to as perfect samplers. In this paper, we review some of them, with the emphasis being given to the algorithm coupling from the past (CFTP). We also provide two signal processing examples where we apply perfect sampling. In the first, we use perfect sampling for restoration of binary images and, in the second, for multiuser detection of CDMA signals.

Tournois Florent. Perfect simulation of a stochastic model for CDMA coverage. INRIA research report #4348, 2002.

Abstract: The aim of this article is to present ways of getting realizations of the stochastic model for CDMA coverage introduced in . This is a basic stochastic geometry model that represents the location of antennas as realizations of Poisson point processes in the plane and allows one to capture regions (CDMA cells) of the plane with signal-to-noise ratio SNR to/from a given antenna at a sufficient level. We describe an algorithm of perfect simulation of arbitrary close approximation of the the CDMA cell mosaic given some coverage constraints. This is an adaptation of the so called backward simulation of conditional coverage process developed in . The approximation comes from the estimation of the SNR, that is modeled by Poisson shot-noise process, based on an influence window whose size is theoretically adjusted to meet the true value of SNR with an arbitrary precision. As a motivating example we estimate the distribution functions of the distance from a fixed location to regions under various given handoff conditions (so called contact distribution functions).

Philip D. O'Neill. Perfect simulation for Reed-Frost epidemic models. Statistics and Computing 13(1):37--44, 2003.

Abstract: The Reed-Frost epidemic model is a simple stochastic process with parameter q that describes the spread of an infectious disease among a closed population. Given data on the final outcome of an epidemic, it is possible to perform Bayesian inference for q using a simple Gibbs sampler algorithm. In this paper it is illustrated that by choosing latent variables appropriately, certain monotonicity properties hold which facilitate the use of a perfect simulation algorithm. The methods are applied to real data.

J. N. Corcoran and U. Schneider. Shift and scale coupling methods for perfect simulation. Probability in the Engineering and Informational Sciences 17(3):277--303, 2003. (PDF)

Abstract: We describe and develop a variation on a layered multishift coupler due to Wilson that uses a slice sampling procedure to allow one to obtain potentially common draws from two different distributions. Our main application is coupling sample paths of Markov chains for use in perfect sampling algorithms. The coupler is based on slicing density functions and we describe a ``folding'' mechanism as an attractive alternative to the accept/reject step commonly used in slice sampling algorithms. Applications of the coupler are given to storage models and to auto-gamma distribution sampling.

U. Schneider and J. N. Corcoran. Perfect sampling for Bayesian variable selection in a linear regression model. Journal of Statistical Planning and Inference 126(1):153--171, 2004.

Abstract: We describe the use of perfect sampling algorithms for Bayesian variable selection in a linear regression model. Starting with a basic case solved by Huang and Djuric (2002), where the model coefficients and noise variance are assumed to be known, we generalize the model step by step to allow for other sources of randomness, specifying perfect simulation algorithms that solve these cases by incorporating various techniques including Gibbs sampling, the perfect independent Metropolis-Hastings (IMH) algorithm, and recently developed ``slice coupling'' algorithms. Applications to simulated data sets suggest that our algorithms perform well in identifying relevant predictor variables.

J. N. Corcoran, U. Schneider, and H.-B. Sch�ttler. Perfect stochastic summation of high order Feynman graph expansions, 2003. Preprint.

Abstract: We describe a new application of an existing perfect sampling technique of Corcoran and Tweedie to estimate the self energy of an interacting Fermion model via Monte Carlo summation. Simulations suggest that the algorithm in this context converges extremely rapidly and results compare favorably to true values obtained by brute force computations for low dimensional toy problems. A variant of the perfect sampling scheme which improves the accuracy of the Monte Carlo sum for small samples is also given.

Iain Murray, Zoubin Ghahramani, and David J. C. MacKay. MCMC for doubly-intractable distributions. Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2006. (PDF)

Abstract: Markov Chain Monte Carlo (MCMC) algorithms are routinely used to draw samples from distributions with intractable normalization constants. However, standard MCMC algorithms do not apply to doubly-intractable distributions in which there are additional parameter-dependent normalization terms; for example, the posterior over parameters of an undirected graphical model. An ingenious auxiliary-variable scheme (Møller et al., 2004) offers a solution: exact sampling (Propp and Wilson, 1996) is used to sample from a Metropolis-Hastings proposal for which the acceptance probability is tractable. Unfortunately the acceptance probability of these expensive updates can be low. This paper provides a generalization of Møller et al. (2004) and a new MCMC algorithm, which obtains better acceptance probabilities for the same amount of exact sampling, and removes the need to estimate model parameters before sampling begins.

James A. Fill and Mark Huber. Perfect simulation of Vervaat perpetuities. arXiv:0908.1733

Abstract: We use coupling into and from the past to sample perfectly in a simple and provably fast fashion from the Vervaat family of perpetuities. The family includes the Dickman distribution, which arises both in number theory and in the analysis of the Quickselect algorithm, which was the motivation for our work.

### Theses Relevant to Perfect Sampling

A number of people have written graduate theses in which a significant component was on perfect sampling with Markov chains. In chronological order these are:

David Bruce Wilson. Exact Sampling with Markov Chains. Ph.D. thesis, MIT Mathematics Department, May 28, 1996.

Tine Møller Sørensen. A study of Propp & Wilson's method for exact Markov chain Monte Carlo sampling, 1997. Master's thesis in computational statistics, University of Bath.

Morten Fismen. Exact simulation using Markov chains. Technical Report 6/98, Institutt for Matematiske Fag, 1998. Diploma-thesis.

Karin Nelander. Contributions to Percolation Theory and Exact Sampling. Ph.D. thesis, Deperartment of Mathematics, Göteborg, March 29, 1998.

Jem Noelle Corcoran. Perfect Sampling in MCMC Algorithms. Ph.D. thesis, Department of Statistics at Colorado State University, May 1998.

Bas Straatman. Exact sampling and applications to a mite dispersal model. Master's thesis, Utrecht, August 10, 1998.

Elke Thönnes. Perfect and Imperfect Simulations in Stochastic Geometry. Ph.D. thesis, University of Warwick, October 1998.

Motoya Machida. Stochastic Monotonicity and Realizable Monotonicity. PhD thesis, The Johns Hopkins University, Department of Mathematical Sciences, 1999.

Mark Lawrence Huber. Perfect Sampling Using Bounding Chains. Ph.D. thesis, Cornell University, May 1999.

Michael Harvey. Monte Carlo Inference for Belief Networks Using Coupling From the Past. Master's thesis, department of computer science, University of Toronto, 1999.

Jens Lund. Statistical inference and perfect simulation for point processes observed with noise. PhD thesis, The Royal Veterinary and Agricultural University, Copenhagen, Department of Statistics, December 1999.

Alison Gibbs. Convergence of Markov Chain Monte Carlo Algorithms with Applications to Image Restoration. PhD thesis, University of Toronto, Department of Statistics, October 1999. (Postscript, PDF.)

Jesper Møller. Aspects of spatial statistics, stochastic geometry and Markov chain Monte Carlo methods. Thesis for doctoral degree in Natural Sciences, Aalborg University.

Tina Dam, Jan Brønnum Sørensen, and Michael Hedegaard Thomsen. Master's thesis, Aalborg University, Department of Mathematical Sciences, 1999.

Mariana Rodrigues Motta. Perfect Simulation of the Bi-dimentional Silo Model. Master's thesis, 2001.

Giovanni Montana. Small sets and domination in perfect simulation. Ph.D. thesis, University of Warwick, April 2002.

Graeme K. Ambler. Dominated Coupling From The Past and Some Extensions of the Area-Interaction Process. PhD thesis, University of Bristol, Department of Mathematics, 2002.

Can Mert. Bayesian inference in Schizophrenia research. Master's thesis, Royal Institute of Technology, Computer Science, 2002.

Ulrike Schneider. Advances and Applications in Perfect Sampling. PhD thesis, University of Colorado, Department of Applied Mathematics, 2003.

Murali Haran. Efficient Perfect and MCMC Sampling Methods for Bayesian Spatial and Components of Variance Models. PhD thesis, University of Minnesota, School of Statistics, 2003.

Faheem Mitha. Perfect Sampling on Continuous State Spaces. PhD thesis, University of North Carolina at Chapel, Department of Statistics, 2003.

Jordi Art�s. Simulaci�n exacta en procesos puntuales espaciales: m�todo CFTP / Exact simulation for spatial point processes: CFTP method. Master's thesis, Universitat Jaume I, Departamento de Matem�ticas, 2004.

Tuomas Rajala. Traditional MCMC and Exact simulation of the Ising model. Master's thesis, University of Jyväskylä, Department of Mathematics and Statistics, 2005.
Advisors: Stefan Geiss and Antti Penttinen.

### Research Talks

There are a number of research talks that do not appear to have associated written articles that are publicly available. These are listed below.

Chris Jennison. Approximate exact sampling: towards the general application of Propp and Wilson's algorithm. Seminar at University of Bath, 1998.

Abstract: Propp and Wilson's sampling-from-the-past rule allows the possibility of exact sampling by Markov chain Monte Carlo (MCMC) methods. We describe the Propp and Wilson method and consider details of its practical usage.
We then consider a scheme for applying the basic idea of Propp and Wilson in general problems. The method becomes approximate once more but with arguable advantages over other MCMC samplers.

Xiao-Li Meng. Improving Perfect Simulation. Seminar at the Department of Statistics, The University of Chicago, January 25, 1999.

Abstract:

The Monte Carlo community recently has had a pleasant surprise. Based on their ingenious idea of running coupled Markov chains with negative time, Propp and Wilson (1996, Random Structures and Algorithms ) proposed a Markov chain Monte Carlo (MCMC) algorithm that will, with probability one, terminate itself in a finite number of steps and yet deliver draws that are exactly from the limiting (i.e., stationary) distribution. By applying the algorithm to a variety of problems, including sampling exactly from the Ising model at the critical temperature with a 4200 by 4200 toroidal grid, they demonstrated the practicality and efficiency of the algorithm. Their findings are perhaps particularly exciting for statisticians/probabilists/applied mathematicians, since this appears to be the first time that a major Monte Carlo method is developed solely by non-physicists, yet the method is applicable to non-trivial computational problems in physics and other scientific studies. Built upon their method of coupling from the past or backward coupling, a new class of MCMC algorithms has emerged. This class of algorithms has been labeled perfect simulation or exact simulation, because for this class of stochastic iterative algorithms the challenging issue of accessing convergence as well as errors in approximation completely vanishes.

In this talk I will first give a brief tutorial of the Propp-Wilson algorithm and Murdoch and Green's (1998, Scandinavian Journal of Statistics) extension to continuous-state Markov chains. I will then propose two methods, multi-stage backward coupling and parallel antithetic coupling, for improving the computational or equivalently statistical efficiency of the current perfect simulation schemes. The first method borrows ideas from survey sampling where it is a common knowledge that multi-stage sampling is typically more cost effective than single-stage sampling. The second method explores the use of antithetic variates with backward coupling, and it relies critically on the theory of negative association, which studies the preservation of negative correlation under monotone transformations. Limited empirical and theoretical evidence suggests the possibility of significant improvement (e.g., 30%-50% reduction in time/variance) with either method over the Propp-Wilson algorithm in certain applications.

While the theory and implementation of the proposed methods is not completely trivial, the talk should be accessible to both the pessimists (nothing is perfect) and the optimists (things can always be better).

Motoya Machida. The Use of Perfect Sampling for Mixture Problems. Talk given at the Fourth North American New Researchers Conference, held at The Johns Hopkins University, August 4 - 7, 1999, also given in Singapore.

Abstract:
Since the seminal work of Propp and Wilson on perfect sampling, two algorithms for perfect sampling are known, one based on coupling from the past (CFTP) by Propp and Wilson, and the other via rejection sampling by Fill. The idea of perfect sampling is, in essence, that given an ergodic Markov chain we can generate a perfect stationary observation within a finite number of steps. The CFTP algorithm has been studied and used in various ways along with applications in Markov Chain Monte Carlo (MCMC). However, the reports on similar investigations for Fill's algorithm are limited compared with the CFTP algorithm.
Recently Hobert, Robert, and Titterington studied the implementation of the CFTP algorithm for two- and three-component mixtures with known components and reported the performances of the CFTP algorithm for these cases. They showed that the MCMC for the two-component mixture is monotone with a natural linear ordering over the weights. By extending such coupling to the case of three-component mixture, they observed that the coupled chains are contained in a lozenge-shaped region, which simplifies the algorithm to detect coalescence. But their coupling method for \$k\$-component mixtures with \$n\$ samples requires \${{n+k-1}{k-1}}\$ transitions at the initial step, no matter how much the size of transitions can be reduced later. Therefore, as they argued, it is difficult to extend their method further because of the combinatorial explosion.
The coalescence of all the coupled chains is the essential characteristic for the CFTP algorithm. On the other hand, Fill's algorithm for stochastically monotone case, in theory, does not require such grand coalescence. In fact, as shown in Fill's paper, the proof of his algorithm's validity relies only on the property of a bivariate coupling. In this talk, we exploit this property further and demonstrate how we can use it for mixture problems of more than three components.

Chris Holmes. Bayesian Model Averaging with Orthogonal Predictors. Seminar, April 29, 1999.

Abstract: An MCMC sampling procedure known as coupling from the past (CFTP) is described for variable selection in the Bayesian linear model with orthogonal predictors, which includes wavelets, Fourier series and Chebyshev polynomials. CFTP allows one to draw perfect i.i.d. samples from the posterior model space in realistic CPU time. Examples are used to demonstrate this approach on a number of curve fitting problems.

Olivier Roques. Coupler depuis le pass� (Coupling from the past). Seminar, May 19, 1999.

Abstract: Lorsque l'on souhaite engendrer des objets selon une distribution P, on peut construire une chaine de Markov ergodique dont la distribution stationnaire est P. Tout le probleme est de savoir combien de temps il faut faire tourner ce processus pour approcher � epsilon pres la distribution stationnaire, i.e borner le temps de m�lange de la chaine. La m�thode "Couplage depuis le pass�" [Kendall, Propp,Wilson, Jerrum,...] permet de s'abstenir de ces calculs souvent compliqu�s et d'obtenir un �chantillon de l'ensemble des �tats, distibu� exactement selon la distribution stationnaire.

Duncan Murdoch. Perfect Sampling: Not Just for Markov Chains? Talk 263-10 at the 1999 Joint Satistical Meeting, also given at McMaster.

Abstract: Propp and Wilson's (1996) coupling from the past (CFTP) algorithm generates a sample from the limiting distribution of a Markov chain. In this talk I argue that the underlying idea of CFTP is more widely applicable, and will demonstrate successful and unsuccessful attempts to apply it to problems ranging from approximation of integrals to simulation of stochastic differential equations.

Antonietta Mira. Ordering Slicing and Splitting Monte Carlo Markov Chains. Talk 263-28 at the 1999 Joint Satistical Meeting.

Abstract: The class of Markov chains having a specified stationary distribution is very large, so it is important to have criteria telling when one chain performs better than another. The Peskun ordering is a partial ordering on this class. We study the implications of the Peskun ordering both in finite and general state spaces. Unfortunately there are many Metropolis-Hastings samplers that are not comparable in the Peskun sense. We thus define a more useful ordering, the covariance ordering, that allows us to compare a wider class of sampling schemes. It is always possible to beat the Metropolis-Hastings algorithm in terms of the Peskun ordering. Two new algorithms will be introduced and studied for this purpose. The slice sampler outperforms the independence Metropolis-Hastings algorithm. The splitting rejection sampler outperforms the more general Metropolis-Hastings algorithm. Finally we propose a ``perfect'' version of the slice sampler. Perfect sampling allows exact simulation of random variables from the stationary measure of a Markov chain. By exploiting monotonicity properties of the slice sampler we show that a perfect version of the algorithm can be easily implemented.

Motoya Machida. Perfect Sampling via Cross-Monotonicity for Simple Mixture Problems. October 13, 1999.

Abstract: Since Propp and Wilson's seminal work published in 1996, the perfect sampling technique has extended its research into many ways, particularly in applications with Markov Chain Monte Carlo (MCMC). Given an ergodic Markov chain, the idea of perfect sampling is, in essence, to generate a sample from the exact stationary distribution in finites steps. Two methods for developing such a perfect samplingalgorithm are known, the one based on coupling from the past (CFTP) by Propp and Wilson (1996) and the other via rejection sampling by Fill (1998).

Fill's algorithm has been known for its advantage to be independent of the running time, but also for its intricacy in both theory and practice. The recent paper by Fill, Machida, Murdoch, and Rosenthal (1999) spelled out a general framework for his algorithm, and proposed various possible extension Hobert, Robert, and Tittering (1999) have studied CFTP algorithm for an MCMC evaluation as simple mixture problems. Their study was limited to two-and three-component mixtures, and suggested a difficulty in implementing CFTP algorithm when the mixture has more than three components. We demonstrate how a framework of Fill's algorithm can be applied for this particular mixture problem, and propose a perfect sampling algorithm using a cross-monotone chain. Our algorithm can be implemented with any number of components, but may not perform well as the number of components increases. The objective of this study is to further stimulate practical applications based on Fill's algorithm.

Laird Breyer. A new coupling construction for Markov chains. July 10, 1999.

Abstract: Many branches of Probability Theory have benefited from coupling constructions, not least the theory of Markov chains. The coupling inequality for example allows us to bound the total variation distance of a Markov chain to its stationary distribution, in an entirely probabilistic way. This has led recently to computable bounds. As another example, Perfect Simulation exploits couplings to produce exact samples from distributions that are hard to simulate from. A central problem is the actual construction of the coupling, which tends to be problem specific and something of an art form. The "splitting technique" is one method which is very general and has been very successful for years. Among its drawbacks are the need to perform analytic estimates before the method can be applied. In this talk, we shall propose a "more automatic" way of coupling Markov chains, which does not require such estimates, and is applicable to a variety of chains.

Kajsa Fröjd. Perfect simulation of some multitype spatial point processes.
Jem Corcoran. Perfect Simulation with Applications to Statistical Mechanics. February 24, 2000.

Abstract: "Perfect simulation"-- Markov chain Monte Carlo with no statistical error. It sounds great, it even looks kind of easy. So why aren't you doing it? Probably because, as they say, "the devil is in the details." Perfect simulation methods are based on the idea of coupling sample paths of a Markov chain. In this talk we will explore the details of three specific examples, each having specific implementation difficulties. We will consider an unbounded storage model where paths must be coupled on a continuous state space, the standard Ising model of magnetism (both the ferromagnet and antiferomagnet) where paths are running around in a more complicated state space of "spin configurations", and the hard-core lattice gas model where sample paths will naturally repel one another despite the fact that our goal is to get them to meet.

E. Olusegun George, Giri Narasimhan, and Anthony Quas. Fast and exact Gibbs sampler: an application of Propp and Wilson's CFTP algorithm.

Abstract: Gibbs Sampling is a popular technique for generating samples from posterior distributions with multidimensional parameters. Often the posterior distributions for such problems are analytically or numerically complex. As in the general MCMC procedure, the samples for posterior analysis are usually collected after a burn-out period, by which time, it is assumed that the sequence of generated values are close to those that would be generated by the true posterior distribution. The problem with this method is that, in order to be sure that sample for posterior analysis is a reasonable representation of the true posterior, a very large number of iterations is typically used for burnout. Furthermore, although the bias introduced by the choice of initial value may become vanishingly small with increasing number of iterations, it is never completely eliminated. In practice, the choice of number of iterations required for burnout is arbitrary and many researchers have proposed less than satisfactory techniques to approach the question. We introduce an alternative method to perform Gibbs Sampling that addresses the above-mentioned shortcomings. The practical advantages of all our methods are as follows: the samples generated are �perfect', i.e., the bias is completely eliminated; the method has a well-defined stopping rule and the expected number of iterations needed for burnout can be estimated in many cases. We illustrate our method by computing posterior distribution of in a teratology application. Our method is an extension of the techniques for perfect sampling suggested by Propp and Wilson (1996).

### Surveys and Representative Articles from Related Areas

Information on heuristic algorithms for assessing Markov chain convergence to the stationary distribution (among other things) can be found in these resources:

Alan D. Sokal. Monte Carlo methods in statistical mechanics: Foundations and new algorithms, 1989. Lecture notes from Cours de Troisième Cycle de la Physique en Suisse Romande. Updated in 1996 for the Cargèse Summer School on ``Functional Integration: Basics and Applications.'' (Click here to download.)

Surveys of rigorous bounds on the convergence rate of Markov chains:

Persi Diaconis and Laurent Saloff-Coste. What do we know about the Metropolis algorithm?, In Proceedings of the Twenty-Seventh Annual ACM Symposium on the Theory of Computing, pages 112-129, 1995.

Mark Jerrum and Alistair Sinclair. The Markov chain Monte Carlo method: an approach to approximate counting and integration, in ``Approximation Algorithms for NP-hard Problems,'' D.S.Hochbaum ed., PWS Publishing, Boston, 1996. (Click here to download.)

A good introduction to stopping times with many examples and references:

David Aldous and Persi Diaconis. Shuffling Cards and Stopping Times, American Mathematical Monthly, 93(5):333-348, 1986.

Persi Diaconis and James A. Fill. Strong stationary times via a new form of duality. The Annals of Probability, 18:1483--1522, 1990.

Representative articles on stochastically recursive sequences:

Herman Thorisson. Backward limits, The Annals of Probability 16(2):914--924, 1988.

A. A. Borovkov and S. G. Foss. Stochastically recursive sequences and their generalizations. Siberian Advances in Mathematics, 2(1):16--81, 1992.

Reversible Markov chains:

David J. Aldous and James A. Fill. Reversible Markov Chains and Random Walks on Graphs. Book in preparation, http://www.stat.berkeley.edu/~aldous/book.html.

When explaining CFTP to an audience, there are invariably many questions. Included below are some of these questions together with their answers. [Adapted from section 1.4 of Wilson (2000), first published by the AMS.]

Q: If we always end up in state 3 no matter where we start, then how is that a random sample?
A: For a given particular sequence of coin flips, every possible starting state ends up in state 3. But for a different (random) sequence of coin flips, every possible starting state may end up in a different final state at time 0.

Q: What if we just run the Markov chain forward until coalescence?
A: Consider the toy example of the previous section. Coalescence only occurs at the states 0 and n, so the result would be very far from being distributed according to π.

Q: What happens if we use fresh coins rather than re-using the same coins?
A: Consider the toy example of the previous section, and set n=2 (so the states are 0,1,2). Then one can check that the probability of outputting state 1 has a binary expansion of 0.0010010101001010101010101001..., which is neither rational nor very close to 1/3.

Q: Rather than doubling back in time, what if we double forward in time, and stop the Markov chain at the first power of 2 greater than or equal to the coalescence time?
A: As before, set n=2 in our toy example. One can check that the probability of outputting state 1 is 1/6 rather than 1/3.

Q: What if we instead ...
A: Enough already! There do exist other ways to generate perfectly random samples, but the only obvious change that can be made to the CFTP algorithm without breaking it is changing the sequence of starting times in the past from powers of 2 to some other sequences integers.

Q: If coupling forwards in time doesn't work, then why does going backwards in time work?
A: If you did not like the first explanation, then another way to look at it is that a virtual Markov chain has been running for all time, and so today the state is random. If we can (with probability 1) figure out today's state by looking at some of the recent randomizing operations, then we have a random state.

Q: If going backwards in time works, then why doesn't going forwards in time work?
A1: There is a way to obtain perfectly random samples by running forwards in time (Wilson 2000), but none of the obvious variations described above work.
A2: CFTP determines the state at a deterministic time, any one of which is distributed according to π. The variations suggested above determine the state at a random time, where that time is correlated with the moves of the Markov chain in a complicated way, and these correlations mess things up.

Q: Why did you step back by powers of 2, when any other sequence would have worked?
A: For efficiency reasons. Let \$T^*\$ be the best time in the past at which to start, i.e. the smallest integer for which starting at time \$-T^*\$ leads to coalescence. By stepping back by powers of 2, the total number of Markov chain steps simulated is never larger than \$4 T^*\$ (and closer to \$2.8 T^*\$ ``on average''); see Propp and Wilson (1996), section 5.2. If we had stepped back by one each time, then the number of simulated steps would have been \${T^*}{2} \approx (1/2) (T^*)^2\$. If we had stepped back quadratically, then the number of simulated steps would have been about \$(1/3) (T^*)^{3/2}\$. For this reason some people prefer quadratic backoff when \$T^*\$ is fairly small.

Q: So you need the state space to be monotone?
A: That would certainly be very useful, but there are many counterexamples to the proposition that is necessary. See {coupling}.

Q: Can you do CFTP on Banach spaces?
A: I don't know. Depends on whether or not you can figure out the state at time 0.

Q: Where's the proof of efficiency?
A: In the monotone setting, loosely speaking, CFTP is efficient whenever the Markov chain mixes rapidly; see {monotone}. There also proofs of efficiency for certain non-monotone settings.

Q(?): But you still need to analyze the mixing time, since otherwise you won't know how long it will take.
A: Wrong. The principal advantage of using CFTP (or Fill's algorithm) is that you don't need these a priori mixing time bounds in order to run the algorithm, collect perfectly random samples, and carry on with the rest of the research project. The fact that these are perfectly random samples is icing on the cake. (But I still think that mixing time bounds are interesting.)

Q: But sampling spin glass configurations is NP-hard.
A: The spin-glass Markov chain that you're using is probably very slowly mixing in the worst case. CFTP will not be faster than the mixing time of the underlying Markov chain.

Q: Is CFTP like a ZPP or Las Vegas algorithm for random sampling?
A: ``Yes'' for Las Vegas, ``sometimes'' for ZPP. [A Las Vegas algorithm uses randomness to compute a deterministic function. The running time is a random variable, but with probability 1 it returns an answer, and when it does so, the answer is correct. A Monte Carlo algorithm in contrast does not guarantee a correct answer. A problem is in ZPP if it can be solved by polynomial expected time Las Vegas algorithm.]

Q: Could you make a hybrid algorithm, which starts out by doing CFTP, but then does something different if CFTP starts to take a long time?
A: Yes. This would be like experiment \$C_T\$.

Q: What's the probability that CFTP takes a long time?
A: The tail distribution of the running time decays geometrically. More can be said if the Markov chain is monotone, see {monotone}.

Q: What if I don't want to wait for \$10^{70}\$ years. Does this make CFTP biased? Is it really better than forwards coupling?
A1: The answer to the second question is yes. How long are you willing to wait? With forward coupling, that's how long you wait. With CFTP your average waiting time is probably much smaller.
A2: The answer to the first question depends on what you mean by ``biased''. No-one disputes that the systematic bias is zero. The so-called ``user-impatience bias'' (the effect of an impatient user interrupting a simulation and progress) is a second order effect that pertains to most random sampling algorithms that most people would not call biased.
A3: If you're genuinely concerned about the quality of your random samples, you should first spend time picking a good pseudo-random number generator.
A4: Anyone concerned about ``user-impatience bias'' should be equally concerned about ``user-patience bias'': if an experimenter run simulations until some deadline, and then lets the last one finish before quitting, then the resulting collection of samples is biased to contain more samples that take a long time to generate. But if the experimenter instead aborts the last simulation (unless there are no samples so far, in which case (s)he lets it finish), then the resulting collection of samples is unbiased. See Glynn and Heidelberger.

Q: Can you quantify the user-impatience bias?
A: If you generate N samples before your deadline, and then average some function of these samples, the bias is at most Pr[N=0]. If you do something sensible in the event that N=0, the bias will be even less. See Glynn and Heidelberger.

### Gallery of Pictures and Java Applets Relevant to the Articles

Postscript illustration of the Murdoch-Green rejection coupler

Postscript illustration of the Murdoch-Green partitioned multigamma coupler

Postscript illustration of the Green-Murdoch random walk Metropolis bisection coupling

Postscript illustration of Green-Murdoch random walk Metropolis bounded CFTP 2-state Potts 3-state Potts 4-state Potts 5-state Potts
The 2-state, 3-state, 4-state, and 5-state Potts models at their respective critical temperatures on the 2D grid, generated by monotone-CFTP on the Fortuin-Kasteleyn random-cluster model.

The 4-state Potts model is particularly interesting, click here to learn why and to see much larger perfectly random 4-state Potts configurations.

Childs, Patterson, and MacKay give a number of perfect Potts configurations with antiferromagnetic and ferromagnetic interactions, as well as a number of associated animations, and plots of thermodynamic quantities. A random spanning tree of the grid, generated by loop-erased random walk. (A spanning tree of a graph is a connected, acyclic subgraph of the given graph.) A random lozenge tiling of a hexagon, generated by monotone-CFTP. A lozenge is a 60-120 degree rhombus. The three different orientations of lozenges are shown in different colors. (Representation of) random ``square ice'' with extremal boundary conditions, generated by monotone-CFTP. In the square ice model each vertex of the square grid points to two of its neighbors, and is pointed at by its two other neighbors. To get this representation, the outgoing arcs from the even-parity vertices were used to create contour lines. The spaces between these contours were alternately colored blue and white. Extremal boundary conditions, associated with alternating sign matrices, were used. Random independent set of the grid (with periodic boundary conditions, and ``activity'' or ``fugacity'' 2), generated by monotone-CFTP. An independent set of a graph is a set of vertices such that no two vertices in the set are adjacent. The probability of an independent set is proportional to the fugacity (2 here) raised to the number of points in the set. Points of this random independent set at even-parity grid positions are shown in one color, points at odd-parity grid positions are shown in the other color.

Perfect simulations of the area-interaction point process (from Kendall's web page of abstracts):
(a) attractive case and (b) repulsive case ### Definitions of Selected Terms

Included here are the definitions of some terms used but not defined above.

area-interaction point process: Defined relative to the Poisson point process, the probability density of a configuration is given an extra weight that is exponentially small or large in the area of the union of circles (or other fixed shapes) centered about the points. When the extra weight is exponentially small (large) in the area, in a configuration with many points, the points will tend to be nearby (far away), giving the attractive (repulsive) area-interaction point process.

continuum random cluster model: Defined relative to a Poisson point process, clusters are formed by taking the union of circles (or other shapes) centered about the points. The probability density of a configuration gets an extra factor of q for each cluster that it contains. The discrete version of the continuum random cluster model is not the random cluster model, but rather is related to it in the same way that site percolation is related to bond percolation.

 dimer model: A dimer configuration is a perfect matching of a graph, i.e. a pairing of the vertices of the graph so that each vertex is paired with exactly one other vertex, connected to it by an edge of the graph. Typically all configurations are equally likely, but sometimes each edge contributes a weight to those perfect matchings that contain it. (The term ``dimer'' comes from considering the vertices to be atoms. The pairing of vertices corresponds to the placement of bonds to make dimers.) Depicted here is a random dimer configuration of the hexagonal grid, generated by monotone-CFTP, and shown as a tiling. Each vertex/atom corresponds to a triangle, and each edge/dimer is shown as a rhombus. hard core model: A hard-core configuration is an indepen- dent set of a graph, i.e.\ a set of vertices of the graph such that no two vertices are connected by an edge. The probability of an independent set gets an extra factor of λ, known as the ``activity'' or ``fugacity'', for each vertex that it contains. Shown here is a random independent set of the grid (with periodic boundary conditions), with λ=2, generated by monotone-CFTP. ice model: An ice configuration is an Eulerian orientation of a graph, i.e. an assignment of directions to the edges of a graph so that at each vertex the number of edges directed towards it equals the number of edges directed away from it. Some edges may be fixed at the boundaries of the graph, but given those constraints, all configurations are equally likely. (The term ``ice'' comes from an analogy with ice, where the vertices correspond to water molecules, the edges to hydrogen bonds, and the edge orientations determine which water molecule's hydrogren atom is involved in the hydrogen bond.)

Potts model: Each vertex on a graph (typically a grid) is assigned one of q different colors. The probability of a configu- ration gets an extra weight for each edge whose two endpoints are the same color. (The weight is thought of as being related to an interaction energy and the temperature.) When the weights are at least one, the model is ``ferromagnetic'', if they are less than one, then the model is ``antiferromagnetic''. The case q=2 is the Ising model. When the weight factor is zero, one obtains proper q-colorings.

random-cluster model: Also known as the Fortuin--Kasteleyn random cluster model, and bond-correlated percolation, it can be defined relative to ordinary bond percolation (each edge occuring independently with probability p), where the probability of a configuration gets an extra weight of q for each connected component it contains. When q is a positive integer, this model is closely related to the q-state ferromagnetic Potts model. When q→0 and then p→0, the result is a random spanning tree. When q=1 one gets ordinary percolation.

spanning tree: A connected, acyclic subgraph of a given graph. If the graph is directed, then ``spanning tree'' refers to an ``arborescence'', in which a distinguished vertex is the ``root'' of the tree, and the edges of the tree are directed to- wards the root. A random spanning tree is such a tree (or arborescence) chosen uniformly at random, or if the graph is weighted, with probability proportional to the products of the weights of the edges contained in the tree.

Widom-Rowlinson model: A model in which there are typically two (but possibly more) types of particles, where particles of different types can't be nearby. In the discrete case, there is a graph, each vertex may contain zero or one particles, and no edge has two different types of particles on its endpoints. The probability of a legal configuration is proportional to some constant (the ``fugacity'' or ``activity'') raised to the power of the number of particles. In the continuous case, particles lie in the plane (or other manifold), and particles of different type must be some minimum distance apart. The probability density of a configuration is what one would get by randomly assigning particle types to the points of a Poisson process, and conditioning on the configuration being legal.

### Perfect Sampling Code on the Web

There seems to be enough perfect sampling code on the web to warrant making a section of links to it. Here is a partial listing, I will add more links as I track them down or they are called to my attention.

First published in volume 41 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, published by the American Mathematical Society, 1998.

Thanks go to Andrei Broder, James A. Fill, Wilfrid S. Kendall, Jesper Møller, and James G. Propp for their helpful comments, and David J. C. MacKay for help updating the HTML.