Algorithms

Algorithms
Poprzedni	Rozdział 9. Monte Carlo Markov Chains Library	Następny

The Metropolis algorithm

Description

This algorithm, popular in mathematics and physics, is based on the rejection of unacceptable elements. It has been named after Nicholas Metropolis, who published it in 1953 for the Boltzmann Distribution. This algorithm generates samples according to any given probability distribution . The only requirement is that the density of the distribution can be computed at the point . The algorithm generates a Markov chain in which every state depends only on the previous state . The algorithm uses a transition function , which depends only on the current state , in order to generate a new proposed element . The proposal is accepted as the new element (x^(t+1) = x' ) if drawn from the uniform distribution satisfies

Otherwise the current value is retained (i.e. ).

Notatka

The transition function must be symmetric, i.e.

and non-negative.

For example, the transition function can generate points according to the Gaussian distribution centered on the current state

where can be interpreted as the probability density for in the case when the previous value is equal to . Therefore, the transition function generates elements located around the current state with the covariance equal to and the proposed state is drawn with probability . Next, the ratio

of probabilities of the proposed element and the previous one is calculated. The new state is chosen according to the rule

The algorithm works best if the shape of the transition function matches the shape of the target distribution , i.e.

However, in the majority of cases the shape of the target distribution is unknown. If the transiition probability used in the transition function has Gaussian distribution, then the variance should be adjusted during the burn-in period. This is usually done by calculating the acceptance ratio, which should oscillate around 60%. If the steps with the proposed elements are too small, then the chain will develop slowly (i.e. it will move slowly around one area and the convergence to will also be slow). If the steps are too large, then the acceptance ratio will be very small, because the proposed elements will frequently be located in areas with very small probability density. Hence the value of will be small.

Constructors

The constructor for the Metropolis algorithm has the following syntax:

MetropolisSampler(distribution, trans[, log])

Tabela 9.4. MetropolisSampler constructor settings

Parameter name	Parameter type	Description
`distribution`	a SamplingDistribution object	the distribution to be sampled
`trans`	a TransitionFunction object	the transition function used to find next proposed point for the Markov chain
`log`	boolean	specifies whether distribution is in logarithmic form (true) or not. The default value is false.

Additional description of the MetropolisSampler parameters

distribution: It can be known up to a constant and it need not be strictly positive on its domain if it is not in logarithmic form.
trans: The transition function used to find the candidate for the next point of the Markov chain. Its dimension has to be exactly the same as the dimension of the distribution. This function must be non-negative and symmetric with respect to its arguments.
log: The log = true setting is used when one wants to specify that sampling distribution is in logarithmic form (e.g. loglikelihood). This is useful when calculations are complicated, to obtain the required value faster thanks to changing multiplications into logarithm summation.

Methods

Tabela 9.5. The MetropolisSampler methods

Method name and syntax	Description of parameters	Output	Output type
generate(length)	`length` - the length of the chain to be generated	the generated Markov chain	`MarkovChain`
generate(length, nrOfChains, maxLength, wholeChains)	`length` - the length of the converging parts of the chains (integer); `nrOfChains` - the number of chains to generate, must be at least 2 (integer); `maxLength` - the maximal length of the generated chains (integer); `wholeChains` - if false only the converging parts of the chains will be returned, if true whole chains will be returned.	an array of generated Markov chains with with specified `length` if `wholeChains` is set to false or whole genereted chains until convergence is observed if `wholeChains` is set to true	`MarkovChain[]`
getAcceptanceRatio()		acceptance ratio (i.e. the value of accepted points divided by the number of proposed points) obtained during the last sampling.	`double`
getTransitionFunction()		returns the TransitionFunction object used in constructing the MetropolisSampler object.	`TransitionFunction`
getGeneratedLength()		the length of the last generated chain	`int`
getRTolerance( )		the tolerance for the R parameter	`double`
setRTolerance(tolerance)	`tolerance` - the acceptable value of the R parameter, under which chain generation wil be stopped (double)		`void`
setStartingPoint(point)	`point` - the starting point for the algorithm (double[])		`void`
getSeed()		the random number seed used for generating the chain	`int`
setSeed(seed)	`seed` - the value to set as the random number seed for generating chains. If 0 is used, the seed itself will be random.		`void`
terminate()			`void`

Detailed description of selected MetropolisSampler methods

generate (single chain): The basic method for generating chains, requires a sample from the specified distribution. All points are grouped in one MarkovChain object.
generate (multiple chains): An alternative chain generation method, with the capability to generate samples with automated convergence examination. The chains are generated until the last part of the chain of specified length has the value of the R parameter below the R tolerance level or the maximal chain length is exceeded.
setStartingPoint: If the starting point is not set then a zero vector is used as one.

Metropolis-Hastings algorithm

Description

In 1970 W. K. Hastings generalized the Metropolis algorithm in the following way: the new element is accepted if

where is the ratio of transition function probabilities in two directions (from to and vice-versa). Due to this, the transition function need not be symmetric. Moreover, may not even depend on . In this case the algorithm is called "Independence Chain Metropolis-Hastings" (as compared to "Random Walk Metropolis-Hastings"). If the transition function is fitted, the algorithm can achieve better accuracy then the random walk variant, but it also requires some a-priori knowledge about the distribution.

Constructors

The constructor for the Metropolis - Hastings algorithm has the following syntax:

MetropolisHastingsSampler(distribution, trans[, log])

Tabela 9.6. MetropolisHastingsSampler constructor settings

Parameter name	Parameter type	Description
`distribution`	a SamplingDistribution object	the distribution to be sampled
`trans`	a TransitionFunction object	the transition function used to find next proposed point for the Markov chain
`log`	boolean	specifies whether distribution is in logarithmic form (true) or not. The default value is false.

Additional description of the MetropolisHastingsSampler parameters

distribution: It can be known up to a constant and it need not be strictly positive on its domain if it is not in logarithmic form.
trans: The transition function used to find the candidate for the next point of the Markov chain. Its dimension has to be exactly the same as the dimension of the distribution. This function must be non-negative and symmetric with respect to its arguments.
log: The log = true setting is used when one wants to specify that sampling distribution is in logarithmic form (e.g. loglikelihood). This is useful when calculations are complicated, to obtain the required value faster thanks to changing multiplications into logarithm summation.

Methods

The methods for MetropolisHastingsSampler are exactly the same as for MetropolisSampler.

Bayesian inference

Description

Bayesian inference is a statistical inference method in which events or observations are used to update the probability that a given hypothesis is true. The name comes from the Bayes' theorem, derived from the works of Thomas Bayes, which is frequently used in the inference process

There are two ingredients to Bayesian inference: the sampling distribution and the distribution . Treating the first ingredient as a function of we get the likelihood function of . The second ingredient is called prior density as it contains the probability distribution of prior to observing the value of . Therefore, the interference is based on the probability distribution of after observing the value of , that becomes a part of the set of availiable information. This distribution is called the posterior distribution and can be obtained using the Bayes' theorem

where

If one denotes and observes that is constant, then it follows that the equation for the density of the posteriori distribution can be rewritten in a more compact form

Constructors

The constructor for the posteriori distribution sampler has the following syntax:

PosteriorSampler(likelihood, prior[, trans])

If only two arguments (likelihood and prior are provided, the constructor uses the Metropolis algorithm, sampling the data from the prior distribution (i.e. the proposal points are drawn diretly from the prior distribution without using a transition function and so the next proposal point is independent on the previous one). If the transition function trans is also provided, the constructor uses the Metropolis - Hastings algorithm.

Tabela 9.7. PosteriorSampler constructor settings

Parameter name	Prameter type	Description
`likelihood`	a `LogLikelihood` object	the likelihood function to use
prior	a `MultivariateDistribution` object	the distribution to be sampled
`trans`	a `TransitionFunction` object	the transition function to use for finding the next proposed points in the Markov chain

Additional description of the PosteriorSampler parameters

likelihood: This is the likelihood function in the logarithmic form.
prior: This is the a-priori function. It has to have the same dimension as the likelihood function.
trans: This is the transition function used to find the candidate for the next point in the Markov chain based on the current point. Its dimension has to be exactly the same as the dimension of distribution. This function must be nonnegative in its whole domain.

Methods

The PosteriorSampler object has the same methods as MetropolisSampler with the addition of the methods listed in the table below.

Tabela 9.8. PosteriorSampler exclusive methods

Method name and syntax	Description of parameters	Output	Output type
getPrior()		the prior function used	a `MultivariateDistribution` object
getInnerSampler()		the sampling algorithm used internally by the PosteriorSampler object	a `MetropolisSampler` or `MetropolisHastingsSampler` object.

Poprzedni	Początek rozdziału	Następny
The MarkovChain class	Spis treści	Transition functions