Metropolis
Monte Carlo method is sufficient for conjugate prior distribution which means that we can derive the explicit form of posterior distribution, and Gibbs sampler also can handle semiconjugate prior distribution, but when the conjugate or semiconjugate distribution is unavailable, we need the Metropolis-Hastings algorithm which is a generic method of approximating the posterior distribution.

Procedure

    1.
    sample
    θJ(θθ(s))\theta^*\sim J(\theta\mid \theta^{(s)})
    ;
    2.
    Compute the acceptance ratio
    r=p(θy)p(θ(s)y)=p(yθ)p(θ)p(yθ(s))p(θ(s))r=\frac{p(\theta^*\mid y)}{p(\theta^{(s)}\mid y)}=\frac{p(y\mid\theta^*)p(\theta^*)}{p(y\mid\theta^{(s)})p(\theta^{(s)})}
    3.
    θ(s+1)=θ  w.p.  min(r,1)\theta^{(s+1)}=\theta^*\; \mathrm{w.p.}\; \mathrm{min}(r,1)
    , and
    θ(s+1)=θ(s)  w.p.  1min(r,1)\theta^{(s+1)}=\theta^{(s)}\; \mathrm{w.p.}\; 1-\mathrm{min}(r,1)
In practice, the first
BB
iterations are called "burn-in" period, which should be discarded.

Example: Normal distribution with known variance

Let
θN(μ,τ2),  {y1,,ynθ}i.i.d.N(θ,σ2)\theta\sim N(\mu,\tau^2),\;\{y_1,\ldots,y_n\mid \theta\}\sim i.i.d. N(\theta,\sigma^2)
, then the posterior distribution of
θ\theta
is
N(μn,τn2)N(\mu_n,\tau_n^2)
.
The simulation is as follows:
1
s2 = 1; t2 = 10; mu = 5;
2
y = c(9.37, 10.18, 9.16, 11.60, 10.33)
3
theta = 0; delta2 = 2; S = 10000; THETA = NULL;
4
set.seed(1)
5
6
for (s in 1:S)
7
{
8
theta.star = rnorm(1, theta, sqrt(delta2))
9
log.r = (sum(dnorm(y, theta.star, sqrt(s2), log = TRUE)) +
10
dnorm(theta.star, mu, sqrt(t2), log = TRUE)) -
11
(sum(dnorm(y, theta, sqrt(s2), log = TRUE)) +
12
dnorm(theta, mu, sqrt(t2), log = TRUE))
13
if (log(runif(1)) < log.r){
14
theta = theta.star
15
}
16
THETA = c(THETA, theta)
17
}
18
19
# figure 10.3
20
## left
21
plot(THETA)
22
## right
23
hist(THETA, probability = TRUE, breaks = 70, xlim = c(8, 12))
24
## true posterior distribution
25
lines(sort(THETA), dnorm(sort(THETA), 10.03, 0.44))
Copied!

For Poisson regression

The data can be found on Peter Hoff's personal page.
1
## get data
2
## you can found it on Peter Hoff's website
3
## https://www.stat.washington.edu/~pdhoff/book.php
4
5
#yX.sparrow = dget(yX.sparrow)
6
7
y = yX.sparrow[, 1]
8
X = yX.sparrow[, -1]
9
n = length(y)
10
p = dim(X)[2]
11
12
# prior expectation
13
pmn.beta = rep(0, p)
14
psd.beta = rep(10, p)
15
# proposal var
16
var.prop = var(log(y+1/2))*solve(t(X) %*% X)
17
S = 10000
18
beta = rep(0, p)
19
acs = 0
20
BETA = matrix(0, nrow = S, ncol = p)
21
set.seed(1)
22
23
library(MASS)
24
for (s in 1:S)
25
{
26
beta.p = mvrnorm(1, beta, var.prop)
27
lhr = sum(dpois(y, exp(X %*% beta.p), log = T)) -
28
sum(dpois(y, exp(X %*% beta), log = T)) +
29
sum(dnorm(beta.p, pmn.beta, psd.beta, log = T)) -
30
sum(dnorm(beta, pmn.beta, psd.beta, log = T))
31
if (log(runif(1)) < lhr){
32
beta = beta.p
33
acs = acs + 1
34
}
35
BETA[s, ] = beta
36
}
37
# plot for beta3
38
plot(BETA[,3])
Copied!
Last modified 3yr ago