Linear Regression

The variable selection problem has a natural Bayesian solution: Any collection of models having different sets of regressors can be computed via their Bayes factors.

A semiconjugate prior distribution

Let βMN(β0,Σ0)\beta \sim MN(\boldsymbol\beta_0, \Sigma_0), then

βy,X,σ2N((Σ01+XTX/σ2)1(Σ01β0+XTy/σ2),(Σ01+XTX/σ2)1)\beta\mid \mathbf y,\mathbf X, \sigma^2 \sim N((\Sigma_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1}(\Sigma_0^{-1}\boldsymbol\beta_0+\mathbf X^T\mathbf y/\sigma^2), (\Sigma_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1})

and let 1/σ2Ga(ν0/2,ν0σ02/2)1/\sigma^2\sim Ga(\nu_0/2,\nu_0\sigma_0^2/2), then

1/σ2y,X,βGa([ν0+n]/2,[ν0σ02+SSR(β)]/2)1/\sigma^2\mid \mathbf y,\mathbf X,\boldsymbol\beta \sim Ga([\nu_0+n]/2, [\nu_0\sigma_0^2+SSR(\boldsymbol \beta)]/2)

Then we can construct the following Gibbs sampler:

Weakly informative prior distributions

  1. unit information prior

  2. the parameter estimation should be invariant to changes in the scale of the regressors.

For the second case, we can derive a Monte Carlo approximation:

since

p(σ2,βy,X)p(βy,X,σ2)p(σ2y,X)p(\sigma^2,\boldsymbol\beta\mid \mathbf y, \mathbf X)\propto p(\boldsymbol\beta\mid \mathbf y,\mathbf X,\sigma^2)p(\sigma^2\mid \mathbf y,\mathbf X)