Linear Regression
The variable selection problem has a natural Bayesian solution: Any collection of models having different sets of regressors can be computed via their Bayes factors.

A semiconjugate prior distribution

Let
βMN(β0,Σ0)\beta \sim MN(\boldsymbol\beta_0, \Sigma_0)
, then
βy,X,σ2N((Σ01+XTX/σ2)1(Σ01β0+XTy/σ2),(Σ01+XTX/σ2)1)\beta\mid \mathbf y,\mathbf X, \sigma^2 \sim N((\Sigma_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1}(\Sigma_0^{-1}\boldsymbol\beta_0+\mathbf X^T\mathbf y/\sigma^2), (\Sigma_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1})
and let
1/σ2Ga(ν0/2,ν0σ02/2)1/\sigma^2\sim Ga(\nu_0/2,\nu_0\sigma_0^2/2)
, then
1/σ2y,X,βGa([ν0+n]/2,[ν0σ02+SSR(β)]/2)1/\sigma^2\mid \mathbf y,\mathbf X,\boldsymbol\beta \sim Ga([\nu_0+n]/2, [\nu_0\sigma_0^2+SSR(\boldsymbol \beta)]/2)
Then we can construct the following Gibbs sampler:

Weakly informative prior distributions

  1. 1.
    unit information prior
  2. 2.
    the parameter estimation should be invariant to changes in the scale of the regressors.
For the second case, we can derive a Monte Carlo approximation:
since
p(σ2,βy,X)p(βy,X,σ2)p(σ2y,X)p(\sigma^2,\boldsymbol\beta\mid \mathbf y, \mathbf X)\propto p(\boldsymbol\beta\mid \mathbf y,\mathbf X,\sigma^2)p(\sigma^2\mid \mathbf y,\mathbf X)