# Linear Regression

The variable selection problem has a natural Bayesian solution: Any collection of models having different sets of regressors can be computed via their Bayes factors.

## A semiconjugate prior distribution

Let $$\beta \sim MN(\boldsymbol\beta\_0, \Sigma\_0)$$, then

$$
\beta\mid \mathbf y,\mathbf X, \sigma^2 \sim N((\Sigma\_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1}(\Sigma\_0^{-1}\boldsymbol\beta\_0+\mathbf X^T\mathbf y/\sigma^2), (\Sigma\_0^{-1}+\mathbf X^T\mathbf X/\sigma^2)^{-1})
$$

and let $$1/\sigma^2\sim Ga(\nu\_0/2,\nu\_0\sigma\_0^2/2)$$, then

$$
1/\sigma^2\mid \mathbf y,\mathbf X,\boldsymbol\beta \sim Ga(\[\nu\_0+n]/2, \[\nu\_0\sigma\_0^2+SSR(\boldsymbol \beta)]/2)
$$

Then we can construct the following Gibbs sampler:

![](https://666993855-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LJfsESZOIJn_3uGIecs%2F-LNYSYxVCQgeUDnZJGri%2F-LNYSZXGeWm4T8qThu7y%2Fconjugate_gibbs.png?generation=1538193247703999\&alt=media)

## Weakly informative prior distributions

1. unit information prior
2. the parameter estimation should be invariant to changes in the scale of the regressors.

For the second case, we can derive a Monte Carlo approximation:

![](https://666993855-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LJfsESZOIJn_3uGIecs%2F-LNYSYxVCQgeUDnZJGri%2F-LNYSZXIPe2a37Zgo7fu%2Fmc_approximation.png?generation=1538193247358051\&alt=media)

since

$$
p(\sigma^2,\boldsymbol\beta\mid \mathbf y, \mathbf X)\propto p(\boldsymbol\beta\mid \mathbf y,\mathbf X,\sigma^2)p(\sigma^2\mid \mathbf y,\mathbf X)
$$
