Ba Yes I An Linear Model

Linear Regression Models: A Bayesian perspective
Ingredients of a linear model include an n 1 response

vector y = (y1 , . . . , yn )T and an n p design matrix (e.g.
including regressors) X = [x1 , . . . , xp ], assumed to have
been observed without error. The linear model:
y = X + ; N (0, 2 I)
The linear model is the most fundamental of all serious
statistical models encompassing:
ANOVA: y is continuous, xi s are categorical
REGRESSION: y is continuous, xi s are continuous
ANCOVA: y is continuous, some xi s are continuous, some
categorical.
Unknown parameters include the regression parameters

and the variance 2 . We assume X is observed without
error and all inference is conditional on X.
1
Linear Regression Models: A Bayesian perspective
The classical unbiased estimates of the regression

parameter and 2 are
= (X T X)1 X T y;
1
T (y X ).
2 =
(y X )
np
The above estimate of is also a least-squares estimate.
The predicted value of y is given by
= PX y where PX = X(X T X)1 X T .
= X
y
PX is called the projector of X. It projects any vector to the
space spanned by the columns of X.
The model residual is estimated as:
T (y X )
= yT (I PX )y.
= (y X )
e
2
Bayesian regression with flat reference priors
For Bayesian analysis, we will need to specify priors for the

unknown regression parameters and the variance 2 .
Consider independent flat priors on and log 2 :
p() 1; p(log( 2 )) 1 or equivalently p(, 2 )
1
.
2
None of the above two distributions are valid probabilities

(they do not integrate to any finite number). So why is it
that we are even discussing them?
It turns out that even if the priors are improper (thats what
we call them), as long as the resulting posterior
distributions are valid we can still conduct legitimate
statistical inference on them.
3
Marginal and conditional distributions
With a flat prior on we obtain, after some algebra, the

conditional posterior distribution:
p( | 2 , y) = N ( | (X T X)1 X T y, 2 (X T X)1 ).
The conditional posterior distribution of would have been
the desired posterior distribution had 2 been known.
Since that is not the case, we need to obtain the marginal
posterior distribution by integrating out 2 as:
Z
p( | y) = p( | 2 , y)p( 2 | y)d 2
Can we solve this integration using composition sampling?
YES: if we can generate samples from p( 2 | y)!
4
Marginal and conditional distributions
So, we need to find the marginal posterior distribution of

2 . With the choice of the flat prior we obtain:

1
(n p)s2
p( 2 | y) 2 (np)/2+1 exp
2 2
( )

2
2 n p (n p)s
= IG |
,
,
2
2
where s2 =
2 =
1
T
np y (I
PX )y.
This is known as an inverted Gamma distribution (also

called a scaled chi-square distribution)
IG( 2 | (n p)/2, (n p)s2 /2).
In other words: [(n p)s2 / 2 | y] 2np (with n p
degrees of freedom). A striking similarity with the classical
result: The distribution of
2 is also characterized as
2
2
(n p)s / following a chi-square distribution.
5
Composition sampling for linear regression
Now we are ready to carry out composittion sampling from

p(, 2 | y) as follows:
Draw M samples from p( 2 | y):

n p (n p)s2
2(j)
IG
,
(n p) , j = 1, . . . M
2
2
For j = 1, . . . , M , draw from p( | 2(j) , y):

(j) N (X T X)1 X T y, 2(j) (X T X)1
The resulting samples { (j) , 2(j) }M

j=1 represent M
2
samples from p(, | y).
{ (j) }M
j=1 are samples from the marginal posterior
distribution p( | y). This is a multivariate t density:
p( | y) =
(n/2)
((n p))p/2 ((n p)/2)|s2 (X T X)1 |
"
1+
#
T (X T X)( )
n/2
( )
(n p)s2
Composition sampling for linear regression
The marginal distribution of each individual regression

parameter j is a non-central univariate tnp distribution.
In fact,
j j
q
tnp .
s (X T X)1
jj
The 95% credible intervals for each j are constructed
from the quantiles of the t-distribution. The credible
intervals exactly coicide with the 95% classical confidence
intervals, but the intepretation is direct: the probability of j
falling in that interval, given the observed data, is 0.95.
Note: an intercept only linear model reduces to the simple
univariate N (
y | , 2 /n) likelihood, for which the marginal
posterior of is:
y
tn1 .
s/ n
7
Bayesian predictions from the linear model
and we
Suppose we have observed the new predictors X,
wish to predict the outcome y. We specify p(y, y | ) to be a

normal distribution:

X
y
I 0
2
N
,
y
0 I
X
2 I).
| y, , 2 ) = p(y
| , 2 ) = N (y
| X,
Note p(y
The posterior predictive distribution:
Z
| y) = p(y
| y, , 2 )p(, 2 | y)dd 2
p(y
Z
| , 2 )p(, 2 | y)dd 2 .
= p(y
By now we are comfortable evaluating such integrals:
First obtain: ( (j) , 2(j) ) p(, 2 | y), j = 1, . . . , M
(j) , 2(j) I).
(j) N (X
Next draw: y
8
Gibbs sampler for the linear regression model
Consider the linear model with p( 2 ) = IG( 2 | a, b) and

p() 1.
The Gibbs sampler proceeds by computing the full
conditional distributions:
p( | y, 2 ) = N ( | (X T X)1 X T y, 2 (X T X)1 )

1
T
2
2
p( | y, ) = IG | a + n/2, b + (y X) (y X) .
2
Thus, the Gibbs sampler will initialize ( (0) , 2(0) ) and
draw, for j = 1, . . . , M :
T
1 T
2(j1)
Draw (j) N ((X
(X T X)1 )
X) X y,
Draw 2(j) IG a + n/2, b + 12 (y X (j) )T (y X (j) )
Metropolis algorithm for the linear model
Example: For the linear model, our parameters are (, 2 ). We write = (, log( 2 )) and, at the j-th
iteration, propose N ( (j1) , ). The log transformation on 2 ensures that all components of
have support on the entire real line and can have meaningful proposed values from the multivariate normal.
But we need to transform our prior to p(, log( 2 )).
Let z = log( 2 ) and assume p(, z) = p()p(z). Let us derive p(z). REMEMBER: we need to adjust
for the jacobian. Then p(z) = p( 2 )|d 2 /dz| = p(ez )ez . The jacobian here is ez = 2 .
Let p() = 1 and an p( 2 ) = IG( 2 | a, b). Then log-posterior is:
(a + n/2 + 1)z + z
1
ez
{b +
1
2
(Y X) (Y X)}.
A symmetric proposal distribution, say q( | (j1) , ) = N ( (j1) , ), cancels out in r. In practice

it is better to compute log(r): log(r) = log(p( | y) log(p( (j1) | y)). For the proposal,
N ( (j1) , ), is a d d variance-covariance matrix, and d = dim() = p + 1.
If log r 0 then set (j) = . If log r 0 then draw U (0, 1). If U r (or log U log r) then
(j) = . Otherwise, (j) = (j1) .
Repeat the above procedure for j = 1, . . . M to obtain samples (1) , . . . , (M ) .
10

Ba Yes I An Linear Model

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Ba Yes I An Linear Model

Încărcat de

Drepturi de autor:

Formate disponibile

Linear Regression Models: A Bayesian perspective

Ingredients of a linear model include an n 1 response

Unknown parameters include the regression parameters

Linear Regression Models: A Bayesian perspective

The classical unbiased estimates of the regression

Bayesian regression with flat reference priors

For Bayesian analysis, we will need to specify priors for the

None of the above two distributions are valid probabilities

Marginal and conditional distributions

With a flat prior on we obtain, after some algebra, the

Marginal and conditional distributions

So, we need to find the marginal posterior distribution of

This is known as an inverted Gamma distribution (also

Composition sampling for linear regression

Now we are ready to carry out composittion sampling from

The resulting samples { (j) , 2(j) }M

Composition sampling for linear regression

The marginal distribution of each individual regression

Bayesian predictions from the linear model

wish to predict the outcome y. We specify p(y, y | ) to be a

Gibbs sampler for the linear regression model

Consider the linear model with p( 2 ) = IG( 2 | a, b) and

Draw 2(j) IG a + n/2, b + 12 (y X (j) )T (y X (j) )

Metropolis algorithm for the linear model

Let p() = 1 and an p( 2 ) = IG( 2 | a, b). Then log-posterior is:

A symmetric proposal distribution, say q( | (j1) , ) = N ( (j1) , ), cancels out in r. In practice

Repeat the above procedure for j = 1, . . . M to obtain samples (1) , . . . , (M ) .

S-ar putea să vă placă și