Sunteți pe pagina 1din 3

Stat 211 2014 Due: Monday Feb 17 at 5 pm, on the 7th oor of the Science Center

Homework 1
Joe Blitzstein and Tirthankar Dasgupta Collaboration policy: You are free to discuss the problems with others, though it is strongly recommended that you try the problems on your own rst. Copying is not allowed, and write-ups must be your own explanations in your own words. 1. (Frequentist, Bayesian, and information theoretic views on suciency) Let Y = (Y1 , . . . , Yn ) be the observed data, and let T be a statistic, where the Yj are i.i.d. draws from a statistical model {f (y ) : } with some dominating measure (for concreteness and to simplify notation, you can assume that Yj is discrete and that the f are PMFs, or you can assume that Yj is continuous and the f are PDFs with respect to Lebesgue measure, but if you make such an assumption make sure to state it). Show that the following three denitions of T being sucient for are equivalent: (a) (frequentist) The conditional distribution of Y given T does not depend on . (b) (Bayesian) The conditional distribution of given T is the same as the posterior distribution of given Y , for any prior distribution on . (c) (information theory) The chain T Y is Markovian for any distribution on , i.e., and Y are conditionally independent given T . 2. (Is conditioning on an ancillary statistic always a good idea?) This is an example provided by Basu (1964) in which he raised a question about the appropriateness of conditioning on the ancillary statistic. Consider a single observation Y from Unif[, + 1), 0 < . (a) Show that Y is minimal sucient for . (b) Argue that the statistic T (Y ) = Y , the greatest integer less than or equal to Y , is an MLE of . Is the MLE unique? (c) Dene the statistic A(Y ) = Y [Y ] and show that A(Y ) is ancillary for . (d) Argue that the vector (T (Y ), A(Y )) is also minimal sucient for . (e) Derive the conditional distribution of T (Y ) given A(Y ). Does the result raise a question about the appropriateness of conditioning on the ancillary statistic? Explain your answer.

3. (Completeness in an NEF) Show that the natural sucient statistic in an NEF is complete. That is, if Y follows an NEF and h is any function such that E (h(Y )) = 0 for all in the natural parameter space, then h(Y ) = 0 a.s. Hint: Decompose h = h+ h into positive and negative part, and use the fact that an MGF uniquely determines a distribution. 4. (Basus Theorem and Linear Regression) Assume an OLS (ordinary least squares) model, in which a data vector y = (y1 , . . . , yn ) is observed with a single vector of n covariates, xi , i = 1, . . . , n. With unknown, assume yi N (xi , 2 ) independently for i = 1, 2, . . . , n. There is no constant term (mainly to minimize diculty), and the xi are known constants, not all 0. We use the notation Sy,x
i

yi xi .

i )2 /(n 1) Sy,x /Sx,x (this is the OLS estimator for ) and let 2 by Let 2 i (yi x (this is an unbiased estimator of 2 ). Denote the vector of residuals as res y x. . Does it follow an NEF, if 2 is known? (a) Find the distribution of (being (b) Use Basus Theorem to show that the residual vector res is independent of careful to justify the assumptions and specic about what model you are working with). and Then show that are independent. (c) Show that 2 is unbiased for 2 and that it has a scaled 2 n1 distribution, without doing tedious, brute-force calculations. There are at least two ways to do this: one is to use the identity (which follows from res being orthogonal to x, which you can assume) (yi xi )2 =
i i

i )2 + ( )2 (yi x
i

x2 i,

and another is to use matrices (especially projection matrices), writing res as a projection matrix times y. )/ (d) Reasoning by representation, nd the distribution of ( . 5. (Shifted Logistic) The shifted Logistic family has CDF F (y ) = exp(y ) , y R, 1 + exp(y ) 2

with parameter space = R. Assume an independent sample y = (y1 , . . . , yn ) F . (a) Find the score function S (y, ). (b) Verify directly that for this family, E (S (y, )/) = Var S (y, ). (c) Find the equation (a likelihood equation) that identies the MLE (you do not need to solve the equation).

S-ar putea să vă placă și