Sunteți pe pagina 1din 5

With compliments of statisticsmentor.

com, the site for online statistics help

Basic statistics formulas


Sets
De Morgans Law (AUB)c = Ac Bc & (AB)c = Ac U Bc Commutativity A U B = B U A and A B = B A Associativity (A U B) U C = A U (B U C) (A B) C = A (B C) Distributivity (A U B) C = (A C) U (B C) (A B) U C = (A U C) (B U C)
s

Measures of Location
Sample mean
x=

n Median (for raw data i.e list of numbers ungrouped)

List the numbers in ascending order. Median is: (n+1)/2 value if n is odd; Mean of n/2th and (n+1)/2th values if n is even.

Measures of spread
Sample variance
2

( x x ) =
i

Probability
p(A) = 1 - p(Ac) p(A U B) = p(A) + p(B) - p(A B) If events A and B are mutually independent then p(A B) = p(A)p(B) p(A|B) = p(A B)/p(B) so long as p(B)>0 p(A B) = p(A|B)p(B) = p(B|A)p(A) so long as p(A)>0 and p(B)>0 If {B1, B2...,Bk} is a set of mutually excusive and exhaustive events, then

n 1

Sample standard deviation,s s = var iance Range

Largest value smallest value


Interquartile range (IQR)

Upper quartile lower quartile


p(A) = p(A|B1)p(B1)+ p(A|B2)p(B2)+...+p(A|Bk)p(Bk)

Coefficient of variation
s/x

www.Statisticsmentor.com 22nd November 2009

With compliments of statisticsmentor.com, the site for online statistics help

Expectation, variance, and covariance


If random variable X is discrete:

Normal density function


The normal density function (aka normal distribution) has 2 parameters: mean and variance. For the normal distribution: 90% of data falls within 1.65 95% of data falls within 1.95 Standardizing (Z-score)
z= x

E ( X ) = xi p ( xi ) which is calculated over all possible values of X. Let g(X) denote a function of discrete X, then: E ( g ( X )) = g ( xi ) p(xi )
Expection rules

Let X, Y, Z denote random variables; a, b denote constants E1. E(a) = a E2. E(aX) = a E(X) E3. E(X+Y) = E(X) + E(Y)
Variance rules

Sampling distribution of sample mean


Suppose X N ( , 2 ) , then the distribution of the sample mean X is
X N ( ,

2
n

The standard error of X is

V1. var(a) = 0 V2. var(aX) = a var(X)


V3. var(X Y) = var(X) + var(Y) 2*Cov(X,Y)
2

2
n This result is true if X does not follow the normal distribution but n is large (and then the result follows because of the Central Limit Theorem)

Covariance rules

C1. Cov(a,X) = 0 C2. Cov(aX,bY) = ab*Cov(X,Y) C3. Cov(X+Y,Z) = Cov(X,Z) + Cov(Y,Z)

www.Statisticsmentor.com 22nd November 2009

With compliments of statisticsmentor.com, the site for online statistics help

Confidence intervals for the mean


Parameter
Mean

Assumptions
Data normally distributed or n is large (n>30);

Formula
x z /2

2 known
Data normally distributed; n small;

x t /2,n1

s n

2 unknown
Difference in means X Y Case of 2 independent distributions
Data are normally distributed;

2 2 ( x y ) z /2 X + Y nY nX

X 2 , Y 2 are known
Variances unknown; Large samples Data are normally distributed; ( x y ) z /2 sX 2 sX 2 + nX nX

X 2 , Y 2 are unknown but X 2 = Y 2

1 1 sp2 + n X nY Where the estimate of the pooled variance is

( x y ) t /2,n

+ nY 2

sp2 =

(nX 1) s X 2 + (nY 1) sY 2 nX + nY 2

www.Statisticsmentor.com 22nd November 2009

With compliments of statisticsmentor.com, the site for online statistics help

Hypothesis test for the mean


Hypothesis
Testing a single mean equals a value a
Ho : = a

Assumption
Data normally distributed, or large sample;

Test equation
x a / n Z-table for critical value
x a s/ n t-table with df = n-1 for critical value

2 known
Data normally distributed

unknown
2

Testing the difference between 2 means equals a number a (which includes the case of a=0 which is a test for no difference between means)

Data normally distributed, or large sample; Independent samples;

x y a X 2 Y 2 + nY nX
Z-table for critical value

X 2 , Y 2 are known

H o : X Y = a
Data normally distributed, or large sample; Case of 2 independent distributions Independent samples;

x y a s X 2 sY 2 + nY nX Z-table for critical value

X 2 , Y 2 are unknown
Data are normally distributed;

x y a 1 1 sp2 + nX nY Where the estimate of the pooled variance is sp2 = (nX 1) s X 2 + (nY 1) sY 2 nX + nY 2

X 2 , Y 2 are unknown but X 2 = Y 2

www.Statisticsmentor.com 22nd November 2009

With compliments of statisticsmentor.com, the site for online statistics help

Covariance and correlation


Parameter Covariance between variables X and Y
Pearsons correlation

Population formula
COV(X,Y) = E(XY) E(X)E(Y)

Sample formula

( x x )( y y )
i i

n 1 Cov ( X , Y )

XY

xy nxy
s X sY

Simple linear regression


Model: for i = 1,2,,n

yi = + xi + ui
OLS estimators Intercept Slope

= yx
var( ) = n

( x

2 x 2i
2 i

xy nxy x nx
2

nx 2

)
( y y )
i i

var( ) =

2
2

nx 2

Estimator for variance of error term u is

n2

www.Statisticsmentor.com 22nd November 2009

S-ar putea să vă placă și