Documente Academic
Documente Profesional
Documente Cultură
Estimators
Unbiased
estimator
Standard deviation
Variance and standard deviation are measures of spread.
They are always positive values.
Probability
P(A ) = 1 P(A), the probability that A does not happen
P(A B) = P(A) + P(B) P(A B)
P(A B) = P(A) P(B A) conditional probability
P(A B) = P(A) P(B)
independent events
P(A B) = 0
mutually exclusive events
P( A B )
P( A | B )
conditional probability
P( B )
Questions on conditional probability are sometimes worded given that a particular
event has happened (or not happened) what is the probability that some other event
will happen? For P(A B), B is the event that has already occurred and A is the event
that is going to take place.
P(A
B ) = 1 P(A B)
A and B are independent P(A|B) = P(A|B ) = P(A)
Questions can often make more sense when the probabilities are written on a tree
diagram.
Correlation
and Regression
Binomial
distribution
Normal
distribution
When making a comment about correlation, remember to use the words strong, fairly
strong, weak etc as well as positive and negative.
Other comments can include what the relationship is in everyday language, being
careful to be precise, and the presence of Anomalous Values (freak results or outliers)
and Influential Data Points (when the x value is much bigger or smaller than all of the
other x values).
Consider the relationship between two variables carefully in case it is spurious
i.e. is there a cause and effect?
The vertical distances between points on a scatter diagram and the regression line are
called residuals,
The regression line,
is a line of best fit and is sometimes called the least
squares regression line.
The point
always lies on the regression line. You can obtain the equation
directly from your calculator.
Extrapolation - Predicting y from x when x is outside the range of given data
Not very reliable.
Interpolation - Predicting y from x when x is inside the range of given data
More reliable.
X ~ B (n, p) X is a discrete random variable, the number of trials n is fixed, the trials
are independent of one another, the probability p is the same for each trial. There are
two outcomes.
The probability function, mean, variance and the cumulative probability tables can be
found in the formulae booklet.
Be careful with the wording of the questions
e.g.
X ~ N( ,
Distribution of
the sample
mean
If X ~ N( ,
or
n
The central
limit theorem
).
and variance
, i.e. X ~ N( ,
) regardless of the
Confidence
intervals
x 1.96
P(Z < |z |)
P(Z < z)
P(Z < |z 1 |)
P(Z < |z 2 |)
z1 z2
z1 z2
z2
P(Z < |z |)
(1