Documente Academic
Documente Profesional
Documente Cultură
Weighted Mean:
A weighted mean is a kind of average. Instead of each data point
contributing equally to the final mean, some data points contribute more “weight”
than others. If all the weights are equal, then the weighted mean equals the
arithmetic mean (the regular “average” you’re used to). Weighted means are very
common in statistics, especially when studying populations.
∑𝑛𝑖=1(𝑥𝑖 ∗ 𝑤𝑖 )
𝑥=
∑𝑛𝑖=1 𝑤𝑖
Where:
Σ = the sum of (in other words…add them up!).
w = the weights.
x = the value.
In this example, we treat the population mean and variance as known, which
would be appropriate if all students in the region were tested. When population
parameters are unknown, a t test should be conducted instead.
The classroom mean score is 96, which is −2.47 standard error units from
the population mean of 100. Looking up the z-score in a table of the standard
normal distribution, we find that the probability of observing a standard normal
value below −2.47 is approximately 0.5 − 0.4932 = 0.0068. This is the one-sided
p-value for the null hypothesis that the 55 students are comparable to a simple
random sample from the population of all test-takers. The two-sided p-value is
approximately 0.014 (twice the one-sided p-value).
Another way of stating things is that with probability 1 − 0.014 = 0.986, a
simple random sample of 55 students would have a mean test score within 4
units of the population mean. We could also say that with 98.6% confidence we
reject the null hypothesis that the 55 test takers are comparable to a simple
random sample from the population of test-takers.
The Z-test tells us that the 55 students of interest have an unusually low
mean test score compared to most simple random samples of similar size from
the population of test-takers. A deficiency of this analysis is that it does not
consider whether the effect size of 4 points is meaningful. If instead of a
classroom, we considered a subregion containing 900 students whose mean
score was 99, nearly the same z-score and p-value would be observed. This
shows that if the sample size is large enough, very small differences from the null
value can be highly statistically significant.
Z-Test's for Different Purposes
There are different types of Z-test each for different purpose. Some of the
popular types are outlined below:
1. z test for single proportion is used to test a hypothesis on a specific value
of the population proportion.
Statistically speaking, we test the null hypothesis H0: p = p0against
the alternative hypothesis H1: p >< p0 where p is the population proportion and
p0 is a specific value of the population proportion we would like to test for
acceptance.
2. z test for difference of proportions is used to test the hypothesis that two
populations have the same proportion.
3. z -test for single mean is used to test a hypothesis on a specific value of
the population mean.
Statistically speaking, we test the null hypothesis H0: μ = μ0 against the
alternative hypothesis H1: μ >< μ0 where μ is the population mean and μ0 is a
specific value of the population that we would like to test for acceptance.
Unlike the t-test for single mean, this test is used if n ≥ 30 and
population standard deviationis known.
4. z test for single variance is used to test a hypothesis on a specific value of
the population variance.
Statistically speaking, we test the null hypothesis H0: σ = σ0 against H1: σ
>< σ0 where σ is the population mean and σ0 is a specific value of the population
variance that we would like to test for acceptance.
In other words, this test enables us to test if the given sample has been
drawn from a population with specific variance σ 0. Unlike the chi square test for
single variance, this test is used if n ≥ 30.
5. Z-test for testing equality of variance is used to test the hypothesis of
equality of two population variances when the sample size of each sample
is 30 or larger.
T-test
The t-test is any statistical hypothesis test in which the test statistic follows
a Student's t-distribution under the null hypothesis. A t-test is most commonly
applied when the test statistic would follow a normal distribution if the value of a
scaling term in the test statistic were known. When the scaling term is unknown
and is replaced by an estimate based on the data, the test statistics (under
certain conditions) follow a Student's t distribution. The t-test can be used, for
example, to determine if the means of two sets of data are significantly different
from each other. This analysis is appropriate whenever you want to compare the
means of two groups, and especially appropriate as the analysis for the posttest-
only two-group randomized experimental design. The larger the t score, the more
difference there is between groups. The smaller the t score, the more similarity
there is between groups. A t score of 3 means that the groups are three times as
different from each other as they are within each other. When you run a t test, the
bigger the t-value, the more likely it is that the results are repeatable.
A large t-score tells you that the groups are different.
A small t-score tells you that the groups are similar.
Uses
Among the most frequently used t-tests are:
A one-sample location test of whether the mean of a population has a
value specified in a null hypothesis.
A two-sample location test of the null hypothesis such that the means of
two populations are equal. All such tests are usually called Student's t-
tests, though strictly speaking that name should only be used if the
variances of the two populations are also assumed to be equal; the form
of the test used when this assumption is dropped is sometimes called
Welch's t-test. These tests are often referred to as "unpaired" or
"independent samples" t-tests, as they are typically applied when the
statistical units underlying the two samples being compared are non-
overlapping.
For exactness, the t-test and Z-test require normality of the sample
means, and the t-test additionally requires that the sample variance follows a
scaled χ2 distribution, and that the sample mean and sample variance be
statistically independent. Normality of the individual data values is not required if
these conditions are met. By the central limit theorem, sample means of
moderately large samples are often well-approximated by a normal distribution
even if the data are not normally distributed. For non-normal data, the distribution
of the sample variance may deviate substantially from a χ2 distribution. However,
if the sample size is large, Slutsky's theorem implies that the distribution of the
sample variance has little effect on the distribution of the test statistic.
If the data are substantially non-normal and the sample size is small, the t-
test can give misleading results. See Location test for Gaussian scale mixture
distributions for some theory related to one particular family of non-normal
distributions. When the normality assumption does not hold, a non-parametric
alternative to the t-test can often have better statistical power.
In the presence of an outlier, the t-test is not robust. For example, for two
independent samples when the data distributions are asymmetric (that is, the
distributions are skewed) or the distributions have large tails, then the Wilcoxon
rank-sum test (also known as the Mann–Whitney U test) can have three to four
times higher power than the t-test.The nonparametric counterpart to the paired
samples t-test is the Wilcoxon signed-rank test for paired samples. One-way
analysis of variance (ANOVA) generalizes the two-sample t-test when the data
belong to more than two groups.
Reflections
Like z-tests, t-tests are calculations used to test a hypothesis, but they are
most useful when we need to determine if there is a statistically significant
difference between two independent sample groups. In other words, a t-test asks
whether a difference between the means of two groups is unlikely to have
occurred because of random chance. Usually, t-tests are most appropriate when
dealing with problems with a limited sample size (n < 30).
Both z-tests and t-tests require data with a normal distribution, which
means that the sample (or population) data is distributed evenly around the
mean. However, a z-test uses the population standard error whereas the t-test
uses the estimated standard error. Thus, the z-test is more accurate and more
powerful.