Sunteți pe pagina 1din 6

Subject: Ed 200 Methodology of Research & Statistics in Education

Topic: Weighted Mean, T-test & Z-test


Date: May 5, 2019
Professor: Mary Arlene C. Carbonera, Ed.D
Reporter: Jamie G. Bagundol
School: Southern Mindanao College
School Year: 2019

Weighted Mean, T-test & Z-test

Weighted Mean:
A weighted mean is a kind of average. Instead of each data point
contributing equally to the final mean, some data points contribute more “weight”
than others. If all the weights are equal, then the weighted mean equals the
arithmetic mean (the regular “average” you’re used to). Weighted means are very
common in statistics, especially when studying populations.

The weighted mean is a type of mean that is calculated by multiplying the


weight (or probability) associated with a particular event or outcome with its
associated quantitative outcome and then summing all the products together. It is
very useful when calculating a theoretically expected outcome where each
outcome shows a different probability of occurring, which is the key feature that
distinguishes the weighted mean from the arithmetic mean.

It is important to note that all the probabilities or weights must be mutually


exclusive (i.e., no two events can occur at the same time) and that the total
weights and probabilities must add up to 100%.
When calculating an arithmetic mean, we make the assumption that all
numbers used in the calculation show an equal probability of occurring or
associated weights. Thus, we do not need to account for the differences and can
simply sum up the numbers that we are interested in finding the mean of and
then dividing the sum by the number of observations.
Formula:

∑𝑛𝑖=1(𝑥𝑖 ∗ 𝑤𝑖 )
𝑥=
∑𝑛𝑖=1 𝑤𝑖

Where:
Σ = the sum of (in other words…add them up!).
w = the weights.
x = the value.

To use the formula:


1. Multiply the numbers in your data set by the weights.
2. Add the numbers in Step 1 up. Set this number aside for a moment.
3. Add up all of the weights.
4. Divide the numbers you found in Step 2 by the number you found in Step
3.

What are the Benefits of Using Weighted Averages?


Weighted averages, or weighted means, take a series of numbers and
assign certain values to them that reflect their significance or importance within
the group of numbers. A weighted average may be used to evaluate trends in
accounting, investing, grading, population research or other fields in which large
quantities of numbers are gathered. The benefit of using a weighted average is
that it allows the final average number to reflect the relative importance of each
number that is being averaged.

Smooth Out Fluctuations


The major benefit of weighted averages for stocks and accounting is that it
smoothes out fluctuations in the market. The normal average may be a bad
indicator of stock trends, which may have huge fluctuations in a short amount of
time. The weighted average takes into account these fluctuations in regard to the
amount of time that they spend at a particular price. The weighted average
reflects a more long-term and consistent valuation of a stock.
Accounts for Uneven Data
In population studies or census data, certain segments of a population
may be over or under represented. Weighted averages take into account the
portions that may have uneven representation, and they account for them by
making the final product reflect a more balanced and equal interpretation of the
data. This type of average is particularly useful in data dealing with
demographics and population size.
Assumes Equal Values are Equal
The benefit of the weighted average system is that it assumes that equal
values are equivalent in proportion. For example, a teacher might want to
determine the relative age of her first graders. She knows that all of the students
are 4, 5 or 6 years old. She can count the number of students in each age group,
and then take a weighted average to determine the average age of the students.
This makes her task simple because she can assume that all children who are
five will be accounted for equally and evenly in the final average.
Z-Test
A Z-test is any statistical test for which the distribution of the test statistic
under the null hypothesis can be approximated by a normal distribution. Because
of the central limit theorem, many test statistics are approximately normally
distributed for large samples. For each significance level, the Z-test has a single
critical value (for example, 1.96 for 5% two tailed) which makes it more
convenient than the Student's t-test which has separate critical values for each
sample size. Therefore, many statistical tests can be conveniently performed as
approximate Z-tests if the sample size is large or the population variance is
known. If the population variance is unknown (and therefore has to be estimated
from the sample itself) and the sample size is not large (n < 30), the Student's t-
test may be more appropriate.
If T is a statistic that is approximately normally distributed under the null
hypothesis, the next step in performing a Z-test is to estimate the expected value
θ of T under the null hypothesis, and then obtain an estimate s of the standard
deviation of T. After that the standard score Z = (T − θ) / s is calculated, from
which one-tailed and two-tailed p-values can be calculated as Φ(−Z) (for upper-
tailed tests), Φ(Z) (for lower-tailed tests) and 2Φ(−|Z|) (for two-tailed tests) where
Φ is the standard normal cumulative distribution function.
Conditions
For the Z-test to be applicable, certain conditions must be met.
Nuisance parameters should be known, or estimated with high accuracy
(an example of a nuisance parameter would be the standard deviation in a one-
sample location test). Z-tests focus on a single parameter, and treat all other
unknown parameters as being fixed at their true values. In practice, due to
Slutsky's theorem, "plugging in" consistent estimates of nuisance parameters can
be justified. However if the sample size is not large enough for these estimates to
be reasonably accurate, the Z-test may not perform well.
The test statistic should follow a normal distribution. Generally, one
appeals to the central limit theorem to justify assuming that a test statistic varies
normally. There is a great deal of statistical research on the question of when a
test statistic varies approximately normally. If the variation of the test statistic is
strongly non-normal, a Z-test should not be used.
If estimates of nuisance parameters are plugged in as discussed above, it
is important to use estimates appropriate for the way the data were sampled. In
the special case of Z-tests for the one or two sample location problem, the usual
sample standard deviation is only appropriate if the data were collected as an
independent sample.
In some situations, it is possible to devise a test that properly accounts for
the variation in plug-in estimates of nuisance parameters. In the case of one and
two sample location problems, a t-test does this.
Example:
Suppose that in a particular geographic region, the mean and standard
deviation of scores on a reading test are 100 points, and 12 points, respectively.
Our interest is in the scores of 55 students in a particular school who received a
mean score of 96. We can ask whether this mean score is significantly lower
than the regional mean—that is, are the students in this school comparable to a
simple random sample of 55 students from the region as a whole, or are their
scores surprisingly low?
First calculate the standard error of the mean:

where ơ is the population standard deviation.


Next calculate the z-score, which is the distance from the sample mean to the
population mean in units of the standard error:

In this example, we treat the population mean and variance as known, which
would be appropriate if all students in the region were tested. When population
parameters are unknown, a t test should be conducted instead.
The classroom mean score is 96, which is −2.47 standard error units from
the population mean of 100. Looking up the z-score in a table of the standard
normal distribution, we find that the probability of observing a standard normal
value below −2.47 is approximately 0.5 − 0.4932 = 0.0068. This is the one-sided
p-value for the null hypothesis that the 55 students are comparable to a simple
random sample from the population of all test-takers. The two-sided p-value is
approximately 0.014 (twice the one-sided p-value).
Another way of stating things is that with probability 1 − 0.014 = 0.986, a
simple random sample of 55 students would have a mean test score within 4
units of the population mean. We could also say that with 98.6% confidence we
reject the null hypothesis that the 55 test takers are comparable to a simple
random sample from the population of test-takers.
The Z-test tells us that the 55 students of interest have an unusually low
mean test score compared to most simple random samples of similar size from
the population of test-takers. A deficiency of this analysis is that it does not
consider whether the effect size of 4 points is meaningful. If instead of a
classroom, we considered a subregion containing 900 students whose mean
score was 99, nearly the same z-score and p-value would be observed. This
shows that if the sample size is large enough, very small differences from the null
value can be highly statistically significant.
Z-Test's for Different Purposes
There are different types of Z-test each for different purpose. Some of the
popular types are outlined below:
1. z test for single proportion is used to test a hypothesis on a specific value
of the population proportion.
Statistically speaking, we test the null hypothesis H0: p = p0against
the alternative hypothesis H1: p >< p0 where p is the population proportion and
p0 is a specific value of the population proportion we would like to test for
acceptance.
2. z test for difference of proportions is used to test the hypothesis that two
populations have the same proportion.
3. z -test for single mean is used to test a hypothesis on a specific value of
the population mean.
Statistically speaking, we test the null hypothesis H0: μ = μ0 against the
alternative hypothesis H1: μ >< μ0 where μ is the population mean and μ0 is a
specific value of the population that we would like to test for acceptance.
Unlike the t-test for single mean, this test is used if n ≥ 30 and
population standard deviationis known.
4. z test for single variance is used to test a hypothesis on a specific value of
the population variance.
Statistically speaking, we test the null hypothesis H0: σ = σ0 against H1: σ
>< σ0 where σ is the population mean and σ0 is a specific value of the population
variance that we would like to test for acceptance.
In other words, this test enables us to test if the given sample has been
drawn from a population with specific variance σ 0. Unlike the chi square test for
single variance, this test is used if n ≥ 30.
5. Z-test for testing equality of variance is used to test the hypothesis of
equality of two population variances when the sample size of each sample
is 30 or larger.

T-test
The t-test is any statistical hypothesis test in which the test statistic follows
a Student's t-distribution under the null hypothesis. A t-test is most commonly
applied when the test statistic would follow a normal distribution if the value of a
scaling term in the test statistic were known. When the scaling term is unknown
and is replaced by an estimate based on the data, the test statistics (under
certain conditions) follow a Student's t distribution. The t-test can be used, for
example, to determine if the means of two sets of data are significantly different
from each other. This analysis is appropriate whenever you want to compare the
means of two groups, and especially appropriate as the analysis for the posttest-
only two-group randomized experimental design. The larger the t score, the more
difference there is between groups. The smaller the t score, the more similarity
there is between groups. A t score of 3 means that the groups are three times as
different from each other as they are within each other. When you run a t test, the
bigger the t-value, the more likely it is that the results are repeatable.
 A large t-score tells you that the groups are different.
 A small t-score tells you that the groups are similar.

T-Values and P-values


How big is “big enough”? Every t-value has a p-value to go with it. A p-
value is the probability that the results from your sample data occurred by
chance. P-values are from 0% to 100%. They are usually written as a decimal.
For example, a p value of 5% is 0.05. Low p-values are good; They indicate your
data did not occur by chance. For example, a p-value of .01 means there is only
a 1% probability that the results from an experiment happened by chance. In
most cases, a p-value of 0.05 (5%) is accepted to mean the data is valid.

Uses
Among the most frequently used t-tests are:
 A one-sample location test of whether the mean of a population has a
value specified in a null hypothesis.
 A two-sample location test of the null hypothesis such that the means of
two populations are equal. All such tests are usually called Student's t-
tests, though strictly speaking that name should only be used if the
variances of the two populations are also assumed to be equal; the form
of the test used when this assumption is dropped is sometimes called
Welch's t-test. These tests are often referred to as "unpaired" or
"independent samples" t-tests, as they are typically applied when the
statistical units underlying the two samples being compared are non-
overlapping.

Alternatives to the T-test for Location Problems


The t-test provides an exact test for the equality of the means of two
normal populations with unknown, but equal, variances. (Welch's t-test is a nearly
exact test for the case where the data are normal but the variances may differ.)
For moderately large samples and a one tailed test, the t-test is relatively robust
to moderate violations of the normality assumption.

For exactness, the t-test and Z-test require normality of the sample
means, and the t-test additionally requires that the sample variance follows a
scaled χ2 distribution, and that the sample mean and sample variance be
statistically independent. Normality of the individual data values is not required if
these conditions are met. By the central limit theorem, sample means of
moderately large samples are often well-approximated by a normal distribution
even if the data are not normally distributed. For non-normal data, the distribution
of the sample variance may deviate substantially from a χ2 distribution. However,
if the sample size is large, Slutsky's theorem implies that the distribution of the
sample variance has little effect on the distribution of the test statistic.

If the data are substantially non-normal and the sample size is small, the t-
test can give misleading results. See Location test for Gaussian scale mixture
distributions for some theory related to one particular family of non-normal
distributions. When the normality assumption does not hold, a non-parametric
alternative to the t-test can often have better statistical power.

In the presence of an outlier, the t-test is not robust. For example, for two
independent samples when the data distributions are asymmetric (that is, the
distributions are skewed) or the distributions have large tails, then the Wilcoxon
rank-sum test (also known as the Mann–Whitney U test) can have three to four
times higher power than the t-test.The nonparametric counterpart to the paired
samples t-test is the Wilcoxon signed-rank test for paired samples. One-way
analysis of variance (ANOVA) generalizes the two-sample t-test when the data
belong to more than two groups.
Reflections

Weighted mean, z-test and t-test are significant in testing hypothesis.


These statistical tools are commonly used in research methods. However, these
tools have different usage. Weighted mean determines the degree of responses
or the weight of the responses. Z-test is used to compare data which sample is at
least 30. T-test, on the other hand, solves the comparison of dta which sample
size is not more than 30.

A weighted average is most often computed with respect to the frequency


of the values in a data set. One can calculate a weighted average in different
ways. However, certain values in a data set are given more importance for
reasons other than frequency of occurrence. Each data point value is multiplied
by the assigned weight which is then summed and divided by the number of data
points. A weighted average is extremely useful in that it allows the final average
number to reflect the relative importance of each observation and is thus more
descriptive than a simple average. It also has the effect of smoothing out data
thereby enhancing accuracy.

Nevertheless, Z-tests and t-tests are statistical methods involving data


analysis that have applications in business, science, and many other disciplines.
Z-tests are statistical calculations that can be used to compare population means
to a sample's. The z-score tells you how far, in standard deviations, a data point
is from the mean or average of a data set. A z-test compares a sample to a
defined population and is typically used for dealing with problems relating to large
samples (n > 30). Z-tests can also be helpful when we want to test a hypothesis.
Generally, they are most useful when the standard deviation is known.

Like z-tests, t-tests are calculations used to test a hypothesis, but they are
most useful when we need to determine if there is a statistically significant
difference between two independent sample groups. In other words, a t-test asks
whether a difference between the means of two groups is unlikely to have
occurred because of random chance. Usually, t-tests are most appropriate when
dealing with problems with a limited sample size (n < 30).

Both z-tests and t-tests require data with a normal distribution, which
means that the sample (or population) data is distributed evenly around the
mean. However, a z-test uses the population standard error whereas the t-test
uses the estimated standard error. Thus, the z-test is more accurate and more
powerful.

In essence, weighted mean is used to find average of responses. A t-test


is a statistical method used to see if two sets of data are significantly different. A
z-test is a statistical test to help determine the probability that new data will be
near the point for which a score was calculated. When using statistical tools, a
researcher must know the usage and conditions of each of these tools for them
to know the most appropriate tools for testing the hypothesis. In addition,
research analyses and interpretations becomes more reliable when the research
methods used are accurate.

S-ar putea să vă placă și