Sunteți pe pagina 1din 6

An answer to a question is accompanied by a statistical measure of uncertainty,

which is based on a probability model. When that probability model is based on a


chance mechanism—like the flipping of coins measures of uncertainty and the
inferences drawn from them are formally justified.
Often, however, chance mechanisms are invented as conceptual frameworks for
drawing statistical conclusions
Creativity Case Study
Subjects with considerable experience in creative writing were randomly assigned
to one of two treatment groups:
Judges were not told about the study’s purpose.

Inference - This experiment provides strong evidence that receiving the “intrinsic”
rather than the “extrinsic” questionnaire caused students in this study to score
higher on poem creativity (two-sided p-value D 0.005 from a two-sample t -test as
an approximation to a randomization test). The estimated treatment effect—the
increase in score attributed to the “intrinsic” questionnaire—is 4.1 points (95%
confidence interval: 1.3 to 7.0 points) on a 0–40-point scale.

Scope of Inference - Since this was a randomized experiment, creativity scores


was caused by the difference in motivational questionnaires. Because the subjects
were not selected randomly from any population, extending this inference to any
other group is speculative.

The inferences one may draw from any study depend crucially on the study’s
design. Two distinct forms of inference—causal inference and inference to
populations—can be justified by the proper use of random mechanisms.

Statistical inferences of cause-and-effect relationships can be drawn from


randomized experiments, but not from observational studies.
Confounding variable – effect of education – Sex discrimination case study
Inference to population - The subjects of the creativity study volunteered their
participation. The decision to volunteer can have a strong relationship with the
outcome (creativity score), and this precludes the subjects from representing a
broader population.

Inferences to populations can be drawn from random sampling studies, but not
otherwise.

A statistical inference is an inference justified by a probability model linking the


data to the broader context.
Probability models – measures of uncertainty- to accompany inferential
conclusions (example = p-value)

Questions of interest are translated into questions about parameters in


probability models.

In general, the null hypothesis (H0) is the one that specifies a simpler state of
affairs; typically—as in creativity score case—an absence of an effect.

A test statistic is a statistic used to measure the plausibility of an alternative


hypothesis relative to a null hypothesis.

A histogram of all these values describes the randomization distribution of 𝑌2- 𝑌1


if the null hypothesis is true.

P-value
In a randomized experiment the p-value is the probability that randomization
alone leads to a test statistic as extreme as or more extreme than the one
observed. The smaller the p-value, the more unlikely it is that chance assignment
is responsible for the discrepancy between groups, and the greater the evidence
that the null hypothesis is incorrect.

One-sided or Two-sided p-value?


This is a one-sided p-value (for the alternative hypothesis that δ > 0) because it counted as
extremes only those outcomes with test statistics as large as or larger than the observed one.
Statistics that are smaller than - 4.14 may provide equally strong evidence against the null
hypothesis, favoring the alternative hypothesis that δ < 0. If those (1,302 of them) are included,
the result is a two-sided p-value of 2637/500,000= 0.005274, which would be appropriate for
the two-sided alternative hypothesis that δ≠ 0.
Computing p-values from Randomized Experiments
1. Enumeration of all possible regroupings of the data would represent all the ways that
the data could turn out in all possible randomizations, absent any treatment effect. This
would determine the answer exactly, but it is often over-burdensome.
2. estimate the p-value by simulating a large number of randomizations and to find the
proportion of these that produce a test statistic at least as extreme as the observed one
3. The most common method is to approximate the randomization distribution with a
mathematical curve, based on certain assumptions about the distribution of the
measurements and the form of the test statistic.

A histogram is a graph where the horizontal axis displays ranges for the measurement and the
vertical axis displays the relative frequency per unit of measurement. Relative frequency is
therefore depicted by area.
Statistical Hypothesis Test Logic

1. Based on research question, select the null hypothesis H0

No difference between the groups


H0 = δ = 0, where δ is treatment effect

2. Select a test statistic

Something we can calculate from the data to decide whether H0 is plausible


– e.g. difference in sample means (𝑌2- 𝑌1)
test stat = 𝑌2- 𝑌1

3. Determine (or approximate) the distribution of test statistic when H0 is true

think Histogram (if study is repeated several times)


Distribution of difference in sample means under null hypothesis

4. Compare the test statistic calculated from the data to the sampling
distribution when H0 is true

Homework Guidelines (These may be augmented during the term):

 Homework must be typed. Include your name, assignment number, and lab time at
the top of the first page. Staple pages together at the top left-hand corner. These
requirements are to streamline handling and grading.
 Aim for conciseness and clarity. Perform the specific tasks and answer the
specific questions requested. You may want to include extra information for your
own records, but please omit this extraneous material in what you submit.
 Do not include computer output unless it is specifically requested. When
reporting results from an analysis, you may want to create your own table of values
and refer to this table in a summary statement.
 Do include R graphs when appropriate. These should have an appropriate title
and axis labels.

S-ar putea să vă placă și