Sunteți pe pagina 1din 6

Teaching statistics using simulation-based methodsproposing significant change to our approach

Alia Maw
Sep 3 2015
Hello colleagues,
I sent the following to the full-time faculty in the mathematics department last spring
(3/25/2015). Some discussion ensued and I did a demo of teaching hypothesis testing using
simulation-based methods. I'd like to continue this discussion this school year and hope that we
can implement significant curricular changes for the coming school year.

I view the statewide changes in the curriculum of the MATH/STAT 1040 class as an opportunity
to align our curriculum with the current international reform efforts incorporating simulationbased methods in the introductory statistics course, and drawing closer to current practice in the
field of Statistics. I propose that we now move our course curriculum to emphasize simulation,
permutation tests, and randomization-based inference.
I think it will be helpful to approach this opportunity with some background about the field of
Statistics.
Early on in Statistics, the Central Limit Theorem put the Normal model at the center of all
statistical inference. Adjustments to this model (using Students t distribution, adjusting for
unequal standard deviations, etcetera) have made it increasingly complicated. In about 1937,
Fisher and Pitman found a significantly simpler model for inference, based on randomization.
However, the model couldnt be effectively used because we didnt have the computing power.
This is no longer the case. (Interview with George Cobb, Allan Rossman, California
Polytechnic State University, George Cobb, Mount Holyoke College. Journal of Statistics
Education Volume 23, Number 1 (2015),
www.amstat.org/publications/jse/v23n1/rossmanint.pdf (Links to an external site.))
In 1992 (over 20 years ago), George Cobb published an article in the MAA (Mathematical
Association of America) volume Heeding the Call for Change: Suggestions for Curricular
Action reporting on recommendations for teaching statistics from a focus group of statisticians
across the nation. The recommendations from this group were 1)Teach statistical thinking,
2)More data and concepts; less theory and fewer recipes, 3)Foster active learning. This charge to
emphasize data and concepts over a theoretical approach mirrors the work of professional
statisticians, where technology has informed theory for years.
In 2005 the ASA (American Statistical Association) endorsed the GAISE (Guidelines for
Assessment and Instruction in Statistics Education) report. This report had six recommendations,
expanding those of the earlier report:

1.
2.
3.
4.
5.
6.

Emphasize statistical literacy and develop statistical thinking


Use real data
Stress conceptual understanding, rather than mere knowledge of procedures
Foster active learning in the classroom
Use technology for developing conceptual understanding and analyzing data
Use assessments to improve and evaluate student learning

A key point is this: Technology has changed the way statisticians work and should change what
and how we teach. For example, statistical tables such as a normal probability table are no longer
needed to find p-values, and we can implement computer-intensive methods. it is important to
view the use of technology not just as a way to compute numbers but as a way to explore
conceptual ideas and enhance student learning as well.
http://www.amstat.org/education/gaise/GaiseCollege_Full.pdf
Now ten years post-GAISE, there are two NSF (National Science Foundation) grant funded
randomization-based inference curricula currently under development in the nation, as well as
other smaller initiatives. I have spent considerable time with both of these curricula, including
attending conferences and workshops, participating in research studies, and reading professional
journals.
The first is the CATALYST project out of the University of Minnesota
(http://www.tc.umn.edu/~catalst/about (Links to an external site.)) with Joan Garfield, Bob
DelMas, et al heading it. CATALYST has been in development for several years. While I like
this curriculum (especially their metaphor that many introductory statistics classes teach students
how to follow recipes, but not how to really cook. That is, even if students leave a class able
to perform routine procedures and tests, they do not have the big picture of the statistical process
that will allow them to solve unfamiliar problems and to articulate and apply their
understanding.), I don't think this will be the best fit for us. The course depends strongly upon a
particular learning software called TinkerPlots http://www.tinkerplots.com/ (Links to an external
site.) and on a group-learning focus in the classroom.
I recommend that we adopt much of the second NSF project: ISI (Introduction to Statistical
Investigations) http://www.math.hope.edu/isi/ (Links to an external site.). This large project
involves key leaders in Statistics education across four universities. The materials developed
include a text, applets and datasets, instructional materials, and assessment.
The text differs from traditional texts in both content and pedagogy. Statistical inference is
introduced using simulation-based methods in Chapter 1. This allows students to intuitively
understand the process of inference from the very beginning of their course. Concepts of
statistical inference are then explored for the entire text instead of only the last half of many
traditional texts.
A semester might be organized like this:
Statistical Significance (One Proportion)
Normal Distributions
Sampling

Confidence Intervals (One Proportion)


Experiments
Comparing Two Proportions
Comparing Two Means
Inference for One Mean
Paired Data
Comparing Multiple Groups
Association and Correlation
Regression Analysis
The basic approach is to introduce each topic of inference with randomization methods first, then
follow-up with the parametric model. In this way the more traditional, parametric models are not
abandoned, but presented as alternative models to the simulation models. This also takes a lot of
the black box out of the introductory statistics class the students will still not be prepared for
a calculus-based proofs approach to the derivation of the parametric models, but will have the
reasoning and logic of the randomization methods to give credence to the parametric models,
while previously these are introduced with little to no backing in the intro stats course. Basic
concepts of sampling, experimental design, and descriptive statistics are integrated into the
course, rather than all at the beginning without the context of the statistical process.
I think that the ISI approach would be a good fit for our school, and in fact, for programs across
the state of Utah. We have an opportunity to be among the first adopters of a randomizationbased curriculum in the state. By putting randomization methods first, we will positively impact
students understanding of Statistics, and align with current professional practice in the field. By
still including parametric models, we will smoothly meet the needs of other departments and
courses still using traditional methods. Our students will have a strong foundation in data-driven
methods in Statistics and will both be able to move forward in programs that rely on the
traditional parametric approach, AND will be able to use simulations to solve problems that
cannot be modeled by a Normal distribution. This unique advantage will make for more versatile
problem-solvers with a grasp of the complete process of statistical analysis.
I believe this matter merits thoughtful and prompt attention and that this moment is timely with
the statewide curricular changes in the MATH/STAT 1040 course. Id appreciate your
considered feedback and discussion.
Alia Criddle Maw

Some further reading:


Cobb, G. and Moore, D. (1997), Mathematics, Statistics, and Teaching, American
Mathematical Monthly, 104, 801-823
http://www.stat.ucla.edu/~rakhee/attachments/moorecobb.pdf (Links to an external site.)

Rossman, A., Chance, B., & Medina, E. Some key comparisons between Statistics and
Mathematics and why teachers should care. Chapter on the 2006 National Council of Teachers
of Mathematics Yearbook on Thinking and Reasoning with Data and Chance.
Combating anti-statistical thinking through the use of simulation-based methods throughout the
undergraduate curriculum white paper by the ISI team discussing the role of simulation-based
inference throughout the undergraduate statistics curriculum.
http://www.math.hope.edu/isi/presentations/white_paper_sim_inf_thru_curriculum.pdf (Links to
an external site.)

Here are some additional responses I sent out to the faculty last spring (4/12/2015).

First, it is important to recognize that Quantitative Literacy > mathematics


From the AACU VALUE rubric Quantitative Literacy (QL) also known as Numeracy or
Quantitative Reasoning (QR) is a "habit of mind," competency, and comfort in working with
numerical data. Individuals with strong QL skills possess the ability to reason and solve
quantitative problems from a wide array of authentic contexts and everyday life situations. They
understand and can create sophisticated arguments supported by quantitative evidence and they
can clearly communicate those arguments in a variety of formats (using words, tables, graphs,
mathematical equations, etc., as appropriate). http://www.aacu.org/value/rubrics/quantitativeliteracy (Links to an external site.)
The Salt Lake Community College-wide Student Learning Outcome #3 reads:
Develop quantitative literacies necessary for their chosen field of study. For example:

Approach practical problems by choosing and applying appropriate mathematical techniques.


Use and interpret information represented as data, graphs, tables, and schematics in a variety of
disciplines.
Apply mathematical theory, concepts and methods of inquiry appropriate to program-specific
problems.
Develop financial literacy.

And the SLCC Quantitative Literacy Rubric Development Guide clarifies:


Is QL math? Well...sort of.
There is no question that math plays a supporting role in QL. But math not the end goal nor is it
the majority of what QL is all about. The simplest explanation of the difference I found between
math and QL was that standard math is learning more-and-more sophisticated abstract math
while QL utilizes simple mathematics in sophisticated ways. Nearly every author Ive read says
QL is not math per se. Rather, it is reasoning and communicating mathematically. Thus, the

main elements of nearly any QL examination would focus on different things than a standard
mathematics multiple-choice exam. Instead, QL assessments would place great emphasis on
determining if the student could identify a method for solving a problem or making a decision, do
the math involved, and then communicate this in narrative, numeric, and graphic form as
needed. http://www.slcc.edu/assessment/docs/Quantitative%20Literacy%20Rubric%20Develo
pment%20Guide%202014.pdf (Links to an external site.)
Mathematics is not what makes an introductory Statistics course a Quantitative Literacy course,
neither at our college nor across the nation.
Second, the place for probability
While I included the Moore/Cobb article as a reference, there are extensive other papers, as well
as the GAISE standards and other more recent work in statistics education, all arriving at a
strong consensus that probability theory is not the focus of an introductory statistics course. This
course needs to be data-centered, focused on reasoning in the face of uncertainty. As weve
discussed previously, topics like conditional probability in the formal presentation are
superfluous to inferential statistics at this level.
However, I disagree that the reformed curriculum removes probability from the course.
Understanding probabilities is crucial to modeling, simulations, and inference, including
confidence intervals and hypothesis tests. Perhaps you are responding to not seeing it as an
separate topic in my list of topics for the course. The list was meant to provide an idea of the
ordering of topics in the course, not to be exhaustive of the student learning outcomes.
For example, the proposed ordering of topics starts with Statistical Significance (One
Proportion). We could start the course with some reading of articles, discussing the statistical
investigation method. Wed need to run through the very basics of probabilities and how to
interpret probability as a long-run-frequency. Using articles, we can work on defining
observational units and variables, types of variables (quantitative or categorical), and introduce
concepts of distribution, shape, center, and variability. Then we start working on inference on the
proportion, and introduce the tool of a hypothesis test. By starting with concrete simulations
(cards, dice, other simple probability models), then moving to more abstract (computerized
simulations), and lastly to the very abstract (traditional parametric theoretical models), we help
advance students understanding. Students will start with interpreting p-values as a relative
frequency probability from their simulation, then move to the parametric test and interpret the
model-based p-value.
I assert that this curriculum will ensure that students have a lot of exposure to probability
throughout the course.
Third, introductory statistics as an applied course versus a theory course
The current intro statistics course offers very, very little in statistical theory. For example, we
declare the Central Limit Theorem to exist, but do nothing to prove it. This is because we
cannot; students would need at least three semesters of Calculus and an Analysis course before

we can do that. Mathematical Statistics is a 5000-level graduate course. Even a beginning


Statistics for Scientists course that attempts some of the proofs requires Calculus. In our 1040
class we just give them parametric models while glossing over their limited utility due to model
restrictions and other assumptions. We establish the basics of a probability distribution, but only
in the discrete case. Moving to the continuous case requires hand-waving. We never have been
teaching a theoretical statistics course. By choosing to teach using simulations first (parametric
models second), we are giving the students what weve never done before: a reason to believe
the theoretical models.
From a sample syllabus posted with the ISI curriculum materials:
Curriculum:
We use a nontraditional approach to teaching introductory statistics. While we end up with
basically the same outcomes as that of a traditional course, our path getting there is a bit
different. A traditional course consists of three sections: descriptive statistics,
probability/sampling distributions, and inferential statistics. With more and more statistics being
taught in the K-12 curriculum, most of you already have a grasp of descriptive statistics. We will
quickly include the descriptive topics that are needed for inference throughout the course, but
will not devote the amount of time on these topics as is traditionally done. The second part of a
traditional course (probability and sampling distributions) is typically included to help students
understand the theory behind inferential statistics. We, however, believe that introducing
students to inferential statistics is better done using simulations called permutation tests or
randomization to learn the statistical inference process. Introducing inference this way is more
intuitive (and thus more understandable) and allows us to spend much more time on it.
Therefore, you should, gain a better understanding of the inferential process as we will
thoroughly cover the entire statistical investigative method throughout the entire semester. We
will still cover the theory-based methods that are traditionally taught including tests and
confidence intervals for a single mean and proportion, matched pairs, comparing two means,
comparing two proportions, comparing multiple means (ANOVA) and proportions (chi-square),
correlation and regression.

S-ar putea să vă placă și