Sunteți pe pagina 1din 36

Evaluation Research

Evaluation research, sometimes called program


evaluation, refers to a research purpose rather than a
specific method.

This purpose is to evaluate the impact of social


interventions such as new teaching methods,
innovations in parole, and a host of others.

Evaluation research is a form of applied research—it


is intended to have some real-world effect.
Many methods, like surveys and experiments can be
used in evaluation research.

In recent years, the field of evaluation research has


become an increasingly popular and active research
specialty, as reflected in textbooks, courses, and
projects.
Evaluation research reflects social scientists’
increasing desire to make a difference in the world.
At the same time, there is the influence of

an increase in federal requirements that program


evaluations must accompany the implementation
of new programs, and

the availability of research funds to fulfill those


requirements.
Topics Appropriate to Evaluation Research
Evaluation research is appropriate whenever some social intervention
occurs or is planned.

Social intervention is an action taken within a social context for the


purpose of producing some intended result.

In its simplest sense, evaluation research is the process of determining


whether a social intervention has produced the intended result.

The topics appropriate for evaluation research are limitless.

The questions appropriate for evaluation research are of great practical


significance: jobs, programs, and investments as well as values and
beliefs.
Formulating the Problem: Issues of Measurement
Problem: What is the purpose of the intervention to be
evaluated?
This question often produces vague results.
A common problem is measuring the “unmeasurable.”
Evaluation research is a matter of finding out whether
something is there or not there, whether something
happened or did not happen.
To conduct evaluation research, we must be able to
operationalize, observe, and measure.
What is the outcome, or the response variable?

If a social program is intended to accomplish something,


we must be able to measure that something.

It is essential to achieve agreements on definitions in


advance.

In some cases you may find that the definitions of a


problem and a sufficient solution are defined by law or
by agency regulations; if so you must be aware of such
specifications and accommodate them.
Whatever the agreed-upon definitions, you must
also achieve agreement on how the measurements
will be made.

There may be several outcome measures, for


instance surveys of attitudes and behaviors,
existing statistics, use of other resources.
Measuring Experimental Contexts
Measuring the dependent variable directly involved
in the experimental program is only a beginning.

It is often appropriate and important to measure those


aspects of the context of an experiment researchers
think might affect the experiment.

For example, what is happening in the larger society


beyond the experimental group, which may affect the
experimental group.
Specifying Interventions

Besides making measurements relevant to the outcomes


of a program, researchers must measure the program
intervention—the experimental stimulus.
• The experimental stimulus is the program intervention.

If the research design includes an experimental and a


control group, then the experimental stimulus will be
handled.
Assigning a person to the experimental group is the same as
scoring that person “yes” on the stimulus, and assigning
“no” to the person in the control group.
Considerations: who participates fully; who misses
participation in the program periodically; who misses
participation in the program a lot?
Measures may need to be included to measure level of
participation.
The problems may be more difficult than that.
The factors to consider should be addressed thoroughly.
Specifying the Population
It is important to define the population of possible
subjects for whom the program is appropriate.

Ideally, all or a sample of appropriate subjects will then


be assigned to experimental and control groups as
warranted by the study design.

Beyond defining the relevant population, the researcher


should make fairly precise measurements of the
variables considered in the definition.
New versus Existing Measures
If the study addresses something that’s never been
measured before, the choice is easy—new
measures.

If the study addresses something that others have


tried to measure, the researcher will need to
evaluate the relative worth of various existing
measurement devices in terms of her or his
specific research situations and purpose.
Advantages of creating measures:
• They can offer greater relevance and validity than using existing
measures.

Advantages of using existing measures:


• Creating good measures takes time and energy, both of which could be
saved by adopting an existing technique.

• Of greater scientific significance, measures that have been used


frequently by other researchers carry a body of possible comparisons
that might be important to the current evaluation.

• Finally, measures with a long history of use usually have known


degrees of validity and reliability, but newly created measures will
require pretesting or will be used with considerable uncertainty.
Operationalizing Success/Failure

Potentially one of the most taxing aspects of


evaluation research is determining whether the
program under review succeeded or failed.
Definitions of “success” and “failure” can be
rather difficult.
Cost-benefit analysis
How much does the program cost in relation to what it returns
in benefits?
• If the benefits outweigh the cost, keep the program going.
• If the reverse, ‘junk it’.
• Unfortunately this is not an appropriate analysis to make if
thinking only in terms of money.

Ultimately, the criteria of success and failure are often a matter of


agreement.

The people responsible for the program may commit themselves in


advance to a particular outcome that will be regarded as an indication
of success.
• If that’s the case, all you need to do is make absolutely certain that the research
design will measure the specified outcome.
Researchers must take measurement quite seriously
in evaluation research, carefully determining all the
variables to be measured and getting appropriate
measures for each.

Such decisions are often not purely scientific ones.

Evaluation researchers often must work out their


measurement strategy with the people responsible
for the program being evaluated.

There is also a political aspect.


Types of Evaluation Research Designs
Evaluation research is not itself a method, but
rather one application of social research methods.
As such, it can involve any of several research
designs. To be discussed:

1. Experimental designs
2. Quasi-experimental designs
3. Qualitative evaluations
Experimental Designs

Many of the experimental designs


introduced in Chapter 8 can be used in
evaluation research.
Quasi-Experimental Designs: distinguished from
“true” experiments primarily by the lack of
random assignment of subjects to an experiments
primarily by the lack of random assignment of
subjects to an experimental and control group. In
evaluation research, it’s often impossible to
achieve such an assignment of subjects.
Rather than forgo evaluation all together,
there are some other possibilities.

• Time-Series Designs
• Nonequivalent Control Groups
• Multiple Time-Series Designs
Time-Series Designs:
• Studies that involve measurements taken over
time. See Figure 12-1 & 12-2.

• 12-1 involves only an experimental group,


without a control group.
Nonequivalent Control Groups:
• Using an existing “control” group that appears
similar to the experimental group, used when
researchers cannot create experimental and
control groups by random assignment from a
common pool.

• A nonequivalent control group can provide a


point of comparison even though it is not
formally a part of the study.
Multiple Time-Series Designs:
• Using more than one time-series analysis.

• These are the improved version of the nonequivalent


control group design.

• This method is not as good as the one in which control


groups are randomly assigned, but it is an improvement
over assessing the experimental group’s performance
without any comparison.
• See page 352, Figure 12-3
Qualitative Evaluations
• Evaluations can be less structured and more
qualitative.

• Sometimes important, often unexpected


information is yielded from in-depth
interviews.
The Social Context
• Evaluation research has a special propensity for
running into problems.
• Logistical problems
• Ethical problems
Logistical Problems
Problems associated with getting subjects to do what
they are supposed to do, getting research instruments
distributed and returned, and other seemingly
unchallenging tasks that can prove to be very
challenging.

The special, logistical problems of evaluation research


grow out of the fact that it occurs within the context of
real life.
Although evaluation research is modeled after the
experiment—which suggests that the researchers
have control over what happens—it takes place
within frequently uncontrollable daily life.

Lack of control can create real dilemmas for the


researchers.
Administrative control:
• The logistical details of an evaluation project often fall
to program administrators.

• What happens when the experimental stimulus changes


in the middle of the experiment due to unforeseen
problems (e.g. escaping convicts; inconsistency of
attendance, or replacing original subjects with
substitutes)?

• Some of the data will reflect the original stimulus; other


data will reflect the modification.
Ethical Issues
• Ethics and evaluation are intertwined in many ways.

• Sometimes the social interventions being evaluated


raise ethical issues. They may involve political,
ideological and ethical issues about the topic itself

• Maybe the experimental program is of great value to


those participating in it.
• But what about the control group who is not receiving help?
• For example, Tuskegee, Alabama study, page 356
Use of Research Results
Because the purpose of evaluation research is to determine
the success or failure of social interventions, you might think
it reasonable that a program would automatically be
continued or terminated based on the results of the research.

It’s not that simple.

Other factors intrude on the assessment of evaluation


research results, sometimes blatantly and sometimes subtly.
Three important reasons why the implications of the
evaluation research results are not always put into
practice.

• The implications may not always be presented in


a way that the nonresearchers can understand.

• Evaluation results sometimes contradict deeply


held beliefs

• Vested interests in the programs underway


Social Indicators Research
Combining evaluation research with the analysis of existing data.

A rapidly growing field in social research involves the development


and monitoring of social indicators, aggregated statistics that reflect
the social condition of a society or social subgroup.

Researchers use indicators to monitor social life.

It’s possible to use social indicators data for comparison across


groups either at one time or across some period of time.

Often doing both sheds the most light on the subject.


The use of social indicators is proceeding
on two fronts:
• Researchers are developing ever more-refined
indicators; finding which indicators of a general
variable are the most useful in monitoring
social life

• Research is being devoted to discovering the


relationships among variables within whole
societies
Computer Simulation:
As researchers begin compiling mathematical equations
describing the relationships that link social variables to
one another (for example, the relationship between
growth in population and the number of automobiles),
those equations can be stored and linked to one another in
a computer.

With sufficient number of adequately accurate equations


on tap, researchers one day will be able to test the
implications of specific social changes by computer rather
than in real life.
Evaluation research provides a means for us to learn
right away whether a particular “tinkering” really
makes things better.

Social indicators allow us to make that determination


on a broad scale; coupling them with computer
simulation opens up the possibility of knowing how
much we would like a particular intervention without
having to experience its risks.

S-ar putea să vă placă și