Sunteți pe pagina 1din 393

Research Methods

and Statistics for


Public and Nonprofit
Administrators
The authors wish to dedicate this book to current and future
professionals in the field of public service.
Masami Nishishiba wishes to dedicate this book to her loving parents, Tetsuo and
Akiko Kawai, and to the memory of her late brother, Takeshi Kawai.
Mariah Kraner wishes to dedicate this book to her supportive husband, Josh,
and loving children, Kennedy and Logan. 
Research Methods
and Statistics for
Public and Nonprofit
Administrators
A Practical Guide

Masami Nishishiba
Matthew Jones
Mariah Kraner
Portland State University
FOR INFORMATION: Copyright  2014 by SAGE Publications, Inc.

SAGE Publications, Inc.


All rights reserved. No part of this book may be reproduced
2455 Teller Road or utilized in any form or by any means, electronic or
Thousand Oaks, California 91320 mechanical, including photocopying, recording, or by any
E-mail: order@sagepub.com information storage and retrieval system, without permission
in writing from the publisher.
SAGE Publications Ltd.
1 Oliver’s Yard All character cartoon images © Kyoko Hosoe-Corn
55 City Road
Printed in the United States of America
London EC1Y 1SP
United Kingdom Library of Congress Cataloging-in-Publication Data

SAGE Publications India Pvt. Ltd. Nishishiba, Masami.


B 1/I 1 Mohan Cooperative Industrial Area
Research methods and statistics for public and nonprofit
Mathura Road, New Delhi 110 044
administrators : a practical guide / Masami Nishishiba,
India Matthew Jones, Mariah Kraner.

SAGE Publications Asia-Pacific Pte. Ltd. pages cm


3 Church Street Includes bibliographical references and index.
#10-04 Samsung Hub
ISBN 978-1-4522-0352-2 (pbk. : alk. paper)
Singapore 049483
ISBN 978-1-4833-0141-9 (web pdf)
ISBN 978-1-4833-2146-2 (epub)

1. Public administration—Research Methodology.


2. Nonprofit organizations—Management. I. Title.

JF1338.A2N57 2013
001.4′2024658—dc23
Acquisitions Editor:  Patricia Quinlin
Associate Editor:  Maggie Stanley This book is printed on acid-free paper.
Assistant Editor:  Megan Koraly
Editorial Assistant:  Katie Guarino
Production Editor:  Brittany Bauhaus
Copy Editor:  Brenda White
Typesetter:  C&M Digitals (P) Ltd.
Proofreader:  Bonnie Moore
Indexer:  Kathy Paparchontis
Cover Designer:  Anupama Krishnan
Marketing Manager:  Liz Thornton 13 14 15 16 17 10 9 8 7 6 5 4 3 2 1

Brief Contents

Preface xvi
Acknowledgments xx

SECTION I.  RESEARCH DESIGN AND DATA COLLECTION 1

Chapter 1  
When a Practitioner Becomes a Researcher 2
Chapter 2  Research Alignment 13
Chapter 3  Identifying the Focus of the Research:
Research Objective and Research Question 26
Chapter 4  Research Design 47
Chapter 5  Sample Selection 72
Chapter 6  Data Collection 87

SECTION II.  DATA ANALYSIS 115

Chapter 7  Quantitative Data Preparation and


Descriptive Statistics 116
Chapter 8  Hypothesis Testing and Statistical
Significance: Logic of Inferential Statistics 151
Chapter 9  Comparing Means Between Two Groups 171
Chapter 10 
Comparing Means of More Than Two
Groups: Analysis of Variance (ANOVA) 193
Chapter 11  Bivariate Correlation 222
Chapter 12  Chi-Square Analysis 239
Chapter 13  Regression Analysis 253
Chapter 14  Qualitative Data Analysis 281
SECTION III.  SUMMING UP: PUTTING THE PIECES TOGETHER 297

Chapter 15  Writing Reports 298


Chapter 16 
Using Research Methods for Continuous
Improvement: Program Evaluation and
Performance Measurement 313

Appendix A: Additional SPSS and Excel Instructions 333


Appendix B: Emily’s Survey Form 339
Glossary 342
Index 356
About the Authors 366

Contents

Preface xvi
Acknowledgments xx

SECTION I.  RESEARCH DESIGN AND DATA COLLECTION 1

Chapter 1  When a Practitioner Becomes a Researcher 2


Case Descriptions of Practitioners Becoming Researchers 3
Emily: HR Director at a City 3
Jim: Deputy Fire Chief 4
Mary: Manager at a Nonprofit Organization 6
Purpose of This Book 6
Research Skills as Leadership Skills 8
1. Research skills help you develop problem solving, solution
construction, and social judgment skills. 8
2. Research skills help you acquire resources. 8
3. Research skills help you allocate resources more effectively. 9
4. Research skills help you advocate. 10
5. Research skills help you become a better decision maker. 10
6. Research skills support ethical leadership. 10
Chapter Summary 11
Review and Discussion Questions 11
Key Terms 12

Chapter 2  Research Alignment 13


When the Research Process is Not Aligned 14
Emily’s Case 14
Research Flow and Components 18
Overall Flow of Research 18
Step 1. Identifying the Focus of the Research (Research Objective) 18
Step 2. Identifying the Specific Questions You Are
Trying to Answer (Research Question) 20
Step 3. Identify How You Collect the Information
You Need (Research Design) 20
Step 4. Identify From Whom or What You Collect the
Information (Sample Selection) 21
Step 5. Collect the Data (Data Collection) 21
Step 6. Analyze the Data (Data Analysis) 22
Step 7. Interpret and Report the Results (Reporting) 23
Chapter Summary 24
Review and Discussion Questions 25
Key Terms 25

Chapter 3  Identifying the Focus of the Research:


Research Objective and Research Question 26
Identifying the Focus of the Research 27
Jim’s Case 27
Mary’s Case 29
Research Objectives 30
Jim’s Case 31
Identifying Research Objectives 32
Types of Research 32
Theory Building Approaches: Inductive Versus Deductive 33
Types of Data Analysis 35
Mary’s Case 36
Research Questions 37
Jim’s Case 37
Focusing Your Research Questions 38
Identifying Types of Research Questions 40
Literature Review 42
Chapter Summary 44
Review and Discussion Questions 44
Key Terms 46

Chapter 4  Research Design 47


Identifying Research Design 48
Emily’s Case 48
Mary’s Case 49
Research Design: A Game Plan 49
Types of Research Design 50
Conditions for Cause and Effect 51
Temporal Precedence 52
Covariation of Cause and Effect 52
No Plausible Alternative Explanation 53
Key Elements of Experimental Research Design 56
Variations of Quasi-Experimental Research Design 59
Jim’s Case 59
Making a Causal Argument Based on the
Experimental Design 63
Jim’s Case (Continues) 63
Other Variations of Experimental and Quasi-Experimental Design 66
Ethical Considerations in Experimental and
Quasi-Experimental Design 69
Chapter Summary 69
Review and Discussion Questions 70
Key Terms 71

Chapter 5  Sample Selection 72


Identifying Samples 73
Emily’s Case 73
Mary’s Case 74
Sample Selection 74
Identify an Appropriate Sampling Frame 75
Identify an Appropriate Sample Size 77
Identify an Appropriate Sampling Technique 78
Probability Sampling 79
Simple Random Sampling 79
Systematic Random Sampling 79
Stratified Random Sampling 80
Cluster Sampling 81
Non-Probability Sampling 82
Convenience Sampling 83
Purposive Sampling 84
Emily’s Case 84
Chapter Summary 85
Review and Discussion Questions 85
Key Terms 86

Chapter 6  Data Collection 87


Identifying Data Collection Methods 88
Emily’s Case 88
Jim’s Case 90
Mary’s Case 90
Types of Data 91
Survey 91
Advantages of Surveys 92
Survey Errors 92
Writing Survey Questions 94
Types of Questions 94
Key Considerations in Wording Survey Questions 94
Key Considerations for Response Options 95
Operationalizing the Concept 96
Mode of Survey Administration 98
Emily’s Case 99
Interview 101
Interview Guide: Instrument for Qualitative Data Collection 101
Focus Group 102
Other Qualitative Data Collection Methods 106
Mary’s Case 107
Using Secondary Data 109
Jim’s Case 109
Ethical Considerations in Data Collection 110
Chapter Summary 112
Review and Discussion Questions 112
Key Terms 113

SECTION II.  DATA ANALYSIS 115

Chapter 7  Quantitative Data Preparation and Descriptive Statistics 116


Preparing for Analysis and Using Descriptive Statistics 118
Emily’s Case 118
Jim’s Case 119
Starting Data Analysis 120
Preparing Data for Analysis 121
Levels of Measurement 122
Descriptive Statistics: Overview 126
Measures of Central Tendency 126
Mean 127
Median 128
Mode 130
Which Measure of Central Tendency to Use? 131
Measures of Variability 131
Range 133
Variance 134
Standard Deviation 136
Measures of the Shape of a Distribution 137
Chapter Summary 144
Review and Discussion Questions 144
Statistics Exercise 145
1. Emily’s Data 145
2. Jim’s Data 145
Step-by-Step Instructions for Running Descriptive
Statistics Using SPSS 145
Step-by-Step Instructions for Running Descriptive
Statistics Using Excel 147
Key Terms 149

Chapter 8  Hypothesis Testing and Statistical


Significance: Logic of Inferential Statistics 151
Using Inferential Statistics 152
Emily’s Case 152
Jim’s Case 153
What Are Inferential Statistics? 154
Developing Hypotheses 155
Types of Variables in the Hypothesized Relationship 155
Emily’s Case 156
Hypothesis Testing 158
Statistical Significance 160
Level of Significance 160
Probability, Normal Distribution, and Sampling
Distribution of the Mean 162
Normal Distribution 162
Sampling Distribution of the Mean 162
Summary of Hypothesis Testing Steps 166
Errors and Risks in Hypothesis Testing 166
Statistical Significance Versus Practical Significance 168
Chapter Summary 169
Review and Discussion Questions 169
Key Terms 170

Chapter 9  Comparing Means Between Two Groups 171


Comparing Two Groups 173
Emily’s Case 173
Jim’s Case 174
Types of Research Questions T-Tests Can Answer 174
Why Conduct T-Tests? 175
Background Story of the T-Test 175
One-Sample T-Test 175
Running One-Sample T-Test Using Software Programs 176
Independent Samples T-Test 178
Equality of Variance 179
Jim’s Case 179
Running Independent Samples T-Test Using SPSS 180
Independent Samples T-Test Using Excel 184
Jim’s Case 185
Paired Samples T-Test 186
Running Paired Samples T-Test Using SPSS 187
Running Paired Samples T-Test Using Excel 189
Chapter Summary 190
Review and Discussion Questions 191
Exercises 191
Key Terms 192

Chapter 10  Comparing Means of More Than Two Groups:


Analysis of Variance (ANOVA) 193
Comparing More Than Two Groups 195
Emily’s Case 195
Jim’s Case 196
Introduction to ANOVA 196
Two Types of ANOVA 196
Why Conduct ANOVA? 197
Understanding F-Statistic 197
What ANOVA Tells Us 199
Post Hoc Tests 200
Effect Size: Eta Squared 201
One-Way ANOVA 201
Note on Sample Sizes for the One-Way ANOVA 202
Running One-Way ANOVA Using SPSS 203
Running One-Way ANOVA Using Excel 209
Side Note: Omnibus Test Is Significant but Post Hoc
Test Is Not Significant 209
Repeated Measures ANOVA 210
Running Repeated Measures ANOVA Using SPSS 211
Running Repeated Measures ANOVA Using Excel 215
Other Types of ANOVA 217
Factorial ANOVA 217
Mixed Design ANOVA 217
Chapter Summary 218
Review and Discussion Questions and Exercises 219
Key Terms 220

Chapter 11  Bivariate Correlation 222


Examining Relationships 223
Emily’s Case 223
Mary’s Case 224
Pearson Product Moment Correlation 224
Direction of the Relationship 225
Strength of the Relationship 226
Visual Presentation of a Correlation: The Scatterplot 226
Note on Linear Versus Curvilinear Relationship 229
Testing Hypothesis and Statistical Significance for Correlation 230
Running Correlation Using Software Programs 231
Running Pearson Product Moment Correlation Using SPSS 231
Running Correlation Using Excel 235
Correlation Does Not Imply Causality 236
Chapter Summary 237
Review and Discussion Questions and Exercises 237
Key Terms 238

Chapter 12  Chi-Square Analysis 239


Examining Relationships Between Two Categorical Variables 240
Emily’s Case 240
Mary’s Case 241
Chi-Square Analysis 242
Calculating Chi-Square Statistics and Testing Statistical Significance 243
Note on Samples Size for Chi-Square Analysis 245
Running Chi-Square Analysis Using Software Programs 245
Running Chi-Square Using SPSS 245
Running Chi-Square Using Excel 249
Chapter Summary 251
Review and Discussion Questions and Exercises 252
Key Terms 252

Chapter 13  Regression Analysis 253


Predicting Relationships 255
Emily’s Case 255
Mary’s Case 256
Linear Regression Analysis 257
Regression Equation and Regression Line: Basis for Prediction 258
Assessing the Prediction: Coefficient of Determination (R2) 262
Assessing Individual Predictors: Regression Coefficient (b) 265
Running Bivariate Regression Using Software Programs 265
Running Bivariate Regression Using SPSS 265
Running Bivariate Regression Using Excel 269
Multiple Regression 270
Multicollinearity 271
Using Dummy Variables in the Multiple Regression 271
Running Multiple Regression Using Software Programs 273
Running Multiple Regression Using SPSS 273
Running Multiple Regression Using Excel 277
Mary’s Case 278
Brief Comment on Other Types of Regression Analyses 278
Chapter Summary 279
Review and Discussion Questions and Exercises 279
Key Terms 280

Chapter 14  Qualitative Data Analysis 281


Collecting and Analyzing Qualitative Data 282
Emily’s Case 282
Mary’s Case 283
Qualitative Versus Quantitative Data Analysis 284
Approaches to Qualitative Data Collection 285
Preparing Data for Qualitative Analysis 285
Thematic Analysis of the Qualitative Data 286
Mary’s Case 287
Brief Comment on the Qualitative Data Analysis Software 289
Analyzing Qualitative Data by Converting Them Into Numbers 291
Mary’s Case 291
Issues in Qualitative Data Collection and Analysis 292
Selection of Study Participants 292
Interviewer Effect 293
Subjective Nature of the Analysis 294
Chapter Summary 294
Review and Discussion Questions and Exercises 295
Key Terms 296

SECTION III.  SUMMING UP: PUTTING


THE PIECES TOGETHER 297

Chapter 15  Writing Reports 298


Data Collected and Analyzed—Then What? 299
Emily’s Case 299
Jim’s Case 300
Mary’s Case 301
Key Issues When Writing Reports 302
Understanding Your Audience 302
Academic Style Reporting Versus Nonacademic Style Reporting 302
Key Components of the Report 304
Abstract or Executive Summary 304
Table of Contents 305
Introduction 305
Review of the Literature or Project Background 305
Methods 305
Results 307
Discussions and Conclusions or Recommendations 308
References 308
Notes 308
Appendix 309
Alternative Forms of Reporting 309
Chapter Summary 310
Review and Discussion Questions and Exercises 310
Key Terms 311

Chapter 16  Using Research Methods for Continuous


Improvement: Program Evaluation and
Performance Measurement 313
Using Research in Program Evaluation and Performance
Measurement 314
Emily’s Case 314
Jim’s Case 315
Mary’s Case 316
Program Evaluation and Performance
Measurement as Research 316
Difference Between Program Evaluation and
Performance Measurement 317
Ty and Mary at the Conference 318
Key Issues in Program Evaluation 319
Types of Evaluation 319
Key Issues in Performance Measurement 322
Types of Performance Measurement 323
Who Conducts Program Evaluation and Performance
Measurement? 323
Ethical Considerations in Program Evaluation and
Performance Measurement 324
Practitioners Becoming Researchers: Making Sense of It All 325
Round Table Discussion at the Conference 325
Chapter Summary 330
Review and Discussion Questions 330
Key Terms 332

Appendix A: Additional SPSS and Excel Instructions 333


Appendix B: Emily’s Survey Form 339
Glossary 342
Index 356
About the Authors 366

Preface

Why Another Research Methods and Statistics Book?

We wrote this textbook on research methods and statistics for public and nonprofit
administrators to emphasize two aspects of research that we believe do not receive
sufficient attention in existing textbooks: the direct relevance of research methods and
statistics to practical issues in everyday work situations for public and nonprofit
administrators, and the importance of aligning research components in a logical man-
ner to produce meaningful results. Students and professionals in public and nonprofit
administration need research skills to become effective decision makers. Managers
need to understand how to conduct and evaluate research to become effective leaders.

Relevance
In our research methods and statistics classes, we found ourselves spending a lot of
time explaining to students why knowledge of research methods and statistics is
important to become effective managers. Many were skeptical. Yet, year after year, we
hear from students who got a job or an internship opportunity, coming back to tell us,
“I’m glad I kept the statistics book. It saved my life,” or “I use what I learned in your
class every day in managing my unit.” Our main purpose in writing this book was to
make this point clear to other students and to practitioners who may find themselves
needing help with their research design, statistics skills, and knowledge to solve practi-
cal issues at work. In our experience, public and nonprofit administrators, managers,
analysts, coordinators, and others are routinely using what they learned in their
research methods and statistics classes to help make decisions, shape policies, evaluate
programs, and manage resources. We think you will, too, and we want you to be pre-
pared for it. We provide practical examples to illustrate ways to use research methods
and statistics in different public and nonprofit work settings.

Alignment
Over the years, we have observed that even after completing research design and meth-
ods courses, students and practitioners often struggle with framing the basic elements
to develop and implement a successful research project. A good understanding of

xvi
Preface ❖  xvii

specific components of a research project, such as experimental design or survey data


collection or using a chi-square statistical test with categorical data, does not mean all
the components fit together in a logical manner to provide convincing results for a
specific research objective. In this book, we want to help students and practitioners
develop viable research projects that work as intended. Partly, this means aligning the
components of the research process. We cover simple steps to help with this alignment
in the first few chapters. In later chapters, we describe research designs, data collection
methods, and types of data, followed by appropriate use of different statistical tests
with quantitative data; and how to code, analyze, and report qualitative data. We hope
this step-by-step discussion, from concept to reporting, will help guide you as a
researcher while developing your own projects. This is especially important for
students and practitioners in public and nonprofit administration who aspire to attain
higher-level managerial positions that may involve overseeing research activities.
Understanding the alignment of research components is also important for graduate
students who need to develop and implement a master’s thesis or doctoral dissertation.

Who Is This Book For?

Research Methods and Statistics for Public and Nonprofit Administrators: A Practical
Guide is intended for upper-division and graduate-level research design and research
methods courses, particularly for students in the fields of public affairs, public admin-
istration, nonprofit management, and public policy. Students in other social science
disciplines, as well as business and executive courses, should also find the textbook
useful as an accessible guide to research methods and statistics. Graduate students can
use this book as a supplemental guide to clarify the research process.
We also wrote this book for practitioners working in the public and nonprofit
sectors. This book can be used as a textbook for professional development courses for
researchers and analysts. Practitioners who engage in research projects may want to
keep this book as a resource guide.

What Are the Key Features of This Book?

Case Stories
At the beginning of this book, the reader is introduced to three main characters: Jim,
the fire chief; Emily, the HR director; and Mary, the program manager of a nonprofit
organization. Throughout the book, the case stories show how each of the characters
approaches the research process to address different organizational challenges. The
case stories illustrate how research is relevant to specific public and nonprofit prac-
tices. The case stories also help the reader to understand how different research com-
ponents should be aligned throughout the whole research process. The case stories
are accompanied by cartoon illustrations to help the reader identify and follow the
different cases.
xviii  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Address Both Research Design and Data Analysis in One Book


This book discusses both research design and data analysis in one book to highlight
the connection among research questions, research design, and data analyses. Instead
of focusing on research design or statistical analysis independently, we make the con-
nections among the components explicit by discussing the whole research process.

Focus on Application
This book is a practical guide to research methods for students and practitioners in
public and nonprofit administration and policy. We want students to recognize the
importance and utility of research in their professional lives and to master research
methods that are immediately applicable in professional settings. Consequently, we
keep mathematical explanations of statistical concepts to a minimum. We introduce
mathematical formulas only when we think it will help the reader understand the logic
of the statistical test under discussion. We hope this approach will help those who have
statistics phobia to develop an initial understanding of the use of the statistics in
research and help relieve the phobia down the line.

Include Step-by-Step Instructions for SPSS


and Excel for Statistical Analysis
Most of the statistical analysis in the field of public and nonprofit administration is
conducted using statistical analysis software. While there are a variety of software pro-
grams available to students and practitioners, SPSS and Excel are among the most
popular and widely used. In this book, we provide visual, step-by-step instructions for
the analyses introduced in the book, using both SPSS and Excel. By following the
instructions, the readers can immediately practice how to use these software programs
and apply the processes in their own research projects.

Review and Discussion Questions and Exercises


Each chapter concludes with a list of review and discussion questions. These questions
can be used for in-class group discussions or for individual written assignments. In
Part II, where data analysis and statistical tests are introduced, the review and discus-
sion questions include exercises that use web-based data sets. Students can practice
each statistical analysis with these data sets, using SPSS and Excel. The exercises are
drawn from questions that emerge in the case examples included in each chapter,
which offer a practical context to engage in the research process. We hope students will
enjoy this manner of presentation, with fictional characters facing real issues.

Ancillaries

Instructor Teaching Site


Password-protected instructor resources are available at www.sagepub.com/
nishishiba1e to help instructors plan and teach their courses. These resources have
Preface ❖  xix

been designed to help instructors make the classes as practical an interesting as pos-
sible for students.
The instructor resources include the following:

• A Microsoft® Word® test bank, is available containing multiple choice, true/


false, and essay questions for each chapter. The test bank provides you with a
diverse range of pre-written options as well as the opportunity for editing any
question and/or inserting your own personalized questions to effectively assess
students’ progress and understanding.
• A Respondus electronic test bank, is available and can be used on PCs. The test
bank contains multiple choice, true/false, and essay questions for each chapter
and provides you with a diverse range of pre-written options as well as the
opportunity for editing any question and/or inserting your own personalized
questions to effectively assess students’ progress and understanding. Respondus
is also compatible with many popular learning management systems so you can
easily get your test questions into your online course.
• Editable, chapter-specific Microsoft® PowerPoint® slides offer you complete
flexibility in easily creating a multimedia presentation for your course.
Highlight essential content, features, and artwork from the book.
• Suggested classroom exercises are designed to promote students’ in-depth
engagement with course material.

Student Study Site


An open-access student study site can be found at www.sagepub.com/nishishiba1e.
The student study resources include the following:

• Data sets are posted to the web site for students to use as they apply their
knowledge through hands on activities in the book.
• Sample result write-ups are provided for several chapters in order to give stu-
dent's valuable examples of how to write your research findings.

Acknowledgments

W e are grateful to many people who helped us make this book possible. We
thank all the students at Portland State University who took our Analytic
Methods classes and gave us valuable feedback. We thank our teachers and mentors
who taught as all we know about research methods and statistics. In particular, we
appreciate the instruction of Dr. David Morgan, Dr. Jason Newsom, Dr. Margaret Neal,
Dr. Brian Stipak, Dr. David Ritchie, Dr. Susan Poulsen, and Dr. Peter Ehrenhaus.
Special thanks go to the following people for their help and support.
Jillian Girard and Caroline Zavitkovski helped us review and edit the manuscript.
They also went through the manuscript in detail and helped us develop the glossary.
Kyoko Hosoe did the illustrations of Emily, Jim, and Mary and helped us draw the bell
curves used in the figures. Dr. Terry Hammond provided extensive editing support
and helped us incorporate reviewers’ comments and streamline the manuscript.
We also appreciate Phil Keisling, director of the Center for Public Service, and Sara
Saltzberg, assistant director of the Center for Public Service, who gave the authors spe-
cial consideration and relieved us from other duties so we could concentrate on writing
this book.
The editorial staff at Sage Publications, in particular Maggie Stanley, Patricia
Quinlin, and Katie Guarino gave us valuable guidance and patience throughout the
project. We are also grateful for production support by Megan Koraly and the initial
contribution of Lisa Cuevas Shaw, who helped us develop the idea for this book.
Many reviewers commissioned by Sage Publications gave us valuable insights and
feedback. This book benefited from their suggestions. We want to thank the thoughtful
contributions of the following reviewers to the chapter drafts:

Matthew Cahn, California State University Northridge


David W. Chapman, Old Dominion
Natasha V. Christie, University of North Florida
A. Victor Ferreros, School of Public Policy and Administration, Walden University
Sheldon Gen, San Francisco State University
John D. Gerlach, II, Western Carolina University
Marcia Godwin, University of La Verne

xx
Acknowledgments ❖  xxi

James S. Guseh, North Carolina Central University


Peter Fuseini Haruna, Texas A&M International University
Dan Krejci, Jacksonville State University
Edward Kwon, Northern Kentucky University
Aroon Manoharan, Kent State University
Charles Menifield, University of Missouri, Columbia
James A. Newman, Western Carolina University
Lee W. Payne, Stephen F. Austin State University
Holly Raffle, Ohio University
Manabu Saeki, Jacksonville State University
Robert Mark Silverman, University at Buffalo
Feng Sun, Troy University
William Wallis, California State University, Northridge
Matthew Witt, University of La Verne
SECTION I:

Research Design
and Data
Collection
1 ❖
When a
Practitioner
Becomes a
Researcher

Learning Objectives 3
Case Descriptions of Practitioners Becoming Researchers 3
Emily: HR Director at a City 3
Jim: Deputy Fire Chief 4
Mary: Manager at a Nonprofit Organization 6
Purpose of This Book 6
Research Skills as Leadership Skills 8
1. Research skills help you develop problem solving, solution
construction, and social judgment skills. 8
2. Research skills help you acquire resources. 8
3. Research skills help you allocate resources more effectively. 9
4. Research skills help you advocate. 10
5. Research skills help you become a better decision maker. 10
6. Research skills support ethical leadership. 10
Chapter Summary 11
Review and Discussion Questions 11
Key Terms 12

2
Chapter 1  When a Practitioner Becomes a Researcher  ❖  3

Learning Objectives

In this chapter you will

1. Get acquainted with the three main characters in the case descriptions used
throughout the book
2. Become familiar with situations where research is an important component of
practice in the public and nonprofit sectors
3. Learn how research skills are an important component of leadership for public
and nonprofit administrators

Case Descriptions of Practitioners Becoming Researchers

Research is not just for professors in academia or scientists at a lab. Research is part
of every practitioner’s job when important decisions are being made. Practitioners
often face challenges that require research to identify the nature of problems and pos-
sible solutions.
Students with career goals to work in the public or nonprofit sectors should be
prepared to conduct, manage, and evaluate research. Throughout this book, the cases
of Emily, Jim, and Mary will illustrate instances when research is an important part of
routine work activities in public and nonprofit organizations. Each case represents a
different type of organization, roles within the organization, and goals that introduce
research into everyday practice.

Emily: HR Director at a City


Emily is the human resource (HR) director of the city of Westlawn,
with a population of 35,000. The city has approximately 500
full-time employees. Emily reports directly to the city manager
and oversees the human resources of the city departments,
including the Library, Parks and Recreation, Street, Water and
Sewer, Public Works, Community Development, Transit Authority,
Fire, Police, Finance, IT, HR, City Manager’s Office, and the City
Attorney’s Office.
Emily joined the city three years ago as the HR director.
Previously, she worked five years as a training manager at a neigh-
boring county. She was also active in promoting diversity and cultural awareness at the
county. At the city of Westlawn, Emily noticed the employees seemed to be less sensitive
about cultural issues. Comments and jokes sometimes referenced women or racial or sex-
ual minorities in ways that came across as insensitive and inappropriate. In the three years
she had worked at the city, Emily noticed the demographic composition of the community
4  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

had changed dramatically with a growing Hispanic population, drawn partly by the
booming farming industry in the area. Emily was concerned that the lack of cultural sen-
sitivity among the city employees could affect community and workplace relations as the
local population grew more diverse.
About six months ago, an employee made a complaint against a supervisor for making
angry, ethnic slurs. Emily was required to investigate. The results indicated that the supervi-
sor’s comments and behavior were discriminatory. This incident brought the issues of diver-
sity and cultural competence Emily had been pondering into focus. After three years on the
job, she felt she had gained enough trust from the employees as HR director to take an
active position. The city manager, Bob, gave her a go-ahead on a proposal to conduct
diversity training classes for the city’s employees.
To get started, Emily applied for a grant from the Community Foundation. In her pro-
posal, she specified four objectives:

• Increase awareness and understanding among city employees about diversity issues
• Develop cultural competence among city employees
• Decrease tension within work units due to poor communication and lack of
knowledge about diversity and cultural issues
• Provide tools for supervisors and managers to assess potential diversity issues in the
workplace and a system for increasing communication and promoting tolerance

She also stated that she would (1) conduct an assessment of the organizational culture
and (2) implement training and events to promote diversity and enhance cultural compe-
tence among the employees.
To Emily’s delight, the Community Foundation granted her the full amount she
requested. The award letter noted that Emily and her project team should meet with the
foundation’s program officer to finalize the project design. The letter also stated that the
foundation required all grantees to conduct an evaluation of the funded program and
include a statement of impact in the final report. Reading the letter from the Community
Foundation, Emily thought, “Hmm—it sounds like I will need to collect data to show the
training makes an impact. It says I need to do a ‘baseline assessment’ and report the result
of my ‘program evaluation.’ I’ll have to think how to do that.”

Jim: Deputy Fire Chief


Jim is the deputy fire chief for the city of Rockwood Fire Department, which serves
a population of 250,000. The department has eight fire stations and employs
500 uniform and nonuniform personnel. The city is proud of the Fire Department,
which has been in existence for 125 years and is consistently one of the top-rated
units in the city administration. In a recent citizen survey of service efforts and
accomplishments across all city departments, the Fire Department received the
highest grade.
Last year, a new fire chief was hired from outside the department, and Jim
was promoted to his new role as deputy chief. Chief Chen, the new chief,
Chapter 1  When a Practitioner Becomes a Researcher  ❖  5

came from a special district that served a population of 175,000. He was well known
and respected in the fire service as an innovative thinker and practitioner. The city
manager’s decision to hire the new chief was partly based on Chief Chen’s accom-
plishment in transforming the fire district into a data-driven, lean organization.
Although the Rockwood Fire Department received the highest citizen satisfaction rat-
ings, the city manager believed there was room for improvement. He forecasted dwin-
dling budgets for the department and wanted the department to operate more
efficiently.
The new chief placed Jim in charge of planning and operations for the department. The
department had collected various data for several years, but no one had analyzed it in any
systematic manner. Jim was given the task to analyze the performance of the organization
and evaluate some of its programs. Chief Chen and the city manager agreed that this
approach would help determine what did and did not work and what factors might con-
tribute to better outcomes. In addition, Chief Chen wanted to seek accreditation from an
international standardization body. Accreditation would serve as a “stamp of approval”
within the fire service. To obtain accreditation, the organization must have a comprehensive
performance analysis plan and evaluation capacity.
Jim recognized the importance of his new management tasks but was nervous, because
he did not have experience with research or data analysis. He had heard about other fire
departments pushing a “data-driven approach,” but he was not sure what that meant
exactly. The department did not have anyone dedicated to research or data analysis, so Jim
was on his own. Chief Chen asked Jim to look into two things: first, Jim was to analyze the
department’s response time to calls for service, compare the different stations to each other
and to national standards, and observe any changes in performance during the past few
years. Response time is a common performance measure for fire service and is required
information for accreditation. Response time is also important in citizen perception and
outcomes. This focus made sense, and Jim thought he might be able to find examples to
help him do the analysis.
The second task appeared more challenging. Chief Chen asked Jim to explore the
effectiveness of an “alternative service model.” He was hoping to deploy fewer, focused
resources to each service call. In Rockwood, a fire engine with four firefighters was dis-
patched to every service call. Emergency medical calls made up 92% of all calls, of
which 85% were for minor incidents such as sprained ankles. Chief Chen thought send-
ing out a fire engine with four firefighters to all calls was costly and ineffective, especially
when a second call at the same time required another engine to be dispatched from a
farther distance away. Furthermore, for the remaining 15% of medical emergency calls
that were more serious, the four firefighters with the engine were sometimes unable to
provide sufficient medical aid, because there were not enough firefighters at Rockwood
who qualified as paramedics. This situation had implications for mortality rates associ-
ated with the emergency calls, which is frequently used as a performance measure for the
fire service. Chief Chen wanted to explore alternative service models, perhaps dispatching
a car with one trained nurse and one firefighter or just a physician’s assistant to first
evaluate what was needed. He hoped such a change could be more efficient and more
effective at saving lives.
6  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

After a long meeting with Chief Chen, Jim said to himself, “Both of these issues will
require some research. Can I do it? Should I hire a consultant? Do we have a budget
for it?”

Mary: Manager at a Nonprofit Organization


Mary is a program manager at Health First, a large urban nonprofit organization.
The organization has 200 paid employees and 60 volunteer positions, primarily
operating from grant funds and private donations. As with most nonprofit organiza-
tions, resources are tight. Much of the work depends on volunteers, who generally
have little incentive to stay or show loyalty to the organization. The organization
invests valuable resources in training and managing the volunteers. Mary is respon-
sible for recruiting and retaining volunteer workers who conduct several canvassing
activities, including direct mailings and organization-sponsored events. Volunteers
can be trained relatively quickly for specific functions, but their overall performance
also needs to align with the organization’s mission and values. Due to the central importance
of fund-raising, the public perception of the organization is extremely important. Consequently,
Mary needs to vet new volunteers, which adds to the work involved.
Mary was finding it increasingly difficult to recruit and retain volunteers. She was
25% below the organizational goal for the number of volunteers. Fifteen volunteer posi-
tions were vacant. Additionally, the volunteers were not staying very long. The turnover
rate appeared to be increasing, but Mary did not have data to confirm the impression.
She did not know how long volunteers typically stayed. She did not have clear informa-
tion as to why they were leaving, because she was not conducting exit interviews.
The organization had a commitment to understand the needs of the volunteers, the
reasons they do the work, and the conditions that may entice them to stay. The executive
director noticed the gaps in the volunteer workforce, and she asked Mary to assess the sit-
uation and report her findings and suggested solutions. Mary’s first thought was to admin-
ister a survey to the existing volunteers. With survey results, she could create charts and
figures and deliver a clear presentation to the executive director and board of directors. Plus,
she could use her number-crunching skills to impress her boss. When she started thinking
what questions to ask in the survey, however, she started to feel confused. “What am
I supposed to ask them? And what does the survey tell us?”

Purpose of This Book

This book is a practical guide to research methods for students and practitioners in
public and nonprofit administration. The premise of this book is that research is an
integral part of the job for those who work in the public and nonprofit sectors.
As a practitioner, you are lucky if you have a choice to delegate research to oth-
ers. Even if your organization has a research department, you are likely to find the
analysts are overloaded and do not welcome extra work. You could consider hiring a
Chapter 1  When a Practitioner Becomes a Researcher  ❖  7

consultant, but then you need to have a budget to pay a high rate for the service. In
any case, finding someone else inside or outside your organization to do the research
does not mean you are totally off the hook. You are the one who is ultimately
accountable for the results.
Let us suppose that Jim convinced the chief to hire a consultant to do a study on
response time. After waiting a few months, the consultant comes back with a report
that says, “The result of the one-sample t-test shows the response time for Rockwood
Fire Department is .53 minutes lower than that of the national standard with statis-
tical significance (p < .05).” So what does that mean? Jim needs to understand the
language in the report, determine if the methods are sound and the data are accurate,
and finally, interpret the practical significance of the results. The quality of the work
overall is important, too. The presentation needs to be clean, consistent, and com-
plete, so Jim can be confident in pulling out parts to present to others and can answer
questions. In other words, as a practitioner, even if you do not do research, you need
to be able to judge the quality of the research and be an educated consumer.
This book emphasizes practical applications of research. With the help of Emily,
Jim, and Mary and their research challenges, we will cover basic issues in designing,
implementing, and using research in the public and nonprofit sectors. Our goal is to
help you recognize the importance of research in professional practice and master
research methods that are immediately applicable.
The cases of Emily, Jim, and Mary are fictional, but the research examples repre-
sent real projects. Facing three different research situations from the beginning will
highlight specific challenges and decisions, procedures, and methods through the
course of the projects. Occasionally, we introduce mistakes. Our characters are not
experienced researchers, but their mistakes should be considered as part of the nature
of the research, not the researchers. Every researcher needs to be prepared to detect
and correct wrong turns and validate results.
A key feature of this book is its emphasis on the alignment of research compo-
nents. Sound research requires an approach that integrates and aligns the following
research components (in bold):

• How does one identify and articulate a focused research objective?


•• How does one formulate a research question that addresses an existing opera-
tional, social, or community problem?
•• How does the research question guide the research design, sample selection,
and data collection?
•• What data analysis approach best answers the research question?
•• What do the results of the analysis tell us? Was the research question answered?
How should the results of the research be reported to inform practice?

More details on research alignment will be covered in Chapter 2. The examples


of research represented in the cases of Emily, Jim, and Mary will help ground the con-
cepts. You will see each one formulate a research question, design data collection,
analyze and interpret the data, and report the results.
8  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Let us briefly note what this book is not. This book is not about mathematical
statistics. We devote a minimum amount of space to statistical formulae and terminol-
ogy. Statistics is one of the many analytical tools available to public and nonprofit managers,
so we include some discussion of basic applications—enough to understand what sta-
tistical analysis can do for you. But statistics is only a small part of what makes research
effective. Getting the basics of research design right is the first critical step. We want to
clarify the course of research from beginning to end and make the journey accessible to
everyday practitioners in the public and nonprofit sectors. Along the way, we will discuss
when and how to use t-tests, ANOVA, chi-square analysis, and regression analysis. We
will not mention advanced analytic techniques, such as linear programming, structural
equation modeling, or factor analysis. In the same token, we will touch on qualitative
data analysis but will not go into details. Many good books are available for more
in-depth discussion of various statistical applications and qualitative data analysis.

Research Skills as Leadership Skills

We believe developing research skills helps practitioners become better managers and
leaders. Trainers and academic instructors also emphasize that training public and
nonprofit managers in the skills and techniques for leadership should be a priority
(e.g. Day, Harrison, & Halpin, 2009; Sims, 2002). Here are the reasons why:

1. Research skills help you develop problem solving,


solution construction, and social judgment skills.
Organizational psychologists Mumford, Zaccaro, Harding, Jacobs, and Fleishman (2000)
propose a model that states that leader performance is based on (1) complex problem
solving skills, (2) solution construction skills, and (3) social judgment skills. They say
leaders need to have skills to identify significant problems that need to be solved in con-
texts where oftentimes the problems are complex and ill defined. Consequently, leaders
need to gather information. Based on the information gathered, they formulate ideas and
come up with a plan to solve the problem. Leaders then need to find out how to persuade
people to work with them and implement the solution. This mirrors the research process
in many ways. First, you identify the problem (or research question), then gather infor-
mation (or data), analyze the information, interpret the results, and find a recommended
solution. As with research, documenting the process is important. Once you have a solu-
tion, you may need to tell your audience how you got there to persuade them to adopt it.
The elements of research are familiar features of group decision making. Understanding
research skills will help you develop key leadership skills.

2. Research skills help you acquire resources.


Leaders in public and nonprofit organizations are often expected to acquire resources
for their programs and projects. When financial resources are scarce, managers need
to look for external funding sources to be able to implement new projects. Like Emily,
Chapter 1  When a Practitioner Becomes a Researcher  ❖  9

the HR director, you may have to seek grant funding from foundations and govern-
ment entities. Oftentimes, when you apply for the grant, the funding decision is made
based on (1) whether you identify compelling issues, (2) evidence that the project you
are suggesting would solve the issues you identified, and (3) the soundness of your
plans to evaluate if your project has produced the desired outcome. Your research
skills will help you address these three areas. First, if you know how to identify a
research problem, you can apply the same logic to identify a compelling issue that
needs funding. Second, if you know how to review and assess the reports or projects
that were tried and tested by others, then you can easily compile evidence that sup-
ports why you think the approach you are taking for your project is sound. In
research, this process is known as a literature review. Third, you need to be able to
provide a plan to assess the effectiveness of your program. This means you need to
know what information to collect and in what way and how to organize the informa-
tion to demonstrate effectiveness. Applying research skills to write a grant proposal
will increase your chance to get funded.
Your ability to successfully evaluate and demonstrate the effectiveness of your pro-
gram is especially important. If you do not do a good job demonstrating the outcome
of your project, then the funders may not continue financing your project in the future.
Just saying, “This project produced a good outcome; I just knew it would,” will not work.
Applied research is most commonly used by public and nonprofit managers in program
evaluation. More details on program evaluation are covered in Chapter 16.
Resources also arrive through budget decisions. Applied research may be useful
here to make a case to fund your program proposals over others. This use of research
skills is called a needs assessment. Look at Mary’s situation as an example. She could
use an assistant to help her with volunteer management. With the number of volun-
teers dwindling, however, she does not have much cache to request an increase in
budget and personnel for her unit. She can make a stronger case if she makes a pro-
posal that provides a plan for increasing the number of volunteers and their commit-
ments. Her proposal will be more convincing if it is based on testimonials from
potential and existing volunteers. Managers most commonly use needs assessment in
strategic planning. This topic will also be covered in more detail in Chapter 16.

3. Research skills help you allocate resources more effectively.


Leaders and managers in public and nonprofit organizations are not only expected
to acquire resources but also manage and allocate available resources to achieve
results (Osborne, Plastrik, & Miller, 1998). Considering resource shortages and the
lean management initiatives taking hold in the public, private, and nonprofit sec-
tors, allocating resources efficiently is increasingly important. Through research,
you will be able to assess performance and evaluate results and make well-supported
allocation decisions.
Jim’s case at the Rockwood Fire Department provides a good example. Chief Chen
thinks sending out four firefighters for all medical emergency calls may be inefficient,
but he is not willing to make changes without first examining the costs and outcomes
10  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

of the proposed alternatives. He does not want to sacrifice efficiency in saving lives.
Jim’s research is expected to provide the chief with valid and reliable information on
the pros and cons of the different models.

4. Research skills help you advocate.


Advocacy becomes important when support is needed from political leaders or the
general public outside the organization. Occasionally, public-sector organizations ask
voters to approve a bond to support their operations or a policy change to provide bet-
ter service. Nonprofit organizations usually need to keep convincing donors to con-
tribute. In both situations, strong research skills will help.
Following Jim’s case again, what if the results of his research favor the alternative
to dispatch a nurse and a firefighter in a car, but the mayor, members of the city coun-
cil, or citizens, do not like the idea. To be effective advocates—and believe in the alter-
native service idea themselves—Jim and Chief Chen will need solid research with
convincing evidence.

5. Research skills help you become a better decision maker.


Knowledge from research will help you make better decisions. How effectively the
research is utilized in the final decision depends on the decision makers’ ability to assess
and comprehend the information. Suppose you get a report from a consultant you hired
to examine work satisfaction of your employees that says something like the following.

Mean rating of the level of satisfaction for the three departments (Human
Resources, Finance, and Community Outreach) were 3.45, 3.01, and 3.66
respectively. A one-way analysis of variance showed statistical significance
(p < .001). A post hoc analysis (Tukey, 1977) for all pair-wise comparisons
confirmed statistically significant differences between Human Resources–
Finance and Finance–Community Outreach pairs.

You as a manger need to be able to understand what this means and make a deci-
sion how to apply the result in your day-to-day operation. Knowledge of the research
process and analytic methods will help you digest academic reports and critically eval-
uate the reliability of the research. The best way to become an educated consumer of
research is to develop research skills yourself.

6. Research skills support ethical leadership.


Good leaders are concerned about ethics (Price, 2008). As Ciulla (1998) indicates, ethics
is “the heart of leadership.” Research brings a number of ethical principles into focus.
For example, researchers learn to balance their desire to obtain and disseminate infor-
mation with the right to privacy and dignity of those who have the information. The
conduct of research must be transparent to assure confidence in the data and the
Chapter 1  When a Practitioner Becomes a Researcher  ❖  11

results and allow replication by others, and this public nature of research amplifies
considerations of ethical behavior. In addition, data need to be reliable and unbiased
by personal interests on the part of the researcher or information sources. Ethical
thinking learned in research will help you become mindful of ethics in other contexts
as well. Our case examples, as they proceed, will offer opportunities to explore typical
ethical issues in research.

Chapter Summary
In this chapter, we explained the purpose and principal theme of this book. We introduced the
cases of Emily (the HR director of the city of Westlawn), Jim (deputy fire chief of the city of
Rockwood), and Mary (manager at a nonprofit organization, Health First). We described the
situations they face that require them to conduct research. We also discussed reasons why research
is relevant for practitioners, or future practitioners, in the public and nonprofit sectors. We argued
that learning about research and developing research skills will help you become a better leader.

Review and Discussion Questions

1. Think about the projects you have conducted in the past. Did any of the projects involve
research? Share your experience. What was the experience like? What did you like about it?
What kinds of challenges did you face?
2. What kind of leader do you aspire to be? For you to be an effective leader, what kinds of skills
do you think you need to develop? Discuss how learning research will help you become an
effective leader.
3. If you were in Emily’s position, how would you evaluate the impact of the diversity training?
4. If you were in Jim’s position, what steps would you suggest to Chief Chen that you will take to
study the effectiveness of the alternative service model?
5. If you were in Mary’s position, what would you do to find the way to recruit and retain more
volunteers?
6. Think about an organization you are familiar with. (It can be an organization at which you are
currently working, or where you used to work. It can be an organization that you just have a
lot of information about.) List problems and challenges this organization is currently facing.
Can you think about how you would suggest obtaining information to address these problems
and challenges?

References
Ciulla, J. B. (1998). Ethics, the heart of leadership. Westport, CT: Quorum Books.
Day, D. V., Harrison, M. M., & Halpin, S. M. (2009). An integrative approach to leader development:
Connecting adult development, identity, and expertise. New York, NY: Psychology Press.
12  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Mumford, M. D., Zaccaro, S. J., Harding, F. D., Jacobs, T. O., & Fleishman, E. A. (2000). Leadership skills for
a changing world: Solving complex social problems. Leadership Quarterly, 11(1), 11.
Osborne, D., Plastrik, P., & Miller, C. M. (1998). Banishing bureaucracy: The five strategies for reinventing
government. Political Science Quarterly, 113(1), 168.
Price, T. L. (2008). Leadership ethics: An introduction. Cambridge, NY: Cambridge University Press.
Sims, R. (2002). Understanding training in the public sector. In C. Ban & N. Riccucci (Eds.), Public personnel
management: Current concerns, future challenges (pp. 194–209). New York, NY: Longman.

Key Terms
Literature Review  9 Needs Assessment  9 Research Alignment  7

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


2 ❖
Research
Alignment

Learning Objectives 14
When the Research Process Is Not Aligned 14
Emily’s Case 14
Research Flow and Components 18
Overall Flow of Research 18
Step 1. Identifying the Focus of the Research (Research Objective) 18
Step 2. Identifying the Specific Questions You
Are Trying to Answer (Research Question) 20
Step 3. Identify How You Collect the Information
You Need (Research Design) 20
Step 4: Identify From Whom or What You Collect the Information
(Sample Selection) 21
Step 5. Collect the Data (Data Collection) 21
Step 6. Analyze the Data (Data Analysis) 22
Step 7. Interpret and Report the Results (Reporting) 23
Chapter Summary 24
Review and Discussion Questions 25
Key Terms 25
Figure 2.1 Research Flow and Components 19
Table 2.1 Emily’s Research Steps 24



13
14  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Learning Objectives

In this chapter you will

1. Understand the seven research components: (1) research objective; (2) research
question; (3) research design; (4) sample selection; (5) data collection; (6) data
analysis, (7) reporting
2. Learn the overall flow of the research process, corresponding to the seven
research components
3. Understand the importance of aligning the research components throughout
the research process

When the Research Process Is Not Aligned

Emily’s Case
Since Emily received the grant award notification from the Community
Foundation to implement a new diversity training at the city of Westlawn, she
and the training manager, Mei Lin, have been busy designing the curriculum and
planning for the training. One week before her meeting with Ahmed, the
Community Foundation program officer, Emily considered how she would evalu-
ate the outcome of the diversity training program. She knew this was something
Ahmed wanted to discuss.
Emily had developed new training programs many times in her previous job
but had never designed an evaluation. Colleagues in a separate department
had taken care of that. “It shouldn’t be too hard,” she thought. In the proposal, she
wrote that she would conduct an assessment of the organizational culture. She could
do a baseline assessment, conduct the training, and then evaluate how much things
changed after the training. She thought of using employee profile data to summarize
the demographic background of the employees for the baseline assessment. Then
she started to think about how to evaluate the training directly. She still had the
survey form used for trainings at her last job, which was developed by evaluation spe-
cialists. She found the form and read through the questions. It was short and to the
point:

•• How do you rate the effectiveness of the instructor?


•• How satisfied are you with the location of the training?
•• How satisfied are you with today’s training?

“This will work,” Emily said to herself. She quickly typed up her evaluation plan to
present to Ahmed.
Chapter 2  Research Alignment  ❖  15

When Emily met with Ahmed, he first went over the rules on the use of the grant money
and requirements for financial reporting. Then he asked how she planned to evaluate the
outcome of the program. Emily was glad she prepared herself for this discussion. She out-
lined her plan and showed Ahmed the survey form. Ahmed listened intently and took notes
and then reviewed the survey.
“Emily, you have good ideas,” he said finally. “It appears to me, though, that there is a
misalignment between the objective of your training and what you are measuring to eval-
uate whether the training program accomplished its objective. Let’s think this through
together.”
Emily thought, “What does he mean by ‘misalignment’?” She was puzzled.
“Can you tell me the objective of the data collection process you outlined?” Ahmed asked.
“Well,” she said a little defensively,” the objective is to evaluate the effectiveness of the
training, isn’t it?”
“OK,” Ahmed said. “Now, tell me how do you know whether the training was effective
or not?”
Noticing Emily looked confused, Ahmed added, “Maybe you can think of the kinds of
issues or problems you would like to see go away as a result of the training. If these issues
or problems go away after the training, then you know the training was effective, right?”
Emily recalled the problems she described in her grant proposal. “What I really want to
do,” she said, “is to train the employees to be more sensitive to cultural differences. There
are lots of examples of high workplace tension due to some insensitive comments and
discriminatory practices.”
Ahmed jotted a note and replied, “So, is it fair to say that if you could see the level of
the employees’ cultural competence improve and the level of workplace tension decrease as
a result of the training, then you can say the training was effective?”
“Sure,” Emily agreed.
Ahmed continued, “Then let’s say, for now, that your research objective is to evaluate if
the training improves people’s cultural competence and decreases workplace tension. Can
you try to rephrase that objective into a research question?”
“OK— does the training improve cultural competence and decrease workplace tension?”
“Sounds right.” Ahmed responded. “Do you notice anything about that question?”
Emily pondered. “I guess, I am actually asking two questions. First, does the training improve
people’s cultural competence? And second, does the training decrease workplace tension?”
Ahmed looked pleased. “Now, how would you design your evaluation to answer those
two questions?”
Emily repeated her idea about a baseline assessment and a follow-up assessment to
observe changes.
“That’s good,” said Ahmed, “that’s a before-and-after design. You can get some idea
from that, but it’s also prone to confounding factors. What if other events occur at the
same time as the training? Say the president gives a speech on race relations that makes
people think more about cultural differences, or the city sponsors an employee picnic that
helps people get to know each other better, and they feel happier at work.”
“Good point,” Emily thought, “I guess collecting data before and after the training is not
enough. I need to somehow account for external influences.”
16  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Ahmed continued. “Let’s think more specifically on how you are going to roll out the
training, who is going to attend, and how you will collect data. Do you intend to train every
employee in the city of Westlawn?”
“Ideally, yes.” Emily acknowledged. “But I will probably have to roll it out in phases. I
can’t get all 500 employees through the training at once.”
“Are you thinking of rolling out the training by department?” asked Ahmed.
“To be honest, I haven’t thought about it. But I don’t think it’s realistic to expect that
the whole Police Department or the Public Works Department will be able to attend the
training at the same time. Each department will only be able to send ten to fifteen people
at a time.”
Ahmed made more notes. “What you just said gave me an idea. Since you have to split
each department into those who attend the training and those who do not attend the
training at any one session, let’s assume you reach a stage where half of the employees in
each department have attended and half have not. In that case, you could compare the
level of cultural competence between those who attended the training versus those who did
not attend the training. Everything else being equal, if those who attended the training
have a higher level of cultural competence, then you can be more confident that the differ-
ence in the level of cultural competence is likely to be due to the training.”
Emily did not respond. She was thinking this through.
Ahmed continued. “This is called an experimental design. The people who take the
training are in the experimental group, and the people who don’t are in the control group.
With this design, everyone will experience about the same historical events, so external
confounding factors are likely to affect both groups equally. The only significant difference
in their experiences will be the training.”
Emily kept listening. Ahmed went on.
“Of course, if you select certain categories of workers to take the training, you might end
up with systematic differences between the experimental group and the control group, and
they might not be comparable. To make the groups as much alike as possible, it will be
ideal if you can randomly assign people to either take the training or not during your first
few sessions until you reach the halfway mark. Then test for differences.”
“Randomly assign? How do I do that?” Emily asked.
Ahmed paused and changed direction. “I know you have done a lot of work studying
your topic. You developed a strong proposal for what you want to do. I think at this point
it will help you to look at the literature again and focus specifically on research on evalua-
tion of a training related to cultural competence and workplace tension or any one of those
things separately—as close as you can find to what you are doing—to see how other
researchers designed their studies, what they measured, how they defined and selected their
comparison groups, and how they collected the data.”
Emily understood. “I see. So you want me to do a literature review on training, cultural
competence, and workplace tension.”
“That’s right.” Ahmed said. “By talking this through, I think we’ve clarified what you
want to accomplish. Can you work on this evaluation part some more? It appears there are
several complications to consider. You’ll have to decide what will best document the effec-
tiveness of your program and will also be feasible to do.”
Chapter 2  Research Alignment  ❖  17

On the way home in her car, Emily realized the evaluation as Ahmed described it was a
research project. This view had not occurred to her before. She could see now what he meant
by “misalignment” regarding her evaluation proposal. Looking at employee profile data
was not likely to tell her much about cultural competence or workplace tension. The survey
feedback would be worthwhile to show how many people attended and appreciated the
training, but it didn’t say anything about cultural competence or workplace tension either.
She wanted to know if the training really made a difference!
She started to think about the comparison groups Ahmed suggested and tried to imag-
ine a different survey to learn about employee attitudes. She would have to develop the
questions later, after looking at the literature—for that matter, maybe she would find some-
thing else to measure, without a survey—but if a survey, how would she administer it? With
Ahmed, she had been a little embarrassed to be confronted with difficulties that had not
occurred to her. She wanted to be prepared. “I could do a web-based survey, since we have
access to survey software,” she thought, “but then, not everyone in the city has access to a
computer at work. Maybe combine it with a paper survey. That might work.” The timing
was an issue, too. “I thought I could just do a survey before I rolled out the training and
then another survey after all the people completed the training, but this idea to compare
groups means I have to do it halfway through. And some people will take the survey a
month or so after they had the training, and some people will take it just a few days after
the training. Will that affect the responses?”
Then Emily remembered the issues Ahmed had raised just before she left, things she
should start thinking about up front. “You are going to want to think about the ethical
implications of your research,” he said. “For example, assure the employees that their
responses to a survey will not be identified with their names. All results, even individual
feedback, should be reported only as aggregated summaries, so individuals cannot be
identified.” He gave her a brochure produced by the Community Foundation that outlined
ethical guidelines for conducting research.
Then he asked about how she would analyze the data she collected. That seemed too
far away to comprehend. She told him they had a good analyst in the HR department who
could do anything she needed. She noticed Ahmed’s look of concern. “Just to make sure,”
he said, after you clarify your research objective and research questions and make a final
decision on your research design and data collection methods, let’s discuss your data anal-
ysis before you get started.” Reflecting on this comment, Emily realized she thought of
analysis as comparing numbers, simple as that. It occurred to her that she should look at
the analysis in the studies she found in her literature review, so she could understand some-
thing about what Ahmed wanted to discuss.
“One more thing,” he said at the end. “It may feel too early to talk about the final report,
but it will help if you start preparing for it from the beginning. It’s natural to think about
writing the report only after everything is done, but you should try to document your steps
as you go. Keep records. It’s much harder to reconstruct the details later, and it’s easy to
forget things. For example, to start, you could write up your literature review and describe
how you identified your research questions. Think about your audience while you do it. The
Community Foundation is your first audience, of course, but I imagine others in the city,
where you work, and perhaps citizens or special interest or professional groups might be
18  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

interested in reading your report. Think about who you want to read it. We encourage, and
actually expect, grantees to disseminate the information they produce with their projects.”
After two hours with Ahmed, Emily felt a bit overwhelmed. Her project proposal had
morphed into areas she only half understood. At least so far. Ruminating over the details,
she noticed she was eager to get started. She wanted a solid plan.

Research Flow and Components

In the early phase of the research process, it is important to identify the specific
components required. There are seven steps involved in the overall flow of
research: (1) identify the research objective, (2) identify the research question,
(3) determine the research design, (4) identify the sample selection, (5) data collection,
(6) data analysis, and (7) reporting.
These steps need to align with each other in a logical manner. In other words, the
research question you ask needs to address the research objective, the research design
you selected needs to answer the research question, how samples are selected and data
are collected need to match with the research design, data need to be analyzed in the
way that answers your research question and addresses your research objective, and you
need to focus on reporting findings relevant to your research question and research
objective. Aligning the research components may sound like common sense, but indi-
viduals new to research frequently mismatch components and end with confusing,
unconvincing, or irrelevant results. This typically happens when the researcher
focuses on a particular component and does not see how each component relates to
the others all the way through the process. In this chapter, we will first describe the
typical flow of the research process and then discuss in more detail how the research
components fit together.

Overall Flow of Research


In the research process, generally, there are suggested steps to follow. The seven basic
steps are summarized in Figure 2.1. These steps are listed sequentially, but research is
usually an iterative process, and you may have to go back and forth between steps 1
and 2, for example, before you move forward. Be prepared to rethink previous steps
and envision how future steps will fit with the plan.

Step 1. Identifying the Focus of the Research (Research Objective)


The first step of research is to identify the purpose. There are various reasons to do
research. The common thread is that every research has a problem (or problems) that the
researchers want to address. So in doing research, your objective is to address the prob-
lem. For example, Emily wants to solve the problem of workplace tension due to cultural
differences by raising people’s cultural competence in the city. So her research objective
is to identify a way to raise people’s cultural competence and decrease workplace tension.
Having a clear understanding of the problem you are addressing is the first step in
Chapter 2  Research Alignment  ❖  19

Figure 2.1   Research Flow and Components

Step 1. Research objective (Identifying the focus of the research)


•  Review the literature

Step 2. Research question (Identifying the specific questions you are trying to answer)
•  Review the literature

Step 3. Research design (Identify how you will collect valid and reliable information)
•  Review the literature

Step 4. Sample selection (Identify from whom/what information will be collected)


•  Review the literature

Step 5. Data collection (Define what will be collected and how)


•  Review the literature

Step 6. Data analysis (Analyze the data with appropriate methods)


•  Review the literature

Step 7. Reporting (Interpret and report the results)


•  Review the literature

identifying the research objective. The objective of your research should focus on pro-
viding answers to the problem you are facing in your practice. If you conduct research
without being clear about what problems you are trying to solve, you may end up with
answers that have nothing to do with the problems. Having a clear understanding of the
problem is necessary to align your research objective with what you want to know.
Researchers typically go through several iterations of a thinking process before
settling on the final research objective. At the very early stages, it is common for
researchers to start with a broad objective that only gradually focuses on specific
points. When Emily decided to apply to the Community Foundation for a grant, for
example, she had a broad recognition of the problem she wanted to solve. She was
concerned about discrimination complaints and evidence of tension in the workplace
due to insensitive comments about race and sexual orientation. She also thought the
lack of awareness of cultural differences among the employees could pose problems
20  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

as demographic change occurred in the community. Writing a grant proposal gave


Emily an opportunity to think about how to address these problems. To get an idea of
what programs or activities she could introduce to address the problems, she looked
into what other jurisdictions did; she read stories in trade journals that discussed
cultural competence issues. She even read some academic journal articles and books
that were referenced in the reports she read. In research terms, this process of initial
information gathering is referred to as a literature review. Based on the information
she gathered, Emily decided she would offer training to the employees to address the
problems she wanted to address. She chose training, because the literature suggested
it was a common approach to improve cultural competence.
After the Community Foundation agreed to provide Emily with the funds to
implement the training, and she was asked to make a plan to demonstrate the effective-
ness of her training, she was faced with a situation where she needed to clearly identify
the problems she was addressing and how she would measure results. She needed to
articulate her research objectives. At this juncture, as Ahmed suggested to Emily, it is
a good idea to conduct another literature review with more specific questions about
your research. The more you think through the problems you are trying to address,
and the more you gather information and review the literature, you may keep revising
your research objective. This kind of iteration may continue even after you move for-
ward through later stages of the research process.

Step 2. Identifying the Specific Questions You Are


Trying to Answer (Research Question)
Identifying the research objective and the research question in steps 1 and 2 are
closely related. A research question focuses and clarifies the research objective
simply by rephrasing it as a question. In some cases, with a broad research objec-
tive, you may have multiple research questions. Reviewing the literature to find out
what others have done on the topic will help you develop a better, more focused
research question. You will also get ideas on how you can move on to develop your
research design (Step 3), sample selection (Step 4), data collection (Step 5), and
data analysis (Step 6). We will discuss the research objective and research ques-
tions further in Chapter 3.

Step 3. Identify How You Collect the


Information You Need (Research Design)
Developing the research design in Step 3 is where you decide how the research will be
conducted and the process of data collection. The research objective and research
question identified in steps 1 and 2 should inform the research design. Having a clear
research objective and research question makes it easier to identify what information
needs to be collected: when, where, from whom, and how. We will discuss options for
research designs in Chapter 4.
Chapter 2  Research Alignment  ❖  21

Step 4: Identify From Whom or What You


Collect the Information (Sample Selection)
As you think about the research design, you will also need to think about from whom, or
from what entity, you are going to collect data. If you are surveying or interviewing people
because you are interested in what people think about a certain topic, you are collecting
data from people. You are interested in knowing about the individuals you are collecting
data from, so your unit of analysis is an individual. When your unit of analysis is an indi-
vidual, you will need to think about which individuals you are collecting data from and
how you will identify them. Some projects may have a research objective and research
questions that require you to collect data from nonhuman entities, such as organizations,
departments, or communities. In this case, you are interested in the information about
certain organizational or institutional entities. So your unit of analysis is an entity, and you
have to think how to select and collect information from them. Individuals or entities
you select for your study are called samples. They are called samples, because in research you
rarely get an opportunity to obtain data from every individual or every entity that you are
interested in. We will discuss sample selection further in Chapter 6.
Looking at Emily’s study, you may have noticed that she has two different units of
analysis corresponding to her two research questions. Her first research question is,
“Does the training improve people’s cultural competence?” In this research question,
she is interested in the level of cultural competence of the individual people who attend
the training. So the unit of analysis is an individual. Her second research question is,
“Does the training decrease workplace tension?” In this research question, she is inter-
ested in the level of tension in the workplace. So the unit of analysis is a workplace, not
an individual. This means she will need to identify two sets of samples, a group of
individuals and a group of entities defined as the workplace.
Notice how the discussion between Emily and Ahmed regarding the sample
needed to collect data revolved to the topic of the research design. Ahmed mentioned
a before-and-after design and an experimental design. This is a good example of an
iterative research process. When Emily started thinking about the sample selection,
she had a before-and-after design in mind. In the discussion with Ahmed, an alterna-
tive experimental design appeared to be a possibility that could produce better results.
In other words, the thinking process required for step 4, selecting the sample, gave
ideas for a better research design, step 3. This kind of back-and-forth thinking
between the research design and sample selection is an important feature of research
alignment. In an applied research project like Emily’s, the research design and sample
selection also need to be realistic and feasible. Clearly, Emily has more thinking to do
before she will have a workable plan for her research design and sample selection. We
will discuss the iterative process further in Chapter 4.

Step 5. Collect the Data (Data Collection)


Data can be collected in many ways. If you are collecting data on individuals, you can
do surveys, interviews, or look for a database that has information about the individuals.
22  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

You can also directly observe the individuals. There are many ways to conduct surveys,
interviews, and observations. Examples of different modes of surveys include mail
surveys, web-based surveys, and face-to-face surveys. Examples of different modes of
interviews include individual interviews, group interviews, and focus-group discus-
sions. Examples of different modes of observation include participant observation and
nonparticipant observation. Audio and video recording can also take place during
observation. If you are interested in collecting data about entities (e.g., organizations,
departments, communities), you can look at documentation of the entity’s activities or
obtain information from individuals about the entity they represent or observe
operations.
It is important to keep in mind the research objective and research questions when
determining the mode of data collection. In Emily’s case, Ahmed was skeptical about
her evaluation plan for the training, and he told her there was a misalignment between
the objective of her training and what she was proposing to collect as the data to eval-
uate the effectiveness of the training. One part of the data Emily suggested to collect
involved employee profiles on record in the human resources department. There is
nothing wrong about using data from an existing database for research. In Emily’s case,
however, it was not clear how her data collection would address her research objective
and questions about the levels of cultural competence and workplace tension. Before
her brainstorming session with Ahmed, she had not yet articulated her research ques-
tions. Emily also intended to survey the training participants. Again, there is nothing
wrong with administering a survey to collect data, but the survey form she decided to
use asked about satisfaction with the training and did not have questions to help her
assess cultural competence or workplace tension. Reviewing the existing literature can
also help you determine what data collection approaches were used, and specifically
what data collection instruments are available.
Ethical implications are another important thing to consider in data collection. In
what circumstances are people providing information to the researcher? Are people
being fully informed about the purpose of the study, and what are they expected to do?
Is there any possibility that the data collection process could cause harm or stress for
participants? Are the study participants given the option to decline answering any ques-
tions without penalty? How are the data collected, shared, and disseminated? These are
some of the questions that the researcher needs to think about in determining the spe-
cific mode of data collection. We will discuss data collection further in Chapter 6.

Step 6. Analyze the Data (Data Analysis)


Once the data are collected, the next step is data analysis. It is important to plan the
data analysis method ahead. Novice researchers often collect the data and then won-
der, “How should I analyze them?”—at which point, it may be too late to make changes
in the research design or data elements to assist the analysis. You should be thinking
about how the data are going to be analyzed from the beginning. When you are formu-
lating your research objective and research questions, what kind of analysis is appro-
priate to provide convincing answers? When you are deciding your research design,
Chapter 2  Research Alignment  ❖  23

what kind of analysis can be applied? When identifying your data collection methods,
will you capture the data elements you need? If you are capable of conducting a broad
range of data analysis approaches, you will have more flexibility in the range of
research designs you can adopt and the nature of the data you collect.
Emily’s case illustrates what appears to be a widely shared myth, that number
crunching is data analysis, and as long as you have numbers, a statistician can do the
analysis for you. First of all, not all number crunching is data analysis. Only when you
are crunching numbers to answer some specific question you can call it data analysis.
That means you need to know what questions you are trying to answer and how the
numbers contribute to an answer. Second, a statistician can do the analysis only if you
have the kind of data that will allow the statistician to run an analysis that fits your
research question. Otherwise it will be garbage in, garbage out.
Another important thing to keep in mind is that there are two types of data: quan-
titative data and qualitative data. Quantitative data capture what you want to know
as a measurement of some kind, represented in numbers. Qualitative data capture what
you want to know as in narratives or statements, represented in words. It may be pos-
sible to convert qualitative data into quantitative data, and it may be desirable to do so
to more adequately address your research question. Typically, however, data analysis
for quantitative data and qualitative data are fundamentally different. We will discuss
quantitative and qualitative data further in Chapter 14. Various approaches to statisti-
cal analysis, appropriate for different kinds of research questions, are presented in
Chapters 7 through 13.

Step 7. Interpret and Report the Results (Reporting)


Once you have analyzed the data, then you need to interpret the results and make sense
out of what you found. The first task is to articulate the implications of the research
results in relation to your research questions and research objective. Your stakeholders—
those who funded the research or could be affected by it—will want to know what the
results mean for the problems you are trying to solve. At this point, another literature
review may help you interpret the meaning of your data analysis results. You may want
to look at related research with a new set of questions that might have arisen in the
course of your data analysis.
In Emily’s case, she will need to produce an interpretation that tells her audience
whether or not the training was effective in raising cultural competence among the
employees and reducing workplace tension. If the training works as she expects, then
she has a good foundation to recommend that such trainings continue. If the training
does not produce demonstrable results, then she will need to discuss reasons to con-
tinue the training, revise the curriculum, or suggest alternative approaches to the
problems she identified.
Once the interpretation of results is complete, the next important task is to sum-
marize the results and your interpretation in a brief conclusion. Reporting the research
results in a summary is an important component of the research process, especially
when your stakeholders may be making key decisions based on your research.
24  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

It is also important to report your research in a format that allows you to share the
details of the process and your results with colleagues in your field. This is an impor­
tant part of your public service. You know as a practitioner that you are not alone in
the challenges you face. If you come up with interesting findings that help you address
your problems, most likely the findings—and the research that produced them—will
be useful to your fellow practitioners. And they will appreciate your collegial efforts.
We will discuss reporting further in Chapter 15.

Chapter Summary
In this chapter, we introduced the seven steps of the research flow and corresponding research
components (i.e., research objective, research question, research design, sample selection, data
collection, data analysis and interpretation/reporting). We also highlighted the importance of
having the research components aligned with each other. To illustrate the point, we described the
challenges Emily faced with her research proposal. Table 2.1 provides an overview of Emily’s
research process.
We emphasized in this chapter that research is an iterative process. You take one step forward
and realize you have to make adjustments in the steps you already completed. You may find it
necessary to take two steps backward and make the necessary adjustments. You feel frustrated
having to take two steps backward, but you realize that by going back and making the adjustments

Table 2.1  Emily’s Research Steps

Research Steps Emily’s Case


Step 1. Research Objective Evaluate if the training improves people’s cultural competence and
decreases workplace tension
Step 2. Research Question 1. Does the training improve people’s cultural competence?
2.  Does the training decrease workplace tension?
Step 3. Research Design Before-and-after design or experimental design
(Need to find out a way to control for the external influence)
Step 4. Sample Selection Random assignment into experimental and control groups
Step 5. Data Collection Web-based and paper-based surveys
Step 6. Data Analysis Statistical analysis based on the survey data.
Step 7. Reporting Emily will start to work on the bibliography and literature review and
take notes on the research process now.
Chapter 2  Research Alignment  ❖  25

now, you can move three steps forward more smoothly. You also need to think far ahead and
anticipate what is coming up three steps ahead. This can be a complex process.
The more you know about the research process and the research components, the easier it will
be for you to make sure the process flows smoothly and the components are aligned. Our goal is
to empower you with the knowledge to avoid unnecessary setbacks and mistakes. At the same
time, emphasizing the iterative process is intended to show you that moving back and forth in the
research flow is natural and no reason to get discouraged.

Review and Discussion Questions


1. Read Emily’s case description. Discuss what misalignment challenges she faces.
2. Review the list of problems and challenges of the organization you created in Discussion
Question 6 in Chapter 1. Identify a research objective that addresses the problems you listed.

References
Booth, W. C., Colomb, G. G., & Williams, J. M. (2008). The craft of research. Chicago, IL: University of
Chicago Press.
Loseke, D. R. (2013). Methodological thinking: Basic principles of social research design. Los Angeles, CA:
Sage.

Key Terms

Before-and-After Design  21 Iterative Process  18 Research Design  20


Comparison Groups  16 Literature Review  20 Research Objective  19
Confounding Factors  15 Qualitative Data  23 Research Question  20
Data Analysis   22 Quantitative Data  23 Sample 21
Data Collection  22 Random Assignment  24 Sample Selection  21
Ethical Implications  22 Reporting 23 Unit of Analysis  21
Experimental Design  21

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter 


3 ❖
Identifying the
Focus of the
Research
Research Objective and
Research Question


Learning Objectives 27
Identifying the Focus of the Research 27
Jim’s Case 27
Mary’s Case 29
Research Objectives 30
Jim’s Case 31
Identifying Research Objectives 32
Types of Research 32
Theory Building Approaches: Inductive Versus Deductive 33
Types of Data Analysis 35
Mary’s Case 36
Research Questions 37
Jim’s Case 37
Focusing Your Research Questions 38
Identifying Types of Research Questions 40

26
Chapter 3  Identifying the Focus of the Research  ❖  27

Literature Review 42
Chapter Summary 44
Review and Discussion Questions 44
Key Terms 46
Figure 3.1 Types of Research 33
Figure 3.2 Theory Building Approaches 35
Figure 3.3 Ty’s Board 39
Table 3.1 Steps in Focusing the Research Question 40


Learning Objectives

In this chapter you will

1. Learn the importance of having a clear understanding of the problems that


need to be solved and identifying the right research objective that will address
the problems
2. Learn how to formulate a focused research objective and research questions
that will provide answers to the research problem

Identifying the Focus of the Research

Jim’s Case
Jim, deputy fire chief of the city of Rockwood, is faced with two tasks
related to the performance analysis of the organization: (1) analyze
if the department’s response time meets the national standard, and
(2) identify an efficient alternative service delivery model that can
replace sending four firefighters to every emergency medical call.
After their first discussion of the two projects, Chief Chen asked Jim
to compose a study design for the alternative service delivery pro-
posal with an estimated budget to present to the city council.
Jim browsed the Internet to get help figuring out how to develop
a study on whether an alternative model is “effective” and “feasible.”
After going over some websites and reading some articles that discussed feasibility studies,
Jim concluded that he would include the following three things in the proposal to Chief
Chen: (1) he would do some kind of cost-benefit analysis, comparing the cost of operation
between the existing model and the alternative model, (2) he would contact other
28  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

jurisdictions in the state and ask them what strategies they have used and if those strate-
gies have been effective, and (3) he would identify key stakeholders and ask them what
they would want and if they thought the alternative service model would be of value to
them. The articles Jim read discussed using a survey or interviews as a way to collect data,
so he decided he would do a survey and interviews. He concluded that the fire department
employees were also stakeholders, so he would include them, too. While he was at it, he
thought he could also ask the other jurisdictions as well as the fire department employees
what they thought the alternative service model would cost.
In terms of a budget estimate, Jim thought the study would not cost much. “It’s just
sending out surveys and calling people to ask questions by phone. I may need a temp or
an intern to do this, but other than that, what would I need? The department has access
to a free online survey tool, so it shouldn’t be too hard to make a survey and send it out.”
Jim calculated a budget for a half-time temp staff for 3 months and some long-distance
call charges. Jim put a hard copy of the proposal in Chief Chen’s in-box.
When Jim met with Chief Chen a few days later, the chief first congratulated him for
writing a memo that conveyed the need for the study in a professional and convincing
manner. “I have some questions, though, about the study design,” Chief Chen continued.
“As far as I know, there is no other jurisdiction in our state that has adopted the alternative
model of service delivery we are considering. So I’m not sure I see the point of calling other
jurisdictions in the state.” He paused to see if Jim had any immediate response. When Jim
was silent, he went on. “I can see the benefit of asking our employees what they think about
adopting an alternative model; it would be difficult to implement if they do not buy in to
the idea. However, I do not think asking them about the cost and effectiveness of the alter-
native model will give us the information we need. We need to find a way to measure its
actual cost and effectiveness. Can you think of ways to get more concrete information?”
As Jim left the chief’s office, he thought to himself, “I should call my buddy, Ty, who
works at the university. I need help.”
Ty was working as a professor at the University of Rockwood. Jim and Ty had known each
other since they were both young firefighters in their hometown near Rockwood. Jim contin-
ued his career as a firefighter and climbed up the organizational ladder, becoming the
deputy fire chief, whereas Ty went to graduate school while he was working as a firefighter.
After completing his master’s degree, Ty quit working as a firefighter to pursue his doctorate
at a prestigious university out of state. When there was a position open at the University of
Rockwood a couple of years ago, Ty applied to be closer to his family. Jim and Ty got back
in touch. Jim was initially worried that Ty might have become one of those intellectual types
who use big words and ideas that Jim would not understand, but he was relieved to find Ty
was pretty much the same, still down to earth and practically minded.
On the phone, after listening to Jim explain the two projects he was struggling with,
Ty said, “I can probably help you do some brainstorming. Maybe we can meet tomorrow
at the fire station conference room. I know you have a big chalkboard there and that can
be handy.”
“Sure,” Jim replied.
“Between now and then,” Ty continued,” I want you to do some homework.”
Jim thought Ty was indeed talking like a professor.
Chapter 3  Identifying the Focus of the Research  ❖  29

“I want you to write down what problems you are trying to solve with these projects, and
then write down the objectives of your research. It might also help if you can think about
how the problems could be solved or not solved as the result of your project.”
“I can do that,” Jim responded, though he wondered as he got off the phone why Ty did
not just tell him where to get the data he needed and how to analyze it. He needed answers,
not a paper exercise, but he decided to do his “homework” and see what Ty had in mind.

Mary’s Case
Mary, a program manager at Health First, had been contemplat-
ing a survey to help find out why Health First was having diffi-
culties in recruiting and retaining volunteers. Despite her initial
excitement about the survey and presenting the results with
charts and graphs to the board members, the more she thought
about the survey—what questions to ask, who to survey—the
more daunted she felt.
The easiest group of people for Mary to survey was the exist-
ing volunteers. She thought she might list reasons why they decided to volunteer at Health
First and why they continue to volunteer. In the survey, she could ask them to choose the
option that best fit their reasons to continue volunteering, but when she tried to list all the
possible reasons, she could only think of three. There must be more reasons why people
volunteer, she thought. Not satisfied, but not knowing what to do about it, Mary started
on another question to ask where people learned about volunteering at Health First. Again,
she listed everything she could imagine for places where a person could learn about Health
First—newspapers, websites, community newsletters, flyers—and made little boxes for the
volunteers to choose one that applied to them.
While working on the survey questions, staring into space thinking, Mary saw one of the
old-time volunteers, Ruth, walk past. Mary jumped up. Maybe she could get some ideas
from Ruth. When she caught up with Ruth in the hallway, Mary asked her, “I’m just won-
dering, how did you find out about volunteer opportunities at Health First?” Ruth paused
a bit and said, “I don’t remember exactly, but I think it was my doctor who told me that I
needed to get out of the house more and suggested getting involved with Health First as a
volunteer. After coming to Health First for a couple weeks, I enjoyed the activity so much I
told my neighbor, John, about the opportunity. I remember he joined as a volunteer soon
after that.”
Mary thought, “Very interesting. Ruth just told me two more ways people get informa-
tion about volunteer opportunities: they hear from their health care providers and from
other volunteers. I wonder how many other ways people find out about us that I did not
consider?”
The next day, walking by the lunchroom, Mary overheard a group of volunteers com-
plaining how they felt unappreciated by the organization for what they do. Mary thought,
“I had no idea the volunteers felt that way. I need to get this kind of candid feedback from
them and understand more about their experiences and how they feel.” Her survey ques-
tionnaire seemed more daunting than ever.
30  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Later that week, Mary attended a breakfast meeting organized by the local association
for nonprofit organizations. The association organized a quarterly breakfast meeting, usu-
ally with a speaker, to help the local nonprofit members network with each other and
exchange information. The Health First executive director asked Mary to attend this quar-
ter’s meeting, because the topic of the session was “volunteer motivation.”
Mary saw many familiar faces in the room when she arrived. Most of them were volun-
teer coordinators like her. She noticed a woman sitting at one of the round tables waving
at her. “Good, Yuki is here,” she thought, smiling. Mary walked across the room and took a
seat next to Yuki. Mary knew Yuki from their time together in graduate school, studying
nonprofit management. At school, Yuki was known as a “research guru.” Many graduate
students in the program, including Mary, consulted Yuki when they got stuck with their
thesis research. Yuki was now head of the research department at one of the major foun-
dations in the region.
“How are things going?” asked Mary.
Yuki gave a brief description of the projects she was working on, and then asked, “What
about you? Anything new?”
Mary told Yuki about her challenge recruiting and retaining more volunteers. She made
sure to add that her current task required research.
Yuki smiled wryly and said, “I know that look on your face. Let me guess, you need
someone to talk through the process with you?” Mary nodded. “No problem,” Yuki said. “If
you have time, we can talk after this meeting.”
Mary effused thanks and felt relieved. She knew Yuki would be able to help her.

Research Objectives

Research originates from a problem that needs to be solved. This is especially true in
applied research (Remler & Van Ryzin, 2011). Managers face a range of problems and
issues that need solutions: programs do not run efficiently, clients express dissatisfac-
tion with service, employees lose motivation, or stakeholders question the worth of
your program. When faced with such a problem, a practitioner may need to conduct
some kind of research to find a solution.
Research is an information gathering activity that will help you identify solutions
to problems (Loseke, 2013). The better the quality of the information you gather, the
better your solutions will be. A key aspect of good information is whether it is directly
relevant to the problem you are trying to solve. Good information also needs to be
accurate and detailed enough to provide insights, but accuracy and detail will not help
you if the information has nothing to do with your problem. This is why we started
with the issue of alignment in the last chapter, matching the data you collect to the
problem you are trying to solve.
The first step to ensure you collect information aligned with the problems you are
trying to solve is to be clear about the objective of your research (Thomas & Hodges,
2010). A research objective is a statement of the purpose of your research—to what
end you are conducting your research (Polonsky & Waller, 2011). Defining a clear
Chapter 3  Identifying the Focus of the Research  ❖  31

research objective is easier said than done. We notice many novice researchers fall into
the trap of focusing on what to do and lose track of the objective of the research. Let
us look in again on Jim’s case and see how he gets help identifying the research objec-
tives for his two projects.

Jim’s Case
“OK, give me your homework,” Ty joked, with his hand out, after he
greeted Jim in the conference room at the fire station.
“Got it right here,” Jim said, waving the yellow note pad he was
carrying.
Ty accepted the pad and read:

Problem: The operation of Rockwood Fire Department is not


efficient. City of Rockwood is facing financial difficul-
ties, and we need to run our department more effi-
ciently.
Objective: Identify efficient ways to operate Rockwood Fire Department.
Result: The Rockwood Fire Department is efficiently operated.

“OK, this is a start,” Ty said, looking up. “You told me, though, that you have two projects
related to the fire department’s efficiency problem. I think we need to get down to the
details. What do you have to do?”
As Jim explained the two projects, Ty drew a line in the middle of the board and wrote
a heading “Response Time” on the left side and “Alternative Service Delivery Model” on the
right. When Jim was done, Ty stood aside from the board and said, “Let’s think about these
projects separately. They are really different.” Getting poised to write, he asked, “For your
response‑time project, how would you describe your problem?”
Jim recalled how Chief Chen presented the issue. “We need to know if we meet the
national standard for response time. We need to provide the data for accreditation. The
response rate should be under 5 minutes 90% of the time.”
Ty wrote:
Problem: no info on response time—meet standard.
Then Ty asked Jim, “So, if your problem is that you don’t know if you meet the national
standard for response time, what would you say the objective of your research should be?”
Jim responded with an edge of sarcasm, indicating he thought the answer was pretty
obvious, “To find out if our response time meets the national standard?”
“Exactly,” Ty agreed. He wrote:
Objective: To explore/describe if response time meets the national standard.
Moving to the right side of the board, Ty said, “This is good for the first project, for now.
Let’s do the same thing for the alternative service delivery model. What is the problem here?”
Answering Ty’s question, Jim started to describe what he said before about the ineffi-
ciency of sending four firefighters and an engine to all medical emergency calls.
32  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Ty stopped him. “When you say ‘inefficient,’ what do you mean?”


“Well, it costs us money to send an engine. And firefighters are not medical experts, so
it is also not very efficient in saving people’s lives.”
Ty wrote on the board:
Problem: Current model costly and not efficient in saving lives.
“Now what is the objective of this research?” Ty asked.
Jim relayed Chief Chen’s thinking again. “It might be more efficient to send a physician’s
assistant and a firefighter by a regular car to the scene first without sending the engine
and four firefighters. But Chief Chen wants to be sure this alternative model is indeed more
efficient than the existing model before we adopt it citywide.”
Thinking, Ty said, “So Chief Chen has a specific model in mind that he wants to compare
with the existing service delivery. It also sounds like he has a working hypothesis that this
alternative model is more cost efficient and effective in saving lives, in comparison to the
existing model.”
“Yes,” Jim agreed.
“How about this,” Ty said, and wrote on the board:
Objective: To confirm/test if alternative model is more efficient than the existing model.
Ty looked back at Jim and asked, “What do you think? Does this capture the research
objective?”

Identifying Research Objectives


We see in Jim’s example that he initially thought he knew his research objective, but the
brainstorming with Ty added more focus. When you are in the process of identifying
and focusing your research objective, it helps to be specific about the problems you are
trying to solve.
Jim first described his research objective broadly, stating, “Identify efficient ways
to operate Rockwood Fire Department.” This statement really represents the research
topic. Research topics are broad descriptions or areas of interest, such as alcoholism,
poverty, leadership, performance management, motivation, or organizational behavior
(Booth, Colomb, & Williams, 2008). All of these topics imply that there are problems
to address. The topic needs to be articulated as a specific problem to reach a definition
for the research objective. In the following section, we will introduce things you should
consider when clarifying your research objectives.

Types of Research

There are two major types of research objective. One type is to explore and describe
the phenomenon of interest. The second type is to confirm or test the hypothesized
relationship. (See Figure 3.1 Types of Research.) In Jim’s case, we saw both of these
types of research objectives appear in his two research projects. Ty characterized the
research objective for the response-time research as to explore/describe and the
research objective for the alternative service delivery model research as to confirm/test.
Ty underlined the terms to emphasize how each problem would be approached.
Chapter 3  Identifying the Focus of the Research  ❖  33

Figure 3.1  Types of Research

Types of Research

Types of
Types of
theory Types of
research
building data analysis
objective
approaches

Descriptive
To Explore and statistics
Inductive describe the
phenomenon Qualitative
themes
To confirm or
test the
Inferential
Deductive hypothesized
statistics
relationships

Group Cause and


Correlation
differences effect

Types of relationship

Identifying the type of research objective you have will help you state it clearly.
Understanding the type of research objective you have will also assist in how you build
theory, approach your data collection, and analyze the data. Let us look first at how the
types of research objective align with theory building.
Earlier in this chapter we noted research is an information gathering activity that
will help you identify solutions to problems. Information helps you identify solutions by
its relationship to a theory. Before we go further, we need to discuss how research and
theory fit together.

Theory Building Approaches: Inductive Versus Deductive


In research, you gather information through myriad types of observations. Based
on this information, you develop an explanation of an event or experience. The
proposed explanation forms a theory. Subsequently, when you have what appears to
be a similar situation that needs explanation, you can apply the theory you know
and see if it makes sense. In other words, when you have a problem you need to
34  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

solve, one way to approach it is to first gather information through systematic


observation of the problem (i.e. research) and develop a theory that explains why
and how the problem is occurring. Once you have a theory, then you can apply the
knowledge of why and how a problem is occurring to identify a solution in similar
situations.
Research is the process we take to develop theories. Despite a common portrayal
of theory as abstract ideas that have little to do with reality or practice, theory and
practice are integrally connected. Theory is an explanation of how things are, and
practice is the application of the knowledge (i.e. theory) to solve real-world problems
(Robson, 2011).
The relationship of research to theory building has two basic forms: an inductive
approach and a deductive approach (Hoover & Donovan, 2010). The inductive
approach starts with specific observations (Loseke, 2013). With an accumulation of
observations, you begin to identify patterns. When the patterns seem to be prevalent
in your observations, you can develop a hypothesis, which is like a tentative theory. If
your observations keep confirming your hypothesis, then your hypothesis becomes a
theory—an explanation that may help you understand some characteristic of what you
are observing. This approach is sometimes referred to as a bottom-up approach,
grounded approach, or exploratory approach.
A deductive approach starts from the opposite direction, with a general idea or
set of principles that suggest more specific ideas on how things are (Loseke, 2013).
In this case, a pattern of ideas forms a hypothesis, or tentative theory, that can be
tested to see if it is true, or perhaps, in what specific instances it is true. If your
observations confirm the hypothesis, then your hypothesis becomes a theory,
related to the original general ideas as a form of explanation that may help you
understand some characteristic of what you are observing. This approach is some-
times referred to as a top-down approach, hypothesis-testing approach, or confirma-
tory approach.
These two approaches to theory building, inductive and deductive, have different
starting points. Observation1 (i.e. data collection) occurs at different stages. Yet they
are not completely distinct or opposed to each other. The two processes can be
sequenced in such a way that they inform each other. For example, a hypothesis for-
mulated as a result of an inductive approach can provide the starting point for a deduc-
tive approach, taking constructed ideas to confirm with further observation. (See
Figure 3.2 Theory Building Approaches.)
As a practitioner–researcher, it is important to understand the basic difference
between the inductive approach and the deductive approach, because each is con-
nected to a different type of research objective. When you are taking an inductive
approach, you do not have a presupposed idea on how things are or should be. You are

1
Here we are using the term observation referring to data collection in general. This is not the
same as the observation as a technique for data collection, such as participant observation and
nonparticipant observation that we mentioned in the previous chapter.
Chapter 3  Identifying the Focus of the Research  ❖  35

Figure 3.2  Theory Building Approaches

Inductive Deductive
Approach Approach

General Idea
Theory (Theory)

Hypothesis Hypothesis

Pattern Observation

Observation Confirmation

basically making observations, describing the patterns, and exploring to identify a the-
ory. If your research objective is to describe the phenomenon, identify patterns, and
explore how things are, chances are you should be thinking in terms of an inductive
approach. On the other hand, when your research objective is to confirm or test your
hypothesis, most likely you should be thinking in terms of a deductive approach. The
different approaches will affect how you define your research components. (See Figure 3.1
Types of Research.)

Types of Data Analysis


The way you analyze your research data needs to align with your research objective
(Berman & Wang, 2012). The analysis needs to produce results that address your
research question and ultimately the problem you are interested in solving. We will
discuss specific data analysis techniques in later chapters of this book. Here, we want
to emphasize alignment.
For a research objective to explore and describe, if you have data captured as num-
bers (quantitative data), you can use descriptive statistics to present your results.
When the objective is to explore and describe, you can also use data captured as state-
ments (qualitative data) and present themes. For a research objective to confirm or
test a hypothesis, quantitative data are typically necessary. There are a number of
36  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

statistical analysis techniques called inferential statistics that can be used to analyze
the data. Qualitative data are ordinarily not suitable for confirming and testing hypoth-
eses in research. (See Figure 3.1 Types of Research.) Returning to Mary’s case will
elucidate practical issues in the approach to theory building and data collection. We
will discuss this issue further in Chapter 14.

Mary’s Case
After the breakfast meeting, Mary and Yuki went to a nearby coffee shop. Mary
was thinking about the speaker’s research on “volunteer motivation.” His talk was
based on a study he conducted over the years interviewing close to 200 volun-
teers from different organizations. At the end of his talk, he emphasized the
importance of talking to the volunteers to gain a full understanding of the broad
range of their motivations. Mary thought, “Maybe I should do something similar,
though not to that scale.”
Once they had their coffee and were seated, Yuki started, “So tell me about
your research challenges.”
Mary described the project and her thoughts about it over the past weeks. She men-
tioned her idea for a survey and her uncertainty about what questions to ask. She also
shared how she overheard volunteers saying they felt unappreciated. She concluded that
she needed to know more about the volunteers and what their experiences are like, but she
was not sure how to get that information.
Yuki thought a moment and said, “It appears to me that you need to take a so-called
‘grounded’ approach and start out exploring why your volunteers came to your organiza-
tion, what they think, why they are still there, and what things are on their wish list. You
could also track down some of your past volunteers, if you can, and ask why they left. You
might also want to find people who are currently not volunteering, but might be interested,
and ask them what would motivate them to volunteer. That would be very similar to the
study we just heard about this morning.”
Mary felt hesitant. “Those are good ideas,” she said, “but how do I approach them? How
do I get answers?”
“It sounds like you really don’t know enough about them to ask anything very specific,”
Yuki responded. “You should take an inductive approach and conduct in-depth interviews with
a few open-ended questions. Let them answer however they want. This would be a qualitative
study, exploring volunteer motivations to join Health First and continue volunteering.”
Mary looked troubled. “I don’t know anything about qualitative research. All I did in
grad school was statistics. And the speaker earlier said he had something like 200 people
he interviewed. I cannot interview that many people.”
Yuki sat back with an amused expression. “Don’t worry! I can find you a couple of books
that will give you some background on qualitative research. And I’ll help you, as a sort of
informal consultant.”
Mary brightened. Yuki’s offer encouraged her. “OK then, I am taking an inductive
approach with a qualitative study.”
Chapter 3  Identifying the Focus of the Research  ❖  37

Research Questions

After clarifying the research objective, the second step in the research process is to
rephrase the objective into a question or in some cases multiple questions. You need to
answer the research question to know how well you reached your objective. As a prac-
titioner, this means you have new knowledge that may help you improve work pro-
cesses or services (Berman & Wang, 2012). Returning to Jim’s case gives us a practical
example of how developing a research question, or questions, helps organize and align
the research process.

Jim’s Case
Ty looked at the chalkboard and back to Jim. “Let’s turn these
research objectives into questions. This will give us something
concrete to answer. Make sure you can see that the answer to
the question will help you solve the problem you started out
with.” He raised his hand to write. “What about the response-
time project?”
Jim stared at the board. “OK, the objective is ‘To explore/
describe if the response time meets the national standard.’ So—the
research question should be ‘Does the response time meet the
national standard?’ That is exactly what I need to find out.”
“Good,” Ty said. He repeated the words deliberately as he wrote
to let the question sink in: “Does the—response time—meet the—national—standard?” He
turned back to Jim. “But before you ask this question, I think you need to know something
else. How will you know if the Department’s response time meets the standard? What about
the ‘explore/describe’ part of the objective?”
Jim scowled. “What about it?”
“Well, do you know the current average response time for the Rockwood Fire Department?
I think you mentioned that you keep response-time data for each call. Do you know the
average?”
Jim shook his head.
“So, you need to ask first,” Ty articulated as he wrote, “What is the average response time
at the Rockwood Fire Department?” He sketched (1) next to this question and (2) next
to the other question. Without pausing, Ty then moved to the right side of the board and
looked at Jim. “Now let’s look at the Alternative Service Delivery Model project. This is prob-
ably going to be more complicated, but let’s see how it works out. How would you phrase
your question here?”
Jim looked at the objective written on the board: To confirm/test if an alternative model
is more efficient than the existing model. “How about,” he said uncertainly, “What service
delivery models are more efficient?”
Ty nodded, but looked unconvinced. “All right. But we need more focus. When you say
‘efficient’, how do you know the model is efficient or not? What do you have in mind to
gauge the efficiency of the model?”
38  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Jim answered, “Well, I guess there are many ways to gauge efficiency, but for me the key
things are monetary cost and mortality rate. If one model costs less per call than the other,
then the one that costs less is more efficient, but only if the mortality rate is no worse.”
While listening, Ty wrote on the board “efficiency-cost” and “effectiveness-mortality
rate.” He said thoughtfully, “Let’s make this into two parts. Your idea of efficiency involves
two things: how much it costs in resources, and how effective it is in producing valuable
results. Would you agree?” When Ty saw Jim nodding his head, he continued, “Then let’s
rephrase your problem statement and objective to reflect these two features. It all adds up
to efficiency, but let’s add the word ‘effectiveness’ in there to represent the two things you
want to know.” Ty wrote on the board and drew insertion arrows under the problem state-
ment and objective, so the objective now read “To confirm/test if an alternative model is
more efficient and effective than the existing model.” “Does this help focus your question?”
Jim read the board and said, “How about, ‘What service delivery models are more effi-
cient and effective?’” He hoped this was the right answer.
Ty responded carefully. “That is really two questions. We could break it apart. But first,
let’s think about what we want to know. You don’t intend to test any alternative model you
can imagine. You have a specific alternative in mind, right?”
“That’s right,” Jim said. “Chief Chen wants to know if sending a physician’s assistant and
a firefighter by regular car to the emergency call first is more efficient than the existing
model. So, how about ‘Does sending a physician’s assistant and a firefighter in a car to
medical calls reduce cost and mortality’?”
As he wrote on the board, Ty said, “That’s good. I’m just going to split that into two
questions.” Under Efficiency–Cost he wrote, “Does sending a physician’s assistant and a
firefighter in a car to medical calls reduce call cost in comparison to the existing model?”
Under Mortality–Effectiveness he wrote: “Does sending a physician’s assistant and a fire-
fighter in a car to medical calls reduce mortality in comparison to the existing model?”
Looking at the board, Jim was a little surprised. The process seemed too simple to be
helpful, but he admitted to himself, “This really organizes what I want to know.”

Focusing Your Research Questions


Jim’s case illustrates how your research question needs to have a tight focus (Putman,
2009). When a research question is broad and amorphous, the research will be difficult
to contain and complexities will accumulate. Take the following example: a researcher
is interested in poverty (a broad topic) and would like to understand the circumstances
that contribute to poverty in the United States. The question, “What contributes to
poverty in the United States?” is a fair one, but so broad that it would take the
researcher an enormous amount of time (perhaps a career) and an entire team of
researchers to answer. The question obviously relates back to an important social prob-
lem (problem statement) and passes the muster of the so what test (poverty is an obvi-
ous social ill that we would like eliminate), but for most researchers and their
organizations, it is not manageable to answer the question with some degree of rigor.
As we will see in the rest of this book, the research question guides the type of
research design, the data to be collected, and the subsequent method of analysis
Chapter 3  Identifying the Focus of the Research  ❖  39

Figure 3.3  Ty’s Board

(Thomas & Hodges, 2010). The research question is a road map that indicates a
basic structure for the following steps in the research process. You can also say that
a research question defines the project’s scope of work. As with any project, there is
always a danger that the scope of a research project will fall into the trap of scope-
creep. Having an interest in a large topic area makes it tempting during a project to
visit interesting side trails along the way. As a researcher, you see these trails when
you review the literature or collect your data or analyze the data. Although not
completely unproductive, taking side trails delays progress on the original objective
and can overburden your capacity to complete the project. Researchers should be
mindful of the consequences of side trails. You may want to note these junctions for
future projects.
It is natural to start a research process with a broad research question or a
research question that has multiple questions embedded within it (Booth et al.,
2008). It may take several steps and discussions with colleagues to narrow the focus
to a specific question, or questions, you can answer. Table 3.1 illustrates the process
in Jim’s case. Jim’s initial research question about efficient service delivery models was
broad and open-ended. Testing every possible model to see which one is more effi-
cient than the existing service delivery model would involve several research projects.
The question gained focus, first, by recognizing two components to the concept of
efficiency: input (cost efficiency) and output (effectiveness at saving lives). The ques-
tion gained further focus by acknowledging that a particular alternative model was
being considered. The result was two questions: comparing cost, and then mortality,
40  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Table 3.1  Steps in Focusing the Research Question

Broad Focused Even More Focused


Question Question Question Actual Research Questions
What What service Does sending a 1. Does sending a physician’s assistant and a
service delivery physician’s assistant firefighter in a car to medical calls reduce call
delivery models and a firefighter in a cost in comparison to the existing model?
models reduce cost car to medical calls 2. Does sending a physician’s assistant and a
are more and reduce reduce call cost and firefighter in a car to medical calls reduce
efficient? mortality? reduce mortality? mortality in comparison to the existing model?

between the current service delivery model and sending a physician’s assistant and
firefighter by car for emergency medical calls. The two research questions are focused
and appear manageable. Each one indicates a focused path toward answering the
question and solving the problem.
Taking time to formulate a research question by rephrasing your research objec-
tive allows you to examine what you want answered. As in Jim’s case, you may find
you have two questions, or multiple questions, that need separate attention. This is
not always clear in the statement of the research objective by itself. Articulating and
distinguishing multiple questions embedded in your research objective is an impor­
tant part of the process of tightening the focus of your research. As you proceed, you
will appreciate the clear focus at the beginning, because it keeps you on track to
achieve your objective.

Identifying Types of Research Questions


Earlier, we described two types of research objectives: (1) to explore and describe a
phenomenon, and (2) to confirm or test a hypothesized relationship. We then dis-
cussed briefly how these two different approaches related to theory and analysis.
Returning to the research question as the next step following the research objective, we
first emphasized the importance of using the research question to apply a clear focus
to the research objective. Now we are prepared to discuss how the two types of research
objectives—to explore/describe, and to test/confirm—help frame the research ques-
tion by anticipating a certain type of results.
In Jim’s case, we saw the research objective and questions for his response-time
project fit the explore/describe approach. In this case, the research question needs to
be phrased as a descriptive research question, where the answer is expected to docu-
ment the existence and status of something (McNabb, 2008). In Jim’s case, the
response-time question needed to be phrased so the answer told him the response time
at Rockwood Fire Department. Only after he knows the current status of the response
time will he be able to answer the question if it meets the national standard. We will
discuss descriptive research further in Chapter 6 and Chapter 14.
Chapter 3  Identifying the Focus of the Research  ❖  41

With the confirm/test approach, your research question needs to be phrased so the
answer will confirm or deny a hypothesis. This is more complicated than the descrip-
tive approach, because a hypothesis seeks to explain a relationship of something to
something else, and the relationship can take several forms. There are three principal
types of hypothetical relationship: (1) differences among groups, (2) correlation, and
(3) cause and effect. This means, when your research objective is “to confirm or test a
hypothesis,” you have three different ways you can phrase your research question.
These types of question are briefly described below. We will discuss hypotheses testing
further in Chapter 7.
When you have a group difference research question, you need to be able to
define what distinguishes the groups. In Jim’s case, notice how the research ques-
tion for the alternative service delivery project specified the alternative model
group in comparison to the existing model group. The groups in this kind of ques-
tion need to be clear enough that you know you will be able to identify and mea-
sure them.
Groups can be defined in many ways. A naturally occurring grouping of people
defines and distinguishes groups by a combination of individual characteristics and
conditions, such as gender, race, educational level, status, location, participation in
a certain activity, or membership in an organization. In this kind of grouping, the
researcher specifies the qualities of interest and studies the people found to fit the
definition. A researcher can also create groups through an assigned grouping of
people. We saw an example of this approach in Emily’s case in the last chapter where
she considered assigning half of the employees at the city to attend training as an
experimental group to compare to a second group of employees who did not attend
the training. As we saw there in the discussion of random assignment, the researcher
needs to consider a systematic manner to assign individuals to the comparison
groups. We will examine how we examine group differences in Chapter 8 and
Chapter 9.
A correlational research question hypothesizes that a characteristic of one thing
is related to a characteristic of another thing (Spector, 1981). The thing can refer to
individuals, conditions, objects, or events. An example of a correlational research ques-
tion might be as follows: What is the relationship between family size and median
household income in the United States? This kind of question expects to find a system-
atic pattern of differences according to the quantity or type of one thing in relation to
the other; in other words, a correlation. We will discuss correlational research further
in Chapter 10 and Chapter 12.
A causal research question hypothesizes that one factor X is a cause of the effect
Y. For example: Does giving tax incentives to a business cause more businesses to come
to the region? Various research objectives can be framed as a causal research question.
In Jim’s case, he might have phrased one of his research questions to ask: Does the
alternative service delivery model cause the cost to be lower? Jim has an interest in
identifying some causal relationship between the alternative service delivery model
and lower cost. We will discuss how a group difference research question can be set up
to address causal relationships in Chapter 4.
42  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Literature Review
We suggested in the last chapter that the first steps in the research process should
include reviewing the existing research and literature. This will help your thinking
process in setting up your research. One reason for the literature review is to gather
information on the topic of your research. It will also be helpful to know what has
already been done by other researchers.
You can use a variety of sources in a literature review, but the information you
gather should be credible. It is recommended that you look at the academic literature—
books, journals, and reports—documenting research on your topic. Other credible
sources include various government agencies, and with somewhat more caution, var-
ious nonprofit interest organizations and think tanks. You will want to find the most
current research available or nearest to the dates you are researching. Very good
research may be found that is distant either in time or place, and you may be able to
include it, but some caution will be necessary, due to cultural changes and differences
from your own situation—particularly if the source is decades old or originates from a
different country.
To start your literature review, you will need to find a way to search for relevant
information sources. You can use a regular Web search engine or a dedicated search
engine, like Google Scholar. General searches can prove rewarding. Some academic
articles may get listed that are available for free. You may also find organizations that
specialize in your topic and provide research reports and other resources. Recognize,
however, that a wealth of materials on your topic exists that will not show up in a gen-
eral Internet search. You will need to find search engines that cover sets of academic
journals and other materials related to your topic, or tap an index of reports by a par-
ticular government agency or organization.
For academic sources, your public library is a good place to start. Members may
be able to use a number of search engines that will access academic journals, either in
general or with a specific focus. Free access to full articles may be available for a num-
ber of journals, and others will at least show the article abstracts, so you can determine
if making the effort to track down the full article will be worthwhile. University and
college libraries have the best online access to academic journals, but there are strict
legal restrictions that prevent them from sharing online resources with persons unaf-
filiated with the institution. You may acquire access, however, by using computer ter-
minals in the library itself.
Note that even a university library will not provide online access to everything
that is available. Libraries are selective in what they purchase. Different libraries will
provide full online access to a different list of journals. Also, much more is likely to
be available in the paper stacks. You may be able to find the journal with the article
you want listed in the physical collection of the library, and you can locate and copy
it. This may apply to new, more obscure journals, and to older articles published
before the electronic age. Admittedly, though, online access is easier. Even older
materials may be available from your desk. Many journals have digitized their entire
publication archive. Some online search engines, like JSTOR, specialize in older
Chapter 3  Identifying the Focus of the Research  ❖  43

articles back to the 19th century. Others specialize in newspapers and other miscel-
laneous materials.
Books have also become more accessible with online resources. Library search
engines can access book reviews published in journals, and online book vendors or
reader groups frequently offer synopses and reviews. These resources can help you
identify books you want to look at, as well as authors you can use in your online
searches.
A few tips may help your harvest. Start your literature review by typing a few key
words into a search engine. Initially, you will most likely get a huge number of hits,
with many irrelevant items. The first few sources you review can help you narrow your
search by giving you more specific terms: an author, a title, or a publication year. Also,
pay attention to the references cited in the materials you review. A general Internet
search on a full title obtained from a reference can return a surprising number of items:
possibly a free copy of the article or book and other materials that reference the title.
Once you find a few good sources, you can check an online citation index (if you have
access to one) and look for other books and articles that cite the ones you have. These
newer sources are likely to be relevant to your topic.
The search process is a big part of the literature review, and any one effort is
likely to be incomplete. This is why more than one literature review is recom-
mended. New sources will emerge with each search. You will also want to keep
searching the literature during the course of your research to stay informed of
newly published materials.
Once you have a stack of promising materials to review, you will want to pay
attention to common topics that are well researched and established and any com-
mon findings and themes. Note specific findings related to points you are interested
in. At the same time, note the gaps in the research. In Jim’s case, for example, we can
imagine that he found studies that examined the existing model of sending four
firefighters to emergency medical calls, with results showing that the majority of the
medical emergency calls did not require firefighters or fire engines. This would sup-
port the idea of an alternative service delivery model. Some studies may have tested
alternative models, but none tested the model he was interested in: sending out a
physician’s assistant and a firefighter in a car. Nor could he find anything on effec-
tiveness at reducing mortality. Information about what he learned, and where there
were gaps, will be important when he develops his own research project to show
what is needed.
A good strategy for combining what you learned from your review is to first write
out the relevant findings, with key data elements, from each study. Then look at the
whole and note gaps where essential points were not addressed. When you start writ-
ing your review, organize the key information in a logical order according to your
objective. This will take a few iterations. You will probably include a number of things
in the first draft that appear interesting and relevant but are only tangentially related.
The literature review should not be a list of summaries like an annotated bibliography.
The more it drifts, the less convincing it becomes, and boring. Stay on point. Also, the
analysis needs detail, but not too much detail. The literature review is usually not the place
44  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

to insert tables of results. Draw in only enough detail to make your points. At the end,
summarize what you learned and state how your research contributes to what you still
want to know.
The content of your topic is not the only thing to glean from your literature
review. You will also want to pay attention to how other researchers approached the
topic. How did they focus their research? Were they exploring and describing, or
confirming and testing hypotheses? How did they design the research? What did
they choose to measure? How did they collect their data? How did they analyze the
data? Learning from others will inform your own research process. Probably, you
will miss the importance of some of these issues when you first start, but they could
become riveting once you face a specific challenge, like needing to design a survey
or align your research design with an appropriate statistical test. Document all your
sources as potential items for your reference list, and keep the most important ones
handy to reread.
The literature review is a critical part of your research process. This is where you
develop an in-depth understanding of your topic and the research practices related to
your topic. The impression is not really an understanding, though, until you manage
to write it out in a crisp, logical summary. The writing part of a literature review is not
easy if you are not used to doing it, even if you are a good writer. It takes practice to
get a handle on the topic, first of all, and then point it to support the objective of your
research. When you review other research reports, among the points of interest, also
pay attention to how researchers write their literature reviews. Good examples can be
found by authors such as Aveyard (2011), Fink (2010), Galvan (1999), Machi and
McEvoy (2012), and Ridley (2012).

Chapter Summary
This chapter corresponds to steps 1 and 2 in the steps of the research flow introduced in Chapter 2.
We described how to focus your research, first by establishing a clear research objective centered
on a problem or issue of concern. The next step is to formulate a focused research question, or
multiple questions, based on the research objective. Jim’s case illustrated this process. Further,
continuing the theme from Chapter 2, we emphasized the importance of aligning the research
objective with approaches to theory building and data analysis. We distinguished types of
research according to different types of research objectives, approaches to theory building, data
analysis, and three types of research questions. Finally, we provided guidance on how to conduct
a literature review.

Review and Discussion Questions


1. How would you describe the research objective and research question for Mary’s research?
2. Identify the unit of analysis for Jim and Mary’s research projects.
Chapter 3  Identifying the Focus of the Research  ❖  45

3. Find a research report or a journal article that is based on research. Identify the research objec-
tive and the research question(s). Are they taking an inductive or deductive approach? Is the
type of the research objective to explore and describe or to confirm or test a hypothesis?
4. Take the same research report or a journal article, and outline the literature review. What are
the main ideas summarized in the literature review? What is the connection between the ideas
summarized in the literature review and the research questions?
5. What are some of the potential issues that might arise if your project does not have a clear
research objective?
6. Compare the inductive and deductive approaches to research. What areas of research are you
interested in that might be more amenable to one approach over the other?
7. Why is quantitative data often more suited to deductive approaches?
8. How can the literature review assist with the alignment of your research process?

References
Aveyard, H. (2011). Doing a literature review in health and social care: A practical guide. Maidenhead, UK:
McGraw-Hill/Open University Press.
Berman, E. M., & Wang, X. (2012). Essential statistics: For public managers and policy analysts. Thousand
Oaks, CA: Sage.
Booth, W. C., Colomb, G. G., & Williams, J. M. (2008). The craft of research. Chicago, IL: University of
Chicago Press.
Fink, A. (2010). Conducting research literature reviews: From the internet to paper (3rd ed.). Thousand Oaks,
CA: Sage.
Galvan, J. L. (1999). Writing literature reviews: A guide for students of the social and behavioral sciences. Los
Angeles, CA: Pyrczak.
Hoover, K., & Donovan, T. (2010). The elements of social scientific thinking (10th ed.). Boston, MA:
Wadsworth Cengage Learning.
Loseke, D. R. (2013). Methodological thinking: Basic principles of social research design. Los Angeles, CA: Sage.
Machi, L. A., & McEvoy, B. T. (2012). The literature review: Six steps to success. Thousand Oaks, CA:
Corwin.
McNabb, D. E. (2008). Research methods in public administration and nonprofit management: Quantitative
and qualitative approaches. Armonk, NY: M. E. Sharpe.
Polonsky, M. J., & Waller, D. S. (2011). Designing and managing a research project: A business student’s guide
(2nd ed.). Thousand Oaks, CA: Sage.
Putman, W. H. (2009). Legal research, analysis, and writing (3rd ed.). Clifton Park, NY: Delmar Cengage
Learning.
Remler, D. K., & Van Ryzin, G. G. (2011). Research methods in practice: Strategies for description and causa-
tion. Thousand Oaks, CA: Sage.
Ridley, D. (2012). The literature review: A step-by-step guide for students. London, UK: Sage.
Robson, C. (2011). Real world research: A resource for users of social research methods in applied settings.
Chichester, UK: Wiley.
Spector, P. (1981). Research designs. Thousand Oaks, CA: Sage.
Thomas, D. R., & Hodges, I. D. (2010). Designing and managing your research project: Core knowledge for
social and health researchers. London, UK: Sage.
46  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Key Terms
Assigned Grouping  41 Descriptive Research Inferential Statistics  36
Question 40
Causal Research Naturally Occurring
Question 41 Descriptive Grouping 41
Confirm/Test the Statistics 35
Qualitative Data  35
Hypothesized Explore and Describe
Relationship 32 Quantitative Data  35
the Phenomenon   32
Correlational Research Research Objective  30
Group Difference Research
Question 41 Question 41 Research Question  37
Deductive Approach  34 Inductive Approach  34 Research Topic  32

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


4 ❖
Research Design

Learning Objectives 48
Identifying Research Design 48
Emily’s Case 48
Mary’s Case 49
Research Design: A Game Plan 49
Types of Research Design 50
Conditions for Cause and Effect 51
Temporal Precedence 52
Covariation of Cause and Effect 52
No Plausible Alternative Explanation 53
Key Elements of Experimental Research Design 56
Variations of Quasi-Experimental Research Design 59
Jim’s Case 59
Making a Causal Argument Based on the Experimental Design 63
Jim’s Case (Continues) 63
Other Variations of Experimental and Quasi-Experimental Design 66
Ethical Considerations in Experimental and Quasi-Experimental Design 69
Chapter Summary 69
Review and Discussion Questions 70
Key Terms 71
Figure 4.1 Types of Research Design Based on
When the Data Are Collected 50
Figure 4.2 Temporal Precedence 52
Figure 4.3 Experimental Research Design Illustration 57

47
48  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 4.4 Quasi-Experimental Research Design Illustration 58


Figure 4.5 Jim’s Design Options 63
Figure 4.6 Jim’s Time Line 65
Figure 4.7 Graph of Change in Outcome: Suggesting Causation 66



Learning Objectives

In this chapter you will

1. Learn different types of research design


2. Learn the concept of validity
3. Learn about threats to validity
4. Learn how to align the research design to answer the research question

Identifying Research Design

Emily’s Case
Emily, HR director at the city of Westlawn, brought her research team together
to share what she discussed with Ahmed, the Community Foundation program
officer. The team included training manager, Mei Lin, and a graduate student
intern, Leo. Emily explained that she now had two research questions that would
help them focus their evaluation of the training: “Does the training improve
people’s cultural competence?” And, “Does the training decrease workplace
tension?” She shared her idea to measure cultural competence and workplace ten-
sion before and after the training to assess the impact. She mentioned the idea
of splitting each department into two groups, so half would participate in the training
before the others. Then the team could measure the level of cultural competence and
workplace tension and compare the two groups at that point. She tasked Leo to find as
much literature as possible that discusses training evaluation, measuring cultural com-
petence, and workplace conflict. She asked Mei Lin to identify multiple scenarios for
rolling out the training. They decided to have a weekly meeting to discuss how to imple-
ment the project.
Emily told Mei Lin and Leo, “It looks like this is going to be a lot of work, but I really
want to do this right. I don’t want to be doing the training for the sake of training without
knowing what kind of impact it has on our employees. I believe focusing on the research
design before we launch the training is important. I appreciate both of your help on this.”
Chapter 4  Research Design  ❖  49

Mary’s Case
Mary, volunteer manager at Health First, was thinking of her friend Yuki’s advice to con-
duct a series of long interviews with her volunteers instead of administering a survey. She
agreed that the idea of an in-depth interview was more likely to
give her the information she wanted about recruiting and retaining
volunteers, but she was concerned. Her experience in graduate
school was with surveys, using quantitative data analysis and sta-
tistics. She knew how to interview people, of course, but she was not
sure how this could qualify as research. She had always thought
collecting a bunch of statements from people was too “soft” to be
scientifically legitimate. She worried what the board members
would think. There would be no numbers and charts to help her
make an impressive presentation. “How do I convince them of any-
thing?” she thought.
Later in the day, a package arrived at the office from Yuki. Two books were inside, and
a jotted note: “Mary, knowing you, I’m sure you have millions of questions about qualitative
research. Read these books first. Then call me. Enjoy!”
Mary was moved by Yuki’s thoughtfulness and prompt attention. She chose one of the
books and eagerly started reading.

Research Design: A Game Plan


Every research project needs a game plan to determine how an answer will be pro-
duced for the research question. The game plan is called a research design. In the
research flow outlined in Chapter 2 (Figure 2.1), the research design is Step 3, fol-
lowing the research objective and research question. A research design will estab-
lish a plan that includes the following elements: (1) the structure of when the data
are collected, (2) the time frame of the research, and (3) the number of data collec-
tion points.
There are numerous variations in research designs. Ethridge (2002) notes that
“research designs are custom-made rather than mass-produced, and we will rarely find
two that are identical” (p. 20). However, there are basic types of research, and it will be
useful to understand the strengths and weaknesses of common types of research design
applied to types of research. Some designs are suitable for a particular type of research and
not others.
When choosing a research design, there are some key factors that need to be
taken into consideration. Most important, the selected research design should
match the purpose of the research (Kumar, 2011). It should allow the researcher to
collect appropriate data that provides answers to the research question. Also, the
selected research design should fit the research objective. A research design to
describe and explore would be different from a research design to confirm and test
a hypothesis. In other words, the research design needs to be in alignment with the
research objective, research question, type of research, and the type of data
required.
50  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 4.1   Types of Research Design Based on When the Data Are Collected

# of data
collection
PAST PRESENT FUTURE
points

1. Collect now
1 time
about now

2. Collect 1 time
now about
the past multiple
times

3. Collected in 1 time
the past about
the past multiple
times

4. Collect now 2 or more


and the future times

Types of Research Design

There are various ways to categorize types of research design. We have chosen to
organize them by when and how many times data are collected. Figure 4.1 provides a
schematic depiction of the organizing framework. In this framework, when the data
collection occurs is represented along the horizontal axis in three categories: past,
present, and future. Along the vertical axis, when refers to the character of the data,
in four categories: collected now about now, collected now about the past, collected
in the past about the past, and collected now and in the future. Along the vertical
axis, we also took into account how many times the data are collected: one time or
multiple times.
In this format, considering only when data are collected, four types of research
design are distinguished:
(1)  Collect data one time now about now. This research design is appropriate
when you are interested in finding out how things are at the present moment. We see
an example of this in Emily’s case, with her interest in identifying the current level of
cultural competence among city employees. If this is all she wanted to know, she could
administer a one-time survey to obtain the information. This type of survey approach
is referred to as cross sectional survey design.
Chapter 4  Research Design  ❖  51

(2)  Collect data now about the past. In this research design, the data could focus
on one event at a single time point or multiple events across multiple time points. We
see an example of this research design in Mary’s case, in her interest to ask volunteers
why they volunteered, which refers to information about past events. Sometimes this
kind of data can be collected in a survey. We saw, however, that Mary had difficulty
finding a way to capture what she wanted to know in a survey. When collecting data
about the past that stretches over a longer time period, not just one time point, a
researcher may want to consider an in-depth interview or oral history to capture the
information.
(3)  Collect data in the past about the past. A researcher might be interested in
data collected in the past only one time or multiple times over a period. Unlike the
previous type of research design, this research design does not depend on the recall of
an informant. We see an example of this research design in Jim’s case, with his interest
in response-time data since 2009. The times were recorded, so Jim can retrieve the data
from archived records, ranging from 2009 to 2011, and analyze the trend (trend
analysis) over multiple time points. This type of approach is referred to as secondary
data analysis.
(4)  Collect data now and in the future. This research design is typically used to
assess change over time. Data collected at present as baseline data are compared to
remeasurement at some point in the future. Remeasurement can occur multiple times,
according to the resources of the researcher. Data collected multiple times in the future
can be used to assess trends. This is similar to the previous research design, using
secondary data from the past for trend analysis. Typically, though, this research design
is used to assess the impact of an intervention and ascertain a cause-and-effect rela-
tionship. We see an example of this research design in Emily’s case, in her intention to
assess the impact of her cultural competence training. She is planning to conduct a
baseline measurement with a survey, and repeat the same survey at a later time to
observe any changes she could attribute to the effects of the training.

Conditions for Cause and Effect

When the objective of a research project is to confirm or test a hypothesized causal


relationship, the research design requires special attention (Shadish, Cook, & Campbell,
2002). The selection of the research design affects the level of rigor in making claims of
causality based on study results. Generally speaking, in order to establish a causal
relationship between A and B we need to meet the following three conditions:

•• Temporal precedence: Changes in A precede the observed changes in B;


•• Covariation of the cause and effect: Changes in B are related in a systematic
way to changes in A;
•• No plausible alternative explanation: No other factors are responsible for the
observed changes in B (Trochim & Donnelly, 2007).
52  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 4.2   Temporal Precedence

Temporal Precedence

A B

Change in (B1)
Diversity
Level of Cultural
Training
Competence

Change in (B2)
Diversity Level of
Training Workplace
Conflict

Temporal Precedence
When you attempt to establish that A causes B, one of the minimum conditions you
need to meet is that the changes in A happened before the changes in B. A change
that occurs prior to an event cannot be claimed to be caused by it. In Emily’s case,
the change in employees’ experience due to the diversity training (A) needs to pre-
cede any observed changes in the level of cultural competence (B1) and level of
workplace conflict (B2) observed in comparison to employees who did not receive
the training.

Covariation of Cause and Effect


Another condition you need to meet the claim that A causes B is that changes in A are
systematically related to changes in B. If the changes in B happen at random, regardless
of the presence of changes in A, then you cannot make a claim that A caused B. In
other words, A and B need to have a relationship.
If you observe that whenever A is present then B is present, or whenever A is
absent, so is B, there is a transitive relationship between A and B. This relationship is
typically described as a syllogism:

If A, then B
If not A, then not B

In Emily’s case, any changes in the level of cultural competence (B1) and the
level of workplace conflict (B2) need to be systematically related to the change
Chapter 4  Research Design  ❖  53

introduced by the diversity training (A). Putting the example of cultural compe-
tence in the syllogism illustrates why Emily needs a control group to establish a
systematic relationship:

If diversity training (A) is offered, then there is an outcome in cultural competence (B1).
If diversity training (A) is not offered, then there is no outcome in cultural
competence (B1).

A cause-and-effect relationship is not always binary (yes/no; present/absent).


In some cases you may be looking for a situation where a different amount of A
leads to a different amount of B. This relationship is described in a slightly different
syllogism:

If more A, then more (or less) B


If less A, then less (or more) B

In Emily’s case, she might later be interested in examining if any observed


changes in cultural competence from the first training increased still more for
employees who took additional diversity trainings. In this case, she would be looking
for a relationship between the quantity of training and the quantity of improved cul-
tural competence.

No Plausible Alternative Explanation


One you establish a relationship between A and B by temporal precedence and covari-
ation, you then need to make sure the observed cause is not really due to some other
factor C that is also systematically related to A and B. To be certain that A is the cause
of B, you will need to eliminate all plausible alternative explanations for the changes in
B. In Emily’s case, she will need to show that no other factors other than diversity train-
ing (A) are responsible for any observed changes in the level of cultural competence
(B1) and workplace conflict (B2).
When you claim a cause-and-effect relationship exists between A on B, the extent
to which your claim is valid is referred to as internal validity. The presences of plau-
sible alternative explanations are threats to internal validity. Following are eight com-
mon threats to the internal validity (Campbell, Stanley, & Gage, 1963; Cook &
Campbell, 1979; Shadish et al., 2002):

(1)  History threat. An external event can be a threat to the causal argument.
Recall Emily’s case in Chapter 2, when she was thinking of administering a survey and
measuring the level of cultural competence and workplace conflict before and after
the diversity training. She expected that any observed increase in cultural competence
would be due to the training. Ahmed, the Community Foundation program officer,
then countered that external events, such as the president giving a speech on race
54  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

relations or a work picnic, might occur at the same time and influence the results.
Ahmed was raising the possibility of a historical effect that could pose a plausible
threat to validity.
(2)  Maturation threat. People change over time. They learn and mature from their
daily experiences. They also grow physically and get older. This natural maturation can
impact the outcomes you are observing in your research and could be a plausible alterna-
tive to your causal argument. This possibility makes it important to consider, for exam-
ple, age or work experience differences among individuals in groups you are measuring.
(3)  Instrumentation threat. Researchers use a variety of instruments to measure
the phenomenon of interest. Instrumentation threat refers to a case when the instru-
ment itself could be influencing the result. In Emily’s case, consider the survey she
intends to use to measure cultural competence. If she uses a different survey before and
after the diversity training, with different wording or order for the same questions, her
results could be affected by differences in the instrument. A common example of the
instrumentation threat to validity appears in face-to-face interviews on sensitive topics,
where a respondent may shape answers to avoid a negative appearance. The instrument
can also refer to the people who collect the data. Over time, an observer might get
bored and pay less attention, or might learn from experience and change the way
observations are interpreted.
(4)  Testing threat. Similar to the instrumentation threat, the testing threat oper-
ates when measurement takes place more than once. Here, instead of the instrument
itself, the issue relates to a learning effect by the subjects being measured. In Emily’s
case, if she uses the same survey before and after the diversity training, there is a pos-
sibility that some employees will have thought about their earlier responses and
decided to change answers to a “right” answer that they think Emily wants to hear.
Similarly, if students are given the same math exam a second time, they might show
improvement that reflects experience with the particular questions on the exam, not
improved skill in the math involved.
(5) Mortality or Attrition threat. During a research study participants will
often drop out. The term mortality is used metaphorically (usually) to refer to the
attrition of the study participants. Participants may drop out for particular reasons,
perhaps because they performed poorly in a baseline assessment or for other reasons
that distinguish them from participants who stay in the study. If attrition is random,
there may be no consequence, but a systematic change in the people in your study is
likely to affect your results. It will at least make it plausible that there is a threat to
validity, and you will need to address the issue to avoid criticism of the results. In
Emily’s case, suppose a number of employees refuse to take the cultural competence
survey following the training. If the results of the survey show improvement, she will
need to consider the possibility that those who dropped out do not endorse the idea
of embracing diversity in the organization and were responsible for lower average
scores on the initial survey. Dropping out can be a form of protest among individuals
who are systematically different from those who continue to participate.
Chapter 4  Research Design  ❖  55

(6)  Regression threat. The regression threat is also known as regression artifact or
regression to the mean. It refers to a statistical phenomenon that occurs when the mean
(scores of the data) from a nonrandom sample is measured twice, it will move closer to the
population mean. (Note: we will discuss more about nonrandom sample and population in
Chapter 5.) Scores from a nonrandom sample may include extreme scores in the first mea­
surement. However, when the same sample is measured twice, it is less likely that the
extreme scores will persist. In other words, even if you do nothing to the sample, an
extreme score is likely to move closer to the mean when measured a second time. This
threat to validity is dependent on the level of variation possible in the value being measured.
(7)  Selection threat. Comparing two groups is a common procedure research-
ers take to establish causality for an intervention offered to one of the groups. If an
outcome changes for the group with the intervention (called the experimental
group), but the outcome does not change for the group with no intervention (called
the control group), then you have a basis to argue that the intervention caused the
change in outcome. This approach is called an experimental design. We will discuss
this kind of research design in more detail in the next section of this chapter. When
two or more groups are compared, attention needs to be given to the possibility that
the composition of the groups are systematically different from each other and are
not comparable. If the groups are different to begin with (selection bias), then any
difference in results observed following an intervention could be due to the original
difference in the groups and not from the intervention. Notice in Emily’s case, when
she decided with Ahmed to offer the training first to half of the employees in each
department to compare to the other half who would not take the training, the issue arose
how the employees would be selected. If Emily allows people to sign up for the train-
ing (self-selection), then a selection bias could occur. Very possibly, those employees
most interested in the diversity training would sign up first. In that case, improved
survey results on cultural competence following the training could be due to their
interest and predisposition to be influenced by the training. Emily and Ahmed
agreed that the employees would need to be randomly assigned to take the training
to avoid this kind of selection bias in the composition of the groups. Note that even
when groups are selected by random assignment, researchers usually examine the
resulting composition of the groups by age and other factors to assess any differences
that might have occurred in the selection process.
(8)  Selection interaction threats. The selection threat to validity can also interact
with other threats to internal validity. Variations are described below:

•• A selection–history threat could occur if individuals in two groups experi-


ence an external event differently; for example, due to differences in a
preexisting attitude or different reporting of the event.
•• A selection–maturation threat could occur if two groups mature differently;
for example, due to gender or socioeconomic differences.
•• A selection–instrumentation threat could occur, for example, when responses
from two groups are measured with two different survey instruments.
56  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

•• A selection–testing threat could occur if differences between two groups


influence the way they respond or learn from exposure to repeated testing;
for example, due to perceived burden and inattention or learning to avoid
stigma by finding the “right” answer.
•• A selection–regression threat could occur, in one example, whenever an
undetected selection bias occurs in the composition of two groups. At
first measurement, the groups could appear similar and only later appear
different, due to regression toward different original conditions. Random
assignment should control for this possibility. The threat is more com-
mon in situations where researchers select extreme cases for an interven-
tion and then find improvement occurs. If variation is possible in the
value that was used to select the sample, then part of the improvement
could be attributable to regression to the mean (Barnett, van der Pols, &
Dobson, 2005).

Key Elements of Experimental Research Design

When the purpose of your research is to test if there is a cause-and-effect


relationship, you must develop a research design that meets the three conditions
elaborated in the above section: (1) temporal precedence, (2) covariation of the
cause and effect, and (3) no plausible alternative explanation. Specifically, you will
need an experimental design or quasi-experimental design (Fisher & Bennett,
1990; Shadish et al., 2002). In an experimental design, data are collected before and
after an intervention or treatment (i.e. pretest/posttest) with an experimental group
and a control group, both randomly assigned. This design meets all three conditions
for causality and is considered the most rigorous research design for making a
causal argument. The quasi‑experimental design has the same kind of group
comparison before and after a treatment or intervention, but group assignment is
not random. In Figure 4.1, both of these research designs belong to the type collect
now and the future.
There are five key elements in the experimental design: (1) observations, (2) treat-
ments or interventions, (3) groups, (4) assignment to group, and (5) time. In this
section, we will explain each element and introduce notations that are fre-
quently used. Figure 4.3 shows how the notations are used to illustrate a research
design:

(1) Observations. Observations are your measurement results, focused on the


outcome or effect you are testing in your study. For example, in Emily’s case, she is
hypothesizing that the diversity training will have an outcome in cultural competence
and workplace conflict. She has two observations: one before the training and one after
the training. The notation O is typically used to refer to the observations. Subscripts,
such as O1, O2, and so on, are used to distinguish a different set of results for the mea­
sures or different types of measure used for the observation.
Chapter 4  Research Design  ❖  57

Figure 4.3   Experimental Research Design Illustration

‘T’ indicates the time. T1


indicates the first time the
observation was made. T2
indicates the second time the
observation was made. Time

T1 T2
‘R’ indicates R O1 X O2
random
assignment X indicates

R O1 O2 Intervention

O1 indicates O2 indicates
observation made in observation made in
Time 1 for each group Time 2 for each group

(2)  Treatments or intervention. Treatments or interventions are the hypothesized


cause that is supposed to lead to a desired outcome (Judd & Kenny, 1981). In Emily’s
example, diversity training is the hypothesized cause for a desired increase in the
employees’ cultural competence and reduction in workplace conflict. A notation X is
typically used to refer to an intervention or treatment.
(3) Groups. When multiple numbers of groups are involved in a study, each group
is given a line in the description of the design. For example, if the notation of the research
design has three lines, that means the study involves three different groups. In Emily’s
case, if she decides to split the city employees into two groups—one group of employees
to take the diversity training and another group to not take the training—then the
description of her research design will have two lines. In this kind of experimental or
quasi-experimental research design, the group that has a treatment or intervention is
called the experimental group and will have an X in the line. The group that does not have
any treatment or intervention is called the control group and will not have an X in the line.
(4)  Assignment to Group. When there are multiple groups involved in a study,
you will need to decide how to assign study subjects to the groups. There are two ways
58  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 4.4   Quasi-Experimental Research Design Illustration

T1 indicates the first time the


observation was made, before
the training. T2 indicates the
second time the observation
was made, after the training. Time

T1 T2
‘NR’
indicates
NR O1 X O2
nonrandom
assignment X indicates

NR O1 O2 the
intervention
i.e. training

O1 indicates first O2 indicates second


measurement of measurement of
cultural competence cultural competence

to assign study subjects to the groups: random assignment and nonrandom assign-
ment. Random assignment refers to the case when all study subjects are given an
equal chance to be assigned to one of the groups in the study. Nonrandom assignment
refers to the case when the assignment of the study subject is not randomized. Random
assignment is preferred to assure the groups are roughly equivalent and comparable.
Many factors, both known and unknown, could make the individuals in one group
different from a second group. Even deliberately matching certain characteristics to
make the groups appear comparable could still leave a selection bias in the composi-
tion of the groups. Random assignment is designed to overcome any selection bias by
giving each study subject an equal chance to be assigned to one of the groups.
Nonrandom assignment of study subjects to groups creates what is called nonequiva-
lent groups (Fisher, 1970). We will discuss different methods for randomly or nonran-
domly selecting study participants in Chapter 5. The notation R is used to denote
random assignment, and NR is used for nonrandom assignment.
(5) Time. One of the conditions in establishing causality is the temporal prece-
dence of cause before effect. If a treatment or intervention is hypothesized as the cause
of a certain effect, then it must occur prior to the effect. A researcher must be careful
Chapter 4  Research Design  ❖  59

about the timing of observation and intervention periods to make sure the temporal
order is maintained. In the description of the research design, time moves from left to
right; elements listed on the left take place before the elements listed on the right. The
typical notation of time is T. When the outcome is measured multiple times, sub-
scripts, such as T1, T2 and so on, are used to distinguish different time points or times
for certain measures in an observation.

Variations of Quasi-Experimental Research Design

In the social sciences and applied research, it is frequently not possible to randomly
assign participants to groups. For example, people are already residents of certain geo-
graphic areas, children are already assigned to classrooms, programs may already be in
operation, and policies already implemented. In addition, in public service work, ethi-
cal and legal constraints may prevent randomly exposing a particular group of people
to a specific service. Whatever the reason, the given reality sometimes makes it impos-
sible for a researcher to randomly assign study subjects into groups. Considering such
factors in applied settings, a cause-and-effect research design may need to use a quasi-
experimental design as the only feasible choice. In the following sections, concluding
this chapter, we will introduce a variety of quasi-experimental approaches. Jim’s case
will allow us to describe practical examples of an after-only design (or posttest only
design), a before-and-after design (or pretest/posttest design), and a before-and-after
two group design (or pretest/posttest two group design). In the final section, we will
describe other variations that use additional groups or observation time points.

Jim’s Case
Jim, deputy fire chief at the city of Rockwood, felt more confident
with his research projects after meeting with his professor friend, Ty.
He decided he needed to work on the alternative service model
project first, because Chief Chen wanted to submit a budget pro-
posal to the city council. Originally, he intended to call a few other
jurisdictions to see what experience they had with alternative ser-
vice models, but Chief Chen pointed out that no other jurisdiction
in the state had adopted the alternative model he had in mind:
sending a physician’s assistant and a firefighter in a car to medical
calls. Also, Ty had stated that the research objective was to “test a
hypothesis,” which made Jim think of setting two cars side-by-side on a race track. He
needed to test the model in Rockwood, or make it apply to Rockwood. Other jurisdictions
were different in size and population, and he wasn’t sure results from somewhere else would
be applicable. The trouble was he still didn’t know how to start. He called Ty and asked if
he had time to meet again.
In the fire station conference room, both men sat across from each other at the long
table. Ty asked Jim what he had so far.
60  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

“I want to test this alternative service delivery model,” Jim started, “but I really don’t see
how, unless we just do it and track those things we talked about last time, track the cost
of the operation for efficiency, and see if the mortality rate goes down for effectiveness.”
Ty stood up, gesturing to Jim that he was going to follow up on that idea. He wrote on
the board:

After-Only Design
X O

Ty turned back to Jim and said, “If that’s the only way you can implement the program,
that’s one way to do it. This is called an ‘after-only design.’ The ‘X’ here represents the new
model, and the ‘O’ represents your measurement of cost and mortality sometime after you
implement it.” He wrote the text in parentheses under the symbols: “alternative service
program” and “cost/mortality rate.”’
“But wait,” Jim interjected. “We have data on the cost of operation and the mortality
rate for medical calls under the current service model. So if we introduce an alternative
model, we can compare before and after we introduce the alternative model.”
Ty grinned, turned around and continued writing on the board. “Good. Now you have a
‘before-and-after model’.” He started a new line:

Before-and-After Design
O X O

“This is better than the ‘after-only’ design,” Ty said as he wrote. “You compare service
data before and after the implementation of the alternative model. In research design lan-
guage, you have a pretest and a posttest, with an intervention in between.”
Jim was glad Ty liked his idea. But then Ty asked another question.
“Jim, do you see any problem with this approach?”
Jim thought, “Problem? What’s the problem?” He stared at the symbols on the board.
Then something occurred to him. “Actually, there could be a problem,” he said. “I know the
number of medical calls changes during the year, and even from year to year, depending
on the weather, certain holidays, like July fourth and New Year’s, and I don’t know what
else, but I do know the numbers go up and down. And the severity of the incidents can be
different, too. So if we start the alternative model and find good results, we still can’t be
sure that it’s due to the model, or due to fluctuations in what’s happening.”
Ty looked pleased again. “Exactly. We need to rule out all other plausible explanations
for any improvement we observe. Any thoughts on how to do that?”
“It would help, I guess, if we ran the model for a whole year, so it covers the same holi-
days, but I’m not sure that would account for everything. Plus, I don’t think we could run a
test that long without knowing if it’s working. So, I don’t know.” He looked mischievously
at Ty, “You tell me, professor.”
Ty laughed. “All right. First let me tell you that you’re right in everything you just said.
What we need here to solve the problems you mentioned is a control group. We need to
Chapter 4  Research Design  ❖  61

start the alternative model with one group—call that the experimental group—and continue
with another group that keeps the existing model of service delivery—call that the control
group. Then we set them in operation with the same external circumstances over the same
period of time.”
Ty started to write a new set of lines on the board. “This way,” he said, “you can compare
the results between the two groups and decide if the alternative model had an effect.”
When Ty stepped aside from the board, Jim could see the new lines:

Before-and-After Two Group design


O X O
O O

Staring at the notation, Jim got the idea. “So, you want me to have some stations adopt
the alternative service model, and some stations continue with the existing model, and
measure them both sometime before and after we implement the alternative model?”
“That could be one way to do it,” Ty answered. “Is that feasible?”
“I guess,” Jim replied. “We have eight stations in Rockwood, so four could adopt the
alternative model, and the remaining four could continue usual practice.” As he formulated
this idea, the advantages became apparent. “Actually, the council may like that idea. We
won’t have to change everything at once, just test the alternative model on a smaller scale
for awhile. That will be cheaper.”
Then a new problem occurred to him. Jim knew the different stations served neighbor-
hoods with different numbers and kinds of medical calls. “Wait a minute,” he said suddenly.
“This doesn’t solve anything. We still have one group with a set of external circumstances
that are different from the other group. You can’t compare these groups either.”
Ty took the challenge in stride. “Good point. That’s exactly what I was going to ask you
next. How do you think we should select who uses the alternative model and who uses the
current model? Ideally, we would toss a coin whenever a medical call comes in to any of
the stations, and depending on whether we get heads or tails, send out either four firefight-
ers and an engine or a physician’s assistant and one firefighter in a car. Then add up our
observations for cost and mortality for the two groups at the end.”
Ty drew a line down the center of the board and at the top of a new column wrote:
Group Assignment.
“A coin toss?” Jim muttered. “How can that be scientific?”
Ty picked up on Jim’s unease. “A coin toss would assure that all the medical calls have
an equal chance to be assigned to one model or the other. This kind of random assignment
would eliminate bias in selecting which calls go into each group, and the two groupings of
calls would be, in theory, as equal as possible in their characteristics.
Jim thought a moment about the different kinds of medical calls coming in—some severe
and life threatening, some with urgent injuries or health problems, and some more fright-
ening than anything else—and thought of a coin toss sending each one randomly to one
model or the other. Sure, that might make a fair distribution. But the practical issue con-
cerned him more. “That’s impossible,” he said firmly. “We can’t equip every station with both
62  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

models and then keep everyone on call, waiting for the dispatcher to toss a coin and send
only some of them out.”
“I suspected that,” Ty said. “So, tell me about the stations. At first you thought you could
assign the stations to one model or the other, but then you decided the medical calls com-
ing into the stations might be too different from each other to be comparable. What are
you thinking?”
Jim answered, “We have four stations closer to downtown, and four stations in subur-
ban and rural areas. The four stations downtown overlap to some degree and cover areas
that are probably comparable, but the rural stations cover independent areas that are a
little different from each other, but are more like each other, I think, than the urban sta-
tions. How do we decide which stations go into each group to make the groups equal?
With only eight stations, I don’t see how a coin toss will help us assign the stations to one
model or the other. What if we end up with all four stations in the rural area assigned to
the alternative model and all four stations near downtown assigned to the current model,
just by chance?”
Ty listened to Jim and responded carefully. “You make a good point. Let’s think this
through. You say the location of the stations matters, because the kinds of medical calls
some of them receive, overall, are different from other stations. The urban stations are
more like each other, and different from the rural stations. The rural stations are also
more like each other than they are to the urban stations. With these differences, you are
worried that when we select which stations adopt the alternative model, there are too few
stations to be confident that random assignment will give us equal groups in terms of
the kinds of medical calls they receive. For example, one group might get all the urban
stations. Is that right?”
Jim nodded.
“So, let’s try something else,” Ty said. He turned and wrote under the “Group Assignment”
heading, first one line saying, “1. Random Assignment: coin toss,” and under that, “2.
Nonrandom Assignment—matching.” Then he scribbled in boxes and lines and arrows
underneath. (See Figure 4.5.)
Turning back to Jim, Ty explained what he had in mind. “The trouble here is that once
we gather a number of calls together by fire station, as a matter of convenience, then each
call no longer has an equal chance of being randomly assigned to one group or the other,
because it’s dependent on the selection of any other call at that fire station. If one call is
assigned to a group—the alternative model or the current model—then all the other calls at
that station go, too. We need to correct for that. What we can do is use a form of nonran-
dom assignment called matching. We’ll group the urban stations on one side and the rural
stations on the other, representing two different populations of calls. The calls are fairly
similar within the matched groups, urban station or rural station, but different between the
groups. You see?” Ty pointed to the drawings of boxes on the board, representing the urban
stations on one side and rural stations on the other. “This is a nonrandom assignment,
because we are choosing. Once we’ve matched the stations this way, we can make a sepa-
rate random assignment of stations within each matched group, so we will be sure to get
two urban stations and two rural stations for each of the service delivery models.”
Jim took in the drawing. It made sense. “I can do that.”
Chapter 4  Research Design  ❖  63

Figure 4.5    Jim’s Design Options

“Great,” Ty replied. “I was worried that you might say something like, you will need to let
each station decide whether to try out the alternative model or not. If you do that, you
could end up having two groups that look very different. Allowing the participants to
choose the group they are in is likely to result in nonequivalent groups. It’s a good com-
promise in a lot of situations, if you have to do it, but not an ideal design.”

Making a Causal Argument Based on the Experimental Design


Jim’s case illustrates the development of a quasi-experimental approach to a research
design. The intent is to make a causal argument about a particular intervention, in this
case, the alternative service delivery model. Jim and Ty discovered that making a ran-
dom assignment of service calls to the experimental group (alternative model) or
control group (current model) was not feasible. Consequently, they determined that a
combination of matching and randomized assignment would be the most likely
method to make the groups comparable. To complete the research design for a causal
argument, Jim now needs to determine how his data will be collected.

Jim’s Case (continues)


“We’ve made good progress,” Ty continued. “Now we have an idea how to implement the
alternative service delivery model so we can compare it to a control group before and after
64  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

the implementation. Since we couldn’t rely completely on random assignment for the two
groups, we ended up with a quasi-experimental research design. That’s OK in applied
research like this. We still have a strong basis to assess the effect of the alternative service
delivery model. What we need to do now is figure out how you will collect the data to
measure cost and mortality. Why don’t you walk me through your data collection process
step-by-step?”
“All right,” Jim said, and got up to go to the chalkboard. He had been thinking about
this part of the research. “We collect data on operating cost for every station as well as the
mortality rate, so I was thinking I would compile the data from the last six months.”
On the left side of the board Jim wrote: “Compile cost & mortality rate for the last 6
months Jan–June FY 00.” He drew a box around the text and over it wrote: “FY 00.” To the
right, he wrote “FY 01.”
Turning to Ty, Jim said, “I figure I can collect data from January to June as baseline
data. The next fiscal year starts in July. We use the next six months, July to December,
to set up a system for the alternative model, then run it for six months next year, January
to June”—he pointed at the “FY 01” at the top of the board—“during the same time of
the year, you see, because I think that’s important. Then I collect data again and see
how it works.”
Jim wrote in more information for the planning phase and the idea of four stations
adopting the new model. Ty was impressed. He moved to the board next to Jim.
“Let’s add the notation for a research design we talked about earlier,” Ty said.
Underneath what Jim had written on the board, Ty added notation to illustrate the
research design:

4 stations O1 (A) X O2 (A)


(Alternative model)
4 stations O1 (T) O2 (T)
(Traditional model)

Ty explained, “You are going to collect data from all eight stations while they are oper-
ating with the traditional model, but four of the stations are going to be the experimental
group, and will adopt the alternative model.” He pointed to the “X” in the middle of the
board in line with the top row of notation. “The other four stations will continue with the
traditional model.”
“I see,” Jim said. “The ‘O’ is an observation period, the ‘X’ is the start of the alternative
model.”
“That’s right,” Ty said, and moved to another board on the wall. He drew a graph, and
along the bottom wrote in the two time periods, “This year Jan~ June” and “Next year
Jan~June.” On the vertical axis he wrote in numbers, from 0 to 9. “This is just for illustra-
tion,” Ty said as he chalked in a heavy dashed line for the traditional model, and a heavy
solid line for the alternative model. The dashed line was almost flat as it moved from the
first time period to the second. The solid line started in about the same place as the dashed
line, and then dropped dramatically.
Chapter 4  Research Design  ❖  65

Figure 4.6    Jim’s Time Line

“If the matching works when you select your two groups of stations,” Ty explained,
“then the cost and mortality values you get during the first observation period should be
about the same for both models.” He pointed to the starting point for the lines. “If the
alternative model really reduces cost or mortality—you could use a graph like this for either
one—then you will see the difference at the second observation period.” He pointed to the
wide gap at the end points. “If it works, this could be a good way to make your argument
to the city council.”
Jim nodded.
“The real reason I wanted to show you this graph,” Ty continued, “is to get you to start
thinking about how you are going to get the numbers for your results. Notice I just made
up the numbers here from zero to nine. We don’t know yet what your numbers are going to
look like. If you have good data, like you say, then calculating a number for mortality rate
or the cost should be pretty straightforward.”
Ty tilted his arms up in surrender and smiled, indicating he was done. Jim looked around
the room at all the writing on the chalkboards and said in a low voice, “I think I can do
this.” The two friends joked and gathered up their things.
In the foyer outside the conference room, they shook hands, and Jim looked straight at
Ty. “You are really a boring guy,” he beamed, “but thanks to you, I know I can make a good
proposal to the council. And I know how to get this project going.”
66  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 4.7    Graph of Change in Outcome: Suggesting Causation

Other Variations of Experimental and Quasi-Experimental Design

So far we have looked at the experimental designs and quasi-experimental designs


with one experimental group and one control group. There are other ways you can
structure and design the research. For example, it is possible to have more than one
treatment group. One of the most influential and widely cited policy experiments,
Minneapolis Domestic Violence Experiment (Sherman & Berk, 1984), used three differ-
ent interventions that were compared to each other. The focus of the study was to
determine which strategy was most effective at reducing domestic violence assaults.
Among a pool of offenders where probable cause for an arrest existed, officers were
directed to randomly choose how to proceed by opening instructions at the scene that
were sealed in an envelope. Three different instructions could be in the envelopes:
(1) arrest the suspect, (2) separate the parties for 8 hours, or (3) advise and mediate.
The notation would appear as follows:

Arrest R O1 X1 O2
Separate R O1 X2 O2
Mediate R O1 X3 O2
Chapter 4  Research Design  ❖  67

The researchers observed police records for subsequent assaults six months later
and calculated the percentage of repeat offenses for the three different interventions:
arrest 19%, separate 33%, and mediate 37%. Among the options police officers had
available to them, represented by the three interventions, arrest was shown to be the
most effective. Comparing the randomly assigned groups to each other provided a
clear result.
Another variation is found in the placebo design, commonly used in clinical
trials for pharmaceuticals. In medical interventions, it is known that when patients
believe they are receiving treatments, they may improve even when the treatment has
no therapeutic benefit. This psychological effect is called a placebo effect. To control
for this possible result, medical researchers have learned to imitate an intervention
with a placebo that appears just like the intervention, so the subjects (and usually the
researchers) do not know if they are getting the real treatment. The research design
has three groups: a treatment group, a placebo group, and no treatment. With this
design, an experimental treatment needs to demonstrate not only that it is better
than no treatment, but also that is it better than a placebo. The notation would
appear as:

Treatment group R O1 X1 O2
Placebo group R O1 X2 O2
Control group R O1 O2

Limitations in applied research can sometimes determine the research design.


For example, pretest information may not always be available, especially in program
and policy evaluations where a decision was made to assess effectiveness only after
the fact. In this situation, researchers could use an after-only design (introduced in
Jim’s case). This research design includes an experimental group and a control
group, but with only one measurement after the implementation. The results can
suggest the effectiveness of an intervention, but the design is not ideal in terms of
rigor, and it limits the ability to make a causal argument. The notation would appear
as in the example:

X O
O

The after-only design is used as a kind of control in a more complex research


design, called the Solomon Four-Group Design. This design utilizes four groups in
a hybrid experimental design: the first group (A) has a pretest and posttest with
intervention; the second group (B) is a control group to Group A, with a pretest and
posttest, but no intervention; the third group (C) receives an intervention like
Group A, and a posttest, but no pretest; and the fourth group (D) is a control group
68  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

for Group C, with a posttest, but no pretest and no intervention. The notation
would appear as shown:

Group A R O X O
Group B R O O
Group C R X O
Group D
R O

This Solomon four-group design is useful to control for a possible testing threat
to validity, where a subject’s exposure to the test or measurement at the pretest may
have affected the posttest scores. There are a number of possible comparisons built
in. First, the researcher can compare the difference in posttest scores between
groups A and B versus the difference in posttest scores between groups C and D. If
the difference score between A and B is similar to the difference score between C
and D, then the researcher can rule out the testing threat. A comparison can also be
made between Group A and Group C for posttest scores, as both groups received
the treatment, and a comparison can be made between Group B and Group D, as
both groups did not receive the treatment. If Group A and Group C have similar
scores, and Group B and Group D have similar scores, then the researcher can rule
out the testing threat.
Another variation of the research design is called time series design, which
takes measures or observations of a single variable at many consecutive periods in
time. These designs are sometimes also referred to as interrupted time series
designs. In this version, several observations are conducted before a treatment is
introduced, and then there are another series of observations. The notation would
appear as written below:

O O O O X O O O O

This research design has an advantage over before-and-after observations,


because it controls for history and any immediate effects the treatment may have that
could possibly dissipate as time progresses. In Jim’s case, this design could be adopted
to track cost and mortality before the implementation date of the alternative model,
and then after implementation, to detect any changes that may be occurring due to
external factors.
To increase the rigor of this design, a control group can be introduced. The advan-
tage here is that the researcher can get more precise information on the trends that lead
up to the intervention, and how things change afterward even when there is no inter-
vention. The notation would appear as the following:

O O O O X O O O O
O O O O O O O O
Chapter 4  Research Design  ❖  69

Ethical Considerations in Experimental and


Quasi-Experimental Design
In designing an experimental or a quasi-experimental study, researchers need to
consider its ethical implications on subjecting study participants to a treatment, or not
providing a certain group an opportunity to benefit from the experimental treatment.
In a placebo study, is it ethical for a researcher to subject study participants to
treatments that are known to have no effect on the outcome, though the study
participants believe they are receiving a treatment? In Emily’s case, is it ethical for her
to randomly assign a group of employees to benefit from diversity training, and not
allow another group to take the training? In Jim’s case, is it ethical to introduce an
alternative model of service delivery when the impact on residents is unknown?
These are the kinds of important considerations that a researcher needs to weigh
before finalizing the research design. One way to address some of these ethical con-
cerns is to obtain informed consent from the study participants. In a placebo study,
participants should be informed prior to their participation to the study that they may
be receiving a treatment that may not be effective, and they are taking that chance. In
Emily’s case, she could make sure that employees who did not originally take the train-
ing received the opportunity later. In Jim’s case, he might inform the residents of the
City of Rockwood that the fire department is implementing the experimental alterna-
tive service model, discuss possible pros and cons of the alternative service model, and
get citizen consent. Researchers need to consider these issues and be aware that there
may be some instances where experimental or quasi-experimental approaches may not
be appropriate, due to ethical implications.

Chapter Summary
In this chapter we introduced different types of research design. Research design is a game plan
for your research. You will need to identify your research design in Step 3 of your research process
after you have determined your research objective (Step 1) and research questions (Step 2).
Research design can be categorized based on when data were collected and what information the
data captured. The four types of research design we identified are research that (a) collects data
now about now, (b) collects data now about the past, (c) uses data already collected in the past
about the past, and (d) collects data now about now and again in the future.
We also discussed key principles that the research design needs to meet in order to establish
a causal argument: (1) temporal precedence, (2) covariation of the cause and effect, and (3) no
plausible alternative explanation. In ruling out plausible alternative explanations in the research
design, researchers can eliminate threats to validity. The eight threats we discussed are: history,
maturation, instrumentation, testing, mortality (attrition), regression to the mean, selection, and
interaction with selection. As a way to address these threats, we introduced the basic idea of
experimental and quasi-experimental design. Jim’s case illustrated the development of a quasi-
experimental design. We also introduced some variations on experimental and quasi-experimental
designs. Finally, we introduced ethical implications researchers need to consider, with a few examples
of situations that could impact study participants.
70  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

There are many things to think about when deciding what type of research design most suits
your research. There are also some practical aspects that need to be taken into account, such as
availability of personnel, funding, time, and existing data. Your role as a researcher is to make the
final determination on what type of research design is most appropriate for the research question
you are pursuing, and is also balanced with practical constraints.

Review and Discussion Questions


1. Review the approaches Emily, Jim, and Mary are considering for their study. How would
you classify their approach in terms of the four types of research design introduced in this
chapter?
2. Consider yourself as a research consultant (like Ty). Suppose Emily came to you to get help
deciding the details of her research. Imagine your conversation with Emily and develop a
research design for her. What insights would you offer, and what would be your rationale for
the approach chosen?
3. How would you describe the primary difference between experimental and quasi-experimental
designs? What are the implications of adopting an experimental design versus quasi-
experimental design in an applied setting?
4. How does random assignment in a research design assist in increasing internal validity?
5. Discuss a possible internal threat to validity if Jim adopts an after-only design.
6. A municipality has had a problem with crashes in some intersections due to motorists running
red lights. To combat this problem, the city decided to install red light cameras that photograph
a violator in the intersection and send a citation through the mail. To evaluate the effectiveness
of this program (if any) and determine if it was due to the intervention, why might a time series
design be beneficial? Is there a threat to validity?
7. Find a research-based article for a topic that you are interested in. After reading the author’s
description of the research methods, categorize the approach into one of the four types of
research design. What other research design approaches can you think of to address the
research questions?

References
Barnett, A. G., van der Pols, J. C., & Dobson, A. J. (2005). Regression to the mean: What it is and how to deal
with it. International Journal of Epidemiology, 34(1), 215–220.
Campbell, D. T., Stanley, J. C., & Gage, N. L. (1963). Experimental and quasi-experimental designs for research.
Chicago, IL: Rand McNally.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design & analysis issues for field settings.
Chicago, IL: Rand McNally.
Ethridge, M. E. (2002). The political research experience: Readings and analysis. Armonk, NY: M. E. Sharpe.
Fisher, R. A. (1970). Statistical methods for research workers. Darien, CT: Hafner.
Chapter 4  Research Design  ❖  71

Fisher, R. A., & Bennett, J. H. (1990). Statistical methods, experimental design, and scientific inference. Oxford,
UK: Oxford University Press.
Judd, C. M., & Kenny, D. A. (1981). Estimating the effects of social interventions. Cambridge, NY: Cambridge
University Press.
Kumar, R. (2011). Research methodology: A step-by-step guide for beginners. Los Angeles, CA: Sage.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for gen-
eralized causal inference. Boston, MA: Houghton Mifflin.
Sherman, L. W., & Berk, R. A. (1984). Minneapolis domestic violence experiment. Washington, DC: Police
Foundation.
Trochim, W. M. K., & Donnelly, J. P. (2007). Research methods knowledge base. Mason, OH: Thomson Custom.

Key Terms
After-Only Design With Interrupted Time Random Assignment  58
Comparison Group  59 Series Designs  68
Regression Threat or
Baseline Data  51 Matching 63 Regression Artifact or
Regression to the Mean  55
Control Group  55 Maturation Threat  54
Secondary Data Analysis  51
Covariation of the Mortality (Attrition)
Cause and Effect  51 Threat 54 Selection Bias  55

No Plausible Alternative Selection Interaction


Cross Sectional
Explanation 51 Threats 55
Survey Design  50
Selection Threat  55
Experimental Design  55 Nonequivalent Groups  58
Solomon Four-Group
Experimental Group  55 Nonrandom Assignment
Design 67
(Nonequivalent
Group Assignment  57 Groups) 58 Syllogism 52
History Threat  53 Observations 56 Temporal Precedence  51
Informed Consent  69 Oral History  51 Testing Threat  54

Instrumentation Placebo Effect  67 Time 58


Threat 54 Time Series Design  68
Quasi-Experimental
Internal Validity  53 Design 56 Trend Analysis  51

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


5 ❖
Sample Selection

Learning Objectives 73
Identifying Samples 73
Emily’s Case 73
Mary’s Case 74
Sample Selection 74
Identify an Appropriate Sampling Frame 75
Identify an Appropriate Sample Size 77
Identify an Appropriate Sampling Technique 78
Probability Sampling 79
Simple Random Sampling 79
Systematic Random Sampling 79
Stratified Random Sampling 80
Cluster Sampling 81
Non-Probability Sampling 82
Convenience Sampling 83
Purposive Sampling 84
Emily’s Case 84
Chapter Summary 85
Review and Discussion Questions 85
Key Terms 86
Figure 5.1 Sixty Random Numbers Generated Using
Online Random Number Generator 80
Figure 5.2 Illustration of Systematic Random Sampling 81
Figure 5.3 Proportional and Disproportional Stratified Sampling 82
Figure 5.4 Illustration of Emily’s Cluster Sampling Approach 83

72
Chapter 5  Sample Selection  ❖  73

Learning Objectives

In this chapter you will

1. Learn how sample selection determines the generalizability of the research


2. Learn the basic principles in determining sample size
3. Learn about the sampling frame
4. Learn about different methods of probability sampling
5. Learn about different methods of non-probability sampling

Identifying Samples

Emily’s Case
Emily, HR director at the city of Westlawn, together with the city’s
training manager Mei-Lin and a graduate student Leo, were working
on a diversity training for city employees. The training was funded
by a grant from the Community Foundation, and the foundation
required an evaluation to determine if the training was effective at
improving the employees’ cultural competence and decreasing work-
place conflict, which Emily had proposed was the purpose of the
training.
Leo summarized what Emily told them about the evaluation plan
she developed with Ahmed, the program officer at the Community
Foundation, “We are going to set up an experimental design for our data collection and
randomly assign half of the employees to take the training and another half to not take the
training, then compare the two groups on measures of cultural competence we get from a
survey conducted before and after the training.”
“That seems to be the best way to do it,” Emily confirmed.
“What about workplace conflict? Are we going to test the impact of the training on the
workplace conflict by selecting some departments to participate in the training and some
not? That’s how we can assess the workplace conflict at the departmental level, right?” Leo
asked.
Emily paused with a little bit of a concerned look on her face. “Earlier, I agreed with you
that the departments are the ’unit of analysis’ for workplace conflict. I am afraid, though,
that if we include only some departments in the training and not all of them, we may get
complaints from the department heads. We have about five hundred full-time employees,
but it’s only going to be possible to train at best a quarter of them. I think we are going to
need to select individuals, not departments, for our training participants. We can still com-
bine the cultural competence and workplace conflict measures in the same survey instru-
ment for everyone.”
74  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Leo looked disappointed. He had invested some time thinking through the issue of work-
place conflict. Now he would need to come up with a new strategy to assess the workplace
conflict at the departmental level.
Mei-Lin had been planning the trainings. She offered details: “For the trainings, we
decided we can conduct one training a month, and the grant period is four months, so we
can do four sessions. The room I scheduled can accommodate about 20 people, so let’s say
80 people get trained. That’s less than one-sixth of the employees.”
Emily picked up on the conclusion. “We are going to have to select smaller groups for
the experimental group of employees who take the training and for the control group who
don’t take the training.” She looked at Leo. “Any ideas about how to do that?”
Leo looked at the list of employees Emily had given him. “We will need to select a sam-
ple,” he answered. “We already talked about randomly assigning people to an experimental
group and a control group. Now we just need to reduce the number we put into each group.
I think we will still have enough people to generalize the results of the whole employee
number, but I’ll have to check that.”
“We need to generalize?” Mei-Lin asked.
“Yes”, Leo replied. “I’m sure the foundation would want to know that our study result
represents the general employee response to the training, not just the opinions of a few
people who happened to attend the training.”
Leo then went on to explain different ways they could sample the employees.

Mary’s Case
Mary, volunteer manager at Health First, had been reading the qualitative
research books from her friend Yuki, who was a research director at a large
foundation. She was now considering a series of long interviews to answer her
questions about recruiting and retaining volunteers. Despite her concern that
interviews are too soft and not scientific, she was surprised to learn that it was
possible to conduct interviews in a rigorous manner and get useful results. She
would need to think about how to structure the interviews. The books had given
her a lot of examples how to collect qualitative data and analyze it.
Mary’s immediate concern, though, was the time it would take to conduct long interviews.
She would need to make personal appointments, then meet with each person for the inter-
view, then transcribe the interviews, and then figure out how to code what was said. Per
person, that was a bigger job than a survey. She glanced over the list of things she had
written down to think about, and underlined with a red pen—Decide who to interview. As
she thought about it, she realized that was not enough. She jotted next to the entry, Decide
how many people to interview, and underlined that, too.

Sample Selection

Step 4 of the research flow is to identify from whom or what you collect information
you need for your research. This is typically a step you take after determining your
Chapter 5  Sample Selection  ❖  75

research objectives (Step 1), research questions (Step 2), and research design (Step 3)
(Review research flow discussed in Chapter 2).
Ideally, a researcher would like to collect data from every person or entity of inter-
est. Very often, however, real-world constraints of time and resources make it neces-
sary to select a small subset of people or entities for the study. The process of
identifying a subset of people or entities for a research project is called sampling. The
subset itself is called a sample. The complete set of people or entities of interest is
called the population (Groves et al., 2009).
In the cases of Emily and Mary above, we see how their research designs, and the
constraints of time and resources, have forced them to consider how they will select
their research participants. In both cases, they will need to decide how to select the
participants in such a way that they can be confident that the sample represents the
whole population. After all, both practitioner–researchers are interested in how to
apply their knowledge. In Emily’s case, she can only test the impact of her diversity
training on a subset of employees, yet she wants to be able to say that the training could
work for all employees of the city of Westlawn. In Mary’s case, she has concluded she
can only interview a small number of volunteers and potential volunteers to get their
perspectives, yet she wants results that can help her recruit and retain volunteers in
general.
Researchers use sampling to obtain information that can be used to make infer-
ences about the whole population of interest, while saving time and resources (Weller &
Romney, 1988). The extent to which the research results can be used to draw conclu-
sions about the whole population of interest is referred to as generalizability, or exter-
nal validity of the research. In the following sections, we will discuss three critical
steps in sampling that a researcher must follow to improve the generalizability of the
research results: (1) identify the sampling frame, (2) identify an appropriate sample
size, and (3) identify an appropriate sampling technique (Cronbach, Gleser, Nanda, &
Rajaratnam, 1972).
In our case examples, we can see both Emily and Mary are facing these sampling
issues. Emily’s research team needs to select training participants that will assure that
the result can be generalized to the whole city. But they are confronted with practical
concerns on their capacity to implement training within a limited amount of time and
duration. So far, it is unclear how they will actually conduct the sampling. In Mary’s
case, she is still trying to figure out how many people she can afford to interview. It
remains unclear how she will select her sample and what population it will represent.

Identify an Appropriate Sampling Frame


Before a sample can be selected, a researcher needs to have a clear definition of what
population the sample is supposed to represent. The definition needs to be specific so
the population includes all those individuals or entities of interest and no others. The
specific criteria to define the population are called inclusion criteria (Brink, Van der
Walt, & Van Rensburg, 2012 ). In Emily’s case, the basic criterion is that an individual
must be an employee at the city of Westlawn. In Mary’s case, the basic criterion is that
an individual must be a past, present, or future volunteer at Health First. The list of
76  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

individuals that qualify to be included in the population of interest is called the sam-
pling frame. The research sample will be selected from this list
In some instances, it may be important to explicitly identify categories of indi-
viduals who will not be included in the study population. The criteria to exclude
individuals from the study population are called exclusion criteria. For example, in
Emily’s case, the list of Westlawn employees also includes temporary workers who
work in the Parks Department during the spring and summer months. She assumes
these workers will have less impact on the level of cultural competence and work-
place conflict among city employees, and she does not want to include them in the
training. She could define her inclusion criterion as full-time employees, but some
of the temporary employees work full-time when they work, and some of the regu-
lar employees she is calling full-time technically work only part-time, and she wants
to include them. Thus, to clearly define the population of interest, she needs to add
an exclusion criterion for employees who work on a temporary contract, or what-
ever other definition fits her purpose. The best way to decide how to specifically
define the inclusion and exclusion criteria is to apply the criteria and make the
sampling frame, then check to see if everyone of interest, and only those of interest,
are included.
Researchers need to be mindful of potential problems in the composition of a
sampling frame. Leslie Kish (1995) cautions researchers to pay attention to four prob-
lems: missing elements, foreign elements, duplicate entries, and clusters. All of these
issues could affect the generalizability of the research and the research conclusions.
The researcher needs to carefully define and examine the sampling frame before select-
ing a sample (Shavelson & Webb, 1991).
Missing elements refers to individuals who are not included in the study popula-
tion, but should be of interest for the research objective. In Emily’s case, for example, it
could be argued that temporary workers are important in the overall levels of cultural
competence and workplace conflict. She needs to justify her reasons for excluding
them. Otherwise, her sampling frame could be considered inadequate as a definition
for the study population. If any doubt exists, the researcher should clearly describe why
the identified population captures everyone of interest.
Foreign elements refers to individuals who may be included in the sampling
frame according to the inclusion criteria, but are not relevant to the interest of the
research or might add spurious information. For example, in Mary’s case, she might
believe that volunteers who stopped volunteering more than five years ago will no
longer have accurate recall of their volunteer experience or may have memories that
are no longer relevant to the current situation. In that case, she could either make
one of her inclusion criteria more specific, to include only volunteers separated
from the organization within the past five years, or add an exclusion criterion that
specifies the same thing. Some flexibility exists in how the inclusion and exclusion
criteria fit together to define the population of interest, according to how the data
are collected.
Duplicate entries are a common occurrence in certain data sets used to compose a
sampling frame. In Mary’s case, for example, she will be looking at volunteer lists over
Chapter 5  Sample Selection  ❖  77

several years and many of the names will appear on more than one list. Once she col-
lects all the names into a single list, she will need to sort them to discover the duplicates
and delete them.
Clusters refer to entries in a list that include multiple individuals. In Mary’s case,
for example, her volunteer lists might identify a family name to represent two or three
members of a family who all volunteer. If this occurs, she will need to disaggregate the
entry to identify separate individuals to make the list consistent.
A further, practical issue should be considered when composing a sampling frame.
Some members of a population may be difficult to identify or locate. For example, if
you are interested in the homeless population, obtaining a list of all homeless people
in a region would probably be impossible, or if such a list was available, the persons on
the list would be difficult to locate. In such a case, you would need to define the sam-
pling frame in a way so access to those individuals is feasible, and a reasonable argu-
ment can be made for generalizing the results from your sample to the whole
population of interest.

Identify an Appropriate Sample Size


The sample size affects generalizability of the research results in a statistical analysis.
In general, a larger sample size increases the level of confidence that the sample is more
representative of the population of interest and inferences from the sample are more
likely to be accurate.
How large should the sample be? It is natural to assume that the sample size should
be a certain percentage of the population of interest, so the size of the population is the
first point of concern, but the power of statistical inference actually relies more on the
absolute size of the sample itself (Henry, 1990). For example, everything being equal, a
sample of 1,000 from a population of 1 million, which is only 0.1% of the population,
provides a more accurate representation of the population than a sample of 100 from
a population of 10,000, which is 1% of the population (Schutt, 2012).
Selecting an appropriate sample size relies on a number of factors: what is being
measured, variation in the population on what is being measured, the confidence
level and margin of error (or sampling error) one expects in the results, and the type
of statistical test that will be employed. The most familiar factor in this list is the mar-
gin of error, which is commonly reported in survey results, saying for example, 61% of
respondents favor some particular point of view, with a plus or minus 3% margin of
error. This means that the value in the population can be expected to lie somewhere
between 58% and 64%. The boundary on one side of the sample value is called the
margin of error. The entire range of possible values on both sides of the sample value
is called the confidence interval. What is not commonly reported with these estimates
is the confidence level. Researchers conventionally choose a 95% confidence level for
sample values (Cochran, 1977). In other words, even when a margin of error is stated,
there is a chance that the true value in the population lies outside it. Raising the con-
fidence level to 99% or near certainty increases the necessary sample size dramatically.
Allowing some degree of uncertainty is necessary to make sampling feasible.
78  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Selecting an appropriate sample size from a population of interest for a survey can
usually be accomplished with three variables: the confidence level, the confidence
interval, and variation of what is being measured in the population. The less a value
varies in the population, the smaller the sample size needed to estimate the population
value. For example, if 99% of individuals in a population prefer A and 1% prefer B, then
clearly, very few people will be needed in the sample to obtain a fairly accurate esti-
mate. The trouble here is that variation is usually unknown in advance, unless some
other data source on the population is available. Often, a researcher needs to assume
maximum variation: in this case, that 50% prefer A and 50% prefer B. In some cases, a
pilot test can be useful to estimate the variation in a population before a larger study is
conducted.
In more complex research projects, the researcher will need to take into consider-
ation the type of statistics and statistical tests that will be used. Calculating a sample
size is relatively easy when measures refer to percentages of categorical variables, as in
the examples above, or the means of continuous variables. Other inferential statistical
tests get more complicated. Free online sample size calculators are available, but they
must be used with some knowledge of what is required for the purposes and proce-
dures of the research.
In research that expects to detect change following an intervention, as in Emily’s
case with her diversity training, the amount of change one expects to detect (the effect
size) is an additional factor in selecting an appropriate sample size. We will discuss
effect size and statistically significant change in relation to sample size in Chapter 8.
Research that does not use statistical analysis but bases the conclusion on the
qualitative data, determines sample size somewhat differently. In the qualitative study
that uses interviewing or focus groups, Rubin and Rubin (2012) suggest the researchers
let the data guide their determination of the sample size. They suggest to continue
collecting the data until the researchers are convinced that what they heard from the
study participants sufficiently covered the meaning of the concept or the process they
are exploring. They note that the researchers can stop collecting data when they reach
the saturation point which occurs when the new data seem to provide very little addi-
tional information.

Identify an Appropriate Sampling Technique


The methods and techniques used to identify the sample is another important factor
that affects the generalizability of the research (Cochran, 1977; Henry, 1990). There are
two basic techniques for sampling: probability sampling and non-probability sam-
pling. Probability sampling is a form of sampling that always includes some ways to
randomly select study participants. It means that each unit in the population has an
equal chance of being selected for the sample. In non-probability sampling, the prob-
ability of any one element being selected is not taken into account; selection is based
on other criteria. Random selection is the preferred method for making reliable infer-
ences for population values from a sample, but it is not always possible. Researchers
adopt non-probability sampling as a matter of necessity or convenience, particularly
Chapter 5  Sample Selection  ❖  79

when the number or identity of individuals in the population of interest is unknown,


so a sampling frame cannot be constructed or when access to any random member of
the population is not possible. We present variations in probability and non-probability
sampling separately in the sections below.

Probability Sampling

This section will introduce four techniques of probability sampling: simple random
sampling, stratified random sampling, systematic sampling, and cluster sampling. All
random sampling requires a definite sampling frame to allow every individual a chance
to be in the sample.

Simple Random Sampling


Simple random sampling draws a single sample from the population represented in
the sampling frame. There are many ways to select a random sample, including putting
all the names or a sequence of numbers in a hat, which is sometimes used in field stud-
ies of small groups. More commonly, researchers will give every individual in the
sampling frame a unique number and then use a random number table or a computer
program to generate a list of random numbers with a maximum value equal to the
number of individuals in the sampling frame. Researchers can then select the sample
based on the numbers. Many open source Web applications are available to generate
lists of random numbers. The list in Figure 5.1 illustrates 60 random numbers created
using the random number generator from http://stattrek.com/statistics/random-
number-generator.aspx.
For example, suppose Mary wants to select 15 volunteers for the interview from
the 60 current volunteers; she will first assign a unique identification number to each
one of the 60 volunteers. After creating 60 random numbers, Mary will select 15 num-
bers from the table. The numbers can be read in any direction from any starting point.
The 15 volunteers who have the identification number corresponding to the 15 num-
bers selected from the random number table will be the volunteers that Mary will be
interviewing.

Systematic Random Sampling


Systematic random sampling, sometimes called interval sampling, offers a way to
randomly select members from an existing list of members of the population. In this
technique, all members of the sampling frame are in a list and every kth element is
selected. The value of (k) is determined by dividing the size of the entire population in
the sampling frame (N) by the required sample size (n): thus, k = N/n, rounded down
to the nearest whole number (Moore, McCabe, & Craig, 2010). The starting point on
the list is determined by randomly choosing a number between 1 and (k). The random
starting point assures that the first member of the list is not always selected.
80  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 5.1   Sixty Random Numbers Generated Using Online Random Number Generator

To illustrate this technique, let us use Mary’s example. Her sampling frame com-
prises 60 (N) volunteers, from which she wants to draw a sample of 15 (n). If Mary
wants to use a systematic random sampling, it means that (k) will equal 4 (60/15 = 4).
Mary will then randomly select a whole number between 1 and 4 (by pulling a number
from a hat, for example). Say the selected number is 2. Mary would then start at the
second name on the list and select every fourth person. The full sample will be selected
by the end of the list. (See Figure 5.2.)

Stratified Random Sampling


In stratified random sampling, rather than taking a random sample from the overall
population, the population is first divided into subgroups (strata) based on certain
characteristics. A random sample is then selected from each of the subgroups.
Stratification ensures that certain segments of the population will not be accidentally
underrepresented in the sample. There are two approaches in the stratified random
sampling: proportional stratified sampling and disproportional stratified sampling.
Proportional stratified sampling means that the sample that the researcher selects
will reflect the actual proportion of the subgroups (strata) in the population. In Emily’s
case, for example, she could divide employees into five types of departments based on
the nature of their work—Administration, Culture and Recreation, Roads and Transit,
Public Safety, and Economic Development—then randomly select employees from
each of the groups. With proportional sampling, the sample from each department will
mirror the proportion of that department’s employees in the whole population. With
this approach, Emily will know that each department is equally represented in the
training according to its size.
In a research where random sampling with a sufficient sample size is possible,
typically it makes proportional stratification unnecessary, because the operation of
chance will adjust the proportions. But the stratified sampling technique can be useful
to ensure that smaller segments of the population are in fact represented. National
surveys often stratify by state to assure representatives that their own local population
has contributed to the overall results.
Chapter 5  Sample Selection  ❖  81

Figure 5.2   Illustration of Systematic Random Sampling

In contrast, Emily could be concerned that smaller departments will have few
employees selected in the sample, and she will be unable to make generalizations about
those individual departments, due to the small sample size. She could use dispropor-
tional stratified sampling and oversample the smaller departments to boost the sample
size and increase her ability to make generalizations about each department. In this
case, she might choose to select the same number of employees from each department.
This technique is commonly used to target small segments of a population of special
interest. (See Figure 5.3.)
Some caution is necessary when using disproportional stratified sampling, because
the selection is no longer completely random when the research results are summa-
rized for the whole population. Weighting will need to be applied to adjust the overall
results to reflect the natural proportions of each segment to the whole population.

Cluster Sampling
Cluster sampling involves identifying naturally occurring groups of elements in a
population, then selecting a random sample of these clusters. For each selected cluster,
82  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 5.3   Proportional and Disproportional Stratified Sampling

Employee Population (N=500)

Proportional Sample (n=80) Disproportional Sample (n=80)

Public Safety (39%)


Roads and Transit (24%)
Culture and Recreation (18%)
Administration (10%)
Economic Development (9%)

the researcher collects information from all elements in the cluster. This technique is
useful when it is difficult to construct a sampling frame for all elements in the popula-
tion, but there are some natural groupings of elements. Each element in the cluster
must appear in one and only one cluster.
In Emily’s case, for example, she might find that employee turnover makes it dif-
ficult to construct a sampling frame of all employees that will remain current by the
time she conducts the sample and recruits individuals to participate in the study.
Cluster sampling could provide a solution by allowing her to focus attention on spe-
cific departments, which remain constant. Using this technique, she would make a
sampling frame of all the city departments, randomly select a number of the depart-
ments, and then include all of the employees in those departments in her diversity
training. (See Figure 5.4.)

Non-Probability Sampling

With non-probability sampling, not all the elements in the population have an equal
probability of being selected in the sample. The selection of the sample is not as
systematic as in probability sampling due to a variety of constraints. The researcher
may not have access to the sampling frame that contains the contact information of all
members of the population of interest. Or, even when the researcher has access to the
Chapter 5  Sample Selection  ❖  83

Figure 5.4   Illustration of Emily’s Cluster Sampling Approach

sampling frame of the population of interest, it may not be practical or appropriate to


select the study participants by randomly selecting them. Usually, this approach to
sampling is less costly and more convenient. We discuss here two basic types of non-
probability sampling: convenience sampling and purposive sampling.

Convenience Sampling
In some research situations, the researcher may need to take what’s available. This
approach is called convenience sampling. The data collection is not systematic and is
somewhat haphazard, so this approach is sometimes called haphazard sampling or
availability sampling. Little effort is made to ensure that the sample selected for the
research is representative of the population of interest. Consequently, error and bias
may be included in the sample, and the research may lack generalizability. In some
research contexts or research populations, however, this approach may be the best
solution to gathering needed information.
In Mary’s case, we can imagine that she could find if difficult getting access to the
variety of volunteers she wants to interview. The current volunteers work different
schedules of only a few hours a day, many past volunteers have moved, and prospec-
tive volunteers are difficult to identify. With these constraints, she might choose
convenience sampling as a way to gather information from whomever she can locate.
The technique could prove valuable in this case, because she does not necessarily
need generalizable information to represent all volunteers but a stock of ideas about
pertinent issues.
84  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Purposive Sampling
Purposive sampling selects the sample by targeting particular categories of interest
within the population. This technique allows the researcher to focus attention on cer-
tain issues and can be useful to gather information on different or extreme cases. In
Mary’s case, for example, she might decide that she wants to select volunteers who live
in different parts of town to examine sociocultural differences. She could use a prob-
ability sample to obtain the information, if she wanted generalizable results, or if she
found this was unimportant or not feasible, she could simply select volunteers from
each neighborhood.
In a more focused example, because retention is one of Mary’s main concerns, she
would have a compelling interest to select volunteers she finds in her lists who have
been with Health First much longer than anyone else. She might decide to interview
all of them to hear their stories and find out what common elements they share. This
example of purposive sampling is also called extreme case sampling. This approach
fits well with qualitative research that aims to examine a small number of cases and
develop a richer understanding of each case. Similarly, rather than aggregating opin-
ions from a whole population of interest, some research questions may be well
answered by soliciting expert opinions. This approach is called expert sampling.
Notice in all of these examples of purposive sampling that the selection process is not
completely a matter of convenience. The researcher selects the sample by first developing
criteria to define who will fit the purpose. A description of the criteria and the method
of selecting the actual participants will be important in the research report. In selecting
experts, for example, the researcher will need to describe how the participants were iden-
tified as experts and how the particular individuals in the sample were selected.
Snowball sampling is another purposive sampling technique. In this approach,
the researcher will first identify one person (or entity) to contact and collect infor-
mation. Then, subsequent participants are selected by asking the first study partici-
pant to introduce whom he or she thinks would be useful for the research to include.
The researcher will continue asking the study participants for new names and keep
adding study participants like a snowball. This technique is especially useful in sam-
pling hard-to-reach populations. In Mary’s case, she might find this technique useful
in finding potential volunteers, whom she otherwise has no way to identify.
Now that we have reviewed different ways of sampling, let’s take a look at how
Emily decides to select her study participants.

Emily’s Case
When Leo was done going over different ways of sampling, Emily summed up
what she heard: “It sounds like ‘probability sampling’ is the way to go if we want
to have a ‘representative sample,’ correct?”
Leo nodded and said, “Yes, if you think it’s feasible.”
Emily pondered a little while and said, “I like the idea of doing the ‘propor-
tional stratified sampling’ to be sure members of every department are represented
Chapter 5  Sample Selection  ❖  85

in the group. I gather it’s a little harder to do, but I think I can get better support from the
department heads to participate if they know they are getting specific attention.” Emily
looked at Mei-Lin, knowing she would understand that consideration, then she turned to
Leo, “Can you figure out how to conduct the sampling that way? We need two groups of
study participants, each with 80 employees. One group will take the training and one group
will not take the training. Both will take the survey before and after the training.”
Leo nodded and took notes on the details.
Emily concluded, “We can repeat this process until all the employees eventually get
trained, but we don’t have to do it all in this grant cycle.”
Emily, Mei-Lin, and Leo looked at each other, satisfied. Mei-Lin said, “OK, now we have
a plan!”

Chapter Summary
In this chapter, we discussed various approaches to select samples. Sampling is the way you decide
from whom or what you collect information for your research. This process corresponds to Step 4 of
the research flow. How you sample influences the generalizability of your research. The two main
approaches in sampling are probability sampling and non-probability sampling. Probability sampling
is when every element in the population is given an equal probability to be selected for the sample.
This is a better sampling method when you use statistical analysis to make inference about the
population based on the information you collected from the sample. Not every research project can
use probability sampling, due to constraints in identifying or accessing all elements of the population
of interest. In some research projects, it may not be appropriate to use probability sampling. Non-
probability sampling approaches provide alternative ways to sample in these situations.

Review and Discussion Questions


1. Discuss situations where probability sampling is not possible. How would you go about iden-
tifying your sample?
2. Why might a stratified random sample be more effective than a simple random sample when
you have minority populations to consider?
3. What are the advantages and disadvantages in using proportional stratified sampling and dis-
proportional stratified sampling?
4. Describe how sampling relates to the generalizability of your research.
5. Think about study populations and research objectives for which probability sampling is
impractical or inappropriate.
6. Write your own “Mary’s case” describing how she would go about and decide who she would
interview, how she would identify them, and how she would decide her sample size.
86  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

References
Brink, H., Van der Walt, C., & Van Rensburg, G. H. (2012 ). Fundamentals of research methodology for health
care professionals. Cape Town, South Africa: Juta.
Cochran, W. G. (1977). Sampling techniques. New York, NY: Wiley.
Cronbach, L., Gleser, G., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements:
Theory of generalizability for scores and profiles. New York, NY: Wiley.
Groves, R. M., Fowler, F. J., Couper, M., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey
methodology. Hoboken, NJ: Wiley.
Henry, G. T. (1990). Practical sampling. Newbury Park, CA: Sage.
Kish, L. (1995). Survey sampling. New York, NY: Wiley.
Moore, D., McCabe, G., & Craig, B. (2010). Introduction to the practice of statistics (7th ed.). New York, NY:
Freeman.
Rubin, H. J., & Rubin, I. S. (2012). Qualitative interviewing: The art of hearing data. Los Angeles, CA: Sage.
Schutt, R. K. (2012). Investigating the social world: The process and practice of research. Thousand Oaks, CA: Sage.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.
Weller, S. C., & Romney, A. K. (1988). Systematic data collection. Newbury Park, CA: Sage.

Key Terms
Cluster Sampling  81 Inclusion Criteria  75 Sampling Frame  75
Confidence Interval  77 Non-probability Sampling
Confidence Level  77 Sampling 78 Technique 75

Convenience Oversample 81 Saturation Point   78


Sampling 83 Population 75 Simple Random
Disproportional Probability Sampling  78 Sampling 79
Stratified Sampling  80 Proportional Stratified Snowball
Effect Size  78 Sampling 80 Sampling 84
Exclusion Criteria  76 Purposive Sampling  84 Stratified Random
Expert Sampling  84 Sample 75 Sampling 80

External Validity  75 Sample Size  75 Systematic Random


Sampling 79
Extreme Case Sampling 75
Sampling 84 Variation 77
Sampling Error or
Generalizability 75 Margin of Error  77 Weighting 81

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


6 ❖
Data Collection


Learning Objectives 88
Identifying Data Collection Methods 88
Emily’s Case 88
Jim’s Case 90
Mary’s Case 90
Types of Data 91
Survey 91
Advantages of Surveys 92
Survey Errors 92
Writing Survey Questions 94
Types of Questions 94
Key Considerations in Wording Survey Questions 94
Key Considerations for Response Options 95
Operationalizing the Concept 96
Mode of Survey Administration 98
Emily’s Case 99
Interview 101
Interview Guide: Instrument for Qualitative Data Collection 101
Focus Group 102
Other Qualitative Data Collection Methods 106
Mary’s Case 107
Using Secondary Data 109
Jim’s Case 109

87
88  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Ethical Considerations in Data Collection 110


Chapter Summary 112
Review and Discussion Questions 112
Key Terms 113
Figure 6.1 Examples of Different Response Options 97
Figure 6.2 Example of Likert Scale (With Multiple Likert Items) 98
Table 6.1 Advantages and Disadvantages of
Different Modes of Survey Administration 100
Table 6.2 Mary’s Interview Guide 103
Table 6.3 Summary of Common Approaches to Qualitative Data Collection 108


Learning Objectives

In this chapter you will

1. Learn different data collection methods


2. Learn the importance of identifying data collection methods that are congru-
ent with the research question and design
3. Identify key issues and considerations in developing survey questions
4. Learn different modes of survey administration
5. Learn about conducting interviews, focus group discussions, and other quali-
tative data collection methods
6. Learn about using secondary data
7. Review ethical considerations in data collection with more specific information  

Identifying Data Collection Methods

Emily’s Case
Emily, HR director at the city of Westlawn, and her research team—Mei-Lin, the
city’s training manager, and Leo, a graduate student intern—meet every week to
plan their diversity training for city employees and the evaluation of the training.
At their last meeting, they settled on a before-and-after two-group research design
for the evaluation and a plan to use stratified sampling to recruit study partici-
pants. Half of the selected study participants will take the diversity training, as an
Chapter 6  Data Collection  ❖  89

experimental group, and the other half will serve as a control group. Employees who do not
attend the training during this first phase would be offered an opportunity to take the
training in the future.
Emily proposed to the Community Foundation, which funded the training, that she
intended to improve employees’ cultural competence and decrease workplace tension.
Now the team needed to figure out how to measure those attributes to demonstrate
improvement.
When it came time during the meeting to talk about the data collection, Mei-Lin took
over. “We talked about doing a survey to measure cultural competence and workplace
conflict, so I looked at reports and journal articles that use surveys to measure those things.
I found some survey questions that we may be able to adopt.”
“That’s great!” Emily exclaimed.
Mei-Lin continued, “Well, the good news is that there are many surveys out there that we
might be able to use, but —” her tone was not encouraging, “first, I think we need to have
a clearer idea of what we mean by cultural competence and what kinds of workplace con-
flict we are concerned about. There are a lot of possibilities.”
“In other words,” Leo said, “define cultural competence and workplace conflict.”
“Exactly,” Mei-Lin responded. She opened a binder in front of her on the table and
passed a few articles over to Emily and a few others to Leo. “These are some of the best
sources I found, but only a couple of the surveys seem to have the same focus as us. A
couple items I’ve circled there appear to be relevant, but notice how each survey is differ-
ent. Even if they asked similar questions, the way they set up how people should respond
to each question is a little different. We need to clarify what we are looking for and how
we want people to respond.”
“OK,” Emily said after a moment looking at the papers, “let’s put this on our to-do
list: define the concepts for our purposes and come up with some specific language.
Before we go too far in that direction, though, let’s talk about what we really want to
do to get the information we need. Are we sure it’s a survey? And if so, how do we do
it?”
Leo entered the discussion, holding one of the papers in his hand. “I think a survey is
the best option for us, especially since Mei-Lin found these things that show other
researchers are using surveys for basically the same issues. Do you have something else
in mind?”
“I was thinking of a web-based survey,” Mei-Lin interjected.
Emily smiled. “I guess we all vote for the survey. I’m not sure a web-based survey is a
good idea, though. Some of the folks in Public Works, and probably Parks and Recreation,
won’t have easy access to a computer.” She paused, seeing the others were thinking.
“Anyway, let’s commit to a survey and come back with ideas on the best way to administer
it. Leo, if you could work on some definitions and send them to us, then we can tweak them,
and Mei-Lin will be able to plug them in to make a list of items and variations that we
might be able to use on the questionnaire.”
The team agreed they had a plan, and the meeting continued to the business of the
training itself.
90  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Jim’s Case
Jim, deputy fire chief at the city of Rockwood, had nailed down the details for the
pilot study to assess the effectiveness of the alternative delivery model and was
moving forward with the proposal writing for the city council. He decided to turn
his attention to his second task, related to response time, so he would have some-
thing to tell Chief Chen there, too. For fire operations, response time is defined as
the time it takes for firefighters to arrive on the scene from when the call is dis-
patched to the station. The National Fire Protection Association uses response
time as a performance measure and sets the target benchmark at less than 5
minutes 90% of the time.
Jim pulled out the instructions for self-study from the Commission on Fire Accreditation
International. He also went online and found examples of self-study reports submitted by
fire operations from other jurisdictions. He found that most jurisdictions provided detailed
data and analysis, including how performance changed during the past few years, and
details on individual stations.
Jim pondered, “Hmm—it looks like there are several ways to analyze and summarize our
performance. It’s not just presenting the raw response-time data. Do we have enough to go
back several years and report on each station?”
He picked up the phone to call Kathy, the operations manager in the Fire
Department. “All I need to do,” he thought, “is ask Kathy to give me the data so I can
take a look at it.”

Mary’s Case
Mary, volunteer manager at Health First, decided she would conduct a series of
long interviews with volunteers to find out how she can better recruit and retain
volunteers. Her first questions were who to interview and how many interviews she
should do. She obtained a list of volunteers from HR.
“So, this list is the sampling frame, at least for the current volunteers,” she said
to herself as she looked at the list. “But I don’t need to do a probability sampling.
I am going to select a few people to interview, based on their background and
perhaps their availability. I’d like to get feedback from both men and women and
people who live in different parts of town. And I want to talk to those who have
been with Health First a long time, as well as those who just started.” She looked through
the list and put a check by several names that fit her criteria. “I will have to narrow that
down, but that’s a start.”
Sitting back, Mary wondered what she would ask them. She started to jot down possible
questions. It didn’t take long to come up with close to 20 questions. “I’m sure this is too
many questions,” she thought. “I need to figure out how to organize these questions and
cut them down to the essential questions.”
Later, when Mary entered the break room for lunch, she found Ruth and John, two vol-
unteers she knew pretty well, together at a table. “Lots of things happening, Mary,” Ruth
said as Mary sat down next to them. “Have you heard about . . . “ Ruth started talking
Chapter 6  Data Collection  ❖  91

about some of the new volunteers and how they were doing, which led John to mention
other volunteers who were leaving and the work that needed to be done. Mary wished she
could take notes. Some of the stories gave her ideas about volunteer recruitment and reten-
tion. As she listened, she thought about “participant observation” and “focus group discussion”—
two alternatives to data collection she had found in the qualitative research books Yuki
gave her.
“There are many ways to collect information other than face-to-face interviews,” she
thought. “I should at least consider other approaches to qualitative data collection before
deciding on the interview.”

Types of Data

Once you have identified your research objectives, research questions, research design,
and the sample for your research, the next step (Step 5 in the research flow) is to collect
data. (See Figure 2.1: Research Flow and Components in Chapter 2.) In collecting data,
you need to know the difference between quantitative and qualitative data and different
data collection methods available to you. In this section, we discuss types of data. The
remaining sections in this chapter discuss methods, tools, and sources to collect data.
In thinking about data collection, it is useful to make a distinction between two
types of data: quantitative and qualitative (Salkind, 2011; Sapsford & Jupp, 2006).
Quantitative data is data in numerical form. Attributes that are defined in terms of
magnitude, using numbers, are also considered quantitative data. Qualitative data is
data that is not in numerical form. In social science research, it is typically information
that is captured by words or text. It can also be captured in other forms, such as pho-
tographs, video, sound, and so on. Qualitative data is used to describe, categorize,
label, and identify qualities of observed phenomena.
Data can also be distinguished as primary data or secondary data. Primary data
refers to data that is collected by the researcher for a given study. Secondary data refers
to data that has already been collected for another purpose, but is used by the
researcher for a given study (Moore, McCabe, & Craig, 2010). In our case examples, we
see that Emily and Mary are preparing to collect primary data, while Jim is going to
use secondary data for his response-time study. In Jim’s case, the data were already
collected, and he is going to use the data to assess the department’s performance for
the accrediting organization.
There are many data sources that provide secondary data for researchers.
Prominent sources for general population characteristics are found in the Census, the
Current Population Survey, and the General Social Survey.

Survey

Surveys are a popular method for collecting primary data (Coxon, 1999; Groves
et al., 2009). Many secondary data archives are developed from surveys, too. A survey
92  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

collects data by asking questions in a standardized form. Surveys can collect quanti-
tative data, qualitative data, or both.

Advantages of Surveys
Schutt (2012) summarizes the advantages of surveys in three key features: versatility,
efficiency, and generalizability. A survey is a versatile data collection method in that it
can be used for research with different objectives, theory building, and data analysis.
Survey data can be used to explore and describe phenomena or confirm and test
hypothesized relationships; to pursue either deductive or inductive theories; and to
analyze descriptive statistics, inferential statistics, or qualitative themes. Surveys can
also be used to study a broad range of topics in a variety of settings. Surveys are used
by politicians to obtain polling data during campaigns, by businesses to learn about
market demographics, by policymakers to assess community needs, by public agencies
to identify program needs, by nonprofit organizations to evaluate service quality, by
managers to solicit feedback from employees, and so on.
Another advantage of the survey is its potential to collect data efficiently. Surveys
provide researchers a way to collect a large set of data fairly quickly at a relatively low
cost. In many cases, a survey can be administered to the whole population of interest.
This capacity of the survey to reach a large number of people with relative conve-
nience for both researchers and participants makes it a popular method of data col-
lection. Researchers can also ask a broad range of questions in one sweep. The
efficiency of a survey is determined by the design and mode of survey administration.
Administering a survey face-to-face, or by mail, telephone, or website affects the cost,
speed, and size of the data. The mode of administration also affects access to the
population of interest.
The advantage of a survey to generalize its findings to a larger population is due to the
facility with which it meets the demands of probability sampling, with relatively conve-
nient access to a whole sampling frame. A survey can also accommodate larger samples.

Survey Errors
There are two types of errors that researchers need to try to minimize while conduct-
ing a survey. One type of error is referred to as errors of observation (Groves et al.,
2009). This error is also called measurement error and stems from the poor wording
of questions or inappropriate selections of questions. Errors of observation happen
when the survey questions are presented in a way that will lead to inaccurate or unin-
terpretable answers. To minimize the risk of errors of observation, it is important to
construct questions that are clear and presented in a well-organized manner. It is also
important to select questions that will provide adequate answers to the research ques-
tions at hand. We will discuss survey questions further in the next section.
A second type of error is called errors of nonobservation (Groves et al., 2009).
This is the error of not including every case that needs to be included in the survey.
Chapter 6  Data Collection  ❖  93

Excluding or omitting some cases that should be included in the survey will affect
the accuracy of the survey results. There are three possible sources of errors of
non-observation:

(1) Inadequate coverage of the population (poor sampling frame)


(2) Sampling error
(3) Nonresponse

Inadequate coverage of a population occurs when the sampling frame devel-


oped to represent the population is incomplete. As we discussed in Chapter 5, the
sampling frame is a list of all elements or units in the population of interest. If the
sampling frame is incomplete, the sample drawn from it may produce biased
results.
Sampling error refers to a difference between the characteristics of the population
and the sample drawn from it, due to the partial representation of the population in
the sample and chance differences. With probability sampling, the sampling error is
typically represented as the margin of error or confidence interval in the final results.
As we described in Chapter 5, a larger sample size generally lowers the sampling error
and increases the level of confidence in the sample’s representation of the population.
We also discussed in Chapter 5 how a more homogeneous population will reduce the
sampling error. In any research that uses sampling, the sample size and variation in
what is being measured in the population need to be carefully considered to minimize
the risk of a nonobservation error.
Nonresponse occurs when individuals in a selected population or sample refuse
to respond or cannot be contacted. Survey results may not be affected if nonresponse
occurs randomly, but if nonresponse occurs in some systematic manner, then the
collected data may not adequately represent the population of interest. Researchers
need to be cautious about nonresponse and examine possible reasons that could lead
to biased results. Ideally, every selected person in a sample would respond to a sur-
vey, but in reality it is almost impossible to achieve a 100% response rate (Rogelberg &
Stanton, 2007). Researchers have suggested different lower bounds to an acceptable
response rate, ranging from 50% (Babbie, 2013; Dillman, Smyth, & Christian, 2009)
to 80% (De Vaus, 1986). Baruch (1999) reported that the average response rate of
surveys published in a sample of academic articles in organizational studies from
1975, 1985, and 1995, was 55.6%. More recently, Baruch and Holtom (2008) reported
a range of response rates in organizational studies 35% to 50%. Survey experts pro-
vide suggestions to help increase survey response rates, which have been noticeably
declining for the past few decades worldwide (Panel on a Research Agenda for
the Future of Social Science Data Collection, 2013). Suggestions to increase response
rates include better questions, better implementation, less burden for respon-
dents, rewards, and efforts to gain respondent trust (Dillman et al., 2009; Millar &
Dillman, 2011).
94  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Writing Survey Questions

Types of Questions
There are two types of questions you can use in surveys: open-ended and closed-ended.
Open‑ended questions allow respondents to answer in any way they like and add addi-
tional commentary. Open-ended questions typically produce qualitative data that describes
a respondent’s views. Open-ended questions are typically phrased using who, why, what,
when, and how—referred to in journalism as the “Five Ws and one H” questions.
Closed-ended questions limit the response. With the closed-ended questions
you typically get quantitative data that is captured numerically. When you start ques-
tions with verbs, such as Are, Will, Is, Have, and Did, they lead to a “yes” or “no”
response. Closed-ended questions can also be constructed by asking respondents to
choose answers from a predetermined list of options. Even questions using the 5Ws
and 1H can be made into closed-ended questions by providing a list of predeter-
mined options, saying for example: “Please choose from the following options.”
Closed-ended questions can allow for multiple choices by providing a list of options
and saying something like, “Select all that apply.” The options can also be presented
as a scale, where the respondent rates the answer on some dimension, such as
amount of agreement (strongly disagree to strongly agree), level of liking (like to
dislike), or judgment about a given situation (e.g. good to bad, strong to weak, active
to passive).
Many things need to be considered when you construct survey questions. One of
the most important things is to make sure the survey questions are aligned with your
research questions and research objectives. You can ask any question you like in a sur-
vey, but if the answers you receive do not inform you in answering your research
questions, you defeat your purpose.

Key Considerations in Wording Survey Questions


The way survey questions are worded have a great impact on the way they are
answered. It is, therefore, important to pay close attention to the wording of the ques-
tions and make the intended meaning of the questions as clear as possible. Survey
experts, such as Dillman and his colleagues (2009), Fowler (1993), and Groves and his
colleagues (2009) provide extensive suggestions on how to construct better survey
questions. Here we will introduce some key principles.

Use simple, direct, and short wording. One of the ways to avoid confusing phrases
in survey questions is to use words that are likely to be understood by more people. For
example, Dillman (et al., 2009) suggest using words such as tired over exhausted,
honest over candid, and correct over rectify. They also suggest simplifying phrases
by using shorter combinations of words. For example, say your answers rather
than your responses to this questionnaire, or job concerns rather than work-related
employment issues.
Chapter 6  Data Collection  ❖  95

Avoid using double negative questions. When two negatives are used in one sen-
tence, it is called a double negative, which is typically understood to cancel out to
become affirmative. For example, “I do not disagree” means “I agree,” or “That picture
is not unattractive” means “That picture is attractive.” Using double negatives in sur-
vey questions add complexity and can make it hard for respondents to figure out the
intent of the questions. Imagine how to answer a survey questions such as: “Do you
disagree that diversity training should not be mandatory?” or “Did you not dislike the
diversity training?” It is typically safer to avoid negative words in survey questions,
such as don’t or not.

Avoid using double-barreled questions. A double-barreled question asks about more


than one issue yet allows for only one answer. Examples of double-barreled questions
are: “How satisfied are you with your department’s support for diversity and the degree
of diversity attained in your department?” or “How often and how much time did you
spend attending diversity training during the last year?” Notice how each of these
questions asks for two separate answers. Double-barreled questions add a burden for
respondents by adding uncertainty in how to answer, or if respondents fail to notice
the two intermingled questions and answer, the burden goes to the researcher who
may find there is no way to know which question is being answered. Consequently, the
double-barreled question will lead to nonresponse or inaccuracies.

Avoid biased, leading phrasing. Biased or loaded words or phrases in survey questions
can lead respondents to answer in a certain way that produces misleading information.
Consider the following question: “Racism affects everyone in the city in a negative
manner. Please indicate if you agree or disagree to the following statement: The City of
Westlawn should take a strong stand against racism.” The opening statement establishes
that the city disapproves of racism, which makes it harder to disagree with the position
to take a strong stand against it. This is an overt example. Many forms of bias can be
insinuated into questions in subtle ways. Researchers should examine words and phras-
ing in survey questions to remain as neutral as possible.

Key Considerations for Response Options


Closed-ended questions with fixed response options need to be exhaustive and mutu-
ally exclusive. In other words, the choices must provide a full spectrum of possible
responses, and each option must be distinct from all other options. When options
include ranges—as for age, income, years of service, and so on—the ranges must not
overlap, and all possible ranges must be provided. In this way, a respondent with a
particular answer will be able to find a suitable option and only one suitable option. In
many cases, this may require adding an option for “Don’t know” or “Not applicable.”
These answers may not seem worthwhile, but they are more informative than a nonre-
sponse and can be quantified. One exception to this rule for exhaustive and mutually
exclusive response options is when you offer multiple choices with the instruction to
“Select all that apply.”
96  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Closed-ended questions with ordered response categories are frequently used in


surveys. Typically, a statement is given that describes a point of view, and the respon-
dent is asked to evaluate the statement by choosing a response on a scale. Common
options for the kind of response asked for include the following:

•• Strongly agree to strongly disagree


•• Very favorable to very unfavorable
•• Extremely satisfied to extremely dissatisfied
•• Excellent to poor
•• High priority to low priority
•• Very important to unimportant
•• Very frequently to never

There are many ways to construct scaled responses. The scales can be constructed
with a different number of response categories or points. The number typically ranges
from 3 to 7 points and sometimes 10 points. (See Figure 6.1.) The scale of responses,
with however many points, needs to follow the rules above about being exhaustive and
mutually exclusive. Some care must be taken when defining the categories. The order
of items should also be constructed to indicate an equal distance between them. This
is important to keep the categories intelligibly distinct for the respondent. In addition,
equal intervals between the items allow the researcher to interpret the responses as
continuous numbers to quantify the responses. We will discuss the quantification of
interval responses further in Chapter 7.
The most common scale is known as the Likert scale, comprised of multiple num-
bers of question items, with response options arranged horizontally with 5 or 7 points
that include a neutral midpoint. (See Figure 6.2.) Each point on the scale is associated
with a consecutive number (usually 1, 2, 3, 4, and 5). Each question item in the Likert
scale is called Likert item and is composed of the statement the respondent is asked to
evaluate and the response options. Likert scale response options indicate equal dis-
tance between each option. Survey response options that do not fit the described for-
mat here are not Likert scale and might be referred to as a Likert-type scale or
ordered-category items. Note that response options without a neutral midpoint, as
with most four point scales, may leave the respondent without an adequate option for
a neutral response.

Operationalizing the Concept


Operationalization refers to the process of developing research procedures (opera-
tions) that will allow empirical observations to represent the concepts in the real world
(Babbie, 2013). With a quantitative approach this means finding ways to turn the con-
cepts into measurable quantities. With qualitative research, it may mean finding ways
to elicit responses to observe the point of interest.
To operationalize a concept, a researcher first needs to refine and specify the
abstract concepts in the research (conceptualization). We saw the beginning of this
Chapter 6  Data Collection  ❖  97

Figure 6.1   
Examples of Different Response Options

A. Three point scale item


“What do you think the city’s priority for diversity should be?”

Low priority Neutral High priority


1 2 3

B. Four point scale item


“How do you rate the importance of offering diversity training to the employees?”

Unimportant Not so important Important Very important


1 2 3 4

C. Five point scale item


"How satisfied are you with the city’s diversity efforts?"

Very Neither satisfied


dissatisfied Dissatisfied nor dissatisfied satisfied Very satisfied
1 2 3 4 5

D. Seven point scale item


"How frequently have you encountered comments that you consider racist?"

Strongly Somewhat Neither agree Somewhat Strongly


disagree Disagree disagree nor disagree agree Agree agree
1 2 3 4 5 6 7

process of conceptualization in Emily’s case. Her research team discovered that they
needed explicit definitions of cultural competence and workplace conflict. How the team
will measure these things remains unclear so far. The process can take several steps.
For example, the team could conceptualize cultural competence as a variety of behav-
iors and attitudes about culture and skills to communicate with those from different
cultural backgrounds. For Emily to operationalize the concepts, she will need to decide
if she wants quantitative or qualitative data, or both, and determine what she expects
to observe to represent the existence of the concepts. These definite observations are
sometimes called indicators. With a survey, Emily’s team may be able to construct
survey questions that elicit direct responses about the characteristic indicators of cul-
tural competence that were identified. In this case, the survey questions and the possi-
ble responses would reflect the operational definitions of the concepts the team wants
to measure.
98  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Figure 6.2   Example of Likert Scale (With Multiple Likert Items)

Very Very
rarely Rarely Sometimes Frequently frequently
•• To what extent are there 1 2 3 4 5
differences of opinion in your
team?
•• How often do the members of 1 2 3 4 5
your team disagree about how
things should be done?
•• How often do the members of 1 2 3 4 5
your team disagree about which
procedures should be used to
do your work?
•• To what extent are the 1 2 3 4 5
arguments in your team task-
related?
•• How much are personality 1 2 3 4 5
clashes evident in your team?
•• How much tension is there 1 2 3 4 5
among the members of your
team?
•• How often do people get angry 1 2 3 4 5
while working in your team?
•• How much jealousy or rivalry is 1 2 3 4 5
there among the members of
your team?

Mode of Survey Administration

When you plan to conduct a survey, the mode of administering the survey is a primary
concern. Currently, using the Internet and email is becoming a common way to
administer surveys. Web-based surveys make it easier for researchers to contact a
large group of potential study participants who can complete and submit the answers
electronically. If the target population of your study does not have regular access to a
computer or other electronic device, web-based surveys may not be a preferable mode
of administering the survey.
Alternately, many large-scale surveys are administered via telephone. Telephone
surveys have been more popular with the introduction of the computer-assisted tele‑
phone interview (CATI) system. Many companies that specialize in polling and sur-
veys have used the CATI system to administer surveys. However, with caller ID
Chapter 6  Data Collection  ❖  99

technology and more people opting out from listing their phone numbers in the phone
book, it is getting harder to administer surveys via telephone. Also, like the web-based
survey, a researcher has to consider if all individuals in the target population have
access to a telephone.
Of course, surveys can still be administered by using paper and pencil. The typi-
cal way to administer a paper and pencil survey is by mail to a home or business
address. Mail surveys are a comparatively easy approach to administering surveys, if
you can access the addresses of your target audience. However, mail surveys can eas-
ily be mistaken as junk mail, or put aside and forgotten. Another way to administer a
paper and pencil survey is to distribute the survey forms to a group of individuals in
person when they gather in one place for a particular occasion. In Emily’s case, for
example, the research team can distribute survey forms to the employees who come
to attend the training. The advantage of this approach is that the researchers have a
captive audience, and participants are more likely to complete the survey and return
them on the spot.
Another variation of survey administration is to have a face-to-face interview.
In this case, the interviewer meets with the respondent face-to-face. Instead of the
respondents filling out the survey form by themselves, the interviewer fills out either
a paper or a web-based questionnaire for the respondents. This approach is espe-
cially effective if the target audience has a harder time reading the survey from paper
or the computer.
Table 6.1 summarizes the advantages and disadvantages of different modes of
survey administration. In order of time and effort, the web-based survey does not cost
a researcher much. The email survey requires more effort to compose a mailing list.
Distributing survey forms where respondents gather in one place may be nearly as easy
as the web-based survey, with some additional printing costs. Other modes of admin-
istering surveys can be time intensive and relatively expensive. The telephone survey
and face-to-face interviews require individual attention to each respondent. Telephone
surveys can also be expensive if a firm is hired to do the job with CATI equipment.
Mail surveys can be expensive in printing and mailing costs, including return postage
for responses and may take a team to stuff envelopes.
We now have a foundation to look in again on Emily’s research team and their
survey planning.

Emily’s Case
Before they concluded their weekly meeting, Emily, Mei-Lin, and Leo
returned to the topic of their training evaluation survey. They still
needed to finalize the definitions of cultural competence and work-
place conflict, but Emily was also concerned about how to adminis-
ter the survey. She reminded Mei-Lin and Leo that they had to survey
the sample of employees in the control group, who would not be
coming to the training, as well as those who came to the training.
100  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Table 6.1 Advantages and Disadvantages of Different Modes of Survey


Administration

Advantages Disadvantages
Web-based survey •• Easy to reach large number of •• Cannot reach respondents who do not
respondents who have Web have Web access.
access. •• Need to control one respondent
•• Respondents can take the accessing the survey multiple times.
survey at their convenience. •• Possible technology failure can hinder
•• Relatively low cost. data collection.
•• Automated data entry. •• Does not allow respondents to ask
•• Can assure anonymity. clarifying questions while taking the survey.
Telephone survey •• Easy to reach large number of •• Cannot reach respondents who do not
respondents who have have telephone.
telephone access. •• Harder to access respondents with caller
•• Skilled telephone survey ID and call blocking.
administrator can increase •• Harder to access respondents at all
response rate. times.
•• Easy to track nonresponse. •• Need trained telephone survey
•• Data can be entered by the administrators.
telephone survey administrators •• More costly to hire telephone survey
during the phone call. administrators.
•• Allows respondents to ask •• Cannot assure anonymity.
clarifying question during
survey administration.
Mail survey •• Easy to reach large number of •• Cannot reach respondents who do not
respondents with the address have publicly available address
information. information.
•• Respondents can take the •• Higher cost to send survey in mail.
survey at their convenience. •• Need to enter data manually after survey
•• Can assure anonymity. forms are returned.
•• Does not allow respondents to ask
clarifying questions while taking the survey.
Face-to-face •• Skilled survey interviewer can •• Harder to reach large number of
interview increase response rate. respondents.
•• Better interface for those who •• Need to train survey interviewers.
do not have easy Web access or •• Cannot assure anonymity
telephone access.
•• Better interface for respondents
who have difficulties in reading
survey question on the Web or
paper.
•• Allows respondents to ask
clarifying question during
survey administration.
Chapter 6  Data Collection  ❖  101

“What are the options?” Emily asked.


Leo offered a quick account of the pros and cons of different ways to administer the
survey. Each way they could do it seemed to have a drawback.
”How about this,” Emily started. “We need to engage the study participants and explain
what we are doing, so what if, once the 80 training participants and the 80 control group
members are selected, we invite them to a lunchtime orientation session. I could make
participation mandatory, but in exchange, I’ll pay for lunch. At the orientation, we will ask
them to fill out a paper survey. That will be our baseline. Then, after all four sessions of
the training are completed, we will organize another lunch gathering for everyone and do the
same thing. That will give us our remeasurement. This way we get access to everyone at
the same time before and after the training session.”
Leo was impressed. “That’s brilliant!“
Emily smiled. “Thanks for laying out my options. It was your idea. Nothing else worked.”
They all laughed and adjourned.

Interview

Interviewing is another popular primary data collection method. In an interview,


the researcher meets with the respondent in person, in most cases face-to-face, but
it can also be done over the phone or by using a webcam. Interviews allow the
researcher to interact with the respondent at a personal level and capture their
insights as qualitative data. Robson (1993) describes interviews as “a kind of
conversation; a conversation with a purpose” (p. 228). Interviews allow the
researcher to develop a deeper and richer understanding of the phenomenon being
researched.

Interview Guide: Instrument for Qualitative Data Collection


One of the data collection instruments used in the interview is the interview guide.
The interview guide is a list of issues and questions to be addressed in the interview.
As the name suggests, it is a general guide for the interviewer during the interview,
with some degree of flexibility. It is not meant to be a strict protocol to follow.
Qualitative research experts (Kvale & Brinkmann, 2009; Lofland, Snow, Anderson,
& Lofland, 2006) recommend including the following elements in the interview
guide:

•• Introduction: What to say when setting up the interview and at the begin-
ning of the interview, including asking for informed consent and assuring
confidentiality of the interviewees. If audio recording the interview, ask for
permission.
•• Main interview questions and possible probing questions.
•• Conclusion: What to say in concluding the interviews.
102  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

The following are some key things to consider when formulating the main inter-
view questions:

•• Identify key topic areas that are relevant to your research question. Formulate
interview questions in a way that will help obtain information to address the
research question.
•• Consider the order of the questions. Make sure there is a logical flow in the
order of the questions. Be prepared to alter the order of the question during the
interview.
•• Do not ask leading questions. Make sure the language and the terms used in the
questions are relevant to the interviewees.
•• Phrase the question using Who, What, When, Where, Why, and How.
•• Ask the interviewees to describe the facts before asking them about their opinions.

To probe the interviewees to expand on their answers, use the following phrases:

•• Would you mind giving me an example?


•• Can you elaborate on that point?
•• Would you mind explaining that further?
•• Is there anything you would like to add?

It is usually not advisable to ask too many questions in one interview session,
especially when the purpose of the interview is to have the interviewee elaborate on his
or her observations and ideas. To develop a sense on how well the interview questions
work and how long it may take for the interviewees to answer the questions, it is useful
to conduct a couple of pilot interviews and adjust the interview questions accordingly.
In Mary’s case, for example, she will probably want to conduct a couple of pilot inter-
views and make adjustments as she learns from the experience. Table 6.2 shows a
sample interview guide for Mary.
Interview data is usually transcribed and coded. Qualitative data analysis will be
discussed in more detail in Chapter 14.

Focus Group

The focus group is a research technique to collect qualitative data from several
individuals at once “through group interaction on a topic determined by the
researchers” (Morgan, 1996, p. 130). The group interview usually includes six to 12
individuals for a period of 1 to 3 hours. A trained moderator prompts the group to
explore a set of topics with a specific focus. Outside academia, focus groups have been
used widely in marketing and political campaigns (Krueger & Casey, 2000; Morgan,
1988).
The advantage of the focus group is that the researcher gains access to several
perspectives at once from the population of interest, including in-depth discussions
prompted by the interaction of fellow participants. With effective moderation, the
focus group participants open up and follow lines of thought in the form of a conversation.
Chapter 6  Data Collection  ❖  103

Table 6.2  M ary's Interview Guide

Date:
Volunteer Name:
Profile Info: Gender __________________ Age __________________
Residence Area:__________________

Check list:  Audio recorder,  Informed consent form,  Business card,


NOTES:

Key points Statements/Questions

Thank you ______________ Thank you for taking the time to meet with me today.

Self intro My name is Mary and I’m a program manager at Health First. [If I know the
volunteer, personalize the information and talk where we met etc.]

Purpose I would like to talk to you about your experiences volunteering at Health First
Specifically, I would like to have your thoughts in how we can recruit more
volunteers. I would also like to hear from you what we can do so you and
others would keep volunteering with us.

Time The interview should take less than an hour. [Check how much flexibility the
interviewee has in the time. Ask if it’s OK if it took a little longer.]

Recording /note I will be audio recording our conversation because I don’t want to miss any of
taking your comments. I will be taking notes during our conversation, but I can’t
possibly write fast enough to get it all down. [Ask for permission to start the
recording.]

Confidentiality What we discussed today will be kept confidential. This means that your
interview responses will only be shared with me and my research team
members. We will ensure that any information we include in our report does
not identify you as the respondent. Remember, you don’t have to talk about
anything you don’t want to and you may end the interview at any time. Also,
what you share with me today will not affect your relationship with Health
First.

Opportunity for Are there any questions about what I have just explained?
question Are you willing to participate in this interview?

Informed consent The terms of this interview and what I said right now is described in this form.
form Please read it and sign at the bottom of the form if you agree to participate in
this interview. [Ask for signature on the informed consent form. Give one copy
to the interviewer.]

(Continued)
104  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Table 6.2  ( Continued)

Key points Statements/Questions

Q1 Initial First of all, can you tell me how you came to volunteer at Health First?
motivation
(How they found Health First)
(What appealed to them)
(Why chose Health first)
(Why volunteer)

Q2 Experience Tell me how long you’ve been with Health First, and your main assigned tasks.
How would you describe your experience here at Health First, overall? What do
you like? What do you not like?

(Ask if the workload is reasonable)

Q3 Retention What are the key things that kept you volunteering at Health First?
(yourself)

Q4 Retention I’m sure you know other volunteers at Health First. Some stay and some leave.
(others) Why do you think some people stay? And why do you think some people
leave?

Q5 Retention What can we do to help you feel more valued as a volunteer?


(Improvements)

Q6 Recruitment Any thoughts or ideas on what we can do to recruit more volunteers?

Q7 Additional Is there anything you would like to add?


comments

Closing Once again, thank you very much for your time and sharing your insights. I’ve
Thank you learned a lot from our conversation today.

Follow-up Will it be all right if I contact you later if I need to clarify something you
mentioned in today’s interview?

Snowball Also, do you have anyone else that you would recommend that I talk to about
their experience at Health First?

Question My contact information is on the informed consent form, but just in case, here’s
my business card, too. If you have any concerns or questions please feel free to
contact me. [Give the business card]

The group approach can make study participants feel more comfortable to share their
ideas than in one-on-one interviews with the researcher. Researchers also note that the
Chapter 6  Data Collection  ❖  105

focus group is an effective approach to give voice to marginalized members of the


community (Morgan, 1996).
In Mary’s case, she noticed how the stories of Ruth and John about other vol-
unteers at Health First built on each other and became more detailed as they talked.
This made her reflect on the advantages of a focus group to get several volunteers
to talk among themselves while she listened and added occasional questions related
to recruitment and retention. This could make them feel more comfortable to talk
about their experiences that other volunteers might share or that they hoped other
volunteers would understand. They would be talking to each other and not
expressly to her.
Morgan (1995) advises researchers to consider the following five points to conduct
effective focus groups:

1. Recruiting: Can you locate people to interview?


2. Sampling: Are you interviewing the right people?
3. Developing questions: What you will ask.
4. Moderating: How you will interact with the participants.
5. Analyzing: What you will do with the data.

Recruiting strategies suggested by focus group experts (e.g. Krueger & Casey,
2000) include repeated contacts with the potential focus group participants, offering
incentives, and overrecruiting in case some participants do not show up. In Mary’s
case, she will need to find a convenient time for 6 to 12 volunteers to meet, which
might be difficult. Arranging the time and venue is a critical feature of the focus group
and can require feedback from potential participants to discover what works. Mary will
also need to consider whom to invite to answer her questions. For example, if she
wants the perspectives of people who live in different parts of town, she will need to
decide if she will bring together volunteers from different geographical locations or
hold several segregated focus groups designed for each locality. Also, how will she mix
in the short list of volunteers she composed of individuals who had been at Health First
much longer than anyone else?
As with any interview process, the focus group moderator needs to prepare ques-
tions (Morgan, Krueger, & King, 1998). Morgan (1995) points out that a frequent
mistake researchers make in developing focus group questions is preparing too many
questions and not paying sufficient attention to the concerns raised by the focus group
participants during the session. On the other hand, the moderator will probably need
to intervene occasionally to keep the conversation on course, according to the purpose
of the meeting. Conversations can easily wander and get out of control. The skill of the
moderator will determine the quality of the data obtained in the focus group
(Greenbaum, 2000). Keeping the purpose of the focus group clearly in mind and what
kind of data is needed to answer the research questions will help the researcher identify
focused questions and help direct how closely the sessions are moderated.
106  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

In Mary’ case, she will need to prepare questions for the group to match what she
wants to know, and she will also need to decide if she is the right person to moderate.
Will volunteers feel comfortable to talk freely? Will she feel comfortable controlling the
group enough to stay on track? If not, Mary may want to find someone else who can
effectively moderate the focus group.
Each focus group will produce a considerable quantity of raw data. In considering
whether or not to hold multiple focus groups, the researcher will want to consider not
only the logistics of arranging each session but also the capacity to process and analyze
the data. If possible, and acceptable to all of the participants, focus group sessions
should be audio recorded for later transcription and analysis. It is also a good idea to
have one or two assistants in the room to take detailed notes, in case voices are lost in
the recording. Planning for the focus group should include arranging the technology
to record and transcribe the sessions (Morgan et al., 1998). The researcher should also
consider what tools and procedures will be used to analyze the qualitative data and if
that will require a particular data format.

Other Qualitative Data Collection Methods

Other approaches to qualitative data collection, in addition to interviews and focus


groups, include observation and textual analysis (Coxon, 1999; King, Keohane, &
Verba, 1994). Observation is the act of watching the phenomenon or the behavior you
are interested in researching and recording it so you can describe, analyze, and
interpret what it means. In observation, the researcher can assume one of two roles:
participant observer or nonparticipant observer.
In participant observation, the researcher becomes a member of the observed
group while collecting data. In Mary’s case, for example, she could consider arranging
for an assistant to work alongside volunteers to report on the observed experience of
the volunteers in their own active environment. Participant observation is commonly
used in ethnographic research (Atkinson, 2001; Hume & Mulcock, 2004).
Nonparticipant observation maintains distance between the actor and the
observer. This approach to observation may produce less intensive insight into the
actor’s experiences, but offers more flexibility in how the data are collected. In
Mary’s case, for example, she could station herself in a position to observe volun-
teers at work and take notes in a nonstructured manner, or she could structure her
observations by establishing categories of key events beforehand and make a record
when they occur. Nonparticipant observation can also include video or audio
recordings.
Structured observation is difficult to achieve as a participant observer. However,
even in nonstructured observation, it is worth noting that the researcher directs what
kinds of data are recorded. In the same way that surveys, interviews, and focus groups
require predetermined questions and formats to obtain responses from study partici-
pants, observation requires some degree of deductive reasoning, or theory, to define
what will be observed. Action is infinite and parameters of some kind are necessary to
Chapter 6  Data Collection  ❖  107

decide what is relevant, without neglecting novel issues that may arise. This kind of
framing for observational studies is developed in a field of qualitative research called
empirical phenomenology (Aspers, 2009).
In both approaches to observation, as a participant or nonparticipant, the
researcher should consider the possibility that the actor’s behavior is affected by an
awareness of being watched. This is known as the observer effect. More generally,
the same issue arises in any study—sometimes called the Hawthorne effect
(Gillespie, 1991)—recognizing that participant behavior as well as researcher views
may be affected by the act of research itself. In the discussion of research design in
Chapter 4, we observed a related phenomenon in the placebo effect. In quantitative
research, efforts have been made to control for these effects in strictly controlled
trials by implementing single-blind studies (where participants do not know if
they are in a treatment or control group) or double-blind studies (where both par-
ticipants and the researchers collecting the data do not know who is in which
group). In qualitative research, similar considerations should be given to unin-
tended effects on the participants and potential bias on the part of the researchers
collecting the data.
Another method of qualitative data collection involves collecting and analyzing
the content of texts called textual analysis or content analysis. Unlike the data collec-
tion methods discussed above, this approach does not involve direct participant feed-
back or observation. Interest in the texts is due to the independent purposes of the
authors who created them, prior to the research. The source of the texts could be a type
of organization, a type of media, a time period, a content area, key authors, or some
other distinguishing characteristic. Data collection might focus on corporate annual
reports, newspapers, legislation, historical documents, or electronic media, such as
websites, blogs, emails, or social media postings. Many different methods and discus-
sions surround the area of content analysis, and it is sometimes drawn into quantitative
analysis and theory testing. Both quantitative and qualitative analysis always apply
coding for the presence of certain words, ideas, or characteristics or identify themes
that are relevant to the research objective.
Table 6.3 summarizes the characteristics of the qualitative data collection
approaches discussed in this chapter. With this foundation, we can return to Mary’s
case and see how she develops her approach to data collection.

Mary’s Case
Mary was still leaning toward doing a long interview to collect
data for her research project on volunteer recruitment and reten-
tion, but she was starting to think of alternatives, too, which she
learned about in the qualitative research books Yuki loaned to her.
A focus group looked like a possibility. Back in her office, weighing
this idea, she looked out the window to the courtyard and saw
three older women who looked like volunteers sitting in garden
108  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Table 6.3  Summary of Common Approaches to Qualitative Data Collection

Type Description
Interview Using a guide of structured or semi-structured questions, the researcher
asks the participant questions, collecting the responses as written data.
Focus Group Small groups are organized around a specific topic and the researcher will
guide the group dialogue. The researcher can either be part of the
conversation, or a facilitator can guide the conversation with the researcher
observing and taking notes.
Observation There are two types of observation—participant and nonparticipant
observation. As indicated by the names, participant observation is
conducted when the researcher is part of the activity being observed, such
as having coffee in a coffee shop while observing other customers. In
nonparticipant observation, the researcher is removed from the research
setting, which could also include watching video surveillance.
Textual Analysis Some qualitative research is completed from previously written material or
manuscripts. This is called textual analysis. For example, one might be
interested in perceptions of federalism during the forming of the
Constitution. Since we are unable to interview or observe participants in a
past time period, we can read transcripts, interviews, and other texts to
analyze an event, perceptions, or other written documentation.

chairs under a sun umbrella, sipping coffee. Another option occurred to her. “Maybe I
should mingle with the volunteers in their own environment and do a participant observa-
tion.” She imagined herself at the annual health fair organized by Health First, working
there with the volunteers, but she quickly dismissed the idea. “No way. I can’t do that. First
of all, the volunteers know I’m their coordinator. I can’t be one of them. It will confuse them.
Plus, it makes me feel like a spy. Not comfortable.”
Mary stared blankly at the women in the courtyard while she thought. They were engaged
in a lively conversation. Then they burst into laughter. “These women would feel comfortable
if they were together in a focus group,” Mary reasoned to herself. “They might be more likely
to express what they think if they know someone else is there who would be sympathetic to
their views. Plus, I could get input from a number of volunteers all at the same time.”
As the idea of a focus group took hold over the idea of a long interview, Mary remem-
bered how the book talked about the importance of having a skilled moderator. “Well, that’s
a problem,” she thought. “I don’t think I can facilitate a focus group. I’m not a great facili-
tator. I haven’t done anything like that before, and I don’t have money to hire someone to
do it.” Then she thought of what she would need: a room, a digital recorder, and a portable
computer to take notes … unconsciously, she shook her head. “I don’t think we have any
rooms where I could record a group conversation and get everyone on it.” With that, Mary
convinced herself that a long interview was still the best option.
Chapter 6  Data Collection  ❖  109

Using Secondary Data

Secondary data refers to data not collected by the researcher or collected for a purpose
other than the current research. In short, secondary data is data used for a second time,
or for a second purpose. There are two ways to obtain secondary data: one way is to
access data from a data archive; another way is to use administrative records and
management information. Some data collected for a particular research purpose or a
particular project are archived and made available for other researchers to use. The
U.S. Census is one of the most popular data archives. There are many other data
archives available for researchers to use.
Administrative records and management information are another good source of
secondary data. Organizations, such as government agencies, nonprofit organizations,
schools, and hospitals typically collect records and information related to their functions
and management. Sometimes, legislation or certain governing bodies mandate the col-
lection of certain administrative data. Additional sources may be available when organi-
zations collect information to track their performance for quality improvement. The
wealth of secondary data publicly available or potentially available through private orga-
nizations makes it a prime resource for researchers. Some research questions may be well
served by secondary data. Researchers should consider at the outset of a project if the
appropriate data might already exist and needs only to be found and used.
The use of secondary data can help researchers save time, money, and administra-
tive resources. Despite these advantages, the following list presents a few cautionary
points that need to be considered before using secondary data:

•• How does the data fit the research objective?


•• What are the costs for securing the data (because not all secondary data is free)?
•• How can the researcher verify the accuracy of the data?
•• Is the data current and up to date for the research?
•• If the data is chronological, has the measurement been kept consistent across
time?

Jim’s Case
Jim talked in a torrent to Kathy, the operations manager at the
Rockwood Fire Department, describing the accreditation project he
was working on and how he needed response time data and break-
downs by station and several years of data to analyze trends. When
he paused for breath, Kathy inserted a question.
“How far back do you want to go?”
Jim wasn’t quite sure. He didn’t recall a specific time frame. “I’m
not sure. Maybe just a year or two? Our self-study year is the last fis-
cal year, so my guess is at minimum from last July to this June. It may
not hurt to include the year before, I guess. What do you think?”
“I can certainly give you the response-time data for the last two fiscal years,” Kathy
responded, but her voice indicated this was a good news–bad news situation. “I have to
110  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

warn you, though, two years ago, the 911 system changed in the middle of the fiscal year.
During the transition, they had a lot of trouble with the computerized data recording sys-
tem. So there is a time period when the data could be missing, or wrong. I remember taking
a quick look at the response-time data for that year, and the times looked a lot worse than
the year before and the year after. If you have to include that year in your analysis, I’m not
sure we will meet the standard.“
Jim’s stomach lurched a little. He took a mental note to himself—check the exact time
frame for the self-study.
“That’s good to know, Kathy. Sort of. Thanks. Anything else I need to be aware of?”
Kathy gave him a thin smile, trying to be reassuring. “I’ll let you know if I come across
anything else. Wait, one other thing. What format do you want for the data? I can run the
query in any way you like. Do you want the daily average, monthly average, or yearly aver-
age? And do you want it per engine average, per station average, or for the whole fire
department?“
Jim’s eyes widened. He had not thought of any of these things. This was going to take
a lot of decisions before he could even see the data. So much for walking out with it in
his hand.

Ethical Considerations in Data Collection

Before collecting data, researchers need to think about ethical considerations.


Primarily, the researcher needs to assure participants of the following points:

•• Informed, voluntary participation


•• Physical and psychological well-being
•• Objective data collection
•• Confidentiality or anonymity of the identity

Informed voluntary participation. In the process of data collection, researchers are


obligated to inform the study participants about the purpose of the study and obtain
consent to participate. Under no circumstances may study participants be coerced to
participate in a study. Participation must be voluntary. For a survey, researchers typically
attach a cover letter or introductory statement that explains the purpose of the study,
background of the researchers, information of the sponsors if the study is sponsored,
and other information required for the respondents to make an informed decision to
participate in the survey or not. In the cover letter, researchers should assure poten-
tial respondents that their participation is voluntary and the decision to participate
or not participate in the survey does not affect their relationship with the researchers,
the organization the researchers are affiliated with, or the respondents’ standing in the
organization or the community to which they belong. There should also be informa-
tion on how the survey data will be disseminated and whether the respondents’ identity
will be kept confidential or anonymous. With this information attached to the survey,
the submission of the survey response can be considered as consent to participate. For
Chapter 6  Data Collection  ❖  111

other data collection methods—interviews, focus groups, and observation—similar


information should be shared with the study participants during recruitment or before
the researcher starts collecting the data. In these situations, the researcher should pre-
pare an informed consent form and have the study participants sign the form before
starting to collect data. Some exceptions to this rule may apply to studies related to
quality improvement and normal business operations, which are distinguished from
research (e.g. for health care: U.S. Code of Federal Regulations, 45 CFR 46, Protection
of Human Subjects).

Physical and psychological well-being. Researchers should consider possible impli-


cations of the study on the well-being of study participants before initiating the project.
Every effort needs to be made to avoid any potential harm. In asking questions, research-
ers need to be mindful of the psychological and emotional impact on the respondents,
especially if they touch on personal and sensitive issues. Special attention should
be given to vulnerable populations, such as young children, older adults, people with
disabilities, and individuals who are socially marginalized.

Objective data collection. Researchers have an ethical obligation to collect data objectively
and to not unduly influence the study subject with the researcher’s own bias. In develop-
ing questions, researchers need to pay close attention to the wording of the questions
and avoid phrasing them in a way that could lead to answers that reflect the researcher’s
preference or bias on the subject matter. Researchers also need to pay attention to the
effect of their presence on respondents during data collection. This could involve a num-
ber of issues that might make the respondents feel uncomfortable, including hygiene,
fashion, gender, race, language, or other cultural differences. Issues could also arise when
the researcher has some relationship with the study participants, particularly when the
relationship involves organizational hierarchy and a power differential.

Confidentiality or anonymity of the identity. Some research includes collecting infor-


mation that might potentially harm the respondents if their answers were disclosed.
In these instances, researchers have an ethical obligation to prevent any possibility of
harming respondents due to the disclosure of information obtained in the research.
Any information that will link a respondent’s identity with the information collected
in the research needs to be kept confidential and should not be shared with anyone
other than key research personnel. In the web-based survey, telephone survey, and mail
survey, when the researcher does not need to follow up with respondents, it is possible
to collect data with anonymity. Any identifying information that was necessary to con-
tact respondents can be destroyed. In other cases, such as surveys that require archived
contact information, email surveys, face-to-face surveys, interviews, focus groups, and
observation, researchers can identify particular individuals with specific information,
so the data cannot be anonymous. In such cases, the researcher needs to keep the iden-
tity of the study participants confidential. In either case, whether individual identities
are kept anonymous or confidential, the researcher needs to inform the study partici-
pants about how their identities are protected.
112  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Chapter Summary
In this chapter, we discussed data collection methods. This is step 5 of the research flow. There are
two different types of data: quantitative data and qualitative data. Methods of data collection will
be different, depending on the type of data you would like to collect for your research. We
described different methods of data collection, including the use of secondary data, and several
advantages and disadvantages associated with them. We also discussed primary ethical
considerations researchers need to follow to protect participants in the research.

Review and Discussion Questions


1. What is the difference between primary data and secondary data? Name an example of each.
2. Think of an example of a research question that would be appropriate for using quantitative or
qualitative data. How are they similar and how are they different?
3. Discuss why a researcher needs to be concerned about the survey response rate. What are the
implications of a survey having a low response rate?
4. Find a survey printed in a magazine, newspaper, or other current media. Review the questions
in the survey and critique them.
5. Identify situations where open-ended questions are more appropriate than closed-ended
questions. Think about the advantage of using closed-ended questions over open-ended
questions.
6. Discuss the advantages and disadvantages of different data collection methods. Compare
survey, interview, focus group, observation, and using secondary data.
7. What ethical issues might Emily, Jim, and Mary need to take into consideration in their data
collection?
8. Develop an interview guide for Emily to assist in interviewing employees about the diversity
training.

References
Aspers, P. (2009). Empirical phenomenology: A qualitative research approach (the Cologne Seminars). Indo-
Pacific Journal of Phenomenology, 9(2), 1–12.
Atkinson, P. (2001). Handbook of ethnography. London, UK: Sage.
Babbie, E. R. (2013). The practice of social research. Belmont, CA: Wadsworth Cengage Learning.
Baruch, Y. (1999). Response rate in academic studies—a comparative analysis. Human Relations, 52(4),
421–438.
Baruch, Y., & Holtom, B. (2008). Survey response rate levels and trends in organizational research. Human
Relations, 61(8), 1139–1160.
Coxon, A. P. M. (1999). Sorting data: Collection and analysis. Thousand Oaks, CA: Sage.
De Vaus, D. A. (1986). Surveys in social research. London, UK: Allen & Unwin.
Chapter 6  Data Collection  ❖  113

Dillman, D. A., Smyth, J. D., & Christian, L. M. (2009). Internet, mail, and mixed-mode surveys: The tailored
design method. Hoboken, NJ: Wiley.
Fowler, F. J. (1993). Survey research methods. Newbury Park, CA: Sage.
Greenbaum, T. L. (2000). Moderating focus groups: A practical guide for group facilitation. Thousand Oaks,
CA: Sage.
Groves, R. M., Fowler, F. J., Couper, M., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey
methodology. Hoboken, NJ: Wiley.
Hume, L., & Mulcock, J. (2004). Anthropologists in the field: Cases in participant observation. New York, NY:
Columbia University Press.
King, G., Keohane, R. O., & Verba, S. (1994). Designing social inquiry: Scientific inference in qualitative
research. Princeton, NJ: Princeton University Press.
Krueger, R. A., & Casey, M. A. (2000). Focus groups: A practical guide for applied research. Thousand Oaks,
CA: Sage.
Kvale, S., & Brinkmann, S. (2009). InterViews: Learning the craft of qualitative research interviewing. Los
Angeles, CA: Sage.
Lofland, J., Snow, D. A., Anderson, L., & Lofland, L. H. (2006). Analyzing social settings: A guide to qualitative
observation and analysis (4th ed.). Belmont, CA: Wadsworth.
Millar, M. M., & Dillman, D. A. (2011). Improving response to web and mixed-mode surveys. Public Opinion
Quarterly, 75(2), 249–269.
Moore, D., McCabe, G., & Craig, B. (2010). Introduction to the practice of statistics (7th ed.). New York, NY:
Freeman.
Morgan, D. L. (1988). Focus groups as qualitative research. Newbury Park, CA: Sage.
Morgan, D. L. (1995). Why things (sometimes) go wrong in focus groups. Qualitative Health Research, 5(4),
516–522.
Morgan, D. L. (1996). Focus groups. Annual Review of Sociology, 22, 129–152.
Morgan, D. L., Krueger, R. A., & King, J. A. (1998). Focus group kit. Thousand Oaks, CA: Sage.
Panel on a Research Agenda for the Future of Social Science Data Collection, Committee on National
Statistics, Division on Behavioral and Social Sciences and Education, National Research Council.
(2013). Nonresponse in social science surveys: A research agenda. Washington, DC: The National
Academies.
Robson, C. (1993). Real world research: A resource for social scientists and practitioner-researchers. Oxford,
UK: Blackwell.
Rogelberg, S. G., & Stanton, J. M. (2007). Introduction: Understanding and dealing with organizational
survey nonresponse. Organizational Research Methods, 10(2), 195–209.
Salkind, N. J. (2011). Exploring research. Upper Saddle River, NJ: Pearson Education.
Sapsford, R., & Jupp, V. (2006). Data collection and analysis(2nd ed.). Retrieved from http://site.ebrary.com/
id/10256950
Schutt, R. K. (2012). Investigating the social world: The process and practice of research. Thousand Oaks, CA: Sage.

Key Terms
Administrative Records Computer-Assisted Data Archives  109
and Management Telephone Interview
Double-Blind
Information 109 (CATI) 98
Studies 107
Anonymity   111 Conceptualization   96
Empirical
Closed-Ended Questions  94 Confidentiality   111 Phenomenology 107
114  ❖  SECTION I  RESEARCH DESIGN AND DATA COLLECTION

Errors of Measurement Error  92 Primary Data  91


Nonobservation   92
Nonparticipant Qualitative Data  91
Errors of Observation   92 Observation 106
Quantitative Data  91
Face-To-Face Interview  99 Nonresponse   93
Sampling Error  93
Focus Group Interviews  102 Observation 106
Secondary Data  91
Inadequate Coverage of Observer Effect  107
Population in the Sampling Single-Blind Studies  107
Open-Ended Questions  94
Frame 93 Survey 91
Operationalization   96
Indicators 97 Telephone Survey  98
Paper and Pencil
Informed Consent Form   111
Survey 99 Textual Analysis or
Mail Survey  99 Content Analysis  107
Participant
Margin of Error  93 Observation 106 Web-Based Survey  98

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


SECTION II:

Data
Analysis
7
Quantitative
Data Preparation
and Descriptive
Statistics

Learning Objectives 118
Preparing for Analysis and Using Descriptive Statistics 118
Emily’s Case 118
Jim’s Case 119
Starting Data Analysis 120
Preparing Data for Analysis 121
Levels of Measurement 122
Descriptive Statistics: Overview 126
Measures of Central Tendency 126
Mean 127
Median 128
Mode 130
Which Measure of Central Tendency to Use? 131
Measures of Variability 131
Range 133
Variance 134
Standard Deviation 136

116
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  117

Measures of the Shape of Distribution 137


Chapter Summary 144
Review and Discussion Questions 144
Statistics Exercise 145
1. Emily’s Data 145
2. Jim’s Data 145
Step-by-Step Instructions for Running Descriptive Statistics Using SPSS 145
Step-by-Step Instructions for Running Descriptive Statistics Using Excel 147
Key Terms 149
Figure 7.1 Example of Data Structure in SPSS 122
Figure 7.2 Relationship Between Ranking and Absolute Value Judgment 124
Figure 7.3 Summary of the Level of Measurement and Its Key Characteristics 125
Figure 7.4 Boxplot 130
Figure 7.5 Range in Emily’s Data 134
Figure 7.6 Histogram and Frequency Polygon of the
Length of Service of the Training Participants 138
Figure 7.7 Leptokurtic, Mesokurtic, and Platykurtic Distribution 139
Figure 7.8 Two Distributions With Same Mean and Different Standard Deviation 141
Figure 7.9 Negative and Positive Skewed Distribution 142
Figure 7.10 Skewness and Central Tendency 143
Figure 7.11 Descriptive Statistics Using SPSS 146
Figure 7.12 Descriptive Statistics Using SPSS - Options 146
Figure 7.13 Descriptive Statistics Using SPSS - Output 147
Figure 7.14 Descriptive Statistics Using Excel 147
Figure 7.15 Descriptive Statistics Using Excel - Inputting Data Range 148
Figure 7.16 Descriptive Statistics Using Excel - Output 148
Table 7.1 Length of Service of the Employees in the
Administrative Departments (Training Participants) 127
Table 7.2 Variation of Table 7.1. Values Ordered 129
Table 7.3 Identifying Median Length of Service for 7 Employees 129
Table 7.4 Variation of Table 7.1. Values Grouped 131
Table 7.5 Which Measure of Central Tendency Should I Use? 132
Table 7.6 Comparison in Variability 132
Table 7.7 Deviance of Employee’s Length of Service 135
118  ❖  SECTION II  DATA ANALYSIS

Table 7.8 Calculating Variance With Squared Deviance 136


Table 7.9 Frequency Table of Length of Service of the Training Participants 137
Formula 7.1 Formula for Mean 127
Formula 7.2 Formula for Range 133
Formula 7.3 Formula for Standard Deviation 136


Learning Objectives

In this chapter you will

1. learn what needs to be done to prepare data for analysis


2. Gain understanding of the four levels of measurement: nominal, ordinal, inter-
val, and ratio
3. Learn about three types of descriptive statistics: measures of central tendency,
measures of variability, and measures of the shape of a distribution
4. Develop an understanding of three types of measure of central tendency:
mean, median, and mode
5. Develop understanding of three measures of variability: range, variance, and
standard deviation
6. Learn about two measures of the shape of a distribution: kurtosis and skewness

Preparing for Analysis and Using Descriptive Statistics

Emily’s Case

“Knock knock.”—Emily looked up and saw Leo standing at the door. “Do you
have time to chat about the survey?” he asked.
Emily invited Leo in. She was certain he had a lot to talk about. The post-
training workshop, where they gathered all the study participants together
following the diversity training, had gone off smoothly two weeks ago. Leo now
had both pretraining and posttraining surveys from the study participants for
the training evaluation, including responses from employees who took the
training and those who did not. Mei Lin had done a good job coordinating the four train-
ing sessions, and Emily felt confident they would get good results, at least as far as satis-
faction from those who participated. Employees who did not attend the training responded
well, too, she thought, considering they were asked to take two identical surveys, four
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  119

months apart, with no other “cultural competency” activities in between. Altogether, about
90% of those who came to the pretraining workshop also attended the posttraining work-
shop. Leo was working hard to enter the second round of survey responses into an SPSS
database.
“I think the data set is ready to go,” Leo said as he approached Emily’s desk. He handed
her a thin binder. “This is a clean version of the codebook, so you can see the variables we
have, and you’ll be able to interpret all the abbreviations on the printouts.” He paused while
Emily browsed the pages in the codebook, then he continued, “I double-checked all the data
for accuracy. Now I need to clarify what you want to do for the analysis.”
Emily had been so focused on collecting data, she had not given much attention to the
analysis part of the study. She stood up and moved to the whiteboard in her office as she
collected her thoughts. “Let’s think about what we need to find out,” she said. On the board,
she wrote: Things to find out.
Leo said, “First, I think we need to look at the background of all the study participants,
things like gender, age, race, department, years of service, those kinds of things.”
On the board, Emily wrote: Characteristics/background of participants. “OK. What
else?”
“We also need to identify the level of cultural competence and workplace conflict
reported in the survey, right?” Leo offered.
“Right, because we are interested in the difference in cultural competence and workplace
conflict between those who attended the training versus those who did not,” Emily replied.
She was not immediately sure how to write that on the board. She glanced at the codebook
on her desk and then at the clock on the wall. It was a quarter to five. She thought, “Looks
like I will be staying late today.”

Jim’s Case

“B-r-r-r-r”—the phone rang and broke the silence in the office. Jim
was staring at the computer screen like a statue. He responded
slowly and picked up the phone.
“Your three o’clock is here,” the receptionist told him in a flat
voice.
“Oh, send her in.”
One minute later, a tall young woman in a navy business suit
appeared in the hall outside Jim’s office, evidently uncertain where
she was going.
Jim stood up to attract her attention. “Lavita, right? Come on in. Take a seat.”
A few days ago, when Jim was consulting with Ty about his research projects over a
glass of beer, Ty told him he had a graduate student who might be a perfect fit to help him.
“She is assigned to me as a research assistant, but I don’t have much going on right now,”
Ty explained. “She is great with statistical analysis, and she told me she wants an oppor-
tunity to do real-life research.”
“Hey, I’m real life,” Jim laughed.
120  ❖  SECTION II  DATA ANALYSIS

Ty clearly had a joke on the tip of his tongue at Jim’s expense, but he restrained himself.
“Maybe she can help with your analysis.”
After brief introductions, Jim described his two research projects. Lavita listened intently
and took notes. Once Jim finished, Lavita said, “OK. So where do you want me to start? It
sounds like the response-time data might be the easiest.”
Jim was delighted that Lavita was willing to jump right in. After several exchanges with
Kathy, the operations manager who got the response-time data for him, he was still not
happy with it. She had sent different spreadsheets with response-time data, each arranged
in a slightly different way, some by years, some by month, and some by station. The
sheets were filled with rows and rows of numbers, and Jim had a hard time making sense
of them.
“I hope it’s easy,” Jim said with a weak smile. He reached out and angled the computer
screen so Lavita could see it. “Actually, I was just looking at one of the response-time
spreadsheets.”
Lavita glanced at the computer screen only briefly. “Would you mind sharing the file
with me? I could run some basic descriptive statistics, so I can get a sense of the data.” She
fumbled in her pocket and pulled out a thumb drive.
Jim was impressed again. He thought a moment about her request. Naturally she should
have the data, so she could work on her own time. It was all public information, nothing
confidential in it. “Sure,” he said.
Lavita didn’t stop there. “While you have the thumb drive plugged in, is there anything
you can give me on the alternative service delivery study? I might as well get oriented on
what you have there as well.”
Jim had been thinking the same thing. “Good idea,” he said. He liked her initiative. Ty
knew what he was talking about. Lavita was just what Jim needed to make sense of his
data. “I appreciate whatever you can do to help me get the analysis done.”

Starting Data Analysis


After the data collection phase in the research flow (Step 5), you can start analyzing the
data (Step 6). (See Figure 2.1: Research Flow and Components in Chapter 2.) At this
stage of your research, you should have a clear idea on what you intend to find out as
a result of your analysis based on your research objective, research questions, research
design, sampling, and what you collected as data. In this chapter, and the following
chapters, we will focus on quantitative data, using the case examples. Data analysis
approaches are very different for quantitative and qualitative data. We will discuss
qualitative data analysis separately in Chapter 14.
As we discussed in Chapter 3, the type of research objective you have is closely
connected with the type of analysis you perform. If you have quantitative data, and
your research objective is to explore and describe the phenomenon, then you will focus
on using descriptive statistics. Alternately, if your research objective is to confirm or test
a hypothesized relationship, then you will focus on using inferential statistics. This
chapter is devoted to understanding descriptive statistics. We will discuss inferential
statistics in Chapter 8.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  121

Preparing Data for Analysis


Before delving into data analysis, the researcher needs to prepare the collected data. In
Emily’s case, we saw that Leo, her intern, spent some time transferring all the survey
data into a computer statistics database and double-checked it for accuracy. With a
web-based survey, the responses are typically captured in some kind of database auto-
matically. In a telephone or face-to-face survey, it is possible for interviewers to enter
responses directly into a database interface while collecting the data. Various computer
programs are available for data management and analysis. In this book, we will intro-
duce statistical analysis using SPSS and Excel. Other popular statistical analysis pro-
grams among public and nonprofit managers include, SAS, Stata, and R.
When data are recorded in a data management and analysis program, the
researcher needs to know ahead of time what data structure is required for the anal-
ysis. Most programs define rows as individual cases from which the data are col-
lected, while the columns indicate variables that are numerical values representing
items of information obtained from the cases. For example, in Emily’s case, the
survey data is collected from each individual employee, so the rows in the database
Leo constructed represent the individuals who responded to the survey. The col-
umns represent the questions asked in the survey. Figure 7.1 shows an example of the
data structure in SPSS.
As part of the data preparation, the researcher should develop a codebook
(Trochim & Donnelly, 2007). The codebook describes each data element. The follow-
ing information is typically included in the codebook:

•• Variable name
•• Description of the variable (e.g. which survey question corresponds to the variable)
•• Format of the variable (e.g. number or text)
•• Information on the instrument ( e.g. web-based survey, paper survey)
•• Date data were collected
•• Date data were entered
•• Any changes made to the original variable entered in the database
•• Notes

Keeping a good codebook is especially important when the researcher conducts


any transformations or additions to the data to make the data more useful or usable.
Transformation may involve recoding within a variable to change the numbers that
represent different responses. Also, certain values can be selected for the analysis and
others excluded. If the researcher includes response categories for don’t know or not
applicable, it may be useful to specify that those responses will not be included in the
analysis. Researchers can also create new variables by calculating a new score based
on the existing variables. In Emily’s case, this might mean creating a variable called
cultural competence by averaging the responses to multiple items in the survey
related to cultural competence. It is important to record all data additions and trans-
formations in the codebook to document how the new variables were created and for
what purpose.
122  ❖  SECTION II  DATA ANALYSIS

Figure 7.1    Example of Data Structure in SPSS

Data cleaning is another task that is required as part of data preparation before
starting data analysis. Data cleaning is a validation process researchers take to check
the data for errors and screen for accuracy. One way to check for errors is to run a
frequency report of data values and look for out-of-range values. In Emily’s case, if she
finds anyone having an age of 3, that person is definitely out of the age range for a city
employee. Another means of screening for errors is to check for missing values. Some
errors may be impossible to detect without double-checking the data set against the
source. When the data are entered manually, it is a good idea to enter the original data
twice, preferably by two different people, and compare the two data sets to look for
discrepancies. This is called double entry (Trochim & Donnelly, 2007). Data cleaning
can be a laborious procedure, but it is an expected part of the research process.

Levels of Measurement

In conducting a statistical analysis using quantitative data, it is important to have a


clear understanding of how the concepts being analyzed are measured and captured as
variables. Variables represent the information about the cases in a study. Different
values assigned to a variable represent attributes of each case. For example, information
on gender can be represented by a 1 or 2 to indicate male or female. The level of
measurement refers to how the values are assigned to the attributes. The level
of measurement of the variable determines the type of statistical analysis that can be
applied. For this purpose, it is important to understand four commonly adopted levels
of measurement (Stevens, 1951):
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  123

•• Nominal or categorical
•• Ordinal
•• Interval
•• Ratio

The last two levels, interval and ratio, are frequently collapsed into a single level
and referred to as continuous. We describe these different levels of measurement
below.
In the nominal or categorical level of measurement, numbers are assigned arbi-
trarily for different attributes. The numerical values assigned to the attributes do
not imply any ordering that allows mathematical interpretation. For example,
assigning a 1 or 2 for male or female does not mean one is larger than the other.
When there are only two options for the attributes, then it is described as dichoto-
mous. Variables can also have more than two categories. In Emily’s case, the data set
of employee responses to the survey employees identify department affiliation in
six department types: 1 = Administration, 2 = Public Safety, 3 = Culture and
Recreation, 4 = Roads and Transit, 5 = Economic Development, and 6 = Field and
Fleet. Again, the numbers assigned to each department type are arbitrary. They do
not indicate priority or magnitude. Categorical variables like this are also referred
to as grouping variables.
In the nominal or categorical level of measurement, a variable’s attributes should
be mutually exclusive and exhaustive. In other words, every case can have only one
attribute. A respondent has to be either male or female, and cannot be both. In the
example of departments, each employee needs to belong to only one department type.
If some individuals are affiliated with multiple attributes, the researcher will need to
decide if the categories need to be redefined (for example, any employee who works
over half the time in one department) or if that situation needs to be conceptually
distinct and assigned a separate defined value.
In the ordinal level of measurement, the numbers assigned to an attribute represent
a ranking. In Emily’s case, for example, one survey questions asks, “Please rank the
following activities in terms of its usefulness in improving your work unit’s overall level
of understanding on diversity and inclusiveness” by providing three specific activities:
diversity training, diversity award event, and newsletter. Responses to this question
will be ordinal, because the numbers represent successive levels of usefulness. Note
that an ordinal ranking does not represent actual values that can be added together.
Only the relative value is known. Furthermore, the distance between the rankings may
not be equal.
Following the example in Emily’s case, let’s say three people provided the same
ranking for the usefulness of three activities for understanding diversity; all three
ranked Diversity training first, Diversity award event second, and Newsletter third. In
this case, diversity training could be recognized as the most valuable activity for under-
standing diversity among the presented alternatives, but it remains unknown just how
useful it is for each of the respondents. Figure 7.2 illustrates how the same ranking for
each person might represent a different value judgment.
124  ❖  SECTION II  DATA ANALYSIS

Figure 7.2   Relationship Between Ranking and Absolute Value Judgment

Absolute value High 1. Diversity 1. Diversity


judgment on the training training
level of
usefulness of
different events
1. Diversity
training

Mid 2. Diversity 2. Diversity


award event award event

3. Newsletter 2. Diversity
award event

Low 3. Newsletter 3. Newsletter

Person 1 Person 2 Person 3

The interval level of measurement and ratio level of measurement are similar. The
difference between them is unimportant in most analyses in social science, and they
are frequently combined and referred to as a continuous level of measurement. In both,
the value indicates order and exact distance between the values. In familiar terms, this
means the value represents a measure on some defined scale, representing quantity or
extent. Temperature, weight, length, and volume are typical examples.
The interval and ratio levels of measurement are different in that interval mea-
sures do not have a true zero, while ratio measures do. By this definition, weight or
length are ratio measures, because there is an absolute starting point at zero. On the
other hand, conventional scales for temperature are interval measures, because there is
no starting point where zero temperature exists. The zero point in the Celsius system
starts where water freezes, while the Fahrenheit system starts at a much colder level.
Both scales count degrees of temperature below as well as above their established zero
points. Consequently, it is invalid to make a ratio statement for temperature, such as 2
degrees is twice as warm as 1 degree, because the starting point at zero, in reality, also
has temperature. The numbers are additive, but do not represent absolute values of
how much temperature exists.
Confusion sometimes occurs when distinguishing a continuous from a categorical
variable in relation to counts, as opposed to a measurement on a scale. It is possible for
a count of some attribute related to individuals to be considered as a continuous vari-
able, as for example, in response to the question, “How many diversity trainings have
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  125

Figure 7.3  Summary of the Level of Measurement and Its Key Characteristics

Levels of Measurement

Continuous

Nominal
(Categorical/ Ordinal Interval Ratio
Dichotomous)

Distance between
Variables are Distance between values are
Can only be classified
measured values are meaningful and
into categories
based on the ranking meaningful have an absolute
zero

e.g. number of
e.g. Ranking of
e.g. Temperature diversity training
e.g. Gender, Race, activities based
(Fahrenheit and attended, length of
Departments on their
Celsius) service (measured
usefulness
in years)

you attended?” The responses would represent a ratio level of measurement, with a
starting point at zero and additive units with equal distance between them, so attend-
ing two trainings is twice as many as one training. However, the question, “Did you
attend the diversity training?” would be a dichotomous variable: yes or no. If individ-
uals are the cases, it would be wrong to count the number of those individuals who
attended the training (or Training #1 versus Training #2), as a continuous variable. The
level of measurement refers to the training, not the people. A count of people who
attended a training could be interpreted as a continuous level of measurement only if
the data set represents a number of trainings as the cases, and “How many people
attended?” is one of the variables.
One of the frequently asked questions is whether the survey questions using
scaled measures such as a Likert scale (described in Chapter 6) should be considered
as ordinal or an interval level of measurement. Although some scholars argue that the
Likert scale represents an ordinal measure (Jamieson, 2004; Stevens, 1946), consider-
able research has been conducted that demonstrates the validity of accepting the
126  ❖  SECTION II  DATA ANALYSIS

Likert scale as an interval measure, or continuous variable, with additive properties.


When the measures assessing an individual’s feelings about a certain topic are care-
fully constructed, the psychological distance between the response options can be
presumed to be equal (Carifio & Perla, 2008). When the distance between the
response categories is uncertain and probably unequal, the variables should be con-
sidered as ordinal measures.

Descriptive Statistics: Overview

Once a data set is complete, accuracy is verified, and the variables are prepared for the
intended analysis, the researcher is ready to begin statistical analysis. Statistics refers
to the study and set of tools and techniques used to quantitatively describe, organize,
analyze, interpret, and present data (Ha & Ha, 2012). There are two types of statistics:
descriptive statistics and inferential statistics. Descriptive statistics are used to organ-
ize and describe the characteristics of the data. In some cases, this is the whole purpose
of the research. In other cases, the researcher intends to use inferential statistics to
confirm or test hypotheses about the population of interest. In all cases, the data
analysis should begin with descriptive statistics to better understand the characteristics
of the sample population, and detect patterns and unexpected incongruities.
The unsummarized and nontabulated form of data before it is analyzed is called
raw data. In Jim’s case, we saw him grappling with spreadsheets full of numbers,
which he had difficulty comprehending. By organizing the data with descriptive sta-
tistics, a researcher can summarize the information in the raw data into a few num-
bers that reveal important characteristics of the data set. Three types of measures are
commonly used:

•• Measures of central tendency


•• Measures of variability
•• Measures of the shape of a distribution

Each of these measures can be represented in several ways that provide differ-
ent insights into the underlying data. In the sections below, we discuss the measures
in detail.

Measures of Central Tendency

A measure of central tendency is a descriptive statistic that indicates the middle or


central position of a value for a variable in a data set (Moore, 2001). For example, in
Emily’s case, she can calculate a middle point to describe the middle or central point
of the training participants’ age, length of service, perception of the level of conflict in
their unit, and so on. There are three measures of central tendency—mean, median,
and mode. Each one of these measures conceptualizes a different kind of central point
in the data.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  127

Mean
The mean is the arithmetic average. This is the most common and familiar measure of
central tendency. To compute the mean, you take all the values for each observation for
a particular characteristic in the group, add them up, and then divide the sum by the
number of observations. For example, let’s say Emily wants to know the mean length
of service for the 8 employees in Administration. The length of service reported by
each individual, and the sum for all them, are shown in Table 7.1.
To calculate the mean, add up all the numbers (72), then divide by the total num-
ber of employees in this group (8); this produces a mean or average of 9 years (72/8 = 9).
The formula for calculating the mean is shown below.

Formula 7.1 Formula for Mean

∑X
X=
n
Where

•• The letter X with a line above (“X bar”) is the mean value
•• The letter X is each individual value in the group of scores
•• The ∑ (Greek letter sigma) indicates to sum all the values that follows it
•• The letter n indicates the size of the sample from which you are computing the
mean
When you have a nominal or categorical measure, mean is not an appropriate
central tendency to use for your descriptive statistics (Babbie, Halley, Wagner, &
Zaino, 2012). In those cases, as described above, the numbers are arbitrary identifiers
for the categories, and they do not have a mathematical relationship. Therefore, the

Table 7.1  L
 ength of Service of the Employees
in the Administrative Departments
(Training Participants)

Employee numbered Length of service (years)


1 20
2  3
3 12
4  4
5  2
6 15
7  8
8  8
Total 72
128  ❖  SECTION II  DATA ANALYSIS

arithmetic mean of these numbers does not mean anything. It should also be noted
that one of the properties of the mean as a central tendency is that it is sensitive to
extreme scores (called outliers). One extremely high score can make the mean much
larger, or one extremely low score can make the mean score much smaller. For example,
we saw the mean length of service for the training participants in Administration was
9 years. Let’s consider what happens to the mean length of service when the first
employee in the list worked 45 years instead of 20 years—much longer than the
others. The recalculated mean for the group would be 12.1 years. If that one person is
excluded from the list, the mean for the group would be 7.4 years. Just having one
person with an extremely long length of service produces a much different mean
value. When the data include a few extreme scores, the mean is less useful as a
measure of central tendency.

Median
The median is the value found at the exact middle of the range of values for a variable,
when the values are listed in numerical order. Half of the values in the range are above
the median and half are below the median. Unlike the mean, the median does not add
specific values together but is sensitive to their rank order. Thus, the median may be
applied to ordinal as well as continuous variables.
Turning again to the example of length of service for employees in Administration
in Emily’s data set, this time we see the central point, represented as the median, is 8
years. We find this value by listing the employees so the values for length of service are
in numerical order. Then we calculate the exact midpoint in the list by taking the total
number of employees (8), add 1 (9), and divide by 2 (4.5), then count that resulting
number of lines from the top or bottom of the list. In this way, you arrive at the line
where half of the values will be below and half above that point.
Table 7.2 shows the employees with their original identification numbers and a
number for their position in the list. Note that 4.5 is between positions 4 and 5. In
this case, both lines show 8 years of service. If the two numbers were different, the aver-
age value of the two numbers would be the median. When the position of the median
is a whole number, then the line associated with that number is taken as the median.
When the position lies between two lines, as in this case, the values in the two
middle positions are averaged. In this instance, 4.5 is between positions 4 and 5,
which both show 8 years of service. The median is 8 years. If these two numbers were
different, the average value of the two numbers would be the median. When the
number of data points is even, as in the case above, the values in the two middle
positions are averaged.
Table 7.3 shows the list of employees after removing the employee with 20 years
of service. Again, we calculate the midpoint in the list by taking the total number of
employees (7), add 1 (8), and divide by 2 (4), then count that resulting number of
lines from the top or bottom of the list. We see the median is still 8 years. In this
example we see how removing an extreme value is likely to have little effect on the
median.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  129

Table 7.2 Variation of Table 7.1 Values Ordered

Employee numbered Length of service (years)


5(1st)  2
2(2nd)  3
4(3rd)  4
7(4th)  8
8(5th)  8
3(6th) 12
6(7th) 15
1(8th) 20

Table 7.3 Identifying Median Length of


Seervice for 7 Employees

Employee numbered Length of service (years)


5(1st)  2
2(2nd)  3
4(3rd)  4
7(4th)  8
8(5th)  8
3(6th) 12
6(7th) 15

A related concept to the median, though not a measure of central tendency, is the
use of percentile points to examine the spread of the data. The median represents the
50th percentile point. The values at other points can be identified to show what per-
centage of the data are less than or equal to that particular value (Holcomb, 1998). For
example, if a value is described as being at the 25th percentile point, that means 25% of
the data are less than or equal to it.
Boxplots provide a way to visually display this information on the spread of the
data in percentile. Figure 7.4 shows the boxplot of the employees’ length of service
from the earlier example. The horizontal line inside the box shows the median value.
130  ❖  SECTION II  DATA ANALYSIS

Figure 7.4   Boxplot

20.00

50th percentile Highest non-outlier value

15.00

10.00
Interquartile range

5.00 Median

25th percentile
Lowest non-outlier value
Length of service

The top of the box indicates the 75th percentile point (third quartile), and the bottom
of the box indicates the 25th percentile point (second quartile). The total height of the
box, indicating the difference between the 75th percentile point and 25th percentile
point, is called the interquartile range. The box is the middle 50% of the data. The
stems protruding from the box, sometimes called whiskers, represent the highest and
lowest values in the range of values for that variable. Outliers are not included. (In
SPSS, when the highest or lowest values are more than 1.5 times above or below the
interquartile range, those observations are identified as outliers and appear as a dot
outside the stem.)
Percentile points offer a good way to illustrate the spread of the data around the
median. Later, we will discuss other ways to measure the spread of the data as a mea-
sure of variability.

Mode
The mode is the value that occurs most frequently in the data set. This is the least
precise measure of central tendency, but it does reveal a characteristic of the values for
a particular variable that neither mean nor median capture. The easiest way to deter-
mine the mode, like the median, starts with a list of values in numerical order.
Ordering the values produces natural groups wherever the same value is repeated
which makes it possible to locate the largest group. When we return to Emily’s data on
length of service, in Table 7.4, the list of numbers shows the mode is 8 years, represent-
ing the most commonly occurring number in the set.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  131

Table 7.4 Variation of Table 7.1. Values


Grouped

Employee numbered Length of service (years)


5  2
2  3
4  4
7  8
8  8
3 12
6 15
1 20

Which Measure of Central Tendency to Use?


Each measure of central tendency says something different about the data for a par-
ticular variable. One measure may be better than another, depending on the type of
data and how the values are distributed (Berman & Wang, 2012; Moore, 2001). The
mean is the most popular measure of central tendency, though as noted above, it is
sensitive to extreme values (outliers) in the data set. Therefore, data with extreme values
can be better represented by the median, which is not as sensitive to extreme values. Mode,
on the other hand, is not a precise measure of central tendency in comparison to the
mean and the median. Mode is, however, particularly useful for summarizing data
based on categories, such as gender, department affiliation, race, or geographic area.
For categorical data, mode is the only one of the measures of central tendency that
makes sense. Table 7.5 summarizes when it is most appropriate to use each measure of
central tendency.

Measures of Variability

Measures of variability (also called dispersion or spread) represent how much the
values in the data differ from each other. The measures of central tendency do not
capture this information. We illustrate this concept in Table 7.6-A, B, and C. The first
table shows the same data used earlier for the employees’ length of service. The two
subsequent tables modify the values, so the mean is the same, but the variation in the
data is different. The mean length of service in all three tables is 9 years, but the differences
between each employee are much smaller in the second table, and in the third table,
the length of service is the same for all the employees.
132  ❖  SECTION II  DATA ANALYSIS

Table 7.5  Which Measure of Central Tendency Should I Use?

Measure of Central
Tendency Proper Usage
Mean When you have data that do not have extreme values. This measure is
meaningless with categorical variables.
Median When you have extreme scores (outliers) and you do not want to distort the
average.
Mode When data are categorical and values can fit into only one class.

There are three measures of variability: range, variance, and standard deviation.
We will explain each one of these measures in the following section.

A: Original Data

Table 7.6a  Comparison in Variability

Employee numbered Length of service


5  7
2  7
4  8
7  9
8  9
3 10
6 11
1 11
Total 72
Mean  9

B: Hypothetical Data With Less Variability

Table 7.6b  Comparison in Variability

Employee numbered Length of service


5  7
2  7
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  133

Employee numbered Length of service


4  8
7  9
8  9
3 10
6 11
1 11
Total 72
Mean  9

C: Hypothetical Data With No Variability

Table 7.6c  Comparison in Variability

Employee numbered Length of service


5  9
2  9
4  9
7  9
8  9
3  9
6  9
1  9
Total 72
Mean  9

Range
The range is simply the difference between the highest value and the lowest value in
the data set. The range provides a general indicator for how widely the data are distrib-
uted. You can calculate the range by taking the highest value of the observations, then
subtract the lowest value. The formula for the range is

Formula 7.2 Formula for Range

r = h - l
134  ❖  SECTION II  DATA ANALYSIS

Figure 7.5   Range in Emily’s Data

25

20

15

Range
10

0
0 2 4 6 8 10

Length of service

Where

r is the range
h is the highest value of the observation in the data set
l is the lowest value of the observation in the data set

Figure 7.5 charts Emily’s data for the employees’ length of service that we used
earlier. The range is 20 − 2 = 18 years.

Variance
Variance provides an idea of the differences between values for a particular variable by
subtracting each value from the mean. Each difference from the mean value is called
deviance. Variance provides a measure of the average deviance for the set of values, but
what is called variance in statistics actually applies to the squared deviance. We illus-
trate the reason for this by looking again at Emily’s data on the employees’ length of
service in Table 7.7. Computing a simple average of the deviance does not work,
because some of the values fall above and some below the mean, which results in both
positive and negative numbers. Adding all the differences from the mean together will
always equal zero. See what happens when you sum all the measures of deviance in
Emily’s data (mean = 9).
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  135

Table 7.7  Deviance of Employee's Length of Service

Employee numbered Length of service (X) Deviance (X-9)


5  2 −7
2  3 −6
4  4 −5
7  8 −1
8  8 −1
3 12  3
6 15  6
1 20 11

Sum of Deviance = (-7) + (-6) + (-5) + (-1) + (-1) + 3 + 6 + 11 = 0

What we really want to know with deviance is the absolute distance of each value
from the mean, whether the value is above or below it. With this in mind, statisticians
came up with the idea to square the deviance for each data point to eliminate the neg-
ative values. Once we square the deviance, we can then calculate a meaningful average.
Table 7.8 shows the squared deviance in Emily’s data.
The sum of the squared deviance, or sum of squares, for the data in Table 7.8 equals
278. To calculate the average requires another step with a slight difference from the
normal procedure. In the normal calculation of an average, the total score is divided
by the number of observations (n). In Emily’s data, we have 8 observations, so we
would expect to divide the sum by 8. In calculating variance, however, we subtract 1
from the count of observations, which produces a slightly larger number. This is done
to improve the estimate of variance for the population our sample of data is supposed
to represent. Statisticians assume that the variance in the overall population is likely to
be greater than what is represented in the sample. Subtracting 1 from the count of
values (n–1) generates a small correction. In this case, therefore:

Variance = Average of sum of squared deviance


= Sum of squared deviance/ (n-1)
= 278/ (8-1)
= 39.71

Notice that the measure of variance (39.7) is larger than any of the values in
Emily’s data. The idea to use the squared deviance to get an average for variance results
in a number that is no longer in the same unit as the original data and is difficult to
interpret (Remler & Van Ryzin, 2011). The sum of squares and variance are useful in
136  ❖  SECTION II  DATA ANALYSIS

Table 7.8  Calculating Variance With Squared Deviance

Employee Length of Squared


numbered service (X) Deviance (X- 9) Deviance
5  2 -7 49
2  3 -6 36
4  4 -5 25
7  8 -1  1
8  8 -1  1
3 12  3  9
6 15  6 36
1 20 11 121

the mathematics for certain statistical analyses (e.g. analysis of variance [ANOVA],
regression), but variance does not give us a very good idea about the spread of the data
in descriptive statistics (Rumsey, 2009).

Standard Deviation
The standard deviation is the square root of the variance. This measure represents the
average deviance of the values from the mean in the same unit as the original data. The
standard deviation is easier to understand than variance, and it is an essential measure
to report in many instances to comprehend the spread of the data.
In Emily’s data on the employees’ length of service, the standard deviation is the
square root of the variance (39.7), which equals 6.3. Now we can interpret the variability
of Emily’s data by recognizing that the average difference in length of service is 6.3 years
from the mean. Formula 7.3 illustrates the formula to calculate the standard deviation.

Formula 7.3 Formula for Standard Deviation

∑ ( xi − X )
s =
n −1

Where

s is the standard deviation


Σ is the summation sign to sum everything that follows
xi is each individual observed score
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  137


X is the mean of all scores
n is the sample size

Measures of the Shape of a Distribution

Among the measures of central tendency, we introduced the mode as the value in a
range of data that occurs most frequently. When we measure the distribution of the
data overall, we apply this idea to chart the relative frequencies of all the data. To
accomplish this, we first make categories of numbers to smooth out small variations.
Then we can plot how many values fit in the different categories. The plot is known as
a histogram, representing a frequency distribution.
To chart a frequency distribution, we first create a frequency table, with categories
of numbers and a corresponding count of how many values fit in each category. Table 7.9
shows a frequency table for Emily’s length of service data set, this time for all 80
employees who participated in the training. It is easier to chart a meaningful frequency
distribution with a larger set of data.
As you can see in Table 7.9, the frequency table groups length of service in four-
year categories, then tallies the number of people in each group. From these data, we
can create a histogram for a visual representation of a frequency distribution. The
grouped values for length of service go on the horizontal axis. The count for each

Table 7.9  F
 requency Table of Length of Service
of the Training Participants

Length of service Frequency


~4  3
5~8  4
9~12  8
13~16 15
17~20 16
21~24 15
25~28  9
29~32  5
33~36  3
37~40  2
138  ❖  SECTION II  DATA ANALYSIS

group of values goes on the vertical axis. This will create a series of bars that show how
many times each range of values occurred in the data set. Figure 7.6 shows the result.
The line drawn around the shape of the histogram is called a frequency polygon.
Drawing a frequency polygon over a histogram is a common approach to highlight the
shape of the distribution. Both the histogram and the frequency polygon provide a
visual representation of the data set.
This graphic illustration of the data allows the researcher to observe how the data are
distributed. Examining the shape of the distribution is important, because many statisti-
cal analyses are based on the assumption that the data are normally distributed. (We will
discuss normal distribution further in Chapter 8.) When the shape of the distribution is
too flat, too pointy, lopsided with data bunched up one side (skewed), or bimodal, then
the data will not be suitable for certain types of statistical tests. Understanding the shapes
of the distribution of your data is important to prepare for your analysis.
In the remainder of this section, we will describe different recognized shapes of
data distribution. The shape of the distribution in Figure 7.6 is very close to a normal
curve, with the majority of the data in middle ranges and sloping down from the

Figure 7.6  Histogram and Frequency Polygon of the Length of Service of


the Training Participants

18

16

14

12
Frequency

10

0
–4 5–8 9–12 13–16 17–20 21–24 25–28 29–32 33–36 37–40
Length of Service

Frequency polygon
Histogram
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  139

middle on each side. Different terms are applied to frequency polygons with slopes
that differ from a normal distribution.
Kurtosis is a measure of the shape of the distribution that indicates the degree of
pointlyness or flatness. A kurtosis value of zero indicates a normal or mesokurtic
distribution. Positive values of kurtosis indicate pointed or leptokurtic distribution.
Negative values of kurtosis indicate flat or platykurtic distribution. Kline (2011) sug-
gests that kurtosis larger than an absolute value of 10 (i.e. higher than 10 or lower
than –10) suggests that the distribution is not normal. Figure 7.7 provides an illustra-
tion of the shape of the data frequencies for leptokurtic, mesokurtic, and platykurtic
distributions.
The kurtosis of your curve is related to the size of the standard deviation, which as
described earlier, represents the average deviance of all the data values for a particular
variable (Mann, 1995). When the standard deviation is larger relative to the mean, then
the distribution tends to be flatter. When the standard deviation is smaller that means
the overall values of the observations are clustered around the mean, and therefore, the
frequency distribution illustrated in a histogram will be pointier. In Figure 7.8, you can
see two distributions, both with the same mean value of 50 but a different standard
deviation; one has a standard deviation of 25, the other has a standard deviation of 15.
Note that the distribution with the larger standard deviation is flatter.
Although the size of the standard deviation gives you an idea of kurtosis in your
distribution, the actual computation of kurtosis uses the standard deviation in a complex

Figure 7.7a    Leptokurtic, Mesokurtic, and Platykurtic Distribution

Leptokurtic
0.5

0.4

0.3

0.2

0.1

0
−4 −3 −2.5 −2 −1 0 1 2 2.5 3 4

−0.1
140  ❖  SECTION II  DATA ANALYSIS

Figure 7.7b    Leptokurtic, Mesokurtic, and Platykurtic Distribution

Mesokurtic
(Normal Distribution)
0.5

0.4

0.3

0.2

0.1

0
−4 −3 −2.5 −2 −1 0 1 2 2.5 3 4

−0.1

Figure 7.7c    Leptokurtic, Mesokurtic, and Platykurtic Distribution

Platykurtic
0.5

0.4

0.3

0.2

0.1

0
−4 −2 −1 0 1 2 4
−0.1

formula that includes the deviation from the mean for each of the original values, multi-
plied to the power of 4, which emphasizes the size of values farther from the mean (as
variance does with the sum of squares). Obtaining a number for kurtosis gives you a
more exact basis for judging how well the visual appearance of a normal distribution
really is normal. Your statistics software should be able to produce a number for kurtosis
to judge if the distribution is too far from normal to run certain statistical tests.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  141

Figure 7.8a   Two Distributions With Same Mean and Different Standard Deviation

Large Standard Deviation SD=25


200
180
160
140
Frequency

120
100
80
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100

Mean=50

Figure 7.8b    Two Distributions With Same Mean and Different Standard Deviation

Small Standard Deviation SD=15


200

180

160

140

120
Frequency

100

80

60

40

20

0
0 10 20 30 40 50 60 70 80 90 100

Mean=50
142  ❖  SECTION II  DATA ANALYSIS

The symmetry of the data distribution is also important. Skewness is a measure of


the degree of lopsidedness of the frequency distribution. When the skewness is zero, the
shape of the distribution is symmetric. When the skewness has a negative value, there is
a low tail on the left side and a larger “hump” on the right side of the distribution. This
is called a negatively skewed distribution; it looks like the chart marked East in the upper
portion of Figure 7.9. Alternately, when the skewness has a positive value, there is a larger
“hump” on the left side of the distribution and the low tail is on the right side. This is
called positively skewed distribution; it looks like the chart marked West in lower portion
of Figure 7.9. The distribution curves superimposed over the two histograms in the
Figure 7.9 illustrates where the skewed data diverge from a normal distribution.
Kline (2011) suggests that absolute values of 3 should be used as a guideline to
assess the skewness of the data; if the skewness value is lower than 3, then your data
could be negatively skewed, and if higher than 3, then your data could be positively
skewed. Again, like kurtosis, the measure of skewness relies on a complex formula that
a statistics software package should be able to calculate for you. Getting an exact

Figure 7.9    Negative and Positive Skewed Distribution

20
Negatively Skewed Distribution

15
Hump

10 East

5 Tall
Frequency

region
0
20
Positively Skewed Distribution

15
Hump
10 West

5 Tall

0
0 20000 40000 60000 80000 100000 120000
Number of bankruptcies in 2000
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  143

Figure 7.10a    Skewness and Central Tendency

Negatively Skewed Distribution


Mode

Mean Median

Figure 7.10b    Skewness and Central Tendency

Positively Skewed Distribution


Mode

Median Mean

number will help confirm the visual observation of skewness in the chart of your frequency
distribution. However, skewness is usually readily apparent in the histogram.
Typically, significant skewness happens when you have outliers in your data set, as
shown in the chart marked West. If a small number of values are extremely low, then
the data distribution will be negatively skewed; if a small number of values are
extremely high, then the data distribution will be positively skewed.
Since the mode is the highest point of a frequency distribution, it corresponds to
the hump of the distribution. One obvious problem with skewed data is that the mode
will shift toward the hump on one side of the data distribution, while the mean is
144  ❖  SECTION II  DATA ANALYSIS

influenced by the long tail of extreme values on the other side, and will shift in the
other direction. Neither the mode nor the mean will be useful as measures of central
tendency (Remler & Van Ryzin, 2011). When the data distribution is skewed, the
median will be the most stable measure of central tendency. The relationship between
skewness and the three measures of central tendency is depicted in Figure 7.10.

Chapter Summary
In this chapter, we outlined the preparatory steps that need to be taken to prepare your data for
analysis. We introduced key steps involving data cleaning, creating a codebook, checking the data
structure and level of measurement in your variables, and finally, running descriptive statistics to
characterize the central tendencies of the data and the shape of the data distribution. All of these
things will give you confidence that your data are in shape to proceed.
We described four levels of measurement: nominal or categorical (also includes dichoto-
mous), ordinal, interval, and ratio. The interval and ratio levels of measurement are usually com-
bined and described as continuous variables. The level of measurement that characterizes your
variables is the first thing to consider when setting up your analysis.
The next step is to understand your data as a whole. Descriptive statistics provide you with
frequencies for the range of values in any particular variable. We introduced several measures to
summarize the frequencies and characterize the data overall. For measures of central tendency, we
discussed the mean, median, and mode. Each of these measures captures relevant information
about the middle point of the data in a different way. Measures of variability include range, vari-
ance, and the standard deviation. We also illustrated different possibilities in how the data may be
distributed, which is important for the applicability of certain statistical tests. The shape of the
distribution can be described by its flatness or lopsidedness. The degree of flatness is indicated by
kurtosis, and the degree of lopsidedness is indicated by skewness. Understanding descriptive sta-
tistics will help you understand your data and provide the necessary foundation to proceed with
your analysis.

Review and Discussion Questions


1. In addition to the demographic backgrounds of the study participants, what other descriptive
statistics should Emily be examining for her study?
2. Discuss the advantages and disadvantages of all three measures of central tendency: mean,
median, and mode. Give specific examples of situations in which you would find these mea-
sures useful.
3. Name the different measures of variability. What does each measure tell you about your data?
4. How does skewed data affect data analysis?
5. Find a report published by a government agency or a nonprofit organization. Examine what
kinds of descriptive statistics are reported.
Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  145

Statistics Exercise
1. Print out Emily’s survey data from the book website http://www.sagepub.com/nishishiba1e. Fill
out the survey yourself, pretending you are an employee of the City of Westlawn. Ask a few peo-
ple to do the same. Create a data file in SPSS and Excel, and enter the survey data you collected.
2. Create a codebook for your data file.
3. Run descriptive statistics for Emily’s survey data and Jim’s response-time data according to the
instructions below.

1. Emily’s Data
Open Emily’s survey data “Emily survey” from http://www.sagepub.com/nishishiba1e
a. List all the variables measured at the nominal level.
b. List all the variables measured at the ordinal level.
c. List all the variables measured at the continuous level (interval and ratio).
d. Run descriptive statistics of the demographic background of the training participants.
e. Run descriptive statistics of the demographic background of those who did not participate
in the training.
f. What do you notice in the descriptive statistics for Emily’s data?

2. Jim’s Data
Open Jim’s data, “Response time by station by year” from http://www.sagepub.com/nishishiba1e

a. Run descriptive statistics of the response-time data from the eight stations for year09,
year10, and year11 respectively.
b. What do you notice comparing the descriptive statistics from year09, year10, and year11?
c. What do you notice in the descriptive statistics for Jim’s data? What are the differences
between the data set for these two time periods?

Step-by-Step Instructions for Running Descriptive Statistics


Using SPSS
To obtain the descriptive statistics for age from Emily’s survey, perform the following:

1. Click AnalyzeàDescriptive statisticsàDescriptives.


2. Enter the variable “how old are you” (q18) into the Variables(s) box.
146  ❖  SECTION II  DATA ANALYSIS

Figure 7.11   Descriptive Statistics Using SPSS

3. Click Options.
4. Mean, minimum, maximum, and standard deviation are selected by default. Also click on
variance and range.
5. Click Continue.
6. Click OK.

Figure 7.12   Descriptive Statistics Using SPSS - Options


Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  147

Figure 7.13   Descriptive Statistics Using SPSS - Output

You will obtain the following output in the SPSS Statistics Viewer window.

Step-by-Step Instructions for Running Descriptive Statistics


Using Excel
To run the same analysis in Excel, perform the following:

1. Click DataàData analysis (Windows Excel).


2. Select Descriptive Statistics.
3. Click OK.
4. Activate the input range by clicking in the text box.
5. Highlight cells U1 through U236.

Figure 7.14   Descriptive Statistics Using Excel

6. Click “labels in first row” (this tells Excel that U1 is a qualitative label and not part of the
data).
7. Check Summary Statistics.
8. Click OK.
148  ❖  SECTION II  DATA ANALYSIS

Figure 7.15   Descriptive Statistics Using Excel - Inputting Data Range

The following output is obtained on a new worksheet.

Figure 7.16   Descriptive Statistics Using Excel - Output


Chapter 7  Quantitative Data Preparation and Descriptive Statistics   ❖  149

References
Babbie, E., Halley, F., Wagner III, W.E., & Zaino, J. (2012). Adventures in social research: Data analysis using
IBM SPSS (8th ed.). Thousand Oaks, CA: Sage.
Berman, E. M., & Wang, X. (2012). Essential statistics: For public managers and policy analysts. Thousand
Oaks, CA: Sage.
Carifio, J., & Perla, R. (2008). Resolving the 50-year debate around using and misusing Likert scales. Medical
Education, 42(12), 1150–1152.
Ha, R. H., & Ha, J. C. (2012). Integrative statistics for the social & behavioral sciences. Thousand Oaks, CA:
Sage.
Holcomb, Z. C. (1998). Fundamentals of descriptive statistics. Los Angeles, CA: Pyrczak.
Jamieson, S. (2004). Likert scales: How to (ab)use them. Medical Education, 38(12), 1217–1218.
Kline, R. B. (2011). Principles and practice of structural equation modeling. New York, NY: Guilford.
Mann, P. S. (1995). Introductory statistics (2nd ed.). West Sussex, UK: Wiley.
Moore, D. S. (2001). Statistics: Concepts and controversies (5th ed.). New York, NY: Freeman.
Remler, D. K., & Van Ryzin, G. G. (2011). Research methods in practice: Strategies for description and causa-
tion. Thousand Oaks, CA: Sage.
Rumsey, D. J. (2009). Teaching Bits: “Random Thoughts on Teaching.” Journal of Statistics Education 17(3).
Retrieved from http://www.amstat.org/publications/jse/v17n3/rumsey.html
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2684), 677–680.
Stevens, S. S. (1951). Mathematics, measurement, and psychophysics. In S. S. Stevens (Ed.), Handbook of
Experimental Psychology (pp.1–49). NewYork, NY: Wiley.
Trochim, W. M. K., & Donnelly, J. P. (2007). Research methods knowledge base. Mason, OH: Thomson
Custom.

Key Terms
Boxplot 129 Inferential Statistics  126 Normal Curve  138
Cases 121 Interquartile Range  130 Outliers 128
Central Tendency  126 Kurtosis 139 Percentile Points  129
Codebook 121 Leptokurtic 139 Platykurtic 139
Data Cleaning  122 Levels of Measurement  122 Range 133
Descriptive Statistics  126 Mean 127 Raw Data 126
Deviance 134 Measures of Variability or Skewness 142
Frequency Distribution  137 Dispersion or Spread  131 Standard Deviation  136
Frequency Polygon  138 Median 128 Statistics 126
Frequency Table  137 Mesokurtic 139 Variables 121
Histogram 137 Mode 130 Variance 134
150  ❖  SECTION II  DATA ANALYSIS

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


8 ❖
Hypothesis
Testing and
Statistical
Significance
Logic of Inferential Statistics


Learning Objectives 152
Using Inferential Statistics 152
Emily’s Case 152
Jim’s Case 153
What Are Inferential Statistics? 154
Developing Hypotheses 155
Types of Variables in the Hypothesized Relationship 155
Emily’s Case 156
Hypothesis Testing 158
Statistical Significance 160
Level of Significance 160
Probability, Normal Distribution, and Sampling Distribution of the Mean 162
Normal Distribution 162
Sampling Distribution of the Mean 162

151
152  ❖  SECTION II  DATA ANALYSIS

Summary of Hypothesis Testing Steps 166


Errors and Risks in Hypothesis Testing 166
Statistical Significance Versus Practical Significance 168
Chapter Summary 169
Review and Discussion Questions 169
Key Terms 170
Figure 8.1 Leo’s Whiteboard Schematic on the Types of
Variables in the Hypothesized Relationship 158
Figure 8.2 Normal Distribution Curve 163
Figure 8.3 Normal Distribution Curve and Percentage
of Cases Between Selected Standard Deviation Points 163
Figure 8.4 Schematic Illustration of Sampling Distribution 165
Table 8.1 Four Possible Outcomes in Hypothesis Testing 167
Formula 8.1 Standard Error (Calculated With Standard
Deviation of the Population) 164
Formula 8.2 Standard Error (Calculated With Standard
Deviation of a Sample) 164


Learning Objectives

In this chapter you will

1. Understand the logic behind hypothesis testing


2. Utilize the concept of statistical significance in hypothesis tests
3. Understand how inferential statistics are used to make generalizations from
the sample to the population

Using Inferential Statistics

Emily’s Case
Emily, HR director at the city of Westlawn, stood at the whiteboard in her office,
while her intern Leo sat in a chair at the small table in one corner. They stared at the
list they were making for “Things to find out” for their data analysis. After thirty
minutes working on it and discussing each bullet point, the list on the board said:

•• Group (attended training or not)


•• Background (gender, ethnicity, age)
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  153

•• Work status (department, length of employment)


•• Average level of cultural competence (multiple items)
•• Average level of perceived workplace conflict (multiple items)

Leo looked at the list and said, “I can get all these by running descriptive statistics.”
Emily noticed Leo was thinking something more. “Anything else you have in mind?”
Leo hesitated. “Well, this information only describes the characteristics of the people
who attended the training and who didn’t attend the training. I am wondering about how
we assess the impact of the diversity training based on the information from the sample
we selected.”
Emily wasn’t sure she followed. “I thought we could just look at the average level of
cultural competence and the average level of perceived workplace conflict of the training
group and nontraining groups, and compare the scores. If we have different scores between
the two groups, can’t we say that the training had an impact?”
“That’s the basic idea,” Leo answered, “but I believe in order for us to make an inference
about the whole employee population based on the sample data we have, we need to do
some statistical tests to see if there is a statistically significant difference between the two
groups.” Leo stood up and moved to the whiteboard, and started drawing squares and
arrows, saying more to himself, “We are looking at higher cultural competence and lower
workplace conflict as the outcomes. So they are the dependent variables. And—”
Emily could see that Leo was in his zone. She decided to let Leo do his own thing and
see where it led. Once he gets his thoughts together, she thought, he can explain.

Jim’s Case
Jim, deputy fire chief at the city of Rockwood, had been nearly para-
lyzed facing the barrage of data for his response-time project. Then
Lavita stepped in. Lavita was a graduate student that Jim’s professor
friend Ty “loaned” to him. She wanted experience with a research
project in a “real” setting. Lavita took Jim’s data home with her and
sounded like she knew what she was doing. A few days later, when
Jim saw Lavita standing at the open door to his office, he beckoned
her in. He stood up.
“I have some results to show you,” Lavita said. She held out papers
and laid the sheets on Jim’s desk so he could read the numbers. “I’ve been looking at the
response-time data for 2009 to 2011. Here are some descriptive statistics for the response
times.”
Jim focused on the numbers on the pages, blocked into tables with abbreviations he did
not recognize. He wasn’t sure what he was looking at. He waited for Lavita to guide him
through it.
“There are various ways we can take a look at the response time,” she explained, “and I
probably need your guidance as to how you want me to analyze the data. For a start, we
can look at the mean response time year to year and see if we meet the national five-minute
standard.” Lavita pulled out a highlighter and marked numbers as she talked. “For example,
in 2011 the annual mean response time was 4.55 minutes.”
154  ❖  SECTION II  DATA ANALYSIS

Jim’s face brightened. “That’s good, I think. It’s better than what I saw on other stations’
reports. Can we claim that our response time is significantly lower than the national
standard?”
“I don’t know about that yet,” said Lavita. “I wanted to check with you first to be sure
I’m on the right track. If you like the way these results are calculated, I can run a statistical
test and see if our data is significantly lower than the national standard. I can get back to
you with a result.”

What are Inferential Statistics?

In the cases above, we see the researchers taking the first steps to explore their data
with descriptive statistics. We discussed descriptive statistics in Chapter 7. In both
cases, the researchers are moving to compare groups to detect if they are different. The
statistical tests they intend to use rely on inferential statistics, which apply to samples
drawn from a larger population of interest (Coolidge, 2013).
The logic of inferential statistics starts with a hypothesis about a relationship of
two or more attributes or concepts observed in a population. As we saw in Chapter 6,
these concepts are first operationalized to identify them with specific items that can be
measured and observed as data. In Chapter 7, we saw that these items become vari-
ables, once they are entered into a database with a set of recorded values for each item
associated with the observed cases in the sample. In Chapter 4, we saw that testing the
hypothesis with these variables involves making a comparison between groups in a
research design that allows differences to be detected or examining the relationships
among these variables.
With inferential statistics, a researcher can assess if the results are statistically sig-
nificant to indicate that the hypothesized relationship exists. The process of assessing
statistical significance is called a significance test. The steps to confirm the hypotheses
are called hypothesis testing. We will discuss more about significance tests and
hypothesis testing later in this chapter. In Chapters 9 through 13, we will introduce
some commonly used inferential statistical approaches available to researchers in the
field of public administration and nonprofit management:

•• t-tests
•• Analysis of variance (ANOVA)
•• Bivariate correlation
•• Chi-Square
•• Regression analysis

Before discussing the details of these inferential statistical tests, we will first
describe how hypotheses are developed. We will also discuss different types of vari-
ables included in hypothesized relationships. We will then explain the logic of hypoth-
eses testing and the significance test.
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  155

Developing Hypotheses

A hypothesis is a tentative statement about the plausible relationship between two


or more variables that is subject to empirical verification. It is an educated guess that
proposes an explanation for some phenomenon (Leedy & Ormrod, 2010; Schutt,
2012). In an applied research field, such as public administration and nonprofit
management, hypotheses frequently originate from the practitioners’ observations,
informed by day-to-day experience in the operations. As we have seen in Emily’s
case, she has a hunch, based on her professional experience as a training manager
and HR director, that providing diversity training will help improve employees’
cultural competence and reduce workplace conflict. In Jim’s case, Chief Chen has a
hunch that dispatching a car with a trained nurse and one firefighter will be more
efficient and effective than sending a fire engine with four firefighters. Although,
these professional hunches are a good place to start, in research these hunches need
to be developed into hypotheses.
The process of developing hypotheses requires the researcher to identify what
variables to include in the study, defend why these variables are relevant, and postu-
late their relationships. As O’Sullivan, Rassel, and Taliaferro (2011) note, this process
is not as systematic nor linear as the research report or journal make it appear. It
requires creativity and good insights on the topic at hand. Researchers frequently
rely on existing knowledge by reviewing literature to help them develop hypotheses.
A literature review will help you gain knowledge on what variables have been
included in similar studies, how the variables are measured, what sources of data
were used, and what relationships were examined. While conducting a literature
review, researchers should also pay attention to different social theories relevant to
the topic of the research. Identifying a theory provides the researcher an overarching
framework to approach the study topic and will help clarify specific hypotheses
(Babbie, 2013).
It should be noted that not all research has a hypothesis. When the objective of
your research is to explore and describe the phenomenon, taking an inductive theory
building approach, you may not have a formal hypothesis. The process we discuss in
this chapter applies specifically to research taking a deductive approach to theory
building, which formulates a specific hypothesis up front and tests it by collecting data.

Types of Variables in the Hypothesized Relationship


There are three basic types of variables in hypothesis testing, based on the role they
play in the hypothesized relationships: dependent variables, independent variables,
and extraneous variables. A hypothesis typically involves a relationship, which states
that a change in one variable will effect a change in another variable. The effected
variable is called the dependent variable (DV), or sometimes an outcome or criterion
variable. The variable (or variables) that you hypothesize as causing the change in the
dependent variable is called the independent variable (IV). When your research
objective is to confirm and test a hypothesized cause-and-effect relationship, what you
156  ❖  SECTION II  DATA ANALYSIS

hypothesized as a cause is the independent variable, and what you hypothesized as the
outcome is the dependent variable.
In Emily’s case, for example, she hypothesized that the diversity training will have
an impact on the level of cultural competence of the employees and will also affect
the level of workplace conflict. In her case, therefore, attending the diversity training
is the independent variable, and the level of cultural competence and perceived work-
place conflict are two dependent variables. She could also include other potentially
causal factors as independent variables that influence the effect. If Emily decides she is
interested in examining if the effect of the training is different among employees from
different city departments, then department affiliation would be an additional
independent variable in the hypothesized relationship.
Not all hypothesized relationships clearly differentiate between the independent
and dependent variables. For example, a researcher may be interested in a correlation
of two variables, without defining an explicit causal relationship. In this case, no clear
distinction can be made between the two variables as to which one is the independent
variable and which one is the dependent variable. For example, Emily may decide to
examine if there is a correlation between the number of diversity trainings people
attended and their level of cultural competence. In this particular situation, it is possi-
ble that higher attendance caused people to attain a higher level of cultural compe-
tence, or it is also possible that people who had a higher level of cultural competence
sought more opportunities to participate in diversity training. Emily does not need to
hypothesize either one of the variables is the cause of the other. She can simply test if
the two variables are associated with each other.
Extraneous variables refer to other factors that may influence change in the depen-
dent variable that were not considered in the hypothesized relationship of independent
and dependent variables. They are sometimes called confounding variables, because they
confound the expected results and provide alternative explanations for the relationship
between the variables you are examining in your study. Once such extraneous variables
are recognized, their effects can be controlled in the research design as well as in the
statistical analysis. In this context, they can also be called control variables.
With this foundation, we can return briefly to Emily’s case, and see what Leo has
in mind for the disposition of the variables in their research.

Emily’s Case
“Tell me what you are thinking,” Emily asked.
Leo was pondering over his diagram on the whiteboard. He glanced at her
apologetically. “Sorry. I was just trying to get that out.”
“I can see,” Emily chuckled.
“I’ve been so busy getting the data file right that I haven’t thought about what
we’re doing for a while,” Leo explained. “OK, let me run this by you. It’s simple.”
He erased his messy doodling and wrote on top of the board: Hypothesized
Relationship. “We are hypothesizing that diversity training affects the employees’
level of cultural competence and workplace conflict.”
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  157

Emily nodded.
“In other words, we believe the diversity training will influence a change in cultural
competence.” Leo wrote neatly on one side of the board: Attend Diversity Training. On the
other side, he wrote: Cultural Competence. He drew an arrow from left to right to connect
them. “The training is the cause and the change in cultural competence is the effect, or an
outcome in our hypothesized relationship.”
Leo repeated the same schematic on another line below, this time with the line pointing
to Workplace Conflict. He drew a dotted square around the right-side terms and labeled the
circle: Dependent Variable (Effect). He drew a dotted square around the left-hand terms
and labeled it: Independent Variable (Cause).
“What we presume to be the cause in our hypothesized relationship is called an inde-
pendent variable, and what we consider to be the effect is called the dependent variable,”
Leo explained. “We have one independent variable, attending the diversity training and two
dependent variables, cultural competence and workplace conflict.”
Emily nodded again, “OK.” This part they knew already.
“Another thing we need to consider, “Leo continued,” is extraneous variables that could
be influencing the effect.”
“What do you mean?” Emily asked.
“Extraneous variables are the kinds of things that can potentially affect the change in
the dependent variable. I was trying to think about what would be some extraneous vari-
ables. Maybe you can help.”
“If I get your meaning, there may be something we should consider that may affect
people’s level of cultural competence, or workplace conflict, other than the training, right?
I think the level of education could make a difference; not just certificates or professional
training, but I mean years in high school and college, basic education. I believe we did ask
that on the questionnaire.”
“Yes,” Leo responded. “I have it listed in the codebook, so you can see the question and
the response categories we used. Mei-Lin and I found that years of education were asked in
the surveys for cultural competence we found in the literature.” On the board under the box
for the independent variable, he wrote: Years of education.
“And what about their previous experience in diversity trainings?” Emily added. “Some
people said they had attended diversity trainings before. It seems like that could affect their
cultural competence and the level of workplace conflict—even for those employees who did
not take our training this time.”
“Ah! That’s a good one,” Leo said. “We put that on the questionnaire, too.” Under the
previous addition, he wrote: Number of diversity trainings attended.
“So what do we do about them?” Emily asked.
“When we assigned people to the training and nontraining groups, we made a random
assignment, so if that worked the way it is supposed to, we should have people with similar
backgrounds in both groups. I can check their education level and experience with other train-
ings in the descriptive statistics. If there are differences, I can control for these two extraneous
variables when we conduct our analysis. We can do it with and without the controls and see
if it makes a difference.” Leo drew a circle around the two items they added and wrote under-
neath: Extraneous variable. He drew an arrow over to the dependent-variable box.
158  ❖  SECTION II  DATA ANALYSIS

Emily looked at the diagram and understood. She could see this was a simple beginning
and suspected Leo had a lot more coming.

Hypothesis Testing

The hypothesis that researchers want to verify is called research hypothesis (or
alternative hypothesis). It is phrased as a statement that a particular relationship
between two or more variables exists. The development of the hypotheses is
informed by the researchers’ interests, observations of the world, and the review of
the literature. In Emily’s case, it is clear that the hypothesis needed to be developed
as much as possible prior to the data collection. The key concepts in the hypothesis
such as cultural competence and workplace conflict needed to be measured using the
survey. Other items in the survey instrument—education level and experience with
prior diversity trainings—were modeled on items found in the literature review by
Leo and Mei-Lin.
The process of hypothesis testing is to formally test if the result obtained from the
sample can be used to infer what’s happening in the population of interest. It confirms
and tests the hypothesized relationship using the data from the sample. In the
­hypothesis-testing process, the first thing researchers do is set up a null hypothesis. As

Figure 8.1   Leo’s Whiteboard Schematic on the Types of Variables in the


Hypothesized Relationship
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  159

the name suggests, the null hypothesis states the opposite of your research hypothesis,
stating that your idea is wrong and the relationship between the variables of interest
you identified does not exist. The researcher needs a null hypothesis, because testing a
hypothesis involves disproving it, not proving it. By demonstrating through a statistical
test that the null hypothesis is likely to be wrong, we can logically conclude that our
original research hypothesis is likely to be right.
This approach to hypothesis testing—setting up a null hypothesis as a straw man
to refute and thus verify the research hypothesis—is based on the philosophy of fal-
sifiability advocated by philosopher of science Karl Popper (Popper, 1959, 1962).
Popper noted that in a scientific inquiry, the hypothesis needs to be falsifiable,
because no matter how many observations you have, you cannot verify that your
observation is universally generalizable. To illustrate, when you have a hypothesis that
states, “All swans are white,” you cannot possibly observe all swans in the world and
verify the hypothesis. However, it is logically possible to falsify (or refute or nullify)
the statement by observing one black swan. Statistical approaches to hypothesis test-
ing applied this logic when falsifying hypotheses were introduced by R. A. Fisher in
the 1930s (Fisher, 1935; Fisher & Bennett, 1990). Fisher is the one who coined the
word null hypothesis.
The notation used to indicate the research hypothesis is HR (or HA for alternative
hypothesis). The notation to indicate the null hypothesis is H0 .
Let’s walk through the process of testing our hypothesis with this logic. Instead of
verifying all possible observations of the relationship between the variables of interest,
the researcher establishes a null hypothesis that states the relationship does not exist
(Relationship = 0). If the researcher is successful in demonstrating that the null
hypothesis statement is unlikely to be true, then the research helps confirm the
research hypothesis that the relationship does exist (Relationship ≠ 0).
In Emily’s case, since she is examining the relationship between the diversity train-
ing attendance and level of cultural competence her null hypothesis is that there is no
relationship between diversity training attendance and level of cultural competence.
Therefore, she can express her null hypothesis as:

H0: level of cultural competence of group that attended the diversity training =
level of cultural competence of group that did not attend the diversity training.

A research hypothesis can be either directional or nondirectional: a directional


research hypothesis specifies a direction of change; a nondirectional research
hypothesis specifies only that there will be a difference, with no direction indicated.
Focusing on one of the two hypothesized relationships in Emily’s case for illustration—
the relationship between attending the diversity training and the level of cultural
competence—we see she expects an increase in cultural competence in the group
attending training compared to the group that did not attend the training, so she has a
directional hypothesis, which can be stated as follows:

HR: level of cultural competence of group that attended the diversity training >
level of cultural competence of group that did not attend the diversity training.
160  ❖  SECTION II  DATA ANALYSIS

If Emily’s research hypothesis was nondirectional, it would be expressed as:

HR: level of cultural competence of those who attended the diversity training ≠
level of cultural competence of those who did not attend the diversity training.

Whether to use a directional or nondirectional hypothesis depends on the research


purpose and in some cases information at hand. If Emily did not care whether the level
of cultural competence of those who attended the diversity training increased or
decreased compared to those who did not attend the training, then a nondirectional
hypothesis would be appropriate.
With the null hypothesis and research hypothesis established, Emily can now
proceed to conduct a statistical significance test to determine the likelihood that the
null hypothesis is wrong in the population of interest. Based on the result of the
statistical significance test, if Emily identifies that the null hypothesis is likely to be
wrong, she rejects the null hypothesis that there is no difference in the level of cul-
tural competence between those who attended the training and those who did not
attend the training. She then can claim that her research hypothesis is supported by
her research. With this evidence in hand, she could then argue that her diversity
training helped increase the level of cultural competence among city employees.

Statistical Significance

Determining if the null hypothesis can be rejected involves a statistical significance test.
When the null hypothesis is rejected as a result of the significance test, then the result
of the study is considered statistically significant. The researcher will decide in
advance at what point a result will qualify as significant. This cutoff point is called the
level of significance. In this section, we will describe statistical significance and the
level of significance in more detail.
First, we will examine the logic of statistical significance according to the charac-
teristics of the sample data and sampling error. We will see in the discussion that
determining statistical significance in testing a hypothesis assumes that probability
sampling was used in selecting the study sample. This is important for the theoretical
foundation of computing standard error, which underlies the estimate of sampling
error and the level of significance.

Level of Significance
A statistical significance test calculates the probability that a result observed in the
data is due to chance. This probability of a chance result is called the level of sig-
nificance, or the Alpha (α) level. The probability is expressed as a percentage, which
is given a shorthand notation as a p-value. Conventionally, researchers in social
science set the p-value at 5% (Faherty, 2008). A statistical test in inferential statistics
will calculate a p‑value with the results. When the p-value falls below the level of
significance (e.g. p < .05), a researcher can feel confident in rejecting the null
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  161

hypothesis and claim that an observed result supports the research hypothesis. There
is only a small possibility that the result is due to chance. Conversely, if the p-value
for test statistics is greater than the set level of significance, then the researcher does
not have enough justification to reject the null hypothesis. Researchers need a higher
degree of confidence to feel comfortable rejecting the null hypothesis.
When we see research reports that say the result is statistically significant, it
means that the researchers have some confidence that they did not obtain the result
just by chance. The relationship observed in the study is not just a fluke. If Emily
finds that there is a statistically significant difference in the level of cultural compe-
tence between those who attended the diversity training and those who didn’t, that
means Emily can declare with some confidence that the difference between the two
groups is not observed in her sample totally by chance. The diversity training must
have had some impact on people’s cultural competence, and the result can be gen-
eralized to the whole employee group of the city of Westlawn. Or if Jim found out
that the response time for the Rockwood Fire Department is significantly lower
than that of the national standard, that means Jim has some confidence that the
difference in the response time between the Rockwood Fire Department and the
national standard is not observed just by chance, and therefore, there must be
something Rockwood Fire Department is doing right to have a lower response time.
It is important to distinguish significance level from confidence level. These are
companion concepts. Setting the level of significance at a certain cutoff point allows
the researcher a level of confidence in the results. This level of confidence is expressed
as a percentage, too, thus perhaps 90%, 95%, even 99% confidence, depending on the
significance level. Confidence that a result is statistically significant can never be 100%.
As discussed in Chapter 5, when sampling from a larger population, the researcher will
never be completely certain that the result is accurate. Researchers conventionally
select a confidence level at 95%, which corresponds to the 5% significance level. This
means there is a good possibility that an observed result in the sample is genuine and
can be generalized to the population of interest, with only a 1 in 20 possibility that the
result is due to chance.
In Emily’s case, she will be looking for a statistically significant difference in the
level of cultural competence between the training and nontraining groups. In Jim’s
case, the relationship involves the fire department’s average response time compared to
a national standard to determine that there is no statistically significant difference (or
if there is a difference, that the average response time exceeds the standard). These are
the most common applications for statistical significance that practitioner–researchers
are likely to use. Statistical significance can also apply to observed correlations between
variables or groups.
The p-value is always reported with test statistics to affirm the possibility of a
chance result. As a researcher, or a consumer of research, it is important to understand
how to interpret the p-value. If a researcher sets a p-value at .05 (p < .05), then any
value over that number, such as p = .1 or p = .06, does not mean the data result is almost
significant. The cutoff point for the significance level means the researcher is not will-
ing to make a statement about the data with any confidence if the p-value is not within
that range. On the other side, in cases where the p-value is small, such as p < .01 or
162  ❖  SECTION II  DATA ANALYSIS

p < .001, this does not mean the results are more significant, or the effect is greater, or
the original research hypothesis is more true; it simply means the null hypothesis can
be rejected with more confidence. This in no way affects the judgment of the alternate
hypothesis, its practical significance, or the size of the effect (Salkind, 2011).

Probability, Normal Distribution, and Sampling


Distribution of the Mean
In this section, we want to explore what it means in inferential statistics to say a result
may be due to chance. This involves principles of statistical probability, based on
characteristics of the sampling distribution, and particularly the normal distribution,
which we introduced in Chapter 7. Examining the normal distribution provides a
basis for understanding the statistical probability of any given data outcome. We will
then discuss sampling distribution and how it is used to estimate the population
parameter, and the probability of a test statistic being obtained by chance.

Normal Distribution
Normal distribution refers to a theoretically ideal distribution. Although an empirical
distribution of real data rarely matches this theoretical distribution, many examples do
closely resemble a normal distribution. A normal distribution is represented by a bell-
shaped frequency polygon with perfect symmetry. It is sometimes referred to as the bell-
shaped curve. One of the properties of the normal distribution is that mean, median, and
mode are the same and locate at the exact midpoint of the distribution. (See Figure 8.2.)
Another property of the normal distribution is that a certain percentage of cases
fall between selected points under the curve. As you can see in Figure 8.3, 34.13% of
the cases fall between the mean and one standard deviation above the mean, as well as
one standard deviation below the mean. In other words, as the horizontal arrow in
Figure 8.3 indicates, 95% of the cases fall between the mean and plus or minus 1.96
deviations, and 99% of the cases fall between the mean and plus or minus 2.58 stan-
dard deviations. These properties of the normal distribution are important for deter-
mining the probability level in the results.

Sampling Distribution of the Mean


Even with a probability sampling approach (discussed in Chapter 5), there is a chance
that data collected from a particular sample may not be representative of the population.
The degree of difference between the sample statistics and the population parameter
(the value of the population) is called sampling error. When the sampling error is
higher, the sample data is less representative of the population, and the researcher will
find it more difficult to make a case that the result found in the sample reflects what is
expected to be found in the population. We saw one form of expressing sampling error
in the confidence interval in Chapter 5. Some degree of sampling error will be associated
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  163

Figure 8.2   Normal Distribution Curve

Mean
Median
Mode

Figure 8.3  Normal Distribution Curve and Percentage of Cases Between


Selected Standard Deviation Points

95%

99%

.13% 2.15% 13.59% 34.13% 34.13% 13.59% 2.15% .13%

Standard −3 −2 −1 0 1 2 3
deviations

−2.58 −1.96 1.96 2.58


164  ❖  SECTION II  DATA ANALYSIS

with tests of statistical significance in inferential statistics and may affect the research-
er’s ability to confirm a hypothesis.
An estimate of the size of sampling error in any given sample is calculated by ref-
erence to a standard error, which establishes a theoretical relationship between sample
values and the population parameter. The formula for standard error is derived from a
theoretical approach that draws all possible samples of the same size from a population
and plots the mean value of the samples. The result is called the sampling distribution
of the mean (See Figure 8.4). It turns out that as the number of the samples gets larger,
the mean of the sampling distribution will be the same as the population mean. Also,
the set of mean values will approximate a normal distribution. This is because each one
of the sample means will not be exactly the same as the population mean, and there
will be a spread of values around the population mean.
As we explained earlier, the variability of any given distribution can be estimated
by calculating the standard deviation. We can apply the same approach to the sampling
distribution of the mean. The variability of sample means around the population mean
in the sampling distribution of the mean can be estimated by calculating the standard
deviation of the sampling distribution. The standard deviation of the sampling distri-
bution is the standard error. This characteristic of the sampling distribution of the
mean is one of the basic statistical principles underlying statistical inference and is
referred to as the central limit theorem.
The formula for the standard error is calculated by dividing the standard deviation of
the population by the square root of the number of samples drawn from the population:

Formula 8.1 Standard Error (Calculated With Standard Deviation of the Population)

σY
σY =
N

When

σY– is standard error of the sampling distribution


σY is standard deviation of the population
N is the number of samples drawn from the population

Of course, in most cases the actual population parameters are unknown, but
knowledge about the sampling distribution allows an estimate of the mean of the pop-

ulation based on just one sample. When you have a sample with a known mean (X),
standard deviation (SD), and a sample size of n, the standard error σY– can be estimated
with the following equation:

Formula 8.2 Standard Error (Calculated With Standard Deviation of a Sample)


SD
σY =
n
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  165

Figure 8.4    Schematic Illustration of Sampling Distribution

Population
Distribution

Sample 4 Sample 3

Sample 5 Mean Mean


Sample 2
Mean
Mean
Sample (N)
Sample 1
Mean
Mean
Mean of sampling distribution =
Population mean

Sampling Distribution

Because the sampling distribution is a normal distribution, about 95% of the


sample mean, estimates will fall between the mean and plus or minus 2 standard
error (because standard error is the standard deviation of the sampling distribu-
tion). Therefore, we can say that we are 95% confident that the population mean is
– –
between X -2×σY– and X +2×σY– (sample mean plus or minus 2 standard error of the
mean). This is the procedure for estimating sampling error and the basis for calcu-
lating the probability (p-value) of getting any given result from a sample totally by
chance.
166  ❖  SECTION II  DATA ANALYSIS

Summary of Hypothesis Testing Steps

The following steps summarize the discussion in this chapter on hypothesis testing with
statistical analysis. Further steps in the analysis will be developed in subsequent chapters.

1. State a null hypothesis. Once you have your research hypothesis (based on your
literature review), state your null hypothesis, negating the relationship you
hypothesized in your research hypothesis. The null hypothesis is a tentative state-
ment about the population that there is no relationship between X and Y or there
is no difference between X and Y. You can have multiple hypotheses in one study.
2. Set the level of significance you will use to reject the null hypothesis. The con-
vention is a p-value of .05. When multiple measurements are being made or
some other circumstance raises the possibility for sampling error, the researcher
may wish to set a more stringent p-value of .01 or .001.
3. Select the appropriate test statistics. In the following chapters of this book you
will learn what test is appropriate for what type of research question.
4. Compute the test statistic value and the associated p-value. In most of the sta-
tistical analyses you will be conducting, this calculation can be done by using a
statistical package. In this book, we will illustrate how to obtain test statistics
and the p-value using SPSS and Excel.
5. Examine the results of the statistical test and see if the p-value is below the set
level of significance. If the p‑value is below the set level of significance, reject
the null hypothesis and consider the research hypothesis to be supported by the
test results. If the p-value is above the set level of significance, do not reject the
null hypothesis and consider that the result could have happened by chance.

Errors and Risks in Hypothesis Testing

When you test your hypothesis with statistical analysis, you will make inferences about the
population of interest from the results obtained in your research sample. The decisions
you will make are either to reject the null hypothesis and conclude that the research
hypothesis applies to the population of interest, or to not reject the null hypothesis and
conclude that the research hypothesis does not apply to the population of interest.
As you can see in the two-by-two table (Table 8.1), there are four possible out-
comes of your decision. When you make a decision in your hypothesis testing, the
result can be correct in two ways and wrong in two ways.
Let’s start with how you can be wrong in your decision. One way that you can be
wrong is to reject the null hypothesis when it should not be rejected—or in other
words, finding significance when it does not exist. This type of wrong decision mak-
ing is called Type I Error. A Type I error is a false positive. As we discussed earlier, in
research, it is impossible to remove all likelihood of committing this error. Instead, we
set to what extent we are willing to take the risk of committing this error. That is the
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  167

Table 8.1  Four Possible Outcomes in Hypothesis Testing

Decision you made


Do not reject the null hypothesis Reject the null hypothesis
The null You are correct in not rejecting You made a mistake!
hypothesis the null.
should not Type I error: Rejected the null
True be rejected hypothesis, when you should not.
condition (True negative) (False positive)
in the
population

Probability of correctly not The risk of making Type I error = α


rejecting the null hypothesis = 1-α (equivalent to significance level)
(equivalent to confidence level)
The null You made a mistake! You are correct in rejecting the null
hypothesis hypothesis and accepting the
should be Type II error: Research hypothesis research hypothesis.
rejected is true but you decide to stick with
the null hypothesis.

(False negative) (True positive)

The risk of making Type II error = β Probability of getting correct = 1- β


(This is also called statistical power)

level of significance. The Greek symbol used to indicate the level of significance is α,
and therefore, the level of significance is sometimes referred to as alpha level.
Another way that you can be wrong in your decision is not rejecting the null hypoth-
esis when in reality you should be rejecting it—or in other words, not finding
significance when it does exist. This is called Type II Error. A Type II error is a false
negative. The risk of making a Type II error is represented by the Greek symbol β (Beta).
Now let’s take a look at how you can be correct. You can be correct in not rejecting
the null hypothesis because in the population the null hypothesis is true. This is a true
negative situation. The probability of being correct in not rejecting the null hypothesis
168  ❖  SECTION II  DATA ANALYSIS

is 1 - α. This is equivalent to the confidence level and indicates the probability of not
committing the Type I error.
You can also be correct in rejecting the null hypothesis because in the population
the research hypothesis is true. This is a true positive situation. The probability of
correctly rejecting the null hypothesis is 1 - β. This is also called power.
It is important to understand the type of errors and the risks of committing these errors
in hypothesis testing. A key factor that affects Type I and II errors is the sample size. If your
sample size is large enough, you will reduce the risk of making Type I and Type II errors.

Statistical Significance Versus Practical Significance

When there is a statistical analysis, there is a tendency to place emphasis on whether


the results were statistically significant. In Jim’s case, the first thing Jim asked Lavita
about the response-time data was if the mean time was significantly lower than the
national standard. Of course, detecting statistical significance is important to have
confidence in your results, but it needs to be emphasized that statistical significance is
not equal to the meaningfulness of the research.
As we discussed in this chapter, what statistical significance tells you is that the
result you obtained from the study is not a haphazard result obtained by chance, and
therefore, it is more likely that the result indicates a certain pattern that you can
observe in your population of interest. Nevertheless, just because the results of the
analysis are statistically significant does not mean they have practical value. Once you
find a statistically significant result, you then need to consider its implications in the
real world.
In Emily’s case, for example, suppose she finds in her survey results that the group
that attended her diversity training had a statistically significant increase in cultural
competence compared to the group that did not attend the training, but the difference
was quantified as 0.1 points on a scale measured in 5 points. What does this mean?
Further, is the difference meaningful enough for the city council of Westlawn to decide
it will allocate a budget to continue diversity training? In addition to the statistical
significance, you need to evaluate the implications of the magnitude of the effects and
relationships identified in your study results.
The practical significance also needs to be examined when no statistically significant
results were found in your analysis. Sometimes a nonsignificant result provides impor­
tant information. In Emily’s case, for example, a nonsignificant result could indicate she
needs to examine why the training did not work. Is it because of the trainer? The
curriculum? The work environment? Or perhaps, despite the widely accepted belief
among the HR profession, that training does not work to increase employees’ cultural
competence? Emily might then consider ways to improve the training or other strategies
to improve cultural competence. In this way, even when her research produces a negative
finding, it can help inform her future activities to improve cultural competence.
The key point here is that even though it is important for us to pay attention to
statistical significance in interpreting the results of hypothesis testing, we should not
lose sight of the implications and meaningfulness of the research results from a prac-
tical or policy perspective.
Chapter 8  Hypothesis Testing and Statistical Significance  ❖  169

Chapter Summary
In this chapter we introduced inferential statistics. When your data is based on the sample
drawn from the population of interest, you need to determine if the result you obtained based
on your sample data can be applied to your population of interest. Inferential statistics are used
to make an inference about the population of interest based on the result you obtained from the
sample. Statistical significance is the criteria applied in determining whether the result can be
generalized to the population. This chapter explained the process for hypothesis testing and the
idea of a significance test. The decision researchers make based on the result of the hypothesis
test can be wrong. We explained two types of errors in making the decision in the hypothesis
testing—Type I and Type II errors. As a researcher, you will also need to pay attention to the
difference between the statistical significance and the practical significance when interpreting
the result of the analysis.

Review and Discussion Questions


1. Take a look at the survey form Emily administered to the employees.
a. What are the possible extraneous variables that may affect the level of cultural competency
among the employees?
b. What are the possible extraneous variables that may affect the level of the perceived work-
place conflict?
2. Take an area of research that you are interested in and develop a null hypothesis as well as a
research hypothesis.
3. Describe the characteristics of a normal distribution.
4. Describe the difference between standard deviation and standard error.
5. What is the relationship between the p-value and committing a Type I Error?
6. Describe what it means that your statistical analysis was statistically significant at the p-value
of .05? What does this really mean?
7. Find a research report in your specific field or discipline where the authors conduct a statistical
analysis and report statistically significant results. Discuss if their results are also practically
meaningful. What does this mean for practice in your field?

References
Babbie, E. R. (2013). The practice of social research. Belmont, CA: Wadsworth Cengage Learning.
Coolidge, F. L. (2013). Statistics: A gentle introduction (3rd ed.). Thousand Oaks, CA: Sage.
Faherty, V. E. (2008). Compassionate statistics: Applied quantitative analysis for social services. Thousand
Oaks, CA: Sage.
Fisher, R. A. (1935). The design of experiments. Edinburgh, UK: Oliver and Boyd.
Fisher, R. A., & Bennett, J. H. (1990). Statistical methods, experimental design, and scientific inference. Oxford,
UK: Oxford University Press.
170  ❖  SECTION II  DATA ANALYSIS

Leedy, P. D., & Ormrod, J. E. (2010). Practical research: Planning and design. Upper Saddle River, NJ: Merrill.
O’Sullivan, E., Rassel, G. R., & Taliaferro, J. D. (2011). Practical research methods for nonprofit and public
administrators. Boston, MA: Longman.
Popper, K. R. (1959). The logic of scientific discovery. New York, NY: Basic Books.
Popper, K. R. (1962). Conjectures and refutations: The growth of scientific knowledge. New York, NY: Basic Books.
Salkind, N. J. (2011). Statistics for people who (think they) hate statistics. Los Angeles,CA: Sage.
Schutt, R. K. (2012). Investigating the social world: The process and practice of research. Thousand Oaks, CA: Sage.

Key Terms
Central Limit Theorem  164 Independent Research Hypothesis
Variable (IV)  155 or Alternative
Control Variables  156
Hypothesis 158
Dependent Variable (DV) Level of Significance
(Alpha [α] level)  160 Sampling Distribution  164
or Outcome Variable or
Criterion Variable  155 Nondirectional Sampling Error  162
Directional Research Research Significance Test  154
Hypothesis 159 Hypothesis 159
Standard Error  164
Extraneous Variables  156 Normal Statistical
Distribution 162 Probability 162
Falsifiability 159
Null Hypothesis  158 Statistically Significant  160
Hypothesis 155
Power 168 Type I Error  166
Hypothesis
Testing 154 P-Value 160 Type II Error  167

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


9 ❖
Comparing
Means Between
Two Groups

Learning Objectives 172
Comparing Two Groups 173
Emily’s Case 173
Jim’s Case 174
Types of Research Questions T-Tests Can Answer 174
Why Conduct T-Tests? 175
Background Story of the T-Test 175
One-Sample T-Test 175
Running One-Sample T-Test Using Software Programs 176
Independent Samples T-Test 178
Equality of Variance 179
Jim’s Case 179
Running Independent Samples T-Test Using SPSS 180
Independent Samples T-Test Using Excel 184
Jim’s Case 185
Paired Samples T-Test 186
Running Paired Samples T-Test Using SPSS 187
Running Paired Samples T-Test Using Excel 189
Chapter Summary 190

171
172  ❖  SECTION II  DATA ANALYSIS

Review and Discussion Questions 191


Exercises 191
Key Terms 192
Figure 9.1 Menu Selections for One-Sample T-Test 177
Figure 9.2 Input for One-Sample T-Test 178
Figure 9.3 SPSS Output for One-Sample T-Test 178
Figure 9.4 Menu Selections for Independent Samples T-Test 181
Figure 9.5 Input Variables for Independent Samples T-Test 182
Figure 9.6 SPSS Output for Independent Samples T-Test 182
Figure 9.7 SPSS Output for Independent Samples T-Test. Group Statistics 182
Figure 9.8 SPSS Output for Independent Samples T-Test.
Independent Samples Test 183
Figure 9.9 Input Variables for Independent Samples T-Test in Excel 185
Figure 9.10 Excel Output for Independent Samples T-Test 185
Figure 9.11 Menu Selections for Paired Samples T-Test 187
Figure 9.12 Input Variables for Paired Samples T-Test in SPSS 188
Figure 9.13 SPSS Output for Paired Samples T-Test 188
Figure 9.14 Rockwood Fire Department Response Times for 2009–2011 189
Figure 9.15 Input Variables for Paired Samples T-Test in Excel 189
Figure 9.16 Excel Output for Paired Samples T-Test 190
Table 9.1 Summary of T-Tests 191


Learning Objectives

In this chapter you will

1. Learn about three types of t-tests: a one-sample t-test, independent samples


t‑test, and a paired samples t-test
2. Develop understanding on the theoretical basis behind the use of each of the
three t-tests
3. Learn how to formulate appropriate hypotheses for each of the t-tests
Chapter 9  Comparing Means Between Two Groups  ❖  173

4. Learn how to choose the appropriate t-test given the research question and data
5. Develop an understanding on the assumptions for t-tests
6. Learn how to perform a comparison of means using t-tests

Comparing Two Groups

Emily’s Case
Emily looked at Leo’s schematic drawing on the whiteboard showing
the hypothetical relationship in their research, with an independent
variable (diversity training), two dependent variables (cultural com-
petence and workplace conflict), and two extraneous variables (years
of education and previous diversity trainings attended). She felt she
understood the analysis. Then she remembered that the two depen-
dent variables, the outcomes they wanted to observe, combined sev-
eral survey questions. She wondered how that would work.
Emily turned to Leo and asked, “Can you remind me how you plan to measure cultural
competence and workplace conflict? I mean, I know we have a set of eight questions on
each topic, but how will you plug all those questions into the analysis?”
Leo thought a moment. He knew the items on the survey intimately after entering them
into a database, but he had to recall what the team decided when they set up the ques-
tions. Then he explained to Emily, “Every question for both measures uses a Likert scale for
the response. For the analysis, I can create a new variable that computes an average for the
combined values from the eight questions for cultural competence and another new vari-
able for workplace conflict the same way. So both of the new variables are continuous
variables. That is actually one thing I wanted to talk to you about today, before I go chang-
ing the database.”
“All right,” Emily said. “Of course you should do what you need to do. I would just say to
archive the master file so you have a backup in case you need to restore the original data.”
“I know,” Leo chuckled, “I will make sure I record the new variables in the codebook.
Thanks for the reminder. I was going to do that.”
Emily pursued the analysis plan. She wanted to be clear on the process. “Once you make
one variable for each outcome, what statistical test will you use to compare the groups?”
“We said we would use a t-test. We have a categorical independent variable—a group in
the training, a group not in the training—and a continuous dependent variable. Pretty
easy,” Leo answered.
Emily smirked.
“I’m not sure yet how to control for the extraneous variables we just decided to put into the
analysis,“ Leo continued. “I’ll have to look into that. We don’t have to do that on the first round
anyway. I’ll just run an independent samples t-test and bring the results to our next meeting.”
174  ❖  SECTION II  DATA ANALYSIS

Jim’s Case
When Jim saw Lavita next, she showed him another data printout from her
statistics package, this time with the numbers for him to notice already circled
with a bright orange marker. “I ran a one-sample t-test on the response-time
data for 2011” she told him, “because you wanted to know if the average time
of 4.55 minutes is significantly better than the national standard of 5 minutes.
I randomly sampled 33 calls during 2011 from each one of the stations and
used those samples for the analysis. In that way, I can meet the data assump-
tion for a one-sample t-test.. . . Well that’s probably too much information for
you—But anyway, on this printout you can see the average time and the com-
parison to the standard time,” she pointed to a set of orange circles, “and here you can see
the p-value is less than .05, which means, yes, you can say your average response time is
significantly lower than the national standard.”
Jim smiled at Lavita gratefully. She looked at him seriously.
“It’s up to you whether you think that has any practical significance. I don’t know what
it means to have .45 minutes lower than the national standard response time.”
“Of course.” Jim responded. He sensed that Lavita wanted something more challenging.
He thought of his second project on the alternative service delivery model.
“Lavita, do you remember we talked about another project? We are doing an experiment
to see if sending a physician’s assistant to the scene first without sending the engine and
firefighters is a viable alternative service delivery model that will save more lives and reduce
cost for the department.”
Lavita nodded. Ty, her graduate advisor, also talked to her about the project. This is the
one that prompted her to volunteer.
Jim continued, “We have been using this alternative service delivery model for four of our
stations and have been collecting data. The chief and the city council members now want
to see some results to consider if the alternative delivery model is any better than the tra-
ditional model. Can you take a look at the data and see what we have?”
Lavita replied with renewed enthusiasm, “Sure! That sounds like fun. What do you have?”

Types of Research Questions T-Tests Can Answer

T-tests are the statistical tests you can use when you have a research question that
requires a comparison of two means. This requires a dependent variable that is a
continuous measure. There are three different types of t-tests, based on the type of
groups you are comparing: a one-sample t-test, an independent samples t-test, and a
paired samples t-test. We will discuss the underlying assumptions of the t-tests in
applied examples in subsequent sections.
The one-sample t-test is used when you have only one sample, and you are com-
paring its mean to some other set value. In Jim’s case, he has data from one sample, the
Rockwood Fire Department, and he wants to compare its mean response time with the
national standard. The national standard can be conceptualized as the ideal response
mean time based on the population of fire departments nationwide.
Chapter 9  Comparing Means Between Two Groups  ❖  175

The independent samples t-test is used when you have two groups in your sample
that are independent from each other, and you would like to compare their means to
see if they are significantly different. Emily’s case illustrates this research design. She
wants to compare her experimental and control groups to see if her diversity training
increased the level of cultural competence among employees who attended the train-
ing. Her measure of cultural competence is a continuous variable, based on values
applied to the response categories in a Likert scale.
The paired samples t-test is used when you want to compare the means of two
groups that are closely related or matched or when one group is measured twice. For
example, in Jim’s case, he has response-time data for the years 2009, 2010, and 2011. If
he decides to chart the trend of response times from year to year, he will be comparing
results from the same eight stations each year. He will be comparing groups of data that
are related. The paired samples t-test will be the appropriate test to determine if any
statistically significant changes in response time occurred from year to year.

Why Conduct T-Tests?


You might wonder why we can’t just look at the means of the two groups and figure out
if they are different or similar. In Jim’s case, for example, he found that the average
response time of the Rockwood Fire Department’s eight stations was 4.55 minutes in
2011, so it was lower than the national standard of 5 minutes. However, what Jim cannot
tell from just observing the descriptive statistics is if the difference is statistically signifi-
cantly lower than the national standard. A t-test evaluates the variance in the values that
make up the mean and determines whether or not 4.55 is simply due to chance, taking
this one sample at this one time. A statistically significant difference indicates that a new
sample is likely to show a result that is also below the national standard of 5 minutes.

Background Story of the T-Test


Sometimes the t-test is referred to as the Student’s t-test. This is because William
Gosset, an employee of the Guinness Brewery in the early 20th century, developed the
use of the t-statistic as part of his work in improving industrial operations at the brew-
ery (the t-test was actually developed as a way to monitor the quality of Guinness’
world famous stout beer). The Guinness brewery viewed the development of this sta-
tistical test as a trade secret and did not permit Gosset to publish his work. To publish
his work and protect his identity, he published under the nom de plume of “student.”
Hence, the test commonly became referred to as the Student’s t-test. (For more infor-
mation, see Salsburg, 2001.) To this day, the t-test, in its various forms, is one of the
most widely used statistical tests (Vercruyssen & Hendrick, 2012).

One-Sample T-Test

As described earlier, the one-sample t-test is used when the researcher wants to
compare the mean of a single sample against a specified value. In the context of public
176  ❖  SECTION II  DATA ANALYSIS

administration and nonprofit management, this specified value might be a benchmark


or some other performance standard (Wang, 2010), as we saw in Jim’s case. Before
describing how to conduct the t-test, it is important to discuss the assumptions behind
the one-sample t-test. All inferential statistical tests have assumptions that must be met
to correctly use and interpret the results of the test. There are three primary
assumptions of the one-sample t-test:

1. The variable from which the mean is calculated (which is a dependent variable or
outcome variable) must be a continuous measure, representing either an interval or
ratio level of measurement. As we discussed in Chapter 7, it makes little sense to
compute a mean for a categorical variable, such as gender. In the database, attributes
for the categories such as male and female may be assigned a value (e.g. male=0,
female=1), but these values are arbitrary and the mean will not refer to anything
that is being measured. In the case of the labels 0 and 1 for a dichotomous variable,
the mean will represent the proportion of the category labeled 1 in the group.
2. The independent variable (or grouping variable) must be dichotomous.
3. The variable from which the mean is calculated must be normally distributed.
(We discussed normal distribution in Chapters 7 and 8.) This assumption is
frequently violated as a great deal of social data often exhibits some degree of
skewness or kurtosis. Moore, McCabe, and Craig (2010) offer some guiding
thoughts when the distribution deviates from normality:
•• If there is a sample size of less than 15, then the one-sample t-test is not an
appropriate choice as outliers heavily influence the data.
•• If the sample size is at least 15, then the one-sample t-test will be fairly
robust against the normality violation.
•• If the sample is at least 40, the researcher need not worry about deviations
from the normal distribution, as the test is robust.

Running One-Sample T-Test Using Software Programs


In the following example, we use SPSS to reproduce the one-sample t-test described in
Jim’s case, run by Lavita to answer Jim’s question if the Rockwood Fire Department’s
mean response time of 4.55 minutes in 2011 was significantly lower than the national
standard of 5 minutes. In SPSS, you can run a one-sample t-test from the drop-down
menus. If you choose to use Microsoft Excel, the program does not have a preset
Windows-based operation that allows you to perform a one-sample t-test. You will
need to enter the one-sample t-test formula in a cell and use the TDIST function.
Jim is hypothesizing that his fire department’s response time is not the same as the
5-minute standard. His null hypothesis and research hypothesis can be described as
follows:
H0: Rockwood Fire Department response time = 5 minutes
HR: Rockwood Fire Department response time ≠ 5 minutes
Chapter 9  Comparing Means Between Two Groups  ❖  177

Figure 9.1   Menu Selections for One-Sample T-Test

To test the comparison of the Rockwood Fire Department’s mean response time
for 2011 to the national standard, complete the following steps:

1. Open the data file Rockwood 2011 Response Time.sav


2. Select AnalyzeàCompare Meansà One-sample T-Test.
3. Select yr_11 as the test variable and specify 5 as the test value (this is the 5-minute
standard that we are comparing against), then click OK.

Figure 9.3 presents the output that you should obtain after running the procedure
above. The box labeled as One-Sample Statistics provides the descriptive statistics. The
sample mean for the response time in 2011 is 4.55 minutes (rounded), with a standard
deviation of 1.46 minutes. In the box labeled as One-Sample Test box, you get informa-
tion on t value as − 5.03. Under the column Sig. (2-tailed), you see information on the
p-value. In this output, it says .000. Note that when SPSS output indicates the p-value
is .000, that means the p-value is smaller than .0009 (you cannot have a p‑value of
zero). The output’s p-value is lower than the significance level α = .05, and thus, the
null hypothesis would be rejected, and the research hypothesis is supported. Therefore,
the analysis of the one-sample t-test using sample response time from 2011 gives Jim a
foundation to state that in 2011 the fire department’s response time, on average, was
statistically below the 5-minute national standard.
178  ❖  SECTION II  DATA ANALYSIS

Figure 9.2   Input for One-Sample T-Test

Figure 9.3   SPSS Output for One-Sample T-Test

Independent Samples T-Test

The independent samples t-test, also known as the two-samples t-test, evaluates
whether the means of two samples are different from one another. The following
assumptions apply to the independent samples t-test:

1. The variable from which the mean is calculated (which is a dependent variable
or outcome variable) must be a continuous measure, representing either an
interval or ratio level of measurement.
2. The independent variable (or grouping variable) must be dichotomous.
Chapter 9  Comparing Means Between Two Groups  ❖  179

3. The dependent variable must be normally distributed. Again, as indicated


above for the one-sample t-test, once a single sample moves beyond 40 cases,
the t-test becomes fairly robust against this violation (Lehman, 1999).
4. Observations between the two groups must be independent of each other. In
other words, the data from one group cannot have some dependency or rela-
tionship to the data from the other group.
5. The variances for the two populations are equal.

Equality of Variance
The fifth assumption, that the variances between the two groups should be equal, is
also called the assumption of homogeneity of variance. When there is a departure from
this assumption, the variances are then considered heterogeneous (heterogeneity of
variances). Why does this matter? Because the way the test statistic is calculated, it can
be influenced by the variance of each group and subsequently may affect the p-value,
and therefore, your interpretation of the results. To see if the population variances for
the two groups you are comparing in your analysis are equal, you should conduct
Levene’s test. If the result of the Levene’s test is significant, you can conclude that there
is a statistically significant difference in the population variances between the two
groups you are comparing in the analysis, and therefore, the assumption of homogene-
ity of variance is violated. In SPSS, you can obtain the t-value adjusted for the unequal
variances between the two groups. If the result of the Levene’s test is not significant,
than the assumption of homogeneity of variance is met, and you can report the unad-
justed t-value.
Experimental design studies with data from one experimental group and one con-
trol group can be analyzed using an independent samples t-test. Emily’s case with her
diversity training represents this kind of research design. Jim’s case with his test for the
alternative delivery model also represents this kind of experimental design. Let’s see
how Lavita proceeds with Jim’s data testing the alternative model.

Jim’s Case
Lavita left Jim’s office with the data for the alternative service delivery
study on her thumb drive. On her computer at school, she opened the
file and found the data set labeled as “Rockwood Mortality and Cost.
sav.” This data set included information for the eight stations. In the
“Data View” of SPSS, each row represented a station, and the column
provided information on the following: (1) type of service delivery
(with 1 indicating that the station adopted the alternative service
model, and 0 indicating that the station maintained the traditional
service model), (2) mortality rate for each station for fiscal years
2010 and 2011, and (3) average cost per emergency runs for each station for fiscal years
2010 and 2011 (entered as thousand dollar amounts). Based on the original data set,
180  ❖  SECTION II  DATA ANALYSIS

Lavita created two new variables by calculating: (1) raw gains for mortality rate by sub-
tracting the mortality rate of 2010 from that of 2011, and (2) raw gains for cost by sub-
tracting the average cost of emergency runs in 2010 from that of 2011. Since the
implementation of the alternative service delivery started at the beginning of 2011, 2010
data can be considered as pretest data, and the 2011 data can be considered as posttest
data. By subtracting the pretest scores (2010 data) from the posttest scores(2011 data),
Lavita can examine how much “gain” there was after the intervention was introduced.
When Lavita examined the data set, she noticed that most of the values are negative. “Well,
this is good!” she said to herself, “This means overall, there was reduction in mortality and
cost across the department.” But Lavita knows that Jim needs to determine whether there
is a statistical difference between the four stations that used the alternative service delivery
and the four stations that stuck to business as usual. Lavita thought it would be a good
practice for her to write down the null hypothesis and the research hypothesis, so it would
be clear in her mind. Ty, her graduate adviser, told her to do that. She pulled out a notepad.
“I have two sets of hypotheses. First, for the mortality rate—“ She wrote down the rate, the
null hypothesis, and the research hypothesis.

H0: Mean raw mortality gain scores for traditional delivery stations = Mean raw mortality
gain scores for alternative delivery stations.
HR: Mean raw mortality gain scores for traditional delivery stations ≠ Mean raw mortality
gain scores for alternative delivery stations.

She then started writing another set of hypotheses. “OK, then for the cost — “

H0: Mean raw cost gain scores for traditional delivery stations = Mean raw cost gain scores
for alternative delivery stations.

HR: Mean raw cost gain scores for traditional delivery stations ≠ Mean raw cost gain scores
for alternative delivery stations.

With these sets of hypotheses in mind, Lavita started running the analysis.

Running Independent Samples T-Test Using SPSS


To run the independent samples t-tests using SPSS for the alternative service model
mortality and cost evaluation, Lavita will be doing the following:

1. Open the data set Rockwood Mortality & Cost.sav.


2. Click AnalyzeàCompare MeansàIndependent samples t-test.
3. Move the variables Mort_Gain and Cost_Gain into the test variables box.
4. Move the variable Delivery into the Grouping Variable box.
5. Since the variables may have more than two attributes, you must click below the
Grouping Variable box on the radio button Define Group.
Chapter 9  Comparing Means Between Two Groups  ❖  181

Figure 9.4   Menu Selections for Independent Samples T-Test

6. For group 1 enter 0, and for group 2 enter 1 (this refers to the SPSS coding for
each of these levels of the grouping variable), click continue.
7. Click OK to obtain the output.

Figure 9.6 shows the the output Lavita got after running the independent samples
t-test with SPSS.
SPSS will produce two tables in the output. The first box labeled as Group
Statistics provides means and standard deviations for the two dependent variables
(i.e. Mortality Raw Gain and Cost Raw Gain) for the stations that remained using the
traditional service model (Traditional) and for the stations that adopted the alterna-
tive service model (Alternative) respectively (Figure 9.7). Just by examining the
descriptive statistics, Lavita could see that at the stations where the alternative ser-
vice delivery model was used, mortality rates were reduced dramatically at a much
higher rate than the traditional service delivery stations. Also, the cost per calls at
the alternative delivery stations declined more than the stations that maintained a
traditional service model.
The table labeled Independent Samples Test provides the results of the two inde-
pendent samples t-tests Lavita conducted (Figure 9.8). The result of the first indepen-
dent samples t-test she conducted examines if there is a significant difference in the
mortality rate between the traditional and alternative service delivery model. The
result of this independent samples t-test appears in the first row of the table labeled
Mortality Raw Gain. The second independent samples t-test she conducted examines
if there is a significant difference in the cost per call between the traditional and alter-
native service delivery model. The result of this independent samples t-test appears in
the second row of the table labeled Cost Raw Gain.
182  ❖  SECTION II  DATA ANALYSIS

Figure 9.5    Input Variables for Independent Samples T-Test

Figure 9.6   SPSS Output for Independent Samples T-Test

Figure 9.7   SPSS Output for Independent Samples T-Test. Group Statistics

Because SPSS conducts Levine’s test to examine if the assumption of homogene-


ity of variance is met, Lavita should first examine the column that has the heading,
Chapter 9  Comparing Means Between Two Groups  ❖  183

Figure 9.8    SPSS Output for Independent Samples T-Test

Levene’s Test for Equality of Variances. The column labeled Sig. provides the informa-
tion as to whether the result of the Levene’s test is significant or not. When the
p-value for Levene’s test is below .05, then Levene’s test is significant. This means
there is a significant difference in the variance between the two groups in the popu-
lation, and the assumption of homogeneity variance is not met. In this case, the
t-value should be reported from the row marked as Equal variances not assumed. On
the other hand, when the p-value for Levene’s test is larger than .05, then Levene’s
test is not significant. This means there is no significant differences in the variance
between the two groups in the population, and the assumption of homogeneity vari-
ance is met. In this case, the t-value should be reported from the row marked as
Equal variances assumed.
The p-value for the Levene’s test for Mortality Raw Gain is .897 and is well above
the standard significance level of .05. This means that she can assume equal variance
between the alternative service model stations and the traditional service model sta-
tions in the change in mortality rate. However, the p-value for the Levene’s test for Cost
Raw Gain is .031. This is below the standard significance level of .05. This means she
cannot assume equal variance between the alternative service model stations and the
traditional service model stations in the change in the cost. Although it is a violation
of the independent samples t-test assumption, luckily there is a way to adjust the t-test
so it would take into account the influence of the difference in variance in calculating
the t-statistics, and SPSS provides the adjusted t-statistics. So for the Cost Raw Gain,
Lavita interprets the t-statistics that show up on the row that’s marked as Equal vari-
ances not assumed.
184  ❖  SECTION II  DATA ANALYSIS

Based on the results of the Levene’s tests, Lavita can identify the right t-statistics
for the independent samples t-tests. The t-statistic for the Mortality Raw Gain is
t = .468, and the associated p-value, marked as sig., is p = .656. Since the p-value is
well above the significance level of .05, Lavita fails to reject the null hypothesis. In
other words, there is no statistically significant difference in the mean change
in the mortality rate between traditional and alternative service model stations.
Even though the descriptive statistics show an obvious difference between the two
service delivery models, the difference is not statistically significant and cannot be
generalized.
On the other hand, the result of the independent samples t-test for the Cost Raw
Gain shows, t = 5.867, and the associated p-value = .009. Since the p-value is below
the significance level of .05, Lavita rejects the null hypothesis. In other words, there
is a significant difference in the mean change in the per call cost between traditional
and alternative service model stations. By examining the descriptive statistics,
Lavita can conclude that the cost savings were greater for the alternative service
delivery model.

Independent Samples T-Test Using Excel


You can also run an independent samples t-test by using Microsoft Excel. Here are the
steps:

1. After opening Rockwood Mortality & Cost.xlsx, click on the Data tab near the
top of the screen and select Data Analysis.
2. You will see the option for three different t-tests.
3. Excel does not allow you to test for homogeneity of variances; therefore, you
must make an assumption. If you are unsure whether your variances are equal,
it is always better to be a little more conservative and use the assumption of
unequal variances. In this case, choose t-test: Two-Sample Assuming Unequal
Variances.
4. To test for a difference in the morality raw gain, fill in the data as shown in the
box below.

Variable 1 range has the cells G3 through G6. These represent the raw mortality
gain scores for the alternative delivery service group. Variable 2 range is the array that
you are comparing array 1 against. In this case, cells G7 through G10, represent the raw
mortality gain scores for the traditional service delivery group. Clicking on output
range allows you to place the output within the worksheet.
Once you click OK, you will obtain the following results.
The results are the same as presented in SPSS. To obtain the p-value, we move
down to the row that is labeled, P(T<=t) two-tail.
Chapter 9  Comparing Means Between Two Groups  ❖  185

Figure 9.9   Input Variables for Independent Samples T-Test in Excel

Figure 9.10   Excel Output for Independent Samples T-Test

Jim’s Case
The next day, Lavita visited Jim’s office.
“What’s the result of the analysis for the alternative service
model study?” Jim asked. “Well, sort of mixed,” Lavita replied.
“While there was no statistical difference in mortality rates, there
186  ❖  SECTION II  DATA ANALYSIS

were certainly significant and meaningful cost savings for using the alternative service
delivery model.”
She laid out the SPSS output and went through the analysis with Jim. After listening,
Jim said, “In an operational sense, this means that Rockwood could change its service
delivery model and save money, while not having any adverse impact on human life. While it
would have been nice if the alternative service model would save more lives, the outcome
of the experiment is still interesting. I’m sure the chief and the city council members will
be happy to see the results.”

Paired Samples T-Test

If you have a study where data were collected from the same group twice, You have a
repeated measures design. In the repeated measures design with the data collected
twice, the paired samples t-test is the appropriate statistical test to compare the means
of the data from the first time (time 1) and the second time (time 2). In this situation,
since the data are collected from the same group, it is unreasonable to assume that the
two data sets, one from time 1 and another from time 2, are totally independent; the
data must be somewhat related to each other. For example, Jim has response-time data
from all eight stations in the Rockwood Fire Department on an annual basis. If he
wants to see if the response time changed significantly between 2010 and 2011, he can
conduct a paired sample t-test to find out the answer to that question.
Another situation where the paired samples t-test is appropriate is when you have
a pair of people assessed once on a same measure. This is called matched subjects
design. Hypothetically, if Mary has married couples volunteering for Health First, and
she wants to see whether the level of satisfaction in the volunteering activities differ
between husbands and wives, she has a matched subject design.
The following assumptions apply to the paired samples t-test:

1. The variable from which the mean is calculated (which is a dependent variable
or outcome variable) must be a continuous measure, representing either an
interval or ratio level of measurement.
2. The independent variable is a pair of two conditions that the data represent
([Time 1]-[Time 2] pair, or husband-wife pair).
3. The difference score in the dependent variable between the two conditions
must be normally distributed in the population. However, with a sample size of
30 pairs or above, the paired samples t-test becomes fairly robust against this
violation (Green & Salkind, 2010; Lehman, 1999).
4. The difference score in the dependent variable between the two conditions
must be independent of each other. In other words, the data from one pair can-
not have some dependency or relationship to the data from another pair (Green &
Salkind, 2010).
Chapter 9  Comparing Means Between Two Groups  ❖  187

In conducting the paired samples t-test, the null hypothesis and the researcher
hypothesis for Jim’s case, where he compares the response times between 2010 and
2011, will be expressed as follows:

H0 : Mean response time by stations in 2010 = Mean response time by stations in 2011.
HR: Mean response time by stations in 2010 ≠ Mean response time by stations in 2011.

Running Paired Samples T-Test Using SPSS


In Jim’s example of comparing the response times between 2010 and 2011, the follow-
ing steps are performed in SPSS:

1. Open the response time by station year.sav


2. Click AnalyzeàCompare Meansà Paired Samples T-Test.
3. Click on the variable year11, then using the arrow key in the middle of the
window, move over to variable 1.
4. Repeat step 2 using the variable year10.
5. Click OK.

The output for the paired samples t-test is shown in Figure 9.13. The first table in
the output that is labeled Paired Samples Statistics provides descriptive statistics for
each group for the paired samples. In Jim’s case, the paired groups are the response

Figure 9.11   Menu Selections for Paired Samples T-Test


188  ❖  SECTION II  DATA ANALYSIS

Figure 9.12    Input Variables for Paired Samples T-Test in SPSS

time of each station in 2010 and the response time of each station in 2011. It appears
that the mean response time in 2011 is slightly higher than in 2010. To find out
whether this difference is statistically significant, we move down to the next two tables.
The table labeled as Paired Samples Correlations simply provides us with a coefficient
that examines the strength of the relationship between the two pairs. (We will explain
correlation more in detail in Chapter 10.) The next table, labeled Paired Samples Test,
provides the information in which Jim is most interested. The first box labeled mean
provides the difference in mean between the two years. There is a difference of .18
minutes (rounded) between the means in response time in 2010 and 2011. Moving
farther to the right, you can see that t = .761 and the p-value = .472. Since the p-value
is well above the significance level a = .05, the null hypothesis will not be rejected. In
other words, there is no statistical difference between response time in 2010 and 2011.

Figure 9.13   SPSS Output for Paired Samples T-Test


Chapter 9  Comparing Means Between Two Groups  ❖  189

Figure 9.14   Rockwood Fire Department Response Times for 2009–2011

Running Paired Samples T-Test Using Excel


You can also run a paired samples t-test using Microsoft Excel. On tab 1 of the file
Rockwood Response times.xlsx, the data appear as follows:

To run the analysis, follow the steps below:

1. Click DataàData Analysis.


2. Click t-test: paired two sample for means, then click OK.

Figure 9.15   Input Variables for Paired Samples T-Test in Excel


190  ❖  SECTION II  DATA ANALYSIS

Figure 9.16   Excel Output for Paired Samples T-Test

3. Activate the Variable 1 range box by clicking inside of it.


4. Highlight cells C4 through C12; this will automatically transfer over into the
variable 1 range box.
5. Repeat Step 4 in the Variable 2 range box, highlighting cells D4 through D12.
6. Since cells C4 and D4 contain a title for the column, make sure that the labels
icon is checked.
7. Select where you would like the output placed, and click OK.

When interpreting the Excel output, look for the P(T<=t) two-tail row. This pro-
vides the p-value for the paired samples t-test.

Chapter Summary
This chapter explained t-tests as the statistical approach to use when comparing two groups. Three
different types of t-tests were explained: one-sample t-test, independent samples t-test, and paired
samples t-test. Table 9.1 summarizes the purpose and conditions for each one of the t-tests. The
t-test is one of the most frequently used statistical tools across all disciplines. In its three forms,
the t-test provides the researcher the ability to answer a variety of research questions that involve
a continuous measure as a dependent variable: a sample mean can be compared to a benchmark
or other single value, two group means can be compared, and the means can be compared from a
single group measured twice. As an instrument of analysis, the t-test can make inferences about
differences in the population of interest that simple descriptive statistics cannot tell us. Thank you
Mr. Gosset!
Chapter 9  Comparing Means Between Two Groups  ❖  191

Table 9.1  Summary of T-Tests

Name of the Purpose of the # of


test analysis # of IV Type of IV # of groups DV Type of DV
One sample Compare means (NA) (NA) (NA) 1 Continuous
t-test of a sample with
another value
(test value)
Independent Compare means 1 Dichotomous 2 1 Continuous
samples t-test of two
independent
groups
Paired Compare means 1 (one pair (NA) (one 2 1 Continuous
samples t-test of two related of variables, pair of (Conditions)
groups 2 variables, 2
conditions) conditions)

Review and Discussion Questions


1. Consider how Emily can compare the differences in cultural competence and workplace con-
flict between those who attended the diversity training and those who did not, and identify the
impact of the diversity training. Conduct the analysis, and interpret the results.
2. What is the primary data requirement difference for the paired samples and independent sam-
ples t-tests? Be as specific as possible.
3. Find a research article that utilizes one of the t-tests. After reviewing the article, describe the
justification for using the t-test.
4. When comparing means, why is it important to perform a t-test rather than just looking at the
differences in the descriptive statistics? How would you explain this to a policymaker?
5. Describe what heterogeneity of variance is and how it applies to the t-test.
6. Identify different types of research questions where it would be appropriate to use a one-sample
t-test, an independent samples t-test, and a paired samples t‑test.

Exercises
1. Emily wants to find out if the cultural competence of the training participants changed before
and after the training. Run an appropriate test and report the result.
2. Jim wants to find out if the response time between 2010 and 2011 are significantly different.
Run an appropriate test and report the result.
3. Emily wants to identify if the level of cultural competence between men and women (prior to
the training) are significantly different. Run an appropriate test and report the result.
192  ❖  SECTION II  DATA ANALYSIS

References
Green, S. B., & Salkind, N. J. (2010). Using SPSS for Windows and Macintosh: Analyzing and understanding
data (6th ed.). Upper Saddle River, NJ: Prentiss Hall.
Lehman, E. (1999). Elements of large sample theory. New York, NY: Springer-Verlag.
Moore, D., McCabe, G., & Craig, B. (2010). Introduction to the practice of statistics (7th ed.). New York, NY:
Freeman.
Salsburg, D. (2001). The lady tasting tea: How statistics revolutionized science in the twentieth century. New
York, NY: Freeman.
Sheshkin, D. (2004). Handbook of parametric and nonparametric statistical procedures. Boca Raton, FL: CRC.
Vercruyssen, M., & Hendrick, H. (2012). Behavioral research and analysis: An introduction to statistics within
the context of experimental design (4th ed.). Boca Raton, FL: CRC.
Wang, X. (2010). Performance analysis for public and nonprofit organizations. Sudbury, MA: Jones and
Bartlett.

Key Terms

Independent One-Sample T-Test  174 Repeated Measures


Samples T-Test  175 Design 186
Output Range  184
Levene’s Test  179 Robust 176
Paired Samples
Matched Subjects Design  186 T-Test 175

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


•• Result write-ups
10 ❖
Comparing
Means of More
Than Two
Groups
Analysis of Variance (ANOVA)


Learning Objectives 195
Comparing More Than Two Groups 195
Emily’s Case 195
Jim’s Case 196
Introduction to ANOVA 196
Two Types of ANOVA 196
Why Conduct ANOVA? 197
Understanding F-Statistic 197
What ANOVA Tells Us 199
Post Hoc Tests 200
Effect Size: Eta Squared 201
One-Way ANOVA 201
Note on Sample Sizes for the One-Way ANOVA 202
Running One-Way ANOVA Using SPSS 203

193
194  ❖  SECTION II  DATA ANALYSIS

Running One-Way ANOVA Using Excel 209


Side Note: Omnibus Test Is Significant but Post Hoc Test Is Not Significant 209
Repeated Measures ANOVA 210
Running Repeated Measures ANOVA Using SPSS 211
Running Repeated Measures ANOVA Using Excel 215
Other Types of ANOVA 217
Factorial ANOVA 217
Mixed Design ANOVA 217
Chapter Summary 218
Review and Discussion Questions and Exercises 219
Key Terms 220
Figure 10.1 Illustration of Variability Between Groups Versus Within Groups 198
Figure 10.2 Menu Selections for One-Way ANOVA in SPSS 204
Figure 10.3 Input Variables for One-Way ANOVA 204
Figure 10.4 Options Menu for One-Way ANOVA 205
Figure 10.5 One-Way ANOVA SPSS Output 205
Figure 10.6 SPSS Windows After Step 7 206
Figure 10.7 One-Way ANOVA Post Hoc Tests 206
Figure 10.8 SPSS Output for One-Way ANOVA Homogeneity of Variances Test 207
Figure 10.9 SPSS Output for One-Way ANOVA Post Hoc Tests 207
Figure 10.10 Tukey HSD Post Hoc Comparisons for One-Way ANOVA 208
Figure 10.11 One-Way ANOVA Output From Excel 209
Figure 10.12 Menu Selections for Repeated Measures ANOVA 212
Figure 10.13 Defining Factors for Repeated Measures ANOVA in SPSS 212
Figure 10.14 Input Variables for Repeated Measures ANOVA in SPSS 213
Figure 10.15 Options for Repeated Measures ANOVA in SPSS 213
Figure 10.16 Descriptive Statistics From Repeated Measures ANOVA in SPSS 214
Figure 10.17 Mauchly’s Test of Sphericity in SPSS 214
Figure 10.18 Test of Within-Subjects for Repeated Measures ANOVA in SPSS 214
Figure 10.19 Repeated Measures ANOVA Pair-Wise Comparisons in SPSS 216
Figure 10.20 Output From Repeated Measures ANOVA in Excel 216
Table 10.1 One-Way ANOVA With One Grouping Variable 217
Chapter 10  Comparing Means of More Than Two Groups  ❖  195

Table 10.2 Factorial ANOVA With Two Grouping Variables 218


Table 10.3 Mixed Design ANOVA With One Grouping
Variable and One Repeated Measure 218
Table 10.4 Summary of Different Types of ANOVA 219
Formula 10.1 Calculating the F-Statistic 198
Formula 10.2 Calculating the Sum of Squares 199
Formula 10.3 Calculating the Mean Sum of Squares 199
Formula 10.4 Calculating the F-Statistic From Sum of Squares 199
Formula 10.5 Calculating Eta Square 201


Learning Objectives

In this chapter you will

1. Learn the purpose for using ANOVA


2. Understand the F-statistic
3. Learn about post hoc tests
4. Learn about effect size and eta square
5. Understand the purpose and assumptions for one-way ANOVA and repeated
measures ANOVA
6. Be able to perform a one-way ANOVA in SPSS and Excel
7. Be able to perform repeated measures ANOVA in SPSS and Excel

Comparing More Than Two Groups

Emily’s Case
Emily sat at her desk, reflecting on the discussion that morning at the
weekly meeting of department heads with the city manager. The public
works director mentioned concerns he had about recent incidents with
employee-manager conflicts. He also said his managers do not get
along very well. He told the city manager and Emily, as HR director, that
he might need help to resolve some of these issues before they blew up.
196  ❖  SECTION II  DATA ANALYSIS

The public works director was fairly new, and he was afraid employee morale was slipping.
With that comment, the police chief also spoke up, saying he was observing some tension in
his department, too. In his case, there had been a long-standing tension between sworn officers
and the nonsworn civilian employees. He also added that he recently hired a couple of female
police officers and one Latina civilian employee as a community liaison, and he wondered if
the addition of these new employees was contributing to the heightened sense of tension in the
police department. Other departments reported nothing different or alarming. The parks and
recreation director emphasized that his current staff was the happiest group ever.
Emily wondered how accurate these assessments were. She knew the directors usually
had a fairly reliable sense of what was going on in their departments. She then realized, “I
have data. I can look for myself.” She jotted a note to ask Leo to look at the workplace
conflict measures on their survey and compare the different departments. This could give
her some insight on what to do.

Jim’s Case
Chief Chen liked Jim’s report on the response-time study, including the finding from
the one-sample t-test that showed Rockwood Fire Department’s average response
time in 2011 was significantly lower than the national standard of 5 minutes. Going
over the report page by page in the chief’s office, Jim was glad he already discussed
the findings carefully with Lavita, so he could answer the chief’s questions.
After going over the report, Chief Chen paused and said, “So, here is some
additional information I would like to have. First, I want to know the response
time for each of the eight stations we have and see if there are any statistical
differences across different stations. Second, I want to find out if the overall
response time changed significantly during the last three years from 2009 to
2011. Do you think you can do that?”
Jim took a deep breath, and said, “Let me see what I can do. You know, I’ve been getting
a lot of help from a graduate student intern. Her name is Lavita, and she’s a statistics
expert. I’m pretty sure she will know how to set up this kind of analysis.”
Chief Chen smiled. “I’m glad you are getting some help. I imagine it’s also good experi-
ence for her to get her hands on a real-world project.”
Jim laughed and agreed, saying he hoped she was getting a fair exchange. Walking out
of Chief Chen’s office, he thought to himself, “Well, if there’s a statistical analysis that
allows us to compare the means of two groups, there should be a way to compare more
than two groups.”

Introduction to ANOVA

Two Types of ANOVA


In Chapter 9, we discussed how you could statistically compare the means of two
groups. In the two cases above, we now have research questions that require a
Chapter 10  Comparing Means of More Than Two Groups  ❖  197

comparison of more than two groups. The appropriate statistical test for comparisons
with more than two groups is analysis of variance (ANOVA). As with t-tests, there are
several variations on the ANOVA test. We are going to discuss the two most common
versions: one-way ANOVA and repeated measures ANOVA.
One-way ANOVA is used when you want to compare the means of several inde-
pendent groups. You can consider one-way ANOVA as a counterpart to the indepen-
dent samples t-test discussed in Chapter 9, except with more than two groups. Repeated
measures ANOVA is used when you want to compare the means of groups that are
related. You can consider repeated measures ANOVA as a counterpart to the paired
samples t-test discussed in Chapter 9, except with more than two related groups of data.
In the case examples, we see that Emily is interested in comparing survey responses
from employees who work in several different city departments. It looks like one-way
ANOVA will be the appropriate test to detect differences between the departments. Jim
has a similar situation in the chief ’s request to compare response time among the eight
fire stations. For that, he can use one-way ANOVA. Jim has another comparison to make
to answer the chief ’s question about change over the years 2009, 2010, and 2011. In that
case, Jim will be comparing three sets of related data from the same eight stations. For
this second question, a repeated measures ANOVA will be the appropriate test.

Why Conduct ANOVA?


Before explaining how to use ANOVA in your analysis, we should consider why it is
better to conduct ANOVA instead of simply performing multiple numbers of t-tests to
compare all the combinations of pairs among the groups. In Jim’s case, for example, to
compare the mean response times for 2009, 2010, and 2011, why conduct a repeated
measures ANOVA, instead of running three paired samples t-tests? The simple answer
is Alpha (α) inflation. When conducting a t-test with a 5% chance of a Type I error
(finding a difference when no difference actually exists), each test compounds the
chance of an error.
The chance of error with multiple tests is calculated by multiplying the confidence
level by as many tests as are conducted. For example, suppose Jim conducted three
paired sample t-tests, comparing the means for 2009 to 2010, 2010 to 2011, and 2009
to 2011 using a significance level of p < .05 as the cutoff point for rejecting the null
hypothesis. This gives a 95% confidence level. The overall probability for not commit-
ting a Type I error for the three tests is calculated by multiplying .95 three times: .95 ×
.95 × .95 = .857. The confidence level is reduced to 85.7%. Thus, the alpha level would
be 14.3% (1 – .857 = .143). Repeating the test three times increases the probability of
making one Type I error across the three tests from 5% to 14.3%. ANOVA controls this
kind of alpha inflation by conducting multiple comparisons within one test.

Understanding F-Statistic
ANOVA produces an F-statistic (named after R. A. Fisher, the creator of the statistic)
to test the hypothesis that there are overall differences between groups. The F-statistic
198  ❖  SECTION II  DATA ANALYSIS

is basically a ratio of the amount of variability between the groups, due to systematic
differences between the groups, compared to the amount of variability within each
group. (See Formula 10.1.) Variability between groups is based on the comparison of
the mean of each group. Variability within groups refers to how the scores within each
group vary due to chance. (See Figure 10.1.)

Figure 10.1   Illustration of Variability Between Groups Versus Within Groups

Variability Between Groups

Variability Between Groups Variability Between Groups

Variability Variability Variability


Within Groups Within Groups Within Groups

The formula for calculating the F-statistic can be expressed as follows:

Formula 10.1 Calculating the F-Statistic

Variablity between groups


F=
Variablity within groups

If the ratio for the F-statistic is 1, that means the variability due to the within group
differences and the variability due to the between group differences are the same. As
the variability between groups gets larger in comparison to the variability within
groups, the F-statistic becomes larger. A larger F-value suggests that the difference
between the groups is not due to chance.
To calculate variability between groups and variability within groups, we use the
sum of squares. As we saw in Chapter 7 in the discussion of variance, the within-group
sum of squares is the sum of differences between each individual score in a group and
the mean of each group, squared. With a similar calculation, the between-group sum
of squares is the difference between the mean of all scores and the mean of each group’s
score, squared. (See Formula 10.2.)
Chapter 10  Comparing Means of More Than Two Groups  ❖  199

Formula 10.2 Calculating the Sum of Squares

Between Sum of Squares = ∑ (Mean of scores − Mean of each group)2

2
Within Sum of Squares = ∑ (Individual score − Mean of each group)

With the results for the within-group sum of squares and the between-group sum
of squares, we can then calculate the mean sum of squares for each. This is done by
dividing each sum of squares by the appropriate degrees of freedom (df), which is the
sample or group size minus 1 for each set of scores to approximate greater variance
within the population compared to the sample. (See Formula 10.3.) We saw this same
adjustment in the calculation of the standard deviation in Chapter 7. Thus:

Degrees of freedom for the within-group sum of squares = N–k

Degrees of freedom for the between-group sum of squares = k–1

Where N is the number of the total sample size from all groups, and k is the number
of groups.

Formula 10.3 Calculating the Mean Sum of Squares

Between Sum of Squares


Mean Sum of Squares for Between Groups =
K −1

Within Sum of Squares


Mean Sum of Squares for Within Groups =
N −k

With the mean sum of squares for both within and between groups, we know the
variance for each and can then calculate the F-statistic. (See Formula 10.4.)

Formula 10.4 Calculating the F-Statistic From Sum of Squares

Mean Sum of Square Between Groups


F=
Mean Sum of Square Within Groups

What ANOVA Tells Us


ANOVA uses the F-statistic to compare within-group and between-group variability.
If all the scores in each group are the same, and there is no variability in each group,
then any difference between groups suggests a systematic difference. When the
F-statistic = 1, then the within-group variability is equal to the between-group variability,
200  ❖  SECTION II  DATA ANALYSIS

and any difference between groups could also be by chance. As the difference in the
mean score between groups becomes larger, the F-statistic gets larger, and it becomes
increasingly unlikely that the group difference in the sample is totally by chance (thus,
one can reject the null hypothesis).
The null hypothesis in ANOVA is that the mean of the samples are equal. If we
have three groups, the mean score of each group can be described as: Mean score for
– – –
Group 1 = X1, Mean score for Group 2 = X2, and Mean score for Group 3 = X3. The null
– – –
hypothesis therefore, can be expressed as the following: H0: X1 = X2 = X3. When ANOVA
gives a result indicating a significant difference between groups, it is simply indicating
that the null hypothesis is not true. However, there are different ways that the means
of the three groups can differ. A statistically significant difference (according to the
significance level for the F-statistic) can mean that the mean of all three groups differ
– – – – – –
significantly (X1 ≠ X2 ≠ X3) or that only Group 1 and Group 2 differ (X1 ≠ X2 = X3), or
– – –
only Group 1 and Group 3 differ (X 1 ≠ X 3 = X 2), or only Group 2 and Group 3 differ
– – –
(X1 = X2 ≠ X3). ANOVA does not tell us which of the four options is the case. For this
reason, ANOVA is referred to as an omnibus test (Gonzalez, 2009).

Post Hoc Tests


In some cases, detecting an unspecified difference among several groups with ANOVA
may be enough for the research question. Usually, however, you will want to conduct a
further analysis to find out which of the groups differ. These additional tests are called
post hoc tests (Toothaker, 1993). Post hoc tests can be conducted within the ANOVA
test in SPSS. The procedure consists of pair-wise comparisons of the groups, similar to
running a separate t-test on each pair. You may notice that this contradicts what we just
discussed about the problem of alpha inflation. Recognizing the problem of alpha infla-
tion with multiple comparisons at once, a number of statisticians have devised ways to
correct the overall Type I error, so the level of significance will remain close to .05.
One of the easiest ways to control for alpha inflation is to divide the chosen alpha
level by the number of comparisons and apply the lower level to judge statistical sig-
nificance. For example, if you have 10 pairs to compare, you apply .05/10 = .005 as a
criterion in determining significance for each one of the pair-wise comparisons. In this
way, the cumulative Type I error rate can be kept below .05. This approach is called
Bonferroni correction, after its originator, Carlo Bonferroni.
There are other correction methods, including Tukey’s HSD (honestly signifi-
cant difference) test, Fisher’s LSD (least significant difference) test, Scheffe’s test,
and many more. Each one of these tests uses different methods to adjust for the
Type I error in multiple comparisons. Some tend to be more conservative than oth-
ers. One of the things to take into consideration when choosing a post hoc test is
that if the test is conservative in controlling the Type I error and lowers the alpha
level, it will increase the probability of a Type II error—not rejecting the null
hypothesis when you should.
The number of comparisons also affects which post hoc test is more appropriate.
Fisher’s LSD is known to be the least conservative and almost equivalent to performing
Chapter 10  Comparing Means of More Than Two Groups  ❖  201

multiple t-tests. Although Scheffe’s test and the Bonferroni test have been popular, they
are known to be more conservative and tend to lead to Type II errors. Many experts
recommend Tukey’s HSD and the Ryan, Einot, Gabriel, and Welsch Q procedure
(REGWQ) when the group sizes are similar and equal variances can be assumed in the
population across the comparison groups. When the group sizes are different and
the variances in the population across the groups cannot be assumed to be equal, then the
Games Howell and Dunnett’s C post hoc tests are recommended (J. Stevens, 1990;
Toothaker, 1993).

Effect Size: Eta Squared


In Chapter 8, we discussed the difference between statistical significance and practical
significance. In ANOVA, along with the statistically significant differences between
groups, the test will also provide a measure that tells you the magnitude of the differ-
ence, or effect size, to assist in making an assessment as to whether the difference is
meaningful (Cortina & Nouri, 2007). There are different measures of effect size as a
standardized measure of the observed effect: Cohen’s d, Pearson’s correlation coeffi-
cient r, omega squared (ω2), and eta squared (h2 ) to name a few. All measures of effect
size range from 0 to 1. The general rule of thumb for interpreting the effect size in
social science is as follows:

0.0–.20 a small effect size


0.20–0.50 a medium effect size
> 0.50 a large effect size
(Cohen, 1988; Salkind, 2011)

The common measure of effect size for ANOVA is eta squared (h2 ), which is
calculated by taking the ratio of the between sum of squares over the combined
between sum of squares and within sum of squares. (See Formula 10.5.) These are
the same sums of squares we observed earlier in the calculation of the F-statistic
(Formula 10.2).

Formula 10.5 Calculating Eta Squared

Between Sum of Squares


Eta squared (η2 ) =
Between Sum of Squares + Within Sum of Squares

One-Way ANOVA

One-way ANOVA is used to compare the means of two or more independent groups.
There are four primary assumptions for the one-way ANOVA that need to be met to
use and interpret the results of the test correctly (Coolidge, 2013; McNabb, 2008;
Warner, 2013):
202  ❖  SECTION II  DATA ANALYSIS

1. The observations of the groups that are compared must be independent of each
other.
2. The dependent variable must be normally distributed in each group.
3. The dependent variable must be a continuous measure (interval or ratio), and
the grouping variable, called the factor in ANOVA, must be nominal.
4. The variances of the dependent variable in each group must be equal (referred
to as homogeneity of variance).

As with the t-test, research has shown that ANOVA is fairly robust against the
violation of the normal distribution and equality of variances assumption (Peers, 1996;
Warner, 2013).
We can apply one-way ANOVA to Jim’s case to compare the mean response time
for the eight fire stations. The hypotheses can be described as follows:

H0: Station 1 mean response time = Station 2 mean response time = (etc.) =
Station 8 mean response time
HR: Station 1 mean response time ≠ Station 2 mean response time ≠ (etc.) ≠
Station 8 Mean response time

The null hypothesis states that the mean response times of all eight stations are the
same. The alternative hypothesis states that at least one of the means across the groups
differs from at least one of the other means. If the result indicates a statistically signif-
icant difference (p < .05), then Jim should reject the null hypothesis and conclude there
is an overall difference between the mean response times at the eight stations. Since the
ANOVA result does not indicate which stations differ, Jim would then need to run a
post hoc test to identify which station or stations are significantly different. On the
other hand, if the result from ANOVA indicates the probability of getting that partic-
ular F-statistic by chance is greater than 5% (p > .05), then Jim should not reject the
null hypothesis. He would conclude there is no statistically significant difference
among the eight stations in their response times.

Note on Sample Sizes for the One-Way ANOVA


When creating a research design where a one-way ANOVA is to be used, a sufficient
sample size is an important consideration. As is the case with many statistical tests, a
small sample size can have an adverse effect on the statistical power of the test. In
ANOVA, a small sample size increases the likelihood of violating the homogeneity of
variance assumption (Howell, 2010). Green and Salkind (2010) recommend at least
15 cases per group be used to conduct a one-way ANOVA capable of fairly accurate
p-values. Thus, with three groups in an ANOVA test, the total sample should be at a
minimum of 45. With a larger number of groups, the overall sample size will need to
be correspondingly larger. In Jim’s case, for example, when he compares mean
Chapter 10  Comparing Means of More Than Two Groups  ❖  203

response times among the eight stations, he will want to have at least 15 service calls
with response times from each station, making a minimum overall sample size of 120
(15 × 8 groups).
When planning a study with group comparisons like this, it is also important to
make the groups as equal in size as possible. The one-way ANOVA is known to be
robust against violating the homogeneity of variance assumption if the sample size for
each group being compared is equal (J. P. Stevens, 2009). Making groups exactly equal
is unlikely in real social settings, but the researcher planning to use ANOVA should
define groups so the largest group is no more than half again as large as the smallest
group. The more unbalanced the group sizes become, the more the violation of the
homogeneity of variances assumption will affect the test result and identify the p-value
incorrectly. Even when the homogeneity of variance assumption is not violated, unbal-
anced group size will artificially increase the size of the F-statistic and increase the
likelihood of Type I error, finding a difference where none actually exists.

Running One-Way ANOVA Using SPSS


As a practical example in using one-way ANOVA, let’s follow Jim’s case and his
research objective to compare the mean response times at the eight stations in the
Rockwood Fire Department. Lavita runs the analysis by creating a data file with
response times for 33 randomly selected calls from each station for 2011. Each call is
a case. This gives her a data set with 264 cases. With this data set, the sample size for
each station is greater than 15, and the samples are an equal size for each group. Lavita
then takes the following steps to compare the mean response times with one-way
ANOVA, using SPSS.

1. Click AnalyzeàGeneral Linear Modelà Univariate.


2. Enter the variable yr_11 into the Dependent Variable box.
3. Enter the variable Station into the Fixed Factor box.
4. Click Options.
5. In the Factor and Factor Interactions box, click on Station and move it into the
box labeled, Display Means for.
6. Select Estimate of Effect Size, Descriptive Statistics, and Homogeneity Tests.
7. Click Continue.

To interpret the one-way ANOVA output, look at the Tests of Between Subjects
Effects table (Figure 10.5). Ignore the first two rows labeled Corrected Model and
Intercept, and go to the third row labeled Station—this is the name of the indepen-
dent variable that specifies the grouping that we are comparing. On that row, you
see the F-statistic in the F column (F = 2.49) and the associated p-value in the Sig.
column (p = .017). The p-value is below .05—the level of significance Lavita is using
204  ❖  SECTION II  DATA ANALYSIS

Figure 10.2   Menu Selections for One-Way ANOVA in SPSS

Figure 10.3   Input Variables for One-Way ANOVA

to judge statistical significance. Based on this result, she rejects the null hypothesis
that there is no difference in the mean response time across the eight stations and
accepts the research hypothesis that at least one pair of the stations is significantly
different in the mean response time. Note, however, that the effect size in the last
Partial Eta Squared column is very small (h2 = .064), which indicates the magnitude
of the difference is not great.
Chapter 10  Comparing Means of More Than Two Groups  ❖  205

Figure 10.4   Options Menu for One-Way ANOVA

Figure 10.5   One-Way ANOVA SPSS Output

The SPSS output gives you all the important information for the calculation of the
F-statistic. If you look at the output on the same Station row, the between sum of
squares is 35.899 (in the Type III Sum of Squares column), and between-group degrees
of freedom is 7 (in the df column), which makes the mean sum of squares for between
groups 35.899/7 = 5.128 (in the Mean square column). On the next row, labeled Error,
the within sum of squares is shown as 527.3 (in the Type III Sum of Squares column),
the degrees of freedom is 256 (in the df column), and the mean sum of squares for
within groups is 527.3/256 = 2.06 (in the Mean square column).
Lavita found a statistically significant difference in her one-way ANOVA test, and
now she wants to run a post hoc test to discover which pair-wise comparisons among
the eight stations show a difference in mean response times. To run a post hoc test,
Lavita takes the following steps:
206  ❖  SECTION II  DATA ANALYSIS

1. Repeat the above steps up to 7.


2. Click the post hoc button on the right.
3. Move station from the Factor box into the Post Hoc Test for box.
4. Under Equal Variance Assumed, select Tukey and/or R-E-G-W-Q (or other
post hoc test of your preference); or under Equal Variance Not Assumed, select
Dunnett’s C and/or Games-Howell (or other post hoc test of your preference).
5. Click Continue, and then OK.

Figure 10.6    SPSS Windows After Step 7

Figure 10.7    One-Way ANOVA Post Hoc Tests


Chapter 10  Comparing Means of More Than Two Groups  ❖  207

Figure 10.8    SPSS Output for One-Way ANOVA Homogeneity of Variances Test

In the output, note the table for Test of Homogeneity of Variances (Figure 10.8),
which shows the p-value for the Levene statistic is above .05 (p = .062). This means we
can assume the group variances are not significantly different. This is good news,
because it confirms we are not violating the ANOVA assumption on homogeneity of
variances. With this information, we also now know that we should select the Tukey
HSD test (or REGWQ) for the post hoc test.
Since there are eight levels to our grouping factor, the output is quite extensive.
Examining the descriptive statistics (Figure 10.9), it appears that Station A has the
lowest mean response time, and Station G has the highest mean response time, with an
average difference of a little over 1 minute per call. The post hoc tests perform all of
the possible pair-wise comparisons between the groups (Figure 10.10) and show that
the only statistical difference is between Station A and Station G (p = .015), with a 1.22
mean difference.
Given these results of the post hoc test, we must conclude that apparent differ-
ences between most of the stations may be due to chance, and the mean values for
response time should not be taken as a generalizable performance of these stations.

Figure 10.9    SPSS Output for One-Way ANOVA Post Hoc Tests
208  ❖  SECTION II  DATA ANALYSIS

Figure 10.10    Tukey HSD Post Hoc Comparisons for One-Way ANOVA

The effect size for the observed difference among the whole set of eight stations was
small. No one station stands out from all the others. Yet we do have this one signifi-
cant difference between a high and a low mean response time, and it appears to be
fairly large. In fact, the high mean response time appears to not meet the national
benchmark of 5 minutes (though this would have to be tested to confirm a significant
difference). Note that the results of the statistical analysis provide information but do
Chapter 10  Comparing Means of More Than Two Groups  ❖  209

not necessarily answer policy and management questions. Often, the analysis raises
questions for further research.

Running One-Way ANOVA Using Excel


You can perform the same one-way ANOVA in Excel by taking the following steps.

1. On the Data tab, click Data Analysis.


2. In the window that opens, click ANOVA: Single Factor, then OK.
3. Click the cursor on the input range box and then highlight cells A2 through H35.
4. Make sure the tab, labels in first row is also selected.
5. Decide where you would like your output displayed and click OK.

Excel produces an output that is similar to SPSS. What Excel does not provide in
this test, though, is rather critical. There is no option to determine if the homogeneity
of variance assumption is met as is done in SPSS (Figure 10.8). Additionally, there is
not an option to perform a post hoc test as is done in SPSS (Figure 10.10). The Excel
output only provides the omnibus ANOVA test, showing a significant difference in
mean response time for at least one of the stations (Figure 10.11).

Side Note: Omnibus Test Is Significant but Post Hoc


Test Is Not Significant
What happens if your ANOVA result (the omnibus test) is significant, but all of the
post hoc tests are not significant? This situation can happen. Post hoc tests are typi-
cally more conservative, because they control for the Type I error and adjust the alpha

Figure 10.11    One-Way ANOVA Output From Excel


210  ❖  SECTION II  DATA ANALYSIS

level, as discussed in the earlier section. So if you reject the null hypothesis when the
omnibus ANOVA is significant but then find that none of the post hoc tests are signifi-
cant, you are most likely committing a Type II error, missing a significant difference that
actually exists. When you face a situation like this, you may have one pair-wise
comparison where the p-value is very close but slightly higher than .05. Small sample
sizes may be another reason the post hoc test may not show a significant result for any
of the comparisons. With small samples, a statistical test loses power to detect differ-
ences and is more prone to Type II error (Cardinal & Aitken, 2006).

Repeated Measures ANOVA

Repeated measures ANOVA is used to compare means for more than two related, not
independent groups. It is also referred to as a within-subjects ANOVA.
Assumptions for the repeated measures ANOVA are similar to that of the one-way
ANOVA. For the repeated measures ANOVA, however, the independence of the
grouped observations is not required. There are four primary assumptions for the
repeated measures ANOVA:

1. The dependent variable must be normally distributed at each measurement


level (e.g. time points).
2. The dependent variable must be a continuous measure (interval or ratio).
3. The variances of the differences between all combinations of related groups
(levels) are equal. This is called sphericity, and violation of the sphericity
assumption will increase the risk of Type I error in a repeated measured
ANOVA (Upton & Cook, 2008). Sphericity is similar to homogeneity of vari-
ances in the one-way ANOVA. Sphericity only applies if there are more than
two levels of measurement. If sphericity is violated, the analysis can be adjusted.
We will discuss this in more detail in the example below.
4. If there are separate groups in addition to the repeated measurement levels,
then the variances of the dependent variable in each group must be equal
(homogeneity of variance).

Jim’s case gives us an example for a repeated measures ANOVA test, to answer the
question if response time among Rockwood Fire Department’s eight stations changed
over the years 2009, 2010, and 2011. The hypotheses can be stated as follows:

H0: Mean response time for 2009 = Mean response time for 2010 = Mean response
time for 2011
HR: Mean response time for 2009 ≠ Mean response time for 2010 ≠ Mean response
time for 2011

The null hypothesis states that the mean response times for 2009, 2010, and 2011
are the same. The research hypothesis states that the mean for at least one of the pair
Chapter 10  Comparing Means of More Than Two Groups  ❖  211

of the years comparing the response time differs. If the repeated measures ANOVA
indicates a statistically significant difference (p <. 05), then Jim can reject the null
hypothesis and conclude there is an overall difference in the mean response times for
these three years. The repeated measures ANOVA is an omnibus test, like the one-way
ANOVA, and will not distinguish which set of means differ, and a post hoc test will be
required to identify which pair-wise comparison is significantly different.

Running Repeated Measures ANOVA Using SPSS


We can follow Lavita as she runs the analysis with a repeated measures ANOVA in
SPSS. Lavita composed a data set with the mean response times for each of the eight
stations in the Rockwood Fire Department for three measurement levels: 2009, 2010,
and 2011. Then she went through the following steps:

1. Click AnalyzeàGeneral Linear Modelà Repeated Measures (Figure 10.12).


2. The Within-Subject Factor name will always default to factor 1. Replace with a
name that better represents your factor, in this case we will call the factor Year
(Figure 10.13).
3. You must also enter the number of levels the factor has. Since we have an obser-
vation for 2009, 2010, and 2011, we will enter 3.
4. Click Add, then Define.
5. In the Within-Subject’s box that appears you will see _?_(1) and _?_(2), and so
on for the number of levels you indicated for your factor. In the variable box,
click on yr_09, yr_10 and yr_11 and move them over to the Within-Subjects
Variables (Figure 10.14).
6. Click Options and select the year factor variable. Then move it over to the box
labeled Display means for. Also click Descriptive statistics and Estimates of
effect size in the Display box.
7. Check the compare main effects tab. Select Bonferroni for the drop-down menu
labeled Confidence interval adjustment. (This specifies your post hoc test.)
8. Click Continue, then OK.

The SPSS repeated measures ANOVA output includes multiple tables. We will
highlight some key tables to inspect. It is always a good practice to review the descrip-
tive statistics. The Descriptive Statistics table (Figure 10.16) provides the overall means
and standard deviations of the station mean response times for 2009, 2010, and 2011.
We can see that the mean of the response times for all eight stations together does not
differ very much from year to year.
Before looking at other tables, we first need to check the test of sphericity (Figure 10.17).
If the test result is significant (p < .05), it means a significant difference exists in the
variance in the difference of all combinations of the different levels of measurement,
212  ❖  SECTION II  DATA ANALYSIS

Figure 10.12    Menu Selections for Repeated Measures ANOVA

Figure 10.13    Defining Factors for Repeated Measures ANOVA in SPSS

and the assumption of sphericity has been violated. In this example, we see that sphe-
ricity has not been violated (p = .655), and we can continue normally. We will discuss
Chapter 10  Comparing Means of More Than Two Groups  ❖  213

Figure 10.14    Input Variables for Repeated Measures ANOVA in SPSS

Figure 10.15    Options for Repeated Measures ANOVA in SPSS

other options below for instances when the sphericity test is significant, and the
assumption of sphericity is violated.
214  ❖  SECTION II  DATA ANALYSIS

Figure 10.16    Descriptive Statistics From Repeated Measures ANOVA in SPSS

We next check the test results in the table for Within-Subjects Effect (Figure 10.18).
Since we already know that Sphericity assumed applies in this case, we can look at the
results on the top line in the table. The F-statistic is .372 and the p-value is .696. No
significant differences were detected in the mean response-time performance of the
eight stations over the three years examined.
The lines in the table below Sphericity assumed show adjustments for instances
when sphericity cannot be assumed, labeled as Greenhouse-Geisser and Huynh-
Feldt. In this case, only the Greenhouse-Geisser gives a different result. Notice for

Figure 10.17    Mauchly’s Test of Sphericity in SPSS

Figure 10.18    Test of Within-Subjects for Repeated Measures ANOVA in SPSS


Chapter 10  Comparing Means of More Than Two Groups  ❖  215

both adjustments that the sum of squares and the F-statistic are exactly the same.
The adjustment is applied to the degrees of freedom (in the df column), which
changes the mean sum of squares (in the Mean Square column) and increases the
p-value (in the Sig. column). In other words, the adjustment makes the test more
conservative, which reduces the chance for a Type I error (observing a difference
where there is none).
In addition to the standard univariate ANOVA test, based on the sphericity
assumption, SPSS produces output for a multivariate test when there are more than
two measurement levels. The multivariate test (MANOVA) does not require the spher­
icity assumption, but like the adjustments described above when the sphericity
assumption is violated, the multivariate test is more conservative and could increase
the chance of a Type II error (missing a difference when there is one). If sphericity can
be assumed, applying the sphericity assumed test results for the repeated measures
ANOVA will have more power and less chance of a Type II error.
As with the one-way ANOVA, the results for the repeated measures ANOVA
does not specify where the significant difference is located among the pair-wise
comparisons. In the example here, no significant difference was observed, so Lavita
will not need to conduct a post hoc test. She can look at post hoc results easily,
because the repeated measures ANOVA output includes the pair-wise comparisons
in the output when you specified Bonferroni test under the menu Confidence level
adjustments (Figure 10.19). In SPSS, default is LSD which is equivalent to conducting
multiple numbers of paired-samples t-tests without making any adjustments for
alpha inflation, and therefore, this is not recommended. The recommended option
is Bonferroni correction. Another option, Sidak correction can be used if you are
concerned that the Bonferroni is overconservative, and you are using the power of
your analysis.

Running Repeated Measures ANOVA Using Excel


The repeated measures ANOVA can also be conducted using Excel. To perform the
analysis in Excel, use the following procedure:

1. In the Rockwood Fire Department spreadsheet, click on Tab 1, Yearly Averages.


2. On the Data tab, click Data Analysis.
3. Once the Data Analysis window opens, select ANOVA: Two-factor without
replication.
4. Click on the input range box to activate and then highlight cells A4 through D12.
5. Check the Labels box.
6. Select where you would like output displayed and then click OK.

In Excel, you are provided with output that tests significance differences for both
the rows and the columns. The rows are the scores for each station across the years.
216  ❖  SECTION II  DATA ANALYSIS

Figure 10.19    Repeated Measures ANOVA Pair-Wise Comparisons in SPSS

Figure 10.20    Output From Repeated Measures ANOVA in Excel

The columns represent each of the years for all the stations. To test for differences
across years for all the stations, we would look at the output, Sources of Variationà
Columns. As in SPSS, this p-value is not statistically significant.
Chapter 10  Comparing Means of More Than Two Groups  ❖  217

Other Types of ANOVA

In this book, we cover only one-way ANOVA and repeated measures ANOVA. There
are other, more complex versions of ANOVA, including factorial design ANOVA and
mixed design ANOVA, which we describe briefly here to give some idea of other
applications.

Factorial ANOVA
The difference between one-way ANOVA and factorial ANOVA is in the number of
grouping variables included in the analysis. In Jim’s case, for example, in the compari-
son of mean response times among the eight stations using one-way ANOVA, he had
only one grouping variable: fire stations. The design for the one-way ANOVA in Jim’s
example is illustrated in Table 10.1.
In contrast, factorial design ANOVA is used with multiple grouping variables.
Suppose Jim wanted to explore how the response time differs depending on whether
the emergency call came in during daytime or nighttime, in addition to the difference
between the eight fire stations. Then he would have two grouping variables: fire sta-
tions and daytime and nighttime calls. A factorial design for this example is illustrated
in Table 10.2.
In the factorial ANOVA, in addition to examining the difference in response times
between the eight stations, he could also examine the difference in response times
between daytime calls and nighttime calls at all the stations, or differences in daytime
and nighttime calls at each station. For further study of factorial ANOVA, you may
want to review Coolidge (2013), Field (2009), Jaeger (1983), and Salkind (2011).

Mixed Design ANOVA


Mixed design ANOVA is used when you want to explore the effect of one or more
grouping variables on one or more repeated measures. This is a mixture of one-way
ANOVA and a repeated measures ANOVA, or a mixture of factorial ANOVA and
repeated measures ANOVA. The scenario in which you use the mixed design ANOVA
can be complex. We can use Jim’s case again for an example. In the repeated measures

Table 10.1  One-Way ANOVA With One Grouping Variable

Grouping variable: Fire Stations


A B C D E F G H
Response Response Response Response Response Response Response Response
time time time time time time time time
218  ❖  SECTION II  DATA ANALYSIS

Table 10.2  Factorial ANOVA With Two Grouping Variables

Grouping Grouping variable: Fire stations


variable:
Time of
call A B C D E F G H
Daytime Response Response Response Response Response Response Response Response
time time time time time time time time
Night- Response Response Response Response Response Response Response Response
time time time time time time time time time

ANOVA described above, the comparison of the mean response times in each year—
2009, 2010, and 2011—combined all eight stations. If Jim wanted to examine changes
over the years for each station, and see if those changes differ between the stations, he
could use a mixed design ANOVA. This example is illustrated in Table 10.3. For fur-
ther study of mixed design ANOVA, you may want to review Coolidge (2013), Field
(2009), Howell (2010), and Gamst, Meyers, and Guarino (2008).

Table 10.3 Mixed Design ANOVA With One Grouping Variable and One Repeated
Measure

Repeated Grouping variable: Fire stations


measures:
years A B C D E F G H
2009 Response Response Response Response Response Response Response Response
time time time time time time time time
2010 Response Response Response Response Response Response Response Response
time time time time time time time time
2011 Response Response Response Response Response Response Response Response
time time time time time time time time

Chapter Summary
This chapter explained analysis of variance (ANOVA) as a statistical approach to use when
comparing more than two groups. Basic principles of ANOVA were introduced, along with
detailed descriptions of how to conduct a one-way ANOVA and a repeated measures ANOVA and
how to interpret the output of the tests in SPSS and Excel. Two other types of ANOVA—factorial
ANOVA and mixed design ANOVA—were briefly introduced. Table 10.4 summarizes
characteristics of the different types of ANOVA discussed in this chapter. ANOVA provides the
researcher the ability to answer research questions that require comparison of more than two
means to discover if there are any statistically significant differences among them.
Chapter 10  Comparing Means of More Than Two Groups  ❖  219

Table 10.4  Summary of Different Types of ANOVA

Name of Purpose of # of # of
the test the analysis IV Type of IV # of groups DV Type of DV

One-way Compare 1 Categorical More than 1 1 Continuous


ANOVA means of
more than two
independent
groups.

Repeated Compare 1 (NA) (One set More than 1 1 Continuous


Measures means of two of variables, (Conditions)
ANOVA related groups. more than 2
conditions)

Factorial Compare More Categorical More than 1 1 Continuous


ANOVA means of two than
related groups. 1

Mixed Compare More 1 or more More than 1 1 Continuous


design means of than Categorical
ANOVA independent 1 variable and 1
and related or more
groups with conditions
multiple
grouping
factors.

Review and Discussion Questions and Exercises


1. Consider how Emily can compare the difference in workplace conflict among different depart-
ments. Create a new variable called Deptype by transforming “Q24 (Which department do you
work for?) into a variable with four department types. Label it as Department Types. Combine
City Hall Administrator, and City Hall Technician and call it City Hall with a value of 1.
Combine Field and Fleet and Transit as one group and call it Field with a value of 2. Keep
Culture and Recreation as 3 and recode Public Safety as 4. Currently, value 7 is assigned for
Rather not say. Using the Missing Value dialogue box, enter value 7 under The Discrete Missing
Values option. Now value 7 will be defined as missing value and will be excluded from the
analysis. Conduct the analysis and interpret the results. (See Appendix A for instructions in
creating new variables.)
2. Think about research examples where you would use (1) one-way ANOVA and (2) repeated
measures ANOVA. Identify the dependent variable and independent variable for each case.
How many groupings does each one of the independent variables have?
220  ❖  SECTION II  DATA ANALYSIS

3. What are the similarities and differences between a one-way ANOVA and an independent
sample t-test?
4. What are the similarities and differences between a repeated measures ANOVA and the paired
sample t-test?
5. If you chose to use independent sample t-tests rather than a one-way ANOVA, what are the
possible consequences of your choice?
6. One-way ANOVA assumes variances are homogenous, what are your options if you violate this
assumption?
7. Is it possible to have a significant omnibus one-way ANOVA but fail to find any differences in
the post hoc tests? Why?

References
Cardinal, R. N., & Aitken, M. R. F. (2006). ANOVA for the behavioral sciences researcher. Mahwah, NJ:
Erlbaum.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum.
Coolidge, F. L. (2013). Statistics: A gentle introduction (3rd ed.). Thousand Oaks, CA: Sage.
Cortina, J. M., & Nouri, H. (2007). Effect size for ANOVA designs. Thousand Oaks, CA: Sage.
Field, A. P. (2009). Discovering statistics using SPSS: (And sex and drugs and rock ‘n’ roll). Thousand Oaks,
CA: Sage.
Gamst, G., Meyers, L. S., & Guarino, A. J. (2008). Analysis of variance designs: A conceptual and computational
approach with SPSS and SAS. Cambridge, NY: Cambridge University Press.
Gonzalez, R. F. (2009). Data analysis for experimental design. New York, NY: Guilford.
Green, S. B., & Salkind, N. J. (2010). Using SPSS for Windows and Macintosh: Analyzing and understanding
data (6th ed.). Upper Saddle River, NJ: Prentice Hall.
Howell, D. C. (2010). Statistical methods for psychology (7th ed.). Belmont, CA: Cengage Wadsworth.
Jaeger, R. M. (1983). Statistics, a spectator sport. Beverly Hills, CA: Sage.
McNabb, D. E. (2008). Research methods in public administration and nonprofit management: Quantitative
and qualitative approaches. Armonk, NY: M.E. Sharpe.
Peers, I. S. (1996). Statistical analysis for education & psychology researchers. London, UK: Falmer.
Salkind, N. J. (2011). Statistics for people who (think they) hate statistics. Los Angeles, CA: Sage.
Stevens, J. (1990). Intermediate statistics: A modern approach. Hillsdale, NJ: Erlbaum.
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). New York, NY: Routledge.
Toothaker, L. E. (1993). Multiple comparison procedures. Newbury Park, CA: Sage.
Upton, G., & Cook, I. (2008). Oxford dictionary of statistics. Oxford, UK: Oxford University Press.
Warner, R. (2013). Applied statistics: From bivariate through multivariate techniques (2nd ed.). Thousand
Oaks, CA: Sage.

Key Terms
Alpha (α) Inflation  197 Effect Size  201 Mixed Design
Degrees of Factorial Design ANOVA 217
Freedom 199 ANOVA 217 Omnibus Test  200
Chapter 10  Comparing Means of More Than Two Groups  ❖  221

One-Way ANOVA  197 Repeated Measures ANOVA Sphericity 210


(Within Subjects
Post Hoc Test  200 ANOVA) 210

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


•• Result write-ups
11 ❖
Bivariate
Correlation

Learning Objectives 223
Examining Relationships 223
Emily’s Case 223
Mary’s Case 224
Pearson Product Moment Correlation 224
Direction of the Relationship 225
Strength of the Relationship 226
Visual Presentation of a Correlation: The Scatterplot 226
Note on Linear Versus Curvilinear Relationship 229
Testing Hypothesis and Statistical Significance for Correlation 230
Running Correlation Using Software Programs 231
Running Pearson Product Moment Correlation Using SPSS 231
Running Correlation Using Excel 235
Correlation Does Not Imply Causality 236
Chapter Summary 237
Review and Discussion Questions and Exercises 237
Key Terms 238
Figure 11.1 Example of a Scatterplot 227
Figure 11.2 Scatterplot for Positive and Negative Correlation 228
Figure 11.3 Scatterplot for Positive and Negative Perfect Relationship 228
Figure 11.4 Scatterplot With Strong and Weak Negative Relationship 229

222
Chapter 11  Bivariate Correlation   ❖  223

Figure 11.5 Example of Curvilinear Relationship 230


Figure 11.6 Menu Selections for Bivariate Correlation in SPSS 232
Figure 11.7 Input Variables for Bivariate Correlation in SPSS 232
Figure 11.8 SPSS Output for the Pearson Product Moment Correlation for Cultural
Competence and Workplace Conflict 233
Figure 11.9 Menu Selections for Creating Scatterplot 233
Figure 11.10 Scatterplot Selection Menu in SPSS 234
Figure 11.11 Input Variables for Scatterplot in SPSS 234
Figure 11.12 Scatterplot of Cultural Competence and Workplace Conflict 235
Figure 11.13 Input Variables for Correlation in Excel 236
Figure 11.14 Correlation Output in Excel 236
Table 11.1 Types of Correlation Coefficients 225
Table 11.2 Guidelines for Interpreting Correlation Coefficient 227



Learning Objectives

In this chapter you will

1. Learn a statistical technique used for examining relationships between variables


2. Understand the use and interpretation of the Pearson product moment cor-
relation coefficient
3. Examine how to graphically illustrate correlation analysis
4. Learn how to conduct the analysis in SPSS and Excel

Examining Relationships

Emily’s Case
After the discussion at the department heads meeting about a height-
ened level of tension and conflict observed in some departments, Emily
had been thinking a lot about workplace conflict and its relationship to
diversity. She decided she needed to learn more about the topic and
downloaded a number of articles, including academic journal articles,
224  ❖  SECTION II  DATA ANALYSIS

that addressed issues related to workplace conflict, diversity, and cultural competence.
Some articles included language like “hypotheses” and “statistical analyses.” In the past,
she probably would have stayed away from these articles, but after working with Leo on
the survey, she realized it is important to get acquainted with statistical analyses. She also
realized that it’s not that hard to read these articles and understand the content. “After all,”
she reasoned, “I did take some research classes in school, and now that I have a specific
project that uses these analyses, they make more sense.”
One Saturday afternoon, after reading through some of the articles she downloaded,
Emily noticed some articles discussed the idea that more cultural competence among
employees may reduce the level of workplace conflict. She wondered if she could examine
if that was true at the city of Westlawn with her survey results.
“We have a set of questions that measure the level of cultural competence among the
employees and the level of perceived conflict in the workplace. I should be able to assess if
there is any relationship between the two.”

Mary’s Case
Mary, volunteer manager at Health First, had been thinking about conducting
a series of in-depth interviews with volunteers. Her main goal was to find a
better way to recruit and retain volunteers. To identify potential participants for
her study, Mary obtained lists of volunteers from the HR department. Health
First had 60 volunteer positions, but currently only 45 active volunteers were
registered.
When Mary reviewed the volunteer list, she saw that along with names and
contact information, the spreadsheet contained additional information about the
volunteers, such as age, the number of hours they wanted to contribute, income
level, gender, and active or inactive status. Mary thought about a recent article she read in
the nonprofit association’s newsletter that discussed the demographic profiles of volunteers.
It noted that age and income level seemed to have a relationship with the way people
volunteer. Mary thought, “I wonder if there is a relationship between our volunteers’ age
and the amount of time they said they will put in volunteering with us. For that matter,
there may be a relationship between the volunteers’ income level and the amount of time
they are willing to volunteer.”
Doing qualitative interviews was new to Mary, but number crunching was familiar. “OK,
this is going to be fun,” she thought. Mary fired up the statistical software on her computer,
loaded the batch of volunteer data into it, and then jumped right into the analysis.

Pearson Product Moment Correlation

Some research questions involve examining relationships (Aldrich, 1995; Ha & Ha,
2012). As illustrated in the case descriptions, Emily wants to examine the relationship
between cultural competence and workplace conflict. Mary wants to examine the
relationship between amount of volunteer time, volunteer age, and income level.
Chapter 11  Bivariate Correlation   ❖  225

The relationship can be examined by looking at the correlation between the two
concepts of interest, represented as variables in a data set (Cohen, Cohen, West, &
Aiken, 2002). Correlation is examined based on how much the two variables co-vary,
or how the value of one variable changes when the value of another variable changes.
For example, in Emily’s case, if the levels of cultural competence of employees at the
city of Westlawn are associated with levels of workplace conflict they perceive, then the
two variables may be shown to co-vary.
In statistical analysis, a correlation coefficient is used as a numerical index to
represent the relationship between two variables. Pearson product moment correla-
tion coefficient (r) is one of the correlation coefficients used to represent the relation-
ship between two variables that are continuous in nature (Remler & Van Ryzin, 2011;
Yule & Kendall, 1973). Other types of correlation coefficients are available for examin-
ing the relationship between variables that are not continuous. See Table 11.1 for other
types of correlation coefficients.
Continuous variables assume any value between two points along some underly-
ing continuum. Examples include height, age, weight, or test scores. Responses to
survey questions, such as those used in Emily’s survey that use the five-point Likert
scale can also be analyzed as continuous variables. (See Chapter 7 for the discussion on
the nature of the variables.)

Direction of the Relationship


There are two patterns in the way two variables can co-vary. One of the patterns is
that the values of the two variables change in the same direction. In other words,

Table 11.1  Types of Correlation Coefficients

Types of
correlation
Types of two variables coefficient used Example In SPSS
Nominal (Categorical) Phi coefficient The correlation between Available under
/ Nominal gender(male/female)and “Descriptive statistics/
(Categorical) diversity training attendance Crosstab”
(yes/no)
Dichotomous/Interval Point biserial The correlation between Use the same procedure
(Continuous) gender (male/female) and for Pearson product
the level of cultural moment correlation
competence.
Ordinal/Ordinal Spearman rank The correlation between Available under “Correlate/
coefficient level of cultural competence Bivariate”
Kendall’s tau-b converted into ranking and
coefficient workplace conflict converted
into ranking.
226  ❖  SECTION II  DATA ANALYSIS

when the value of one variable increases, the value of the other variable increases.
Or when the value of one variable decreases, the value of the other variable decreases.
When the variables co-vary in this pattern, the two variables have positive correlation,
and the value of the Pearson product moment correlation coefficient (r) will have a
positive value. In Mary’s example, if the older the volunteer is the more the amount of
time that volunteer puts in, then there is a positive correlation between age and
volunteer time.
Another pattern is that the values of the two variables change in the opposite
direction: when the value of one variable increases, the value of the other variable
decreases. When variables co-vary in this pattern, the two variables have a negative
correlation, and the value of the Pearson product moment correlation coefficient (r)
will have a negative value. In Emily’s example, if the employees who report higher
cultural competence report lower workplace conflict, and the employees who report
lower cultural competence report higher workplace conflict, then there is a negative
correlation between cultural competence and workplace conflict.
The terms “positive” and “negative” correlation only suggest the direction of the
relationship—whether the variable changes in the same direction or in the opposite
direction. It does not mean that a positive correlation is better and a negative correla-
tion is worse.

Strength of the Relationship


The correlation between the two variables can be strong or weak. The value of the cor-
relation coefficient indicates the amount of variability shared between the two varia-
bles and the strength of the relationship. The correlation coefficient always falls
between −1 and +1 (Cohen et al., 2002).
As noted earlier, the negative or positive value of the correlation coefficient indi-
cates the direction of the relationship, and the strength of the relationship is indicated
by the absolute value of the coefficient. A coefficient that is closer to the absolute value
of 1 indicates a stronger relationship. A coefficient that is closer to 0 indicates a weaker
relationship. The coefficient of an absolute value of 1 (i.e. −1 or +1) means that the two
variables have a perfectly linear relationship. Conversely, the coefficient of 0 means the
two variables have no linear relationship.
There are no absolute criteria in interpreting the strength of the relationship based
on the value of the correlation coefficient, but Table 11.2 shows the rule of thumb
generally accepted by social science researchers in interpreting the correlation coeffi-
cient (e.g. Salkind, 2008).

Visual Presentation of a Correlation: The Scatterplot


A correlation between two variables can be visually presented by a graph called a scat-
terplot (Anscombe, 1973). A scatterplot displays values of two variables as a collection
of points on the space determined by horizontal axis (X) and vertical axis (Y). The
position of the points indicates the value of one variable (X) and the value of another
variable (Y). (See Figure 11.1.)
Chapter 11  Bivariate Correlation   ❖  227

Table 11.2  Guidelines for Interpreting Correlation Coefficients

Correlation coefficient General interpretation of the strength of relationship


0.8 to 1.0 Very strong
-0.8 to -1.0 (± 1 = perfect relationship)
0.6 to 0.8 Strong
- 0.6 to - 0.8
0.4 to 0.6 Moderate
- 0.4 to - 0.6
0.2 to 0.4 Weak
- 0.2 to - 0.4
0 to 0.2 Very weak
0 to - 0.2 (0 = no relationship)

Figure 11.1   Example of a Scatterplot

10.00

8.00

6.00
Y

4.00

2.00

.00
.00 2.00 4.00 6.00 8.00 10.00
X

The scatterplot visually represents the direction and the strength of the relationship
between the two variables. When the points on the scatterplot form a slope-like shape
that goes from the lower left corner toward the upper right corner of the graph, then the
two variables have a positive correlation. On the other hand, when the points on the
228  ❖  SECTION II  DATA ANALYSIS

scatterplot form a slope-like shape that goes from the upper left toward the lower right
corner of the graph, then the two variables have a negative correlation. (See Figure 11.2.)
The scatterplot also visually represents the strength of the relationship (Anscombe,
1973; Ha & Ha, 2012). When the two variables have a perfect correlation, the points on
the scatterplot align perfectly on a straight line. Figure 11.3 shows scatterplots with a
perfect positive and a perfect negative relationship.
The strength of the relationship is indicated on the scatterplot by how the points are
aligned in relation to the straight sloped line that will be formed when the two variables

Figure 11.2    Scatterplot for Positive and Negative Correlation

Positive relationship (r=.87) Negative relationship (r=−.87)


10.00 10.00

8.00 8.00

6.00 6.00
Y

4.00 4.00

2.00 2.00

.00 .00
.00 2.00 4.00 6.00 8.00 10.00 .00 2.00 4.00 6.00 8.00 10.00
X X

Figure 11.3    Scatterplot for Positive and Negative Perfect Relationship

Perfect positive relationship (r=1) Perfect negative relationship (r=−1)


10.00 10.00

8.00 8.00

6.00 6.00
Y

4.00 4.00

2.00 2.00

.00 .00
.00 2.00 4.00 6.00 8.00 10.00 .00 2.00 4.00 6.00 8.00 10.00
X X
Chapter 11  Bivariate Correlation   ❖  229

have a perfect relationship (Yule & Kendall, 1973). The stronger the relationship, the
closer the points are clustered to the straight sloped line. In a weaker relationship, the
points are more scattered. Figure 11.4 shows two scatterplots with negative relation-
ships. The scatterplot on the left represents a strong negative relationship with r = −.87.
The scatterplot on the right represents a weak negative relationship with r = −.28.

Note on Linear Versus Curvilinear Relationship


The Pearson product moment correlation and other bivariate correlations introduced so
far are based on the assumption that the two variables are related in a linear manner. In
other words, the perfect relationship is represented as a straight line. Not all relationships,
however, can be represented in a straight line. Some relationships may be better repre-
sented along a curved line. For example, let’s assume in Emily’s case that she has a variable
that measures employee job satisfaction, and she finds a curvilinear relationship between
job satisfaction and the level of cultural competence, explained by the following situations:
employees who have lower cultural competence have higher job satisfaction, because they
are unaware of the cultural issues in the workplace; employees with a moderate level of
cultural competence show lower job satisfaction, because they are more aware of the
problems at work related to cultural differences; and employees with higher cultural
competence have learned to manage cultural differences more effectively and show higher
job satisfaction. In this example, the relationship between the two variables is not linear.
Figure 11.5 provides a visual depiction of the curvilinear relationship described in this
example (X represents cultural competence, Y represents job satisfaction).
When the two variables have a curvilinear relationship, Pearson product moment
correlation coefficient (r) will be close to 0 and suggest that there is no relationship. It
is, therefore, not an appropriate measure to use. Other advanced statistical approaches
(e.g. curvilinear regression) are used to examine curvilinear relationships.

Figure 11.4    Scatterplot With Strong and Weak Negative Relationship

Strong negative relationship (r= −.87) Weak negative relationship (r=−.28)


10.00 10.00

8.00 8.00

6.00 6.00
Y

4.00 4.00

2.00 2.00

.00 .00
.00 2.00 4.00 6.00 8.00 10.00 .00 2.00 4.00 6.00 8.00 10.00
X X
230  ❖  SECTION II  DATA ANALYSIS

Figure 11.5    Example of Curvilinear Relationship

10.00 Observed
Quadratic

8.00

6.00
Y

4.00

2.00

.00
.00 2.00 4.00 6.00 8.00 10.00
X

Curvilinear relationship
Example:
X represents “cultural competence”
Y represents “job satisfaction”
(r = 0.1)

Testing Hypothesis and Statistical Significance for Correlation

Following Emily’s case as an example, we can examine if there is a relationship between


cultural competence and the level of perceived conflict among the employees, using a
Pearson product moment correlation for the analysis. As with t-tests and ANOVA, in a
correlational analysis you develop a null hypothesis and a research hypothesis, conduct
a significance test, and make a determination whether the result you obtain is due to
chance or not. The null hypothesis and research hypothesis for Emily’s analysis is as
follows:
H0: There is no relationship between employees’ cultural competence and the
level of perceived workplace conflict (Pearson product moment correlation
between employees’ cultural competence and their level of perceived conflict = 0)
Chapter 11  Bivariate Correlation   ❖  231

HR: There is relationship between employees’ cultural competence and the level of
perceived workplace conflict (Pearson product moment correlation between
employees’ cultural competence and their level of perceived conflict ≠ 0)

Just as we have been doing in other statistical tests, after obtaining the Pearson
product moment correlation coefficient, it must be determined whether the probabil-
ity of obtaining that particular Pearson product moment coefficient is lower than .05
(p < .05). If the p value is .05 or lower, then reject the null hypothesis and conclude
there is a statistically significant relationship between employees’ cultural competence
and their level of perceived workplace conflict. If the p-value is larger than .05, then do
not reject the null hypothesis and conclude that there is no statistically significant
relationship between employees’ cultural competence and their level of perceived
workplace conflict.

Running Correlation Using Software Programs

When it comes to running the correlational analysis of cultural competence and


perceived workplace conflict, Leo was busy with finals and sent Emily the following
instructions to follow.

Running Pearson Product Moment Correlation Using SPSS


1. Click AnalyzeàCorrelateàOne-Bivariate.
2. Enter the variable Culturalcompetence and Conflict from the box on the left
into the Variables box.
3. Make sure Pearson is checked in the Correlation Coefficient box.
4. Make sure Two-tailed option is checked.
5. Make sure Flag significant correlations is checked.
6. Click OK.

Figure 11.8 is the output Emily gets after running a Pearson product moment
coefficient with SPSS. Note that the cells in the correlations table are diagonally iden-
tical: the results in the upper right corner and the lower left corner are exactly the
same. The Pearson product moment correlation coefficient r for cultural competence
and workplace conflict is –.76, and the p-value for the correlation is .000, which means
p < .001. The correlation between the two variables is statistically significant. The
negative sign in the correlation indicates that cultural competence and workplace
conflict are negatively correlated: the higher the cultural competence, the lower the
perceived workplace conflict. The absolute value of the coefficient is .76, which indi-
cates that the relationship is strong.
232  ❖  SECTION II  DATA ANALYSIS

Figure 11.6    Menu Selections for Bivariate Correlation in SPSS

Figure 11.7    Input Variables for Bivariate Correlation in SPSS


Chapter 11  Bivariate Correlation   ❖  233

Figure 11.8  SPSS Output for the Pearson Product Moment Correlation for
Cultural Competence and Workplace Conflict

Figure 11.9   Menu Selections for Creating Scatterplot

Leo also gave Emily instructions on how to visually examine the correlation with
a scatterplot:

1. Click Graphs à Legacy Dialogs à Scatter/Dot.


2. In the Scatter/Dot dialogue box, click Simple Scatter. Click Define.
234  ❖  SECTION II  DATA ANALYSIS

Figure 11.10    Scatterplot Selection Menu in SPSS

Figure 11.11    Input Variables for Scatterplot in SPSS

3. In the Simple Scatterplot dialogue box, move variable Conflict into the Y axis
box and variable Culturalcompetence into the X axis box.
4. Click OK.
Chapter 11  Bivariate Correlation   ❖  235

Figure 11.12   Scatterplot of Cultural Competence and Workplace Conflict

5.00

4.00
Workplace Conflict Score

3.00

2.00

1.00

1.00 2.00 3.00 4.00 5.00


Cultural Competence Score

Running Correlation Using Excel


Leo also gave Emily an instruction on how to run the correlation using Excel with the
following procedures:

1. Click on the Data tab and open the Data Analysis window.
2. From the choice of options click Correlation, then OK.
3. Click in the Input Range box and then highlight cells AF1 through AG236.
4. Make sure that the box Labels in First Row is checked.
5. Request where you would like the output sent, in this example cell AI2 was selected.
6. Click OK.

Figure 11.14 shows the output you get when you run the correlation with Excel.
Note that Excel does not provide the user with a p-value of the correlation coefficient.
236  ❖  SECTION II  DATA ANALYSIS

Figure 11.13    Input Variables for Correlation in Excel

Figure 11.14    Correlation Output in Excel

Correlation Does Not Imply Causality

An important point to keep in mind when interpreting correlation is that correlation


does not mean causation. Change in one of the variables does not necessarily cause
change in the other variable. In Chapter 4, we discussed three conditions to establish
causality: temporal precedence, covariation, and no plausible alternative explanation.
Covariation suggests causality—one element out of three necessary conditions—
but covariation alone does not indicate temporal precedence of one variable over the
other or account for possible other explanations.
In Emily’s case, just because the variables for cultural competence and the per-
ceived level of workplace conflict display a statistically significant correlation, there is
no way for her to know if one is causing the other. It is possible that people having
higher cultural competence contribute to lower workplace conflict. It is also possible
that lower workplace conflict contributes to higher cultural competence. There might
also be a third, extraneous variable that affects the apparent correlation. It could be that
the leadership styles of the managers in some units are causing both cultural competence
Chapter 11  Bivariate Correlation   ❖  237

to increase and workplace conflict to decrease. In other units, the opposite may be
happening. The leadership styles of the managers across the organization could be
causing the appearance of a correlation between the two variables of interest.

Chapter Summary
This chapter introduced bivariate correlation using Pearson product moment correlation
coefficient (r) as a statistical approach to examine the linear relationship between two continuous
variables. The Pearson product moment correlation coefficient, ranging from −1 to +1, provides a
value for the strength of the correlation and its direction, either positive or negative. When
examining the correlation of variables that are categorical or ordinal, other type of correlation
coefficient such as Phi, point biserial, Spearman’s rank, and Kendall’s tau-b coefficient are used. It
is important to remember that correlation does not imply a cause-and-effect relationship.

Review and Discussion Questions and Exercises


1. Take a look at Mary’s volunteer profile data (Mary_Volunteer_profile.sav) from http://www
.sagepub.com/. Examine if there is a statistically significant correlation between the income
level of the volunteers and the number of hours they are willing to put in. Create a scatterplot.
Interpret the result and report it.
2. Take a look at Mary’s volunteer profile data (Mary_Volunteer_profile.sav) from http://www
.sagepub.com/. Examine if there is a statistically significant correlation between the age of the
volunteers and the number of hours they are willing to put in. Create a scatterplot. Interpret
the result and report it.
3. You ran a correlation between variables A & B and obtained Pearson correlation coefficient of −.87.
You also examined the correlation between variable and obtained Pearson correlation coeffi-
cient of .35. What are the directions of the relationship between A & B and C & D? Which one
of the pair has a stronger correlation?
4. When the two variables are correlated, they are associated with each other, but it doesn’t mean
that one is causing the other. Explain why.
5. Find a journal article in your field that uses correlation in the analysis. Discuss how the authors
present the result and the substantive implications of the result.

References
Aldrich, J. (1995). Correlations genuine and spurious in Pearson and Yule. Statistical Science, 10(4), 364–376.
Anscombe, F. J. (1973). Graphs in Statistical Analysis. The American Statistician, 27(1), 17–21.
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2002). Applied multiple regression/correlation analysis for the
behavioral sciences (3rd ed.). Mahwah, NY: Erlbaum.
Ha, R. H., & Ha, J. C. (2012). Integrative statistics for the social & behavioral sciences. Thousand Oaks, CA:
Sage.
238  ❖  SECTION II  DATA ANALYSIS

Remler, D. K., & Van Ryzin, G. G. (2011). Research methods in practice: Strategies for description and
causation. Thousand Oaks, CA: Sage.
Salkind, N. J. (2008). Statistics for people who (think they) hate statistics. Thousand Oaks, CA: Sage.
Yule, G. U., & Kendall, M. G. (1973). An introduction to the theory of statistics. London, UK: Griffin.

Key Terms
Correlation 235 Negative Positive Correlation  226
Correlation Coefficient  225 Correlation 226 Scatterplot 226
Pearson Product
Co-Vary 225
Moment Correlation
Curvilinear Relationship  229 Coefficient (R)  225

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


•• Result write-ups
12 ❖
Chi-Square
Analysis

Learning Objectives 240
Examining Relationships Between Two Categorical Variables 240
Emily’s Case 240
Mary’s Case 241
Chi-Square Analysis 242
Calculating Chi-Square Statistics and Testing Statistical Significance 243
Note on Sample Size for Chi-Square Analysis 245
Running Chi-Square Analysis Using Software Programs 245
Running Chi-Square Using SPSS 245
Running Chi-Square Using Excel 249
Chapter Summary 251
Review and Discussion Questions and Exercises 252
Key Terms 252
Figure 12.1 Menu Selection for SPSS Cross Tab 246
Figure 12.2 Input Variables for SPSS Cross Tabs 246
Figure 12.3 Selecting Statistics in SPSS Cross Tabs 247
Figure 12.4 SPSS Cross Tab Options 247
Figure 12.5 Output From SPSS Cross Tabs 248
Figure 12.6 SPSS Chi-Square Output 248
Figure 12.7 Reconfiguring Excel Data File 250

239
240  ❖  SECTION II  DATA ANALYSIS

Figure 12.8 Excel Pivot Table 251


Figure 12.9 Observed and Expected Counts Output in Excel 251
Table 12.1 Two-Way Contingency Table for People’s Attitude
Towards Making Diversity Training Required
Versus Diversity Training Attendance 243
Table 12.2 Observed Frequencies for 2 x 2 Contingency Table 244
Table 12.3 Expected Frequencies for 2 x 2 Contingency Table 244
Formula 12.1 Calculating Chi-Square 243
Formula 12.2 Calculating Expected Frequency 244
Formula 12.3 Shorthand Formula for Calculating Chi-Square 245



Learning Objectives

In this chapter you will

1. Learn an analytical approach to examine the relationship between two cat­


egorical variables
2. Learn a chi-square analysis (or two-way contingency table analysis)
3. Understand how chi-square analysis differs from correlation analysis
4. Develop an understanding of the difference between parametric and nonpara-
metric statistical tests
5. Learn how to conduct a chi-square analysis using SPSS and Excel

Examining Relationships Between Two Categorical Variables

Emily’s Case
When Mei-Lin peeked into Emily’s office to see if she was ready for their meeting,
Emily had a serious look on her face.
“Are you ready for me?” Mei-Lin asked softly.
Emily smiled when she looked up and invited Mei-Lin into her office, but
Mei-Lin figured this was no time for chit-chatting.
“We need to discuss what we are going to do with the diversity training for next
year and beyond. This year we got the foundation money, but I understand it is a one-
time fund that we can’t count on for ongoing trainings. What’s your thought on it?”
Chapter 12  Chi-Square Analysis  ❖  241

“I would like to continue offering the training, if we can,” Emily replied. “Of course, it is
contingent upon whether we have the funding for it, but I think I can make it happen.”
Mei-Lin was happy to hear that Emily was committed to continuing the diversity train-
ing. “If that’s the case,” she continued, “I could start making plans for the upcoming year.
I have one question, though. Do you think we should require all employees to take the
training? Sort of make it a mandatory training?”
Emily paused for a while and said, “Hmm—my first reaction is, of course we should. But
I also remember hearing in one of the HR directors’ meetings that it is not a good idea to
make diversity trainings mandatory. They said if it is going to be a mandatory training, the
idea should come from the employees, not from the HR director. What’s your sense about
where people stand on this?”
Now it was Mei-Lin’s turn to pause. She then said, “Didn’t we ask in our survey whether
people think diversity training should be a requirement or not? I can look into how people
responded to that question.”
Emily looked satisfied that Mei-Lin brought up the survey. “That’s a great idea, Mei-Lin.
Actually, you might also want to take a look at whether there is any relationship between
how people think about diversity training, to make it a requirement or not, and their atten-
dance at the last training. If people who attended the last training tend to say we’d better
make the training a requirement, that means we can count on them to support the idea to
make the training a requirement.”
On the way from Emily’s office back to her cubicle, Mei-Lin thought she might need
help with this. She pulled out her cell phone. “Hello? Leo? When are you coming into the
office?”

Mary’s Case
Ever since Mary started to think about interviewing volunteers, she had
been observing each individual volunteer more closely. As a program
manager, who was also responsible for managing volunteers, Mary
knew the volunteers pretty well, especially the active ones, but with the
new lens as a researcher trying to select interviewees, Mary started to
pay attention to things she did not notice before. Who is taking a
leadership role among the volunteers? Who seems to know a lot about
the volunteers? Which volunteers are more actively engaged at Health
First, and which are not? Who hangs out with whom? Where do they live?
In one of the weekly managers’ meetings, people started to talk about increased work-
load, and naturally the conversation shifted into the problem of the declining number of
volunteers and the challenges of recruiting new volunteers.
“Not again,” Mary sighed to herself. She had been in numerous discussions exactly like
this. She stayed quiet, so the managers would not start asking her how things were going
with the volunteer recruitment and retention efforts. Even if they asked, she would not have
much to offer at this point.
Instead of turning to Mary, the managers started sharing observations about the volunteers.
“There are definitely more women.”
242  ❖  SECTION II  DATA ANALYSIS

“That’s for sure, but I think we have more men than we used to.”
“Yeah, more retired men.”
“It feels like most of the women are from the west side of town.”
“Do you think so? I didn’t notice. I wonder why that’s the case.”
“Maybe they have more money and time to volunteer?”
In the past, Mary would have just tuned out. She did not like participating in gossiping
about volunteers. But today, she listened with a new interest. She had been thinking about
the same things. She thought to herself, “I wonder if it is true that more volunteers are
women from the west side. It may be worth analyzing it in the volunteer profile data. And
if that’s the case, I should also ask questions in the interview to find out why women from
the west side are volunteering. It may give me some ideas to do some target recruiting.”

Chi-Square Analysis

In the last chapter, we focused on a statistical approach to examine the relationship


between two continuous variables. When the two variables under consideration are
categorical, a different statistical approach is necessary. In this chapter, we will focus
on examining the relationship of categorical variables with a statistical approach called
chi-square (Welkowitz, Cohen, & Lea, 2012).
In Emily’s case above, a new question has arisen: Is there a relationship between
whether or not people think the diversity training should be required (yes/no) and
whether they attended the recent diversity training (yes/no)? Survey responses provide
the variables to help answer the question. Both variables are categorical—and in this
case dichotomous, with only two options (yes/no).
In Mary’s case, she also has a new question that she is able to answer with available
data from the volunteer lists: Is there a relationship between the volunteer gender
(male/female) and the area where they live (East/West)? Mary also has a pair of cat­
egorical, or dichotomous, variables for her analysis.
Chi-square is the appropriate statistical test to use for both of these questions.
Chi‑square is referred to as a two-way contingency table analysis, because it uses a 2
x 2 table that cross tabulates the frequency of the cases within each possible category
when the two variables are combined: yes/yes, yes/no, no/yes, no/no (Agresti, 2013).
In the two-way contingency table, the rows represent the grouping of one variable, and
the columns represent the grouping of the second variable.
Table 12.1 provides an example of a contingency table for the two variables in
Emily’s question.
Chi-square analysis belongs to a family of statistical analysis called nonpara-
metric tests (Corder & Foreman, 2009). All statistical tests covered in this book,
except for chi-square analysis, are parametric tests, which means the statistical
analysis is based on the assumption that the underlying population data are normally
distributed and that the measures used for the data are continuous (interval or ratio
scales). Nonparametric tests apply to categorical data and do not require a normal
distribution of the data. This makes sense, because the values assigned to categorical
Chapter 12  Chi-Square Analysis  ❖  243

Table 12.1 Two-Way Contingency Table for People's Attitude Towards Making


Diversity Training Required Versus Diversity Training Attendance

“Diversity training should be required”


Yes No Total
“Attended the diversity Yes 49  69 118
training” (50.2%)
No 40  77 117
(49.8%)
Total 89 146 235
(37.9%) (62.1%) (100%)

variables are arbitrary (e.g. female = 1, male = 2) and the numeric distance between
them has no meaning.
Although chi-square analysis does not rely on a normal distribution, it still
examines the relationship between variables according to statistical significance and
determines whether or not the frequencies of the observations in their different cat­
egories would be expected to occur by chance. We still use the same approach to
hypothesis testing. In Emily’s case, her null hypothesis and research hypothesis
appears as follows:

H0: There is no relationship between people’s views whether or not diversity train­
ing should be required and their training attendance.
HR: There is a relationship between people’s views whether or not diversity train­
ing should be required and their training attendance.

Calculating Chi-Square Statistics and Testing


Statistical Significance
The chi-square analysis is based on a comparison between what is observed in your
data set and what would be expected by chance. The formula for chi-square statistics
(depicted as χ2) is as follows:

Formula 12.1 Calculating Chi-Square

(Observed frequency − Expected frequency )2


χ2 = ∑
Expected frequency

This formula shows that the chi-square statistic is calculated by subtracting the
expected frequency from the observed frequency for each cell, square the total, divide
244  ❖  SECTION II  DATA ANALYSIS

it by the expected frequency, and then add all the scores from each cell (Agresti, 2013;
Corder & Foreman, 2009).
The observed frequency is the actual frequency of occurrences for each category
cell. The expected frequency is what occurrences would be expected if the distribution
among the cells is left to chance.
For any cell, the expected frequency is calculated by the following formula:

Formula 12.2 Calculating Expected Frequency

Row total × Column total


Expected Frequency =
Overall total

Table 12.2 shows the observed frequencies for a 2 x 2 table. Table 12.3 shows how
the expected frequencies can be obtained for the 2 x 2 table shown in Table 12.2.
Statistical software will calculate chi-square statistics, but the formula is also simple
enough that you could do the calculation by hand with the following shorthand formula:

Table 12.2  Observed Frequencies 2 × 2 Contigency Table

Variable 1
Grouping 1 Grouping 2 Total
Variable 2 Grouping 1 a b a+b
Grouping 2 c d c+d
Total a+c b+d a+b +c+d = N

Table 12.3  Expected Frequencies for 2 × 2 Contigency Table

Variable 1
Grouping 1 Grouping 2 Total
Variable 2 Grouping 1
( a + b ) (a + c) ( a + b ) (b + d) a+b

(a + b + c + d ) (a + b + c + d )
Grouping 2 c+d
( c + d ) (a + c) ( c + d ) (b + d)
(a + b + c + d ) (a + b + c + d )
Total a+c b+d a+b +c+d = N
Chapter 12  Chi-Square Analysis  ❖  245

Formula 12.3 Shorthand Formula for Calculating Chi-Square

(ad − bc)2 (a + b + c + d )
χ2 =
(a + b) (c + d ) (b + d ) (a + c)

After obtaining the chi-square statistic, you can then check to see if the result is
statistically significant. As with other statistical tests, if the p-value is equal to or lower
than .05, then you can reject the null hypothesis and conclude that there is a statisti­
cally significant difference in the grouping of your categorical variables. If the p-value
is larger than .05, then you do not reject the null hypothesis and conclude that there
are no statistically significant differences in the groupings of the two variables (SPSS
will give you a warning if you run a chi-square analysis with an expected frequency
score of less than 5).

Note on Sample Size for Chi-Square Analysis


One important thing to remember in conducting the chi-square analysis is that it
requires at least five expected frequency scores in each category cell to meet the require­
ments of the analysis. Green and Salkind (2010) note that when the numbers of cat­
egory cells are larger, the validity of the test should be questioned when more than 20%
of the cells have an expected frequency of less than 5. Therefore, using a chi-square
analysis with a very small sample size is not advisable, and attention needs to be paid
to the expected frequency of the cells when determining the overall sample size.

Running Chi-square Analysis Using Software Programs

When Leo heard from Mei-Lin about the question of making diversity training a
requirement, and how the employees thought about that idea depending on if they
attended the training or not, he became immediately curious. He turned on his
computer and booted up SPSS. Variable Q26 was the answer to the survey question,
“Should the diversity training be required?” and the response category was Yes = 1, No = 2.
Variable Q27 was the answer to the survey question, “Did you attend the diversity
training offered?” and the response category was Yes = 1, No = 2. Leo thought, “Both
are categorical variables. So I cannot use the Pearson product moment correlation.
This needs to be a chi-square analysis.”

Running Chi-Square Using SPSS


Here’s the procedure Leo used to run the chi-square analysis using SPSS.

1. Select AnalyzeàDescriptive Statisticsà Cross Tabs.


2. Select Q26 (Should the diversity training be required) as the row variable and
Q27 (Did you attend the diversity training) as the column variable.
246  ❖  SECTION II  DATA ANALYSIS

Figure 12.1   Menu Selection for SPSS Cross Tab

Figure 12.2   Input Variables for SPSS Cross Tabs

3. Click Statistics and select Chi-Square.


4. Click on Cells and select Observed and Expected in the Counts box. In
Percentages Box, select Row. Click Continue.
5. Click OK.
Chapter 12  Chi-Square Analysis  ❖  247

Figure 12.3   Selecting Statistics in SPSS Cross Tabs

Figure 12.4   SPSS Cross Tab Options


248  ❖  SECTION II  DATA ANALYSIS

Figure 12.5 shows the contingency table provided in the output for the chi-square
analysis, using Cross Tabs function in SPSS. The Count row shows the observed fre­
quencies, and the Expected Count row shows the expected frequencies.
Figure 12.6 shows the chi-square statistics under the row marked as Pearson
Chi-Square. The result indicates that the chi-square statistics is 1.34, with degrees of
freedom equal to 1 (Note: Degrees of freedom can be calculated as the number of
rows minus 1). The p-value, in the Asymp.Sig (2-tailed) column, is .246. This means
that the p-value is higher than the significance level .05, and therefore, the null
hypothesis cannot be rejected. In other words, the result indicates that there is no
relationship between people’s view on the requirements for diversity training and in
their training attendance.

Figure 12.5   Output From SPSS Cross Tabs

Figure 12.6   SPSS Chi-Square Output


Chapter 12  Chi-Square Analysis  ❖  249

Running Chi-Square Using Excel


Chi-square analysis can also be conducted using Excel. It is, however, a little more
laborious than SPSS. Part of the reason is that Excel does not automatically provide the
expected counts, and therefore, to obtain the critical value and associated p-value, the
user must calculate the expected counts separately and then use Excel’s functions to
obtain the chi-square statistic and associated p-value. With the same hypothesis and
the same survey variables (Q26 & Q27), the chi-square analysis can be conducted in
Excel with the following steps:

 1. Paste the copied columns in the new Excel Workbook.


 2. To create an output easier to interpret and to avoid errors while constructing a
cross tab table, create two more columns contiguous to the two variables,
pasted from the original data file. The numerical values attached to designate
the category needs to be transformed to labels. In that way, the Excel program
will perform counts, rather than automatically summing the numerical values.
 3. Create a new column labeled, Required Training.
 4. Create another new column labeled, Attended Training.
 5. To turn the numerical values into labels, use an IF statement. (Rather than
changing them by hand.)
 6. In Cell C2, enter the following formula: =IF(A2=1, yes,no), then hit return.
This tells Excel that if the value of A2 is 1, then it should change it to the label
yes. If A2 does not have the value of 1, then the label will be changed to no.
 7. Activate this cell, dragging it down to the last cell to input the formula into all cells.
 8. Repeat the process for column D, starting with cell D2. This time enter the for­
mula, =IF(B2=1, yes,no). This essentially does the same thing as step 6; it just
references a new cell so that we can change the values back to labels for this
variable. The data file should now look similar to Figure 12.7.

 9. The next step is to create a pivot table (a cross tab). Click InsertPivot Table.
10. 
In the Create Pivot Table dialog box, there will be a Table/Range window.
Highlight C1:D236 and place that range in this box. Click OK.
11. There will be a pivot table field list containing the two variables.
12. Place Required Training in the Column Labels box.
13. Place Attended Training in the Row labels box.
14. Click Required Training and drag it into the ∑ Values section.

The result of the above procedure should produce a contingency table that
looks like Figure 12.8.
250  ❖  SECTION II  DATA ANALYSIS

Figure 12.7   Reconfiguring Excel Data File

The pivot table calculates the cell counts (observed frequencies) for the basic 2 x 2
contingency table for the two variables. The chi-square test also requires the expected
frequencies, so we need to create an expected frequencies table. Figure 12.9 shows the
contingency table for expected frequencies. The user has to construct a table manually,
using the formula we showed earlier (Formula 12.2): (Column Grand Total)(Row
Grand Total)/Overall Grand total.

1. In Cell B27 enter the formula, =(B20*D18)/D20. This references the observed col­
umn grand total (146), the row grand total (117), and the overall grand total (235).
2. For Cell C27, enter the formula =C20*D18/D20.
3. For Cell B28, enter the formula =B20*D19/D20.
4. For Cell C28, enter the formula =C20*D19/D20.

Now that the expected frequencies are calculated, you can use the chi-square func­
tion in Excel. To do this, perform the following:

1. Click on any cell in the output. This example arbitrarily clicked on Cell B32. In
the cell, enter the following formula, =CHITEST(B18:C19,B27:C28). This for­
mula calculates the p-value for the chi-square statistics by examining the
observed versus expected frequencies. This function produces a p-value of .246.
2. To obtain the chi-square statistic, activate cell B33 and enter the formula, =•
CHIINV(B32,1). The corresponding result should be 1.344.
Chapter 12  Chi-Square Analysis  ❖  251

Figure 12.8   Excel Pivot Table

Figure 12.9   Observed and Expected Counts Output in Excel

Chapter Summary
This chapter introduced a chi-square analysis (or two-way contingency analysis) to test the
relationship of two categorical variables. Chi-square is the most common of the nonparametric
statistical statistics that do not assume a normal distribution in the data. This chi-square
analysis examines the frequencies for each one of the categories in the two variables. By
calculating the expected frequencies (by chance) and comparing them to the actual observed
frequencies, the test determines if there are any statistically significant relationships between the
two categorical variables.
252  ❖  SECTION II  DATA ANALYSIS

Review and Discussion Questions and Exercises


1. Write a short memo from Leo to Emily reporting the result of the chi-square analysis.
2. Conduct an analysis that addresses Mary’s question as to whether there is any relationship
between the volunteers’ gender and the area in which they live.
3. Write a report summarizing the result of the analysis you conduct based on Mary’s question.
4. Explain why you use frequencies to analyze two categorical variables.
5. Explain how the relationships you examine are the same or different in correlation and chi-
square analysis.
6. What are the limitations to chi-square analysis? How are these related to sample size?

References
Agresti, A. (2013). Categorical data analysis. Hoboken, NJ: Wiley.
Corder, G. W., & Foreman, D. I. (2009). Nonparametric statistics for non-statisticians: A step-by-step approach.
Hoboken, NJ: Wiley.
Green, S. B., & Salkind, N. J. (2010). Using SPSS for Windows and Macintosh: Analyzing and understanding
data (6th ed.). Upper Saddle River, NJ: Prentice Hall.
Welkowitz, J., Cohen, B. H., & Lea, R. B. (2012). Introductory statistics for the behavioral sciences. Hoboken,
NJ: Wiley.

Key Terms

Chi-Square Analysis  242 Parametric Two-Way Contingency


 Tests 242  Table Analysis 242
Nonparametric Test  242

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


•• Result write-ups
13 ❖
Regression
Analysis

Learning Objectives 255
Predicting Relationships 255
Emily’s Case 255
Mary’s Case 256
Linear Regression Analysis 257
Regression Equation and Regression Line: Basis for Prediction 258
2
Assessing the Prediction: Coefficient of Determination (R ) 262
Assessing Individual Predictors: Regression Coefficient (b) 265
Running Bivariate Regression Using Software Programs 265
Running Bivariate Regression Using SPSS 265
Running Bivariate Regression Using Excel 269
Multiple Regression 270
Multicollinearity 271
Using Dummy Variables in the Multiple Regression 271
Running Multiple Regression Using Software Programs 273
Running Multiple Regression Using SPSS 273
Running Multiple Regression Using Excel 277
Mary’s Case 278
Brief Comment on Other Types of Regression Analyses 278
Chapter Summary 279
Review and Discussion Questions and Exercises 279
Key Terms 280

253
254  ❖  SECTION II  DATA ANALYSIS

Figure 13.1 Scatterplot of the Volunteers’ Income Level and the Volunteer
Hours 259
Figure 13.2 Scatterplot of the Volunteers’ Income Level and the
Volunteer Hours with Regression Line 260
Figure 13.3 Relationship Between the Regression Equation and the
Visual Representation of Regression Line 261
Figure 13.4 Visual Representation of the Total Sum of Squares (SST) 263
Figure 13.5 Visual Representation of the Sum of Squares (SSR) 264
Figure 13.6 Menu Selection for Linear Regression 266
Figure 13.7 Input Variables for Linear Regression in SPSS 266
Figure 13.8 Bivariate Linear Regression Model Summary Output From SPSS 267
Figure 13.9 Bivariate Linear Regression ANOVA Output From SPSS 267
Figure 13.10 Bivariate Regression Coefficients SPSS Output 268
Figure 13.11 Input Variables for Bivariate Regression in Excel 269
Figure 13.12 Bivariate Regression Output From Excel 270
Figure 13.13 Menu Selections for Linear Regression 273
Figure 13.14 Input Variables for Multiple Regression in SPSS 274
Figure 13.15 Statistics Options for Linear Regression in SPSS 274
Figure 13.16 Multiple Regression Model Summary SPSS Output 275
Figure 13.17 Multiple Regression ANOVA SPSS Output 276
Figure 13.18 Multiple Regression Coefficients SPSS Output 277
Table 13.1 Dummy Variable Coding 272
Formula 13.1 Basic Equation for the Regression Line 259
Formula 13.2 The Formula for the Slope (b) of a Regression Line 261
Formula 13.3 The Formula for the Intercept (a) of a Regression Line 261
Formula 13.4 Calculating the Coefficient of Determination (R2) 262
Formula 13.5 Calculating the Total Sum of Squares (SST) 263
Formula 13.6 Calculating the Total Sum of Residuals (SSR) 264
Formula 13.7 Equation for Multiple Regression 270
Formula 13.8 Equation for Multiple Regression With Categorical
Gender Variable 271
Chapter 13  Regression Analysis  ❖  255

Formula 13.9 Equation for Multiple Regression With Categorical Gender


Variable and Dummy Coded Region Variable 273
Formula 13.10 Regression Equation That Predicts Volunteer Hours 276



Learning Objectives

In this chapter you will

1. Understand and use bivariate and multiple linear regression analysis


2. Understand the concept of the regression line and how it relates to the regres-
sion equation
3. Understand the assumptions behind linear regression
4. Be able to correctly interpret the conceptual and practical meaning of coeffi-
cients in linear regression analysis
5. Be able to use SPSS and Excel to conduct linear regression analysis

Predicting Relationships

Emily’s Case
“It was a great conference,” Leo exclaimed as he slipped into the back-
seat of Emily’s car.
Mei-Lin agreed enthusiastically as she got in the front passenger
side. “This was really good. Thank you, Emily.”
“My pleasure,” Emily replied with a laugh as she settled behind the
driver’s wheel. “People liked Leo, don’t you think?”
Mei-Lin turned toward the back. “I was so proud of you!”
“It’s true,” Emily continued, “you did a great job making the statis-
tical analysis on the impact of the diversity training understandable.
HR professionals are not usually into statistics, but I think they liked it.”
As they drove back to Westlawn, they talked about the things they learned at the con-
ference. Emily and Mei-Lin were particularly interested in a presentation where the speaker
talked about the cumulative effects of training and employee education. The point was that
a one-time training is not enough to make an impact on employee development. The
speaker emphasized the importance of having a long-term strategic plan for training and
employee education and to track the results.
256  ❖  SECTION II  DATA ANALYSIS

“She made a good point,” Mei-Lin argued. “We need to continue the diversity trainings
if we really want to make an impact.”
Emily smiled. “We just need to secure the resources to keep it going.”
Leo leaned forward between them. “You know, it occurs to me that this ‘cumulative
effect’ of training; if it’s true, it should show up in our survey data.” Neither Emily nor
Mei-Lin responded, so Leo explained what he meant. “We have a question that asks
how many diversity trainings the employee has attended in the past. We had quite a
few who responded. I wonder if we can predict the level of cultural competence by the
number of diversity trainings they attended in the past. If it’s cumulative, then the level
of cultural competence should be higher with the people who attended more trainings,
right?”
Mei-Lin picked up on the significance of the idea first. She was writing a proposal to
justify a new round of diversity training, and this looked like a useful piece of evidence.
“That’s a great idea, Leo.” Mei-Lin responded. “I would really like to see what that looks
like. Is it easy to run that analysis?”
“Hey, he’s Leo,” Emily joked. They all laughed.
“Sure, I can do that pretty quickly,” Leo confirmed. He was already curious. “I’ll get on it
tonight.”

Mary’s Case
Mary was a little frustrated that nobody at Health First showed much interest in
her research-based approach to volunteer recruitment and retention. She needed a
sounding board. Yuki, her grad-school friend who headed the research department
at one of the major foundations in the area would certainly understand. She had
helped Mary get started on this project. Mary sent Yuki an invitation and met her
the next day at a coffee shop about halfway between their two offices.
“It’s such a nice day, let’s sit outside,” Yuki said, holding a latte in her hand.
As soon as they sat down at a metal table outside the coffee shop, Yuki jumped
right to the topic she knew was on Mary’s mind. “So, how’s your research on the
volunteers going?”
Typical Yuki style, Mary thought. No “how’s your parents?” or “how’s your boyfriend?” or
“how’s your dog?” niceties. This was one thing she liked about Yuki. She smiled apprecia-
tively as she responded.
“I read your qualitative research books. Thanks for loaning them to me. I want to keep
them a little longer, if you don’t mind.”
“That’s fine,” replied Yuki.
Mary sipped her coffee and continued, “I’ve been spending a lot of time thinking about
whom I should interview and what questions I should ask.”
Yuki nodded and said, “As you should be.”
“In the mean time, I obtained data from HR on the background of the volunteers, and
I’ve been analyzing it.” Mary told Yuki about the correlation and chi-square analyses she
had conducted using the volunteer profile data.
Chapter 13  Regression Analysis  ❖  257

Yuki answered with a knowing look: “Not surprising you are doing statistics. You like
quantitative data. Sounds like you are getting interesting results.”
Encouraged by Yuki’s interest, Mary shared her thoughts on another analysis she was
thinking about. “At Health First, we don’t have much money to put into a major volunteer
recruitment campaign, so we need to focus our resources on the most efficient ways to
recruit volunteers.” Mary paused to be sure Yuki was following.
“Go on,” Yuki encouraged.
“We don’t have very much information about the current volunteers, but we do ask when
they start how many hours they are willing to put in for their volunteer work. It appears
from feedback from other managers that the number of hours the volunteers actually work
is pretty close to what they said they would work. So—”
Yuki leaned forward, anticipating the punch.
“—I thought, in addition to increasing the number of volunteers themselves, I should
focus on volunteers who are willing to work more hours. That gives us a better return on our
investment.”
Yuki laughed and said, “I can’t believe you use a phrase like ‘return on investment’ about
the volunteers. You were always so opposed to the business approach in nonprofit manage-
ment.” She saw Mary was a little startled. “But what you say makes sense.”
Yuki, looked down and stirred her latte, and continued, “So, I suppose this idea of yours
means you have a new plan for your research?”
“You are right on.” Mary was glad she had a friend she could talk to about research
without worrying about coming across too geeky. She opened up: “Literature suggests that
people who have more income tend to volunteer more. I don’t know if that’s the case with
the volunteers in our region, but I was thinking about running a regression analysis that
predicts volunteer hours with our volunteers’ level of income.”
“Interesting idea,” Yuki nodded. “What other volunteer background information do you
have? Do you know their age and gender, and anything else? You can run a multiple
regression and see which volunteer background information predicts volunteer hours signifi-
cantly and also has the strongest relationship with the volunteer hours.”
“Brilliant idea, Yuki!” Mary exclaimed. She pursued Yuki’s suggestion, and the two
friends blithely slipped into geekdom, talking about regression analysis. They did not notice
their drinks getting cold.

Linear Regression Analysis


In Chapter 11, we introduced a way to examine the relationship of two continuous
variables. In this chapter, we will build on this idea with an analytical tool, called
linear regression analysis that uses correlation as a basis to predict the value of
one variable from the value of a second variable or the combination of several
variables.
In regression analysis, the variable that the researcher intends to predict is the
dependent variable (sometimes called outcome variable or criterion variable).
Typically the notation “Y” is used to describe the dependent variable. The variable
258  ❖  SECTION II  DATA ANALYSIS

that the analysis uses to predict the value of the dependent variable is the indepen-
dent variable (sometimes called predictor variables). The notation “X” is used to
describe the independent variable. Linear regression analysis provides information
about the strength of the relationship between the dependent variable and indepen­
dent variable. When there is only one independent variable in the regression analy­
sis, it is called bivariate (or simple) linear regression analysis. When there are two
or more independent variables involved in the analysis, it is called multiple regres-
sion analysis.
In Mary’s case, she is considering using bivariate linear regression analysis to pre­
dict volunteer hours (dependent variable) with the volunteers’ income level (indepen­
dent variable). Yuki suggested a multiple regression analysis to predict volunteer hours
(dependent variable) with not only the income level, but also age, gender, and other
information that might be available on the volunteers (independent variables). By
examining the relative strength of the relationship of each independent variable with
the dependent variable, Mary can identify the kind of volunteers she needs to maxi­
mize volunteer hours.
As with all statistical tests we introduced in this book, linear regression analysis is
also based on a set of assumptions (Fox, 1991; Kahane, 2008), as follows:

1. Linearity: The relationship between the dependent variable and the indepen­
dent variables are linear in nature.
2. Normality: The dependent variable is measured as a continuous variable and is
normally distributed. The basic form of linear regression also assumes that the
independent variables in the linear regression are continuous and are normally
distributed. There are ways, however, to incorporate and interpret categorical
independent variables in the regression analysis as dummy variables.
3. Homoscedasticity: The word homoscedasticity is derived from the Greek homo
for same and skedastickos for dispersion (Merriam-Webster, 2012). It means
having same variance. This assumption requires the degree of random noise in
the dependent variable to remain the same regardless of the values of the inde­
pendent variables.

Regression Equation and Regression Line: Basis for Prediction


Let’s use Mary’s example to illustrate the logic of prediction. Starting with her first
question, she wants to predict volunteer hours (dependent variable) based on the vol­
unteers’ level of income (independent variable). Basically, prediction means estimating
an unknown outcome based on a known outcome (Upton & Cook, 2011). For Mary to
predict the unknown volunteer hours using income level information, she can use the
known pattern of the relationship between volunteer hours and income level. So let’s
look at the pattern of the relationship between volunteer hours and income level with
a scatterplot (Figure 13.1).
Chapter 13  Regression Analysis  ❖  259

Figure 13.1   Scatterplot of the Volunteers’ Income Level and the Volunteer Hours

25.00

20.00
hours_week

15.00

10.00

5.00

.00

.00 5.00 10.00 15.00 20.00


income

On the scatterplot, the value of the dependent variable is plotted on the Y-axis and
the value of the independent variable is plotted on the X-axis. You can draw a line
through the scatterplot to represent the minimum distance between the line and each
one of the actual points. This is called a regression line. As you can see in Figure 13.2,
once you identify the regression line, then you can use the line to estimate what the
dependent variable (Y) would be if you know the value of the independent variable
(X). A regression line is also called the line of best fit (Munro, 2004), because it is the
line that best represents the pattern of the relationship between the dependent variable
and the independent variable.
Once we identify the pattern of the relationship between the dependent variable Y
and the independent variable X as a regression line, then we can describe the line with
a formula. The basic equation for the regression line is as follows:

Formula 13.1 Basic Equation for the Regression Line

Y = a + bX
260  ❖  SECTION II  DATA ANALYSIS

Figure 13.2   S catterplot of the Volunteers’ Income Level and the Volunteer Hours With
Regression Line

26.05
R2 Linear = 0.631

Regression Line

21.05

16.05
hours_week

Estimated
value of Y
11.05

6.05

Known
1.05 value of X

1.00 6.00 11.00 16.00 21.00

income

Where
Y=the dependent variable (value on the vertical axis)
X=the independent variable (value on the horizontal axis)
a=the point where the regression line crosses the Y axis, called the intercept (the
value of Y when X is zero)
b = the slope of the regression line, indicating how much the Y value changes
when there is a one-unit change in the value of X. It indicates the strength of the
relationship between X and Y (the regression coefficient).

The relationship between the equation and the visual representation of the regres­
sion line is presented in Figure 13.3.
The slope of the regression line can be positive (+) or negative (−). When the slope
is positive, that means the line goes up toward the upper right corner. When the slope
is negative, that means the line goes down towards lower right corner. In other words,
Chapter 13  Regression Analysis  ❖  261

Figure 13.3   R
 elationship Between the Regression Equation and the Visual Representation
of Regression Line

26.05
R2 Linear = 0.631
Regression Line
Y=a+bX
21.05

16.05
Intercept (a)
hours_week

11.05
∆Y

6.05
b=∆Y/ ∆X
∆X

1.05

1.00 6.00 11.00 16.00 21.00


income

the slope also indicates the direction of the relationship between X and Y. The intercept
(a) and the slope (b) can be calculated based on the value of X and Y.
The formula for the slope (b) is:

Formula 13.2 The Formula for the Slope (b) of a Regression Line

∑ XY − (∑X ∑Y ) / n
b=
2
2 ∑X )
∑X − [( ]
n

Once you have the slope (b) then you can use it to calculate the intercept (a):

Formula 13.3 The Formula for the Intercept (a) of a Regression Line

∑Y − b∑X
a=
n
262  ❖  SECTION II  DATA ANALYSIS

Of course, you don’t have to calculate the slope and the intercept by hand. The
SPSS and the Excel programs can do the calculation for you. In Mary’s example, it
turned out that the intercept (a) is 1.05 and the slope (b) is .90 (which you will see in
the SPSS and Excel output). That means, in Mary’s example, the regression equation is:

Y = 1.05 + .90 X

Once Mary obtains this regression formula, she can plug in a volunteer’s income
level and predict how many hours this particular person is likely to volunteer. For
example, with this regression equation, Mary can expect that if a volunteer reports an
income level as “5” ($60,000 to $70,000), then the estimated number of volunteer
hours will be 5.55 hours per week, as shown in the calculation below.

Y = 1.05 + .90 X
= 1.05 + .90 * 5
= 1.05 + 4.5
= 5.55

Assessing the Prediction: Coefficient of Determination (R2)


Once we identify the regression line, it is important to assess how well it predicts an
outcome from the basis of a known variable. You can see from the scatterplot that the
dispersion of the points will affect how accurate the estimate is likely to be. With this
predictive model, we calculate a coefficient of determination (R2) to measure how
much of the variance in one variable is explained by variance in another.
R2 is obtained by examining first how much the actual score in the dependent
variable differs from the mean. This gives us a familiar measure of variance, with a
total sum of squares (denoted SST). Then we measure how much the actual score in
the dependent variable differs from the value estimated by the regression equation.
This is called a residual sum of squares (denoted SSR).
The formula for R2 is:

Formula 13.4 Calculating the Coefficient of Determination (R2)

SSR
R2 = 1 −
SST

From the formula, you can see that R2 will take a value between zero and 1. The
closer the R2 is to 1, the better the prediction. When R2 is 1, the regression equation has
a perfect prediction.
Let’s unpack these concepts a little. In Mary’s case, she could get an idea of how
many volunteer hours to expect from her volunteers by looking at the mean. However,
the actual hours put in by most of the volunteers will probably not equal the mean. The
difference between the mean and the actual volunteer hours represents what we called
Chapter 13  Regression Analysis  ❖  263

deviance in the discussion of measures of central tendency and variance in Chapter 7.


Here we think of the same concept as an error in prediction when using the mean to
predict. Remember earlier that we used a sum of squares to measure variance, because
otherwise the sum of the plus and minus differences from the mean cancel each other
out and always add up to zero. We use the same procedure here. This produces a total
sum of squares (SST), as represented in the following formula (Formula 13.5) and
illustrated in Figure 13.4:

Formula 13.5 Calculating the Total Sum of Squares (SST)

SST = Σ (Observed value − Mean)2

The regression equation identifies the line that minimizes the distance between
the line and the observed values. The regression line offers a more sophisticated
approach for prediction than just using the mean, but the prediction still does not
perfectly match the observed values. There is still some inaccuracy. The differences
are referred to as the residuals, or the error in prediction. Similar to deviance, sum­
ming up the residuals will result in a zero value because the directions of the differ­
ences cancel out. Therefore, we square the residuals before we add them all up to
capture the overall error in prediction in the regression equation. This total is referred

Figure 13.4   Visual Representation of the Total Sum of Squares (SST)

26.05
Total Sum of Squares = Σ (Observed value − Mean)2

21.05
Mean value
of Y
16.05
hours_week

Deviance

11.05

6.05

1.05

1.00 6.00 11.00 16.00 21.00


income
264  ❖  SECTION II  DATA ANALYSIS

to as the sum of residuals (SSR), as represented in the following Formula 13.6 and
illustrated in Figure 13.5:

Formula 13.6 Calculating the Total Sum of Residuals (SSR)

SSR = Σ (Observed value − predicted value)2


The assessment on how well the regression equation predicts one outcome from
another can be determined by calculating R2. In the formula for R2, we see that the
quotient for SSR/SST—the sum of squares residual (SSR) over the sum of squares
(SST)—will equal 1 if SSR and SST are exactly the same, meaning R2 (=1−SSR/SST) will
be zero. This result would indicate that the prediction using the regression equation is
no different from the prediction using the mean and did not improve the prediction.
When the SSR is smaller than the SST, then SSR/SST will be less than 1, and R2 (=1−
SSR/SST) will be greater than zero, meaning the prediction using the regression line is
incrementally better than the prediction using the mean. When R2 is closer to 1, the
prediction is better (Field, 2009).
R2 can also be explained as a measure of association between the dependent vari­
able and the independent variable. It indicates the proportion of variance explained in

Figure 13.5    Visual Representation of the Sum of Squares (SSR)

26.05
R2 Linear = 0.631
Residual Sum of Squares
= Σ (Observed value - Estimated value)2

21.05

16.05
hours_week

11.05 Regression Line


Residual

6.05

1.05

1.00 6.00 11.00 16.00 21.00


income
Chapter 13  Regression Analysis  ❖  265

the dependent variable by variance in the independent variable. When R2 is zero, that
means none of the variance is shared between the two variables. They are unrelated.
When R2 is 1—which would only be possible if the sum of residuals (SSR) equaled
zero—then 100% of the variance is shared. This would mean that an exact prediction
of the value of one variable would be possible by knowing the value of the other.
Intermediate values for R2 provide a good measure of the degree of the relationship
between the independent and dependent variables (Pedhazur, 1997).
The null hypothesis for R2 would state that there is no relationship between the
independent and dependent variables. We can test the null hypothesis by calculating
an F-statistic, as we did with ANOVA in Chapter 10. If the result of the test is signifi­
cant, with the p-value below .05, then we reject the null hypothesis that R2 is zero and
accept the research hypothesis that R2 is significantly different from zero, and there is
a relationship between the independent and dependent variables in the population
(Cohen, 2010).

Assessing Individual Predictors: Regression Coefficient (b)


In the regression equation, the independent variable X that we use to predict the value
of Y has a coefficient (b). In the bivariate regression analysis, where there is only one
independent variable X, the value of b represents the slope of the regression line. It
indicates the change in the dependent variable Y, when there is a one-unit change in
the independent variable X. When the regression coefficient b is zero, then a unit
change in the value of the independent variable X results in no change in the depen­
dent variable Y. In the regression analysis, we can conduct a t-test to test the null
hypothesis that the regression coefficient b is zero. If the result of the t-test is signifi­
cant, with a p-value below .05, then we reject the null hypothesis that b is zero and
accept the research hypothesis that b is significantly different from zero. This means
the independent variable X significantly contributes to the value of the dependent
variable.

Running Bivariate Regression Using Software Programs


Let’s take a look at Mary’s case to see if she can predict volunteer hours by volunteer
income level, using a bivariate regression analysis. We will go through the procedure
in SPSS and then in Excel.

Running Bivariate Regression Using SPSS


The following steps outline how Mary will examine the relationship of volunteer hours
to level of income with a bivariate regression analysis in SPSS:

1. Open Mary_Volunteer_profile.sav
2. Click AnalyzeàRegressionàLinear.
266  ❖  SECTION II  DATA ANALYSIS

Figure 13.6    Menu Selection for Linear Regression

3. Move the variable hours_week into the Dependent Variable box.


4. Move the variable income into the Independent Variable box.
5. Click statistics and check Descriptives box.
6. Click OK.

Figure 13.7    Input Variables for Linear Regression in SPSS


Chapter 13  Regression Analysis  ❖  267

There will be multiple tables in the output, including the descriptive statistics of
the variables in the analysis. The table labeled Model Summary (Figure 13.8) includes
information about R, R square (R2), and Adjusted R Square.

Figure 13.8    Bivariate Linear Regression Model Summary Output From SPSS

R is the square root of R2. We introduced R in Chapter 11 as the Pearson product


moment correlation coefficient, indicating the strength and the direction of the linear
relationship between the dependent variable (volunteer hours) and the independent
variable (income level). In Mary’s data, volunteer hours and volunteer income level are
positively correlated, and the strength of the relationship is strong at .795.
R-Square (R2) in Mary’s analysis is .631, which suggests that volunteer income
level explains 63.1% of the variance of their volunteer hours. This indicates that the
relationship between volunteer income level and volunteer hours is moderately strong.
Adjusted R-Square (R2) adjusts the value of R2 when the sample size is small,
because an estimate of R2 obtained when the sample size is small tends to be higher
than the actual R2 in the population. The rule of thumb is to report adjusted R2 when
it substantially differs from R2 (Green & Salkind, 2010). In this analysis, the difference
is very small (adjusted R2 = .625). Therefore, Mary can report the unadjusted R2.
The SPSS output table labeled ANOVA (Figure 13.9) provides the results of a test of
significance for R and R2 using the F-statistic. In this analysis, the p-value is well below
.05 (p < .001). Therefore, Mary can conclude that the R and R2 between volunteer hours
and the volunteer’s income level is statistically significant (different than zero).

Figure 13.9    Bivariate Linear Regression ANOVA Output From SPSS


268  ❖  SECTION II  DATA ANALYSIS

The table in the SPSS output labeled Coefficients (Figure 13.10) provides informa­
tion that is useful for understanding the regression equation. Under the column
marked Unstandardized Coefficient and sub-column B, the numerical value on the first
row, labeled (Constant), is the value for the intercept (a) in the regression equation.
The numerical value on the second row, labeled as Income in this case (representing
the independent variable), is the value for the slope (b) for the regression equation.
Based on these results, Mary can report the following regression equation, predicting
volunteer hours based on level of income.

Y (Volunteer hours) = 1.05 + .895X (income level)

Taking these values for the slope and intercept in the resulting regression equation,
we can make the following statement: According to the intercept, when income is zero,
the average number of hours will be 1.05, and according to the slope, for each addi­
tional unit change in the income level (by defined income categories), the volunteer
hours (per week) will increase by .895 hours. Notice in the table that the p-value is
repeated here (p <. 001).
Under the column Standardized Coefficient and the sub-column Beta, the value
shown in the second row indicates the slope (b) when the independent and dependent
variables are converted into scores that have a mean of zero and a standard deviation
of 1 (scores with these properties are called z-scores). This standardized regression
coefficient β (Beta) is useful when making comparisons of the relationship between the
variables when the units of measurement are different. We will discuss this concept
further below in the section on multiple regression.

Figure 13.10    Bivariate Regression Coefficients SPSS Output


Chapter 13  Regression Analysis  ❖  269

Running Bivariate Regression Using Excel


The bivariate regression analysis can be conducted using Excel with the following steps:

1. Open the Data Analysis window and choose regression.


2. Click in the Input Y Range to activate.
3. Highlight cells C1 through C61.
4. Once you are finished highlighting these cells, C1:C61 will appear in the Input
Y Range box.
5. Click on Input X Range to activities.
6. Highlight cells F1 through F61.
7. Again, after highlighting, they should appear in the box.
8. Be sure and click Labels.
9. Specify your output range and click OK.

Your window should look similar to the Figure 13.11 below.

Figure 13.11    Input Variables for Bivariate Regression in Excel


270  ❖  SECTION II  DATA ANALYSIS

The output from Excel appears below in Figure 13.12. Note that the p-value in this
case is reported in scientific notation as a very small value, which we can interpret as
in the SPSS output as p < .001.

Figure 13.12    Bivariate Regression Output From Excel

Multiple Regression
Multiple regression is an extension of bivariate regression. Rather than having only one
independent variable in the regression equation, multiple regression includes more
than one independent variable in the equation. By incorporating more than one
independent variable in the analysis, multiple regression predicts the dependent
variable taking multiple factors into account. It also examines the effect of each
independent variable on the dependent variable while holding the effect of other
variables constant. In other words, multiple regression identifies the unique contribution
of the individual independent variables, while controlling for the effects of other
independent variables.
The regression equation remains essentially the same for multiple regression,
appearing as follows (the subscripts identify additional variables):

Formula 13.7 Equation for Multiple Regression

Y = a + b1 X1 + b2 X2 + …. + bi Xi

In conducting multiple regression analysis, it is important to think carefully


about what independent variables should be included in the analysis (Allison, 1999).
An effort should be made to include all relevant independent variables in explaining
the dependent variable, and there should be a good theoretical basis for the inclusion
of each variable. Additional independent variables should explain differences in the
dependent variable that the other independent variables do not. All the independent
variables included in the analysis, in combination, should predict the dependent
variable better than any one of the independent variables alone.
Chapter 13  Regression Analysis  ❖  271

Multicollinearity
In addition to the assumptions for the linear regression analysis noted earlier, in mul­
tiple regression analysis, there is one more important assumption that needs to be met.
In multiple regression, independent variables included in the analysis should not have
a strong linear relationship to each other. When there is a strong relationship among
the independent variables it is referred to as multicollinearity. When there is multi­
collinearity, the two independent variables already share much of the information
about the dependent variable and the analysis will not be able to distinguish the effects
of one over the other (Allison, 1999; Norusis, 2009).
One way to examine if there is multicollinearity among the independent variables is to
run correlations of all pairs of independent variables. When the correlation is high (rule of
thumb is above .8), there is a likelihood that you have multicollinearity. SPSS will conduct
a diagnosis for multicollinearity by computing what is called a variance inflation factor
(VIF). The general rule of thumb is when any VIF is greater than 10 there is a multicol­
linearity problem (Stevens, 2009). (Some researchers suggest using 5 to be conservative.) If
SPSS indicates there is a multicollinearity problem, examine the direct correlation between
each pair of independent variables and take out one from a pair that has a high correlation.

Using Dummy Variables in the Multiple Regression


As previously mentioned, a basic premise of linear regression analysis is that the vari­
ables are continuous. Yet, there are research questions that hypothesize categorical
variables—such as race, gender, political party affiliation—may affect the variance in
the dependent variable. Including a categorical variable in the analysis may make the
prediction of the dependent variable more accurate. Linear regression analysis allows
the inclusion of categorical independent variables as dummy variables.
Dummy variables take a value of 0 or 1. The value 0 indicates the absence of the
attributes of the category, and the value 1 indicates the presence of the attribute of the
category. For example, gender has two attributes, male and female. As a dummy vari­
able, male could be designated as 0, and female as 1. In the regression equation, then,
the coefficient for the dummy variable would indicate how the female attribute (1) has
an effect on the dependent variable in contrast, or in reference, to the male attribute
(0). The category designated as 0 in the dummy variable is called the reference group.
In Mary’s case, she is considering a second analysis that examines multiple volun­
teer characteristics to predict volunteer hours (dependent variable), including income
level (independent variable 1), age (independent variable 2), and gender (independent
variable 3). Notice that gender is a categorical variable. In this case, gender can be
added as a dummy variable to the regression equation as follows:

Formula 13.8 Equation for Multiple Regression With Categorical Gender Variable

Y (Volunteer hours) = a + b1 X1 (income level ) + b2 X2 (age) + b3 X3 (gender)

In interpreting the regression coefficients in this equation, the value of a indi­


cates the intercept, or mean volunteer time for male volunteers (the reference
272  ❖  SECTION II  DATA ANALYSIS

group), when (hypothetically) the volunteer has no income and is zero years old. In
other words, the intercept represents the value of the dependent variable when the
values of all the independent variables are zero. We expect a relationship between
volunteer time (dependent variable) and income level (independent variable 1)
indicated by b1 and the relationship between volunteer time (dependent variable)
and age (independent variable 2) indicated by b2. We expect these relationships to
be the same for both male and female volunteers (independent variable 3). The
coefficient b3 indicates the mean difference in the dependent variable between the
group coded as 1 (female) and the reference group (male). When the regression
coefficient for the dummy variable gender is significant, it means the difference in
the mean volunteer time between male and female volunteers is significantly differ­
ent from zero.
Creating a dummy variable for a categorical variable with more than two attributes
is more complicated. For example, if Mary wanted to include a region variable to indi­
cate the part of town in which the volunteers live, she would have four categorical
attributes (or groupings): North, South, East, and West. Including region in a regres­
sion analysis would require three dummy variables as follows:

Dummy Variable 1 (North): North = 1, Other region designation = 0


Dummy Variable 2 (South): South =1, Other region designation = 0
Dummy Variable 3 (East): East =1, Other region designation = 0

Notice that we only need to define three of the four regions. In this case, West is
designated as the reference group, with a value of 0, for all three of the created dummy
variables. With a categorical variable like this with multiple attributes, all the dummy
variables need to be entered as a block. The coding in this example is summarized in
Table 13.1.
If Mary adds these dummy variables in the regression analysis, the equation will
appear as follows:

Table 13.1  Dummy Variable Coding

Dummy Variable 1 Dummy Variable 2 Dummy variable 3


(North) (South) (East)

North 1 0 0

South 0 1 0

East 0 0 1

West 0 0 0
Chapter 13  Regression Analysis  ❖  273

Formula 13.9 Equation for Multiple Regression With Categorical Gender Variable
and Dummy Coded Region Variable

Y (Volunteer hours) = a + b1 X1 (income level ) + b2 X2 (age) + b3 X3 (gender) +


b4 X 4 (North) + b5 X5 (South) + b6 X6 (East)
The interpretation of the regression coefficient is similar to the case described
above with a dummy variable for gender. When the regression coefficient (b4, b5, b6) for
the dummy variable is significant, it means the difference in the mean volunteer time
between the region represented by the dummy variable (North, South, East respec­
tively) and the reference group (West) is significantly different from zero.

Running Multiple Regression Using Software Programs


Now let’s look at Mary’s case to see if she can predict volunteer hours better with
multiple independent variables in a multiple regression analysis, including volunteer
income level, age, and gender. We will go through the procedure in SPSS and Excel.

Running Multiple Regression Using SPSS


1. Open Mary_Volunteer_profile.sav
2. Click AnalyzeàRegressionàLinear.

Figure 13.13    Menu Selections for Linear Regression

3. Move the variable hours_week into the Dependent Variable box.


4. Move the variable income, age, dummy_gender into the Independent
Variable(s) box.
274  ❖  SECTION II  DATA ANALYSIS

Figure 13.14    Input Variables for Multiple Regression in SPSS

5. Click Statistics. The Estimates and Model Fit should already be selected as a default.
6. Click Collinearity diagnostics. (You can also click Descriptives if you want to
have the descriptive statistics.)

Figure 13.15    Statistics Options for Linear Regression in SPSS

7. Click Continue.
8. Click OK.

Just as in the bivariate regression output in SPSS, the table labeled Model Summary
(Figure 13.16) includes information about R, R square (R2), and Adjusted R Square. In this
Chapter 13  Regression Analysis  ❖  275

case, with multiple regressions, all three R values indicate the degree to which the linear
combination of the independent variables in the regression analysis predicts the dependent
variable. We will explain the idea of linear combination in the discussion below.

In the multiple regression, the value of R is different than in the bivariate regres­
sion. Here, it represents the Pearson product moment correlation coefficient between

Figure 13.16    Multiple Regression Model Summary SPSS Output

the observed value of the dependent variable and the predicted value of the dependent
variable using the regression equation. R for multiple regression is referred to as
Multiple R (Field, 2009). The characteristics of the metric are the same, with a range
from 0 to 1, a larger value indicating a larger correlation and 1 representing an equa­
tion that perfectly predicts the observed value of the dependent variable. Multiple R is
an indicator of how well the overall regression equation predicts the observed data. In
the current multiple regression analysis for Mary, the result of .799 indicates that the
linear combination of the three independent variables (income, age, and gender)
strongly predicts the actual dependent variable.
R Square (R2) indicates the proportion of variance that can be explained in the
dependent variable by the linear combination of the independent variables. The values
of R2 also range from 0 to 1. Mary’s analysis suggests that the linear combination of
volunteers’ income, age, and gender explains 63.9% of the variance in volunteer hours.
Note that this is a slight increase from the bivariate model, which was 63.1%.
Typically, anytime more variables are added to the regression equation, the value
of R2 increases. As a note of caution, adding variables haphazardly to increase the
explanation of the variance in the dependent variable is not a good research practice.
As noted earlier, each independent variable should be added with a purpose that comes
from the research question and the theory. Sometimes there is a tendency to treat
multiple regression analysis like making soup; the cook will add a bunch of leftovers
just because they are there and need to be used. This kind of arbitrary, nontheoretical
approach can produce misleading results (Baltagi, 2011).
Adjusted R Square (R2), as noted for the bivariate regression analysis, adjusts the
value of R2 to more accurately represent the population of interest when the sample
size is small. Also when there are a large number of independent variables included in
276  ❖  SECTION II  DATA ANALYSIS

the multiple regression equation, it tends to produce a higher estimation of the R2


in the population, and therefore, Adjusted R Square adjusts the value. In Mary’s analysis,
the adjusted R2 is 61.0%—more conservative than the unadjusted R2 of 63.1%. It is
different enough from the unadjusted R2 to be worth reporting.
The table labeled ANOVA in the SPSS output (Figure 13.17) provides the results of
a test of significance for R and R square using the F-statistic. In this analysis, the
p-value is well below .05 (p < .001), and therefore, Mary can conclude that R, R2, and
Adjusted R2 for the multiple regression she conducted predicting volunteer hours
based on the linear combination of income, age, and gender is statistically significant.
The information in the table labeled Coefficients in the SPSS output (Figure 13.18)
can be interpreted in the same way as we discussed in the bivariate regression section
above. It provides information that is useful for understanding the regression equation.

Figure 13.17    Multiple Regression ANOVA SPSS Output

Again, under the column marked Unstandardized Coefficient and sub-column B is the
value for the intercept (a) in the regression equation on the first row, labeled (Constant).
The numbers below it in the same column are the values for the regression coefficients for
income, age, and gender. Based on these results, the regression equation that predicts vol­
unteer hours based on the linear combination of income, age, and gender is as follows:

Formula 13.10 Regression Equation That Predicts Volunteer Hours

Y (volunteer hours) =1.29 +.88X1 (income) + .01X2 (age) + (-.76)X3 (gender_reference male)

This result indicates, first, that the intercept is 1.29 hours when all independent
variables have a value of zero. Then, moving through the equation, holding volunteer age
and gender constant, the volunteer hours (per week) increase by .88 hours for each addi­
tional increase in the income level. The p-value for this coefficient is statistically signifi­
cant (p < .001), meaning that volunteer income is a significant predictor of volunteer
hours. Holding income and gender constant, the volunteer hours increase by only .01
hours (per week), according to the equation, and this coefficient is not statistically signif­
icant (p = .695). Volunteer age is not a significant predictor of the volunteer hours.
Finally, the regression coefficient for the gender dummy variable, with male as the refer­
ence group, is −.76, which means that holding volunteer income level and age constant,
Chapter 13  Regression Analysis  ❖  277

Figure 13.18    Multiple Regression Coefficients SPSS Output

female volunteers put in an average of .76 hours less (per week) than male volunteers.
However, the p-value for gender is also not statistically significant (p = .333).
As with the bivariate regression analysis, the values in the Coefficients table under
the column Standardized Coefficient and sub-column Beta is the regression coefficient
when the independent and dependent variables are converted to a z-score. In the mul­
tiple regression, this standardized regression coefficient Beta (β) is useful, because it
allows you to compare the relative strength of each independent variable’s relationship
with the dependent variable. In this case, the regression coefficients (b) provide you
with information on how much change can be expected with a one-unit change in each
independent variable, but they don’t tell you the relative strength of the relationship
between the dependent variable and each of the independent variables. With the Beta
values here, we can see in Mary’s analysis that income (.777) has the strongest relation­
ship with volunteer hours, compared to age (.035) and gender (−.079). Besides, the
Beta for age and gender are not statistically significant.
In the same table, the information under the column Collinearity Statistics and
sub‑column VIF indicates if there is multicollinearity among the independent vari­
ables. In this current analysis, all VIF is lower than 5, and therefore, Mary can be
assured that there is no multicollinearity problem in her analysis. If any of the VIF is
higher than 5, Mary needs to check the correlation of that particular variable with
other independent variables and eliminate one of the variables with high correlation.

Running Multiple Regression Using Excel


Running multiple regression analysis in Excel follows the same procedure that we used
for bivariate regression. The only key difference is that the multiple independent vari­
ables need to be placed in columns that are contiguous to each other. Therefore, move
the independent variables to be used in the analysis in columns next to each other or
copy a column so that it is contiguous to your other independent variable(s). When the
278  ❖  SECTION II  DATA ANALYSIS

Input X range box is activated, then highlight all of the columns of the independent
variables (again, they must be contiguous).

Mary’s Case
Mary ran her finger over the output she obtained from her multiple regression
analysis to predict volunteer hours by income level, age, and gender. She talked
herself through it.
“OK, R square is about .64, so this regression equation accounts for about
64% of the variances, and it is significant. That’s not too bad. So it looks like
this is a good regression equation model to predict volunteer hours.”
She then directed her attention to the regression coefficients.
“Hmm … so age and gender are not significant. That means age may not be a
good predictor for volunteer hours. And there may not be a generalizable difference
between male and female volunteers. Still, the coefficient for the gender dummy
variable is a fairly strong negative value, which suggests that female volunteers put
in less volunteer time when income level and age are constant. That’s interesting.”
Mary decided to check the descriptive statistics, comparing actual volunteer time for
male and female volunteers, and indeed, female volunteers had a lower average. It may not
be generalizable, she thought, but it was interesting, because it went against the assump-
tion she had heard that women put in more time.
The regression coefficient for income level was more startling.
“Wow, it’s .875 and significant. Income level clearly matters more than anything else.
Does that mean I should try to target higher-income volunteers?”
Somehow this conclusion did not sit well with her. Although the data definitely sug-
gested this relationship of income and volunteer hours, she wondered if there might be
other factors that were not captured in the volunteer background information—something
that coincided with higher income.
“I still think I need to interview volunteers and get their perspectives.”
Mary shut down SPSS and opened the list of volunteers she had marked for interviews.

Brief Comment on Other Types of Regression Analyses


In this chapter, we introduced two types of regression analysis that can be used when
the dependent variable is continuous. There is another type of regression analysis
called logistic regression, which can be used to predict the outcome when the
dependent variable is dichotomous. To learn more about logistic regression, see
Kleinbaum and Klein (2011), and Menard (2008).
Another variation of regression analyses that is commonly used by public and
nonprofit mangers or policymakers is time series analysis, which is useful when
observing trends and making forecasts based on past observations at equally spaced
time intervals. To learn more about time series analysis, see Brockwell and Davis
(2002), and Ostrom (1990).
When you evaluate a policy or program, you have multiple observation points
before and after an intervention, resulting in a time series that looks like the following
Chapter 13  Regression Analysis  ❖  279

notation, where O indicates observations and X indicates the implementation of the


policy or program:

O1  O2  O3  O4  X  O5  O6  O7  O8

With this design, you can use an interrupted time series analysis. To learn more
about interrupted time series analysis, see McDowall (1980).

Chapter Summary
This chapter introduced bivariate and multiple linear regression analyses. Linear regression analysis
identifies a regression equation that allows a researcher to predict the scores of the dependent
variable based on the scores of one or more independent variables. It also provides information on
the strength of the relationship between the dependent variable and the independent variables.

Review and Discussion Questions and Exercises


1. Based on the Emily’s case description at the beginning of the chapter, run a bivariate regression
analysis to answer Leo’s question. Write a regression equation based on the result you obtained.
2. Are there any other independent variables that are appropriate to include in the analysis you
conducted in (1) above? Conduct a multiple regression analysis and report the result.
3. Create dummy variables for region in Mary’s data and conduct multiple regression analysis
with the dummy variables. (See Appendix A for the instructions on how to recode the variable
in SPSS to create dummy variables.)
4. Describe the importance of the multicollinearity assumption in linear regression.
5. Describe Total Sum of Squares and Residual Sum of Squares and how it relates to Coefficient
of Determination.
6. Describe the difference between standardized regression coefficient β and the unstandardized
regression coefficient b.
7. When is it appropriate to report adjusted R2?

References
Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Pine Forge Press.
Baltagi, B. H. (2011). Econometrics (5th ed.). New York, NY: Springer.
Brockwell, P. J., & Davis, R. A. (2002). Introduction to time series and forecasting. New York, NY: Springer.
Cohen, J. (2010). Applied multiple regression/correlation analysis for the behavioral sciences. New York, NY:
Routledge.
Field, A. P. (2009). Discovering statistics using SPSS: (And sex and drugs and rock ‘n’ roll). Thousand Oaks, CA: Sage.
Fox, J. (1991). Regression diagnostics. Newbury Park, CA: Sage.
280  ❖  SECTION II  DATA ANALYSIS

Green, S. B., & Salkind, N. J. (2010). Using SPSS for Windows and Macintosh: Analyzing and understanding
data (6th ed.). Upper Saddle River, NJ: Prentice Hall.
Homoscedasticity. (n.d.). In Merriam-Webster’s online dictionary (11th ed.). Retrieved from http://www.m-w
.com/dictionary/homoscedasticity
Kahane, L. H. (2008). Regression basics (2nd ed.). Thousand Oaks, CA: Sage.
Kleinbaum, D. G., & Klein, M. (2011). Logistic regression: A self-learning text (3rd ed.). New York, NY: Springer.
McDowall, D. (1980). Interrupted time series analysis (Vol. 21). Beverly Hills, CA: Sage.
Menard, S. (2008). Applied logistic regression analysis. Thousand Oaks, CA: Sage.
Munro, B. H. (2004). Statistical methods for health care research (5th ed.). Philadelphia, PA: Lippincott,
Williams, & Wilkins.
Norusis, M. J. (2009). PASW statistics 18 statistical procedures companion. Upper Saddle River, NJ: Prentice Hall.
Ostrom, C. W. (1990). Time series analysis: Regression techniques. Newbury Park, CA: Sage.
Pedhazur, E.J. (1997). Multiple regression in behavioral research: Explanation and prediction (3rd ed.). Fort
Worth, TX: Harcourt Brace.
Stevens, J. P. (2009). Applied multivariate statistcs for the social sciences (5th ed.). New York, NY: Routledge.
Upton, G. J. G., & Cook, I. (2011). A dictionary of statistics (2nd ed.). New York, NY: Oxford University Press.

Key Terms
Adjusted R Square (R2) 267 Interrupted Time Series Regression Line or Line
Design   279 of Best Fit  259
Bivariate or Simple Linear
Regression Analysis  258 Linear Regression  257 Residual Sum of Squares
(SSR) 262
Coefficient of Linearity 258
Determination or Residuals or Error in
Logistic Regression  278
R-Square (R2) 262 Prediction 263
Multicollinearity 271
Dependent Variable Slope or X Coefficient  260
(Outcome Variable or Multiple Regression
Time Series
Criterion Variable)  257 Analysis 258
Design   278
Dummy Variables  258 Normality 258
Total Sum of Squares
Homoscedasticity 258 R (Multiple R)  275 (SST) 262
Independent Reference Group  271 Variance Inflation
Variable 258 Factor (VIF)  271
Regression
Intercept (a) 260 Coefficient (b)  260 Z-Score 268

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


•• Result write-ups
14 ❖
Qualitative
Data Analysis

Learning Objectives 282
Collecting And Analyzing Qualitative Data 282
Emily’s Case 282
Mary’s Case 283
Qualitative Versus Quantitative Data Analysis 284
Approaches to Qualitative Data Collection 285
Preparing Data for Qualitative Analysis 285
Thematic Analysis of the Qualitative Data 286
Mary’s Case 287
Brief Comment on the Qualitative Data Analysis Software 289
Analyzing Qualitative Data by Converting Them Into Numbers 291
Mary’s Case 291
Issues in Qualitative Data Collection and Analysis 292
Selection of Study Participants 292
Interviewer Effect 293
Subjective Nature of the Analysis 294
Chapter Summary 294
Review and Discussion Questions and Exercises 295
Key Terms 296
Table 14.1 Mary’s Transcript With Yuki’s Suggested Coding 290
Table 14.2 Codes or Themes by Volunteer Matrix 291
Table 14.3 Codes or Themes by Volunteer Matrix in Numbers 292

281
282  ❖  SECTION II  DATA ANALYSIS

Learning Objectives

In this chapter you will

1. Understand the basics of qualitative data analysis


2. Learn to use qualitative data to explore and describe a phenomenon of interest
3. Learn to conduct a thematic analysis to interpret qualitative data
4. Understand key issues to take into consideration when analyzing qualitative
data

Collecting And Analyzing Qualitative Data

Emily’s Case
“We’ve done a lot of work during the last nine months,” Emily told the team when
they met for their weekly meeting. “Now we need to confirm our next steps. Leo
did a great job summarizing the survey results. Mei-Lin, you managed a great
training. We received a lot of good feedback. I know you really want to keep it
going.”
Leo and Mei-Lin acknowledged the compliments with small nods. The pair
were uncharacteristically quiet, because they had an important guest.
“So, you know Bob, the city manager,” Emily continued. “I asked him to join
us for a few minutes this morning, so we can—” she looked at Bob and smiled,
“hit him up for money.”
Bob chuckled and gave a friendly greeting to all.
Emily started again and laid out the situation. “The Community Foundation gave us a
start-up grant for this year’s diversity training, and I think that gave us good momentum.
If we want to continue—and we think we should,” Emily gestured to include Leo and Mei-Lin
in the remark, “then we need to allocate a budget to include the diversity training as part
of the employee professional development program.”
Bob took the cue. Obviously, he had thought about the proposal and knew what he
wanted to say. “You’ve all done a very good job,” he started. “I think it’s a good idea. I will
certainly support your request. The data you gathered to evaluate the training helped con-
vince me.” Bob glanced at Leo and smiled appreciatively, then looked down at the summary
report in front of him on the table, with tables and two pie charts on the first page. He put
his finger on it. “The mayor likes numbers and statistics. I think he will understand that your
regression analysis suggests more diversity training is better for our employees.” Bob paused
to let that good news sink in.
“Some council members, however, are not number savvy like the mayor. They are more
people oriented and are better persuaded by stories. I suggest if you want to garner support
Chapter 14  Qualitative Data Analysis   ❖  283

from the city council members, you may also want to include comments and feedback from
the employees. For example, did you ask in your survey what employees think about the
impact of the diversity training? Are there any testimonials from the employees about the
importance of having an ongoing diversity training?”
The team nodded noncommittally. Emily answered, “We have some information like that.
I’m not sure if we have it in a data format to make a presentation, but we can work on it.”
“If you do have that kind of information,” Bob concluded, “I think it will help your bud-
get request.”
Emily recalled Leo showing her a list of employee feedback from the survey in response
to a question: “Are there any comments you would like to add?” The comments helped the
team get an idea of how the employees felt about the training—Leo printed them out and
they all discussed the comments at one of their meetings—but they couldn’t figure out what
to do with them, so they set them aside. Emily could see that Leo and Mei-Lin were thinking
of the same thing.
Mei-Lin spoke up. “We do have some information, but it may not be a bad idea to do
some follow-up interviews or focus group discussions with the employees, so we can get a
more diverse”—she grinned and stopped, and the others caught the word, “you know, a
broader range of perspectives and give more people an opportunity to offer comments that
we haven’t heard yet.”
“Do we have time for that?” Emily looked inquiringly around the table.
Leo spoke before the question could hang there. He realized Bob needed a firm conclu-
sion. “Let’s see what we already have and decide what more we need.” He looked at Emily,
“I’ll work on the survey responses,” then he turned to Mei-Lin, “and Mei-Lin and I can put
together a plan for what else we might need.”
Bob looked satisfied. “Sounds like you have it under control.” He started to get up. “I
do think it will help your case to turn your numbers into narratives, as they say.” He
reflected momentarily that he wasn’t sure who had said that, but he knew this team
would get it.

Mary’s Case
“Another latte?” Mary raised an eyebrow at Yuki. They were outside
again at the same table at the same coffee shop. She sat forward
conspiratorially, “They also have a great homemade apple pie. We
can share one.”
Yuki laughed, “Well that’s hard to resist. Sure. But if you keep
feeding me like this every time we meet to talk about your research,
pretty soon I won’t fit into my clothes.”
Mary went inside and returned shortly with pie and two lattes
clutched together in her hands. Once she had her fork, Yuki got back
to business, “So now, tell me more about your qualitative data collection. Last time you said
you were trying to decide who to interview and what you would ask. Have you started with
your interviews yet?”
284  ❖  SECTION II  DATA ANALYSIS

“I actually have,” Mary replied. “I’m doing sort of a combination of purposive sampling
and snowball sampling to get my interviews. Purposive sampling in the sense that I am
‘purposively’ trying to get a broad representation of men and women, and those who have
been there a long time versus those who are fairly new, and people who live on the west
side and the east side. And snowball, because I’m also asking each interviewee if they know
anyone else who might have something to tell me.”
“Great!” Yuki exclaimed. “How many have you interviewed so far?”
“Five. I have a couple more lined up next week.”
“How’s that going?” Yuki knew Mary was a novice at qualitative data collection, so she
was a little concerned.
“It’s been a learning experience for me,” Mary said reflectively. “Since I didn’t have much
experience in interviewing, I decided to do a dry run with a couple of the volunteers whom
I know very well. I’m glad I did the pilot interview like the book suggested. I quickly found
out that I had too many questions. I didn’t realize how much people talk in responding to
one question. Initially, I prepared close to 20 questions for a one-hour interview. Obviously,
I couldn’t cover them all. I had to rush through to fit in even half that.” Mary smiled apol-
ogetically at Yuki. “After listening to the tapes of the pilot interviews, I reduced the number
of questions, and during the interview I tried to let the interviewee talk more freely. I started
using follow-up questions, such as ‘Tell me more about it’—like your book suggested. I think
I’m getting more insights and personal stories now.”
“Sounds like you are becoming quite an expert interviewer,” Yuki congratulated. “You
don’t seem to need any help from me on conducting interviews. So what’s bothering you?”
“Well …” Mary dabbed her fork in the whipped cream tousled on top of the pie, “I need
your guidance on how to analyze qualitative data. I have interviews, but I don’t know what
to do with it all.”
“Ahh—” Yuki acknowledged. She pressed her fork through the middle of the whipped-
cream curl down through the pie slice, and then scooped her half onto a separate plate.
“For that, I need a bit of bribery.”

Qualitative Versus Quantitative Data Analysis

With quantitative data, researchers can summarize results using statistics, as we have
demonstrated in the last several chapters. In contrast, qualitative data capture the
phenomena using words, statements, and sometimes visuals (Denzin & Lincoln, 2011).
This requires some kind of textual or thematic data analysis. For some purposes,
qualitative data can also be converted into numbers. In this chapter, we will review data
collection procedures (introduced in Chapter 6) and examine how to analyze
qualitative data by identifying key themes.
Qualitative data provide a richer description of the phenomena of interest than
can be accomplished with numbers. Narrative or graphic information is easier to com-
prehend as a direct representation of the phenomena. Numbers and statistics are more
abstract and require additional knowledge to understand what is represented.
As an example of this distinction, let’s take Mary’s regression analysis results intro-
duced in Chapter 13. Her results suggested there was a moderately strong relationship
Chapter 14  Qualitative Data Analysis   ❖  285

between volunteer hours and volunteer income level. Compare this regression analysis
result with the following email message Mary received from one of the volunteers.

Dear Mary, I want to take a moment to thank you for the opportunities Health
First has given me. I love volunteering for Health First! I recently quit a cushy
job as a corporate lawyer. I have enough set aside and invested that I can pursue
things I care about: in this case, improving our health care system and helping
people in need. I meet people at Health First who share my passion, and that’s
wonderful! I intend to keep working at Health First, and if possible expand my
engagement with other projects. I just want to thank you and Health First for
the wonderful job you do. Let me know if you need any more volunteers.”

The regression analysis directed attention to an association between income level


and volunteer hours but did not give Mary any information about the personal moti-
vations of volunteers to explain the how and why behind the association. On the other
hand, the brief email message does not inform Mary how much average volunteer
hours may increase with a higher income level. Qualitative and quantitative data
answer different questions about the phenomena of interest (Creswell, 2007; Giddens,
1990; Ziman, 2000). As we discussed in Chapters 2 and 3, the decision to use quanti-
tative or qualitative data in research needs to consider what answers are needed for the
research question and the specific objectives. As we saw above in Mary and Emily’s
cases, practical research in public and nonprofit sectors may be enhanced by consider-
ing incorporating both types of data.

Approaches to Qualitative Data Collection

We discussed different ways to collect qualitative data in Chapter 6, including open-


ended questions in a survey—questions that start with Who, What, When, Where,
Why, and How—and through interviews, focus groups, and participant and
nonparticipant observation (DeWalt & DeWalt, 2002; Patton, 2002). All of these data
collection methods produce narratives or descriptive text. This is typically what we
mean by qualitative data. Collecting graphic materials or artifacts of one sort or
another may also be thought of as qualitative data, and these data could involve special
techniques and challenges, but the basic analytic approaches described below should
fit all qualitative data from any source. We will discuss process for qualitative data
coding, thematic analysis, and converting categories into numbers.

Preparing Data for Qualitative Analysis

You may find yourself, like Emily and Mary in the cases above, in charge of a stack of
raw text or recordings that are supposed to be qualitative data, but you are not sure
how to analyze them as data. Although it may not seem quite as daunting, note that we
faced a similar situation with quantitative data. Between the data collection and the
analysis, we needed to prepare the data by setting up a database with cases and
286  ❖  SECTION II  DATA ANALYSIS

variables (as discussed in Chapter 7), and we had to check for accuracy and be sure the
format and the variables we had available were suitable for the intended statistical
analysis. Quantitative data is a little easier to manage at this stage, because most of the
decisions about what to count were made before the data were collected. With
qualitative data, the researcher may have prepared with as much care how to elicit
responses from participants in the research, but once in hand, it is not immediately
clear how those responses qualify as data.
Before starting the process of working with qualitative data, audio recordings from
interviews and focus group discussions need to be transcribed. This is important to
document the narrative as source material and allow easy access for review by the
researcher or a second reader. Always plan for transcription time and costs in the
research process (Miles & Huberman, 1994).

Thematic Analysis of the Qualitative Data

One of the most common approaches to the qualitative data analysis is called thematic
analysis. This approach focuses on identifying themes that adequately represent the
data. Themes are key patterns identified in the data that may be important features of
the phenomenon in question, according to the purposes of the research question. A
researcher identifies themes by going through multiple examinations of the data. First,
familiarize yourself with the narrative by reading it repeatedly. At this stage, your
purpose is purely inductive, focusing on patterns that emerge from the data itself. On
further examination, you may take a deductive approach, looking for patterns that fit
a theoretical model of what you expect to find or issues you want to address. (We
discussed inductive and deductive approaches to research in Chapter 3.)
The next step involves documenting the patterns you find by generating initial
codes as labels for the recurring patterns. The labels attach a categorical meaning to
bits of text to represent a single concept, even though the specific examples may be a
little different from each other. Sometimes a single bit of text may be given several
codes to emphasize different parts, or to represent different interpretations of the
same parts.
Typically, researchers repeat the process of coding a few times. Coding involves
subjective judgment on the meaning of the text, and a second or third pass helps to
refine the codes as the researcher gains a better grasp of the patterns and combinations
of ideas involved in the raw materials. Judgment improves with experience.
Coding is the first step to systematically organize your qualitative data. Once you
have an array of codes, you can review them to search for themes in a similar way as
you did for the original text. A code itself is a kind of theme for the text contained
within it, and you can think of the themes as a new level of coding to identify broader
patterns in the data. Here, too, a theory or established conceptual framework can be
helpful to organize different levels of codes and themes in a coherent whole. Be aware,
however, that any initial framework is likely to change, due to the inherent properties
of the data. Thematic analysis can take several iterations before a satisfactory structure
is developed that adequately represents the available data.
Chapter 14  Qualitative Data Analysis   ❖  287

If you manage to develop a hierarchical outline of themes, sub-themes, and codes,


it should be fairly easy to recognize key themes, partly by the resulting structure of
concepts and partly by the prominence of particular concepts that were frequently
mentioned. This is a little tricky, because you have to remember when dealing with
qualitative data that the number of times something is mentioned is not a definitive
mark of its importance as it would be with quantitative data. (We will discuss using
numbers with qualitative data in a separate section below.) In addition, you will
want to choose key themes that help answer your research question, yet also not
ignore other elements that might emerge in the analysis as an important theme for the
participants.
The results are really a story told by the data and the way you have organized it.
When you present the results of your qualitative research, you will want to discuss each
theme with enough evidence from the actual data—using direct quotations from the
original text—to capture the essence of the story in the experiences of real individuals.
Let’s look in on Mary’s case again to illustrate the points we just outlined in the
qualitative data analysis process.

Mary’s Case
“I transcribed the five interviews I’ve done so far,” Mary told Yuki,
once she was settled on a small couch in Yuki’s office. Mary needed
immediate help with her research on volunteer recruitment and reten-
tion at Health First, and she took the hour in the morning Yuki was
able to give her. She pulled a sheaf of papers from her satchel and
set them on the low table in front of Yuki, who was scrunched up
next to her on the couch. Only two friends could sit there together.
“I don’t know what to do with the transcripts,” Mary said with
resignation as Yuki picked up the pages. “I heard there are qualitative data analysis soft-
ware programs. I want one where I can plug in all these transcripts and click ‘analyze’ and
have it produce some kind of output like SPSS.”
Yuki sensed Mary’s downcast mood and recognized by the office visit, rather than latte
and pie, that this was time to be serious.
“I see you organized the transcripts by the questions you asked. That’s good,” Yuki said
encouragingly.
“I got that from the qualitative research methods book you loaned me. It said to orga-
nize the data and identify patterns. The easiest pattern I could see was by questions.”
Yuki knew that reading a textbook about qualitative data analysis is not the same as
actually doing it.
“So, you have a good start. What you do now is first read your transcripts a few times.
Look for key information that may be important for your research question and highlight
those portions of the text. Once you have the highlights, it will be easier to spot when you
see the same ideas coming up. Then you can start to add marks to identify the same point
when you see it repeated.”
Yuki glanced at Mary and saw she was not rallying.
288  ❖  SECTION II  DATA ANALYSIS

“Let’s do one together. Let me grab a highlighter.”


Yuki reached over to a cup of pens on her desk, and then laid one of the pages of the
transcript on the table. “Take a look at this page. You have responses from two volunteers
on the same question: ‘What are the key things that kept you volunteering at Health First?’
Let’s read what they say.”
Yuki and Mary both skimmed through the transcript. Yuki followed with her finger and
deliberately went over the page twice. “You probably remember this, because you talked to
these people. Anything you notice here?”
“I remember all the volunteers talked about how they like the fellow volunteers and how
they make friends. I see that for both the people here,” Mary replied. She pointed to the
parts, and Yuki highlighted them.
“Anything else?”
“Well,” Mary answered slowly, “they say it a little differently, but they both talk about
wanting to give back to the community.” She pointed to the parts again, and Yuki high-
lighted them.
“There’s also, let’s see —here,” Mary pointed, “this one talks about how she was a nurse,
and she can ‘capitalize’ on her skill set, and this other one was into social work and feels
like she gets to use her skills, too. They are both educated and want to use their knowledge.”
Yuki highlighted some more. She then pulled out her pen and started writing short
phrases on the right- hand column. “This was a good idea to make a narrow column of text,
so not too much appears on each line, and you have enough space to write notes.”
“Got it from a model in the book,” Mary responded. She was perking up a little.
Yuki jotted down short phrases, such as “Like volunteers” and “Friends” and “Giving back
to community.” She then read through the transcript again, highlighted more parts, and
added short phrases in the margin. She turned the paper so Mary could see it.
“So, you just identified specific ideas and developed codes. This is a rough coding. What
I wrote in the margin here could be the labels for your codes. You can basically go through
the whole transcript and identify key points just like this. Maybe you will notice the same
thing said in different ways or with different particulars, like the one here with the nurse
and the social worker both representing a professional background. But don’t think too
much at first about how one person fits together with others. You can just keep highlighting
anything you think might be relevant. Then come up with a short phrase that captures the
meaning of the phrase you highlighted. You can go back several times and change the
name of the codes if you want to. That way you may find common ideas where two things
fit together under one code. You can keep adding or eliminating the codes. As you go
through this coding process, you will probably start noticing some themes. Write those
down, too. A theme summarizes or categorizes the codes into groups. Can you think of any
themes, based on what we just coded?”
Mary looked at the marked-up transcript. “Just what we’ve coded, I guess. There’s
definitely a ‘Friendship’ theme. Everyone seemed to mention it. Then maybe a ‘Public service’
theme that refers to the sense of fulfillment, and contributing to their community. I guess
‘Having a relevant background’ is another recurring theme.”
“That’s seems right,” Yuki said. “What you need to do is keep track of your themes and
see how many different ideas, or codes, can be grouped under the same theme. There may
Chapter 14  Qualitative Data Analysis   ❖  289

be different aspects of it that you want to recognize with separate codes. Maybe you will
find a lot of ex-nurses, or social workers, or something else.”
Yuki glanced at Mary again and was relieved to see she looked interested.
“Finally,” Yuki concluded, “you will want to figure out what codes and themes help you
answer your research question, decide which ones are relevant or could be relevant. Once
you get that, your analysis is done!”
“Really, that’s it?” Mary felt like something was missing. “How do I know my analysis is
right? How can I be sure that what my interviewees say represents all the volunteers?”
Yuki was glad to see the familiar side of Mary’s quantitative thinking surfacing. She
knew she would need to address that eventually.
“All you want to do right now is tell a story from what you’ve been told,” Yuki said firmly.
“First, figure out what you have. Code your text. Analyze and organize your themes. Tell a
good story, with quotes. Make it relevant to what you want to know.”
Mary looked abashed. “All right. I can do that.”
“If you want to generalize,” Yuki continued, “well, that’s not really the same for qualita-
tive data as it is for quantitative data.” She smiled, “Do you remember reading about the
idea of ‘saturation’ in identifying the sample size in the qualitative data collection?”
Mary paused, reflected a bit, and nodded, “Yeah—“
“In qualitative data analysis, once you start seeing similar themes coming up again and
again from your study sample, then you can be confident that these themes are generaliz-
able.” Yuki continued. “But I can also show you some other things.”

Brief Comment on the Qualitative Data Analysis Software


Reading through your text on paper, as Mary and Yuki did, can be a productive way
to familiarize yourself with the embedded ideas (See Table 14.1). As Mary mentioned,
though, there are some qualitative data analysis software programs available that can
facilitate the coding process once you get going. Two popular programs are ATLAS.ti
and NVIVO. These software programs allow you to do a more refined coding, and
also assist in identifying themes by helping organize the codes and visualize how they
relate with graphic representations. As you can imagine, the number of codes can be
fairly large, and organizing them can be difficult. A software program can make the
task much more manageable. But that’s all they do. The programs will not tell you
what codes you should use or how to code the data. Identifying the codes, applying
the codes to the data, and eventually identifying the key themes are the task of each
individual researcher. You will need to use your knowledge about the topic, your
familiarity with the data, and your analytical skill and judgment to determine the
codes and themes.
Coding can be done without using a specialized software program. Some research-
ers use a basic word processing or spreadsheet program, or just paper and pencil with
index cards, post-it notes and highlighters, and still do a good data analysis. As long as
you have a system for organizing and coding data, low-tech approaches can be as effec-
tive as specialized qualitative data analysis programs.
290  ❖  SECTION II  DATA ANALYSIS

Table 14.1  Mary's Transcript With Yuki's Suggested Coding

Q3: What are the key things that kept you volunteering at Health First?
1 “I really like the other volunteers who work here. They are good Like volunteers
people. I get along so well with all of them. I met a few people here,
whom I became good friends with. I’ve been feeling somewhat Friends
isolated since my husband passed away five years ago. So people I Felt isolated /Widower
meet here are my main social network right now.” Social network

(That’s nice to hear that you made friends with other volunteers.
Any other things that keep you coming back to volunteer with us?)

“Well, the same reason I initially decided to come here to volunteer.


I believe the work Health First does is very important for our
community. I feel fulfilled when I spend time giving back to my Giving back to community
community. Of course, there are other ways I can give back to my
community. But I can’t do everything. And since I used to work as a
nurse, the work I can do at Health First allows me to capitalize on Relevant background
my skill sets.”
2 “Definitely the people! I love the other volunteers and the friends I Like volunteers
have made. When I started, I just moved to the area and I didn’t Friends
know anybody. Now I feel like I know everyone in town. I Felt isolated/new to the area
frequently run into my fellow volunteers, at the grocery stores, Social network
restaurants, the gym— they are everywhere. (Laugh). Oh, I love the
staff, too, you know. They are all such dedicated people. I really
admire them.”

(I like the people at Health First, too. Yes, they are great.)

“I can’t remember if I already told you this, but I have a degree in


social work, and before I moved here, I used to work in the public
health department. When I moved here, my kids were still very
young, and I didn’t want to go back to a full-time job. Luckily, my
husband makes enough money for us to live comfortably, so I didn’t
feel the urgency to go get a full-time job. But, like I said earlier, I
wanted to do something. Something that makes me fulfilled. I like Sense of fulfillment
dogs so at one point I tried volunteering at the Humane Society. It
was ok, and I liked the people over there, too, but somehow it was
different. It didn’t give me the sense that I was contributing to Contributing to the
society. I think I have a very strong sense of public service, and community
volunteering at Health First gives me the satisfaction that I’m
contributing to the public good. Plus, I can use my background in Relevant background
social work. Actually, once my younger one graduates high school,
I’m thinking of working full time again. It would be nice if I could Job opportunity
have full-time paid work here at Health First. Having volunteer
experience would help land a full-time position, right?”
Chapter 14  Qualitative Data Analysis   ❖  291

Analyzing Qualitative Data by Converting Them Into Numbers

Qualitative data can also be analyzed by converting the coding into numerical
values. These numerical values can then be analyzed using quantitative data
analysis techniques to gain insights into the meaning of the data. You can use
descriptive statistics to get a sense of the prominence of certain ideas or even test
hypotheses using statistical approaches (Miles & Huberman, 1994; Trochim &
Donnelly, 2007).
Let’s eavesdrop on Mary’s case again to get an idea how to use quantitative analysis
with qualitative data.

Mary’s Case
Yuki explained to Mary how she could use numbers with her qualita-
tive research to help get a sense of how her results might represent the
population of volunteers in which she was interested. “If it makes you
feel more comfortable,” Yuki explained, “you can convert your coding
into numbers.”
Mary was intrigued, “How do I do that?”
Yuki retrieved a legal pad from her desk and started drawing a
table. “You create a data matrix like this, the same way you would
make a database in SPSS, with each row representing a case and the
columns representing your variables.”
Yuki titled the table in the upper left corner as Codes/themes, then labeled the first
column in the table as Volunteers and labeled the columns to the right with the three codes
or themes they identified in their practice session: Friends, Public service, and Relevant
background. She listed three cases: Volunteer 1, Volunteer 2, and Volunteer 3.
“OK, that should look familiar,” she said. “Now you look at each volunteer and make a
check mark under each code or theme they mention.” She added check marks in some of the
cells in the table to illustrate. (See Table 14.2.) “Think of your codes or themes as categorical
data, either yes or no, with a check mark counting as 1 and a blank cell counting as zero.”
Yuki quickly drew another matrix and put numbers in the cells. “Now you really
feel at home, right?”(See Table 14.3.)

Table 14.2  Codes or Themes by Volunteer Matrix

Codes or themes Theme 1 Theme 2 (Public Theme 3 (Relevant


Volunteers (Friends) service) background)

Volunteer 1

Volunteer 2

Volunteer 3
292  ❖  SECTION II  DATA ANALYSIS

Table 14.3  Codes or Themes by Volunteer Matrix in Numbers

Codes or themes Theme 1 Theme 2 (Public Theme 3 (Relevant


Volunteers (Friends) service) background)
Volunteer 1 1 0 1
Volunteer 2 1 1 1
Volunteer 3 1 0 0

Mary smiled for the first time. She was still uncomfortable, though. “But I don’t have a
random sample. Plus, I asked questions, but I let people talk as long as they wanted about
anything they wanted to talk about. They didn’t all have an equal chance to comment on
some of these topics. I don’t really know if the things that get the highest count are really
the most important topics.”
“That’s right,” Yuki turned to look directly at Mary. “It’s good you recognize that.
Quantifying qualitative data will only give you a sense of prominence, what emerged spon-
taneously in a sample of the population; it’s not really prevalence as you would normally
think of it. It’s not exact.”
That said, Yuki relaxed and put a hand on Mary’s shoulder. “Anyway, remember that you
wanted to do long interviews so you could hear the stories. Don’t be embarrassed by the
stories now that you have them. If you rely too much on numbers, you lose the richness of
your data and miss the purpose of what you wanted to achieve. Numbers here just give you
another way to get a sense of what you heard.”
Mary brightened. Yuki could see she was starting to think about this new context for her
numbers proficiency.
“Got it,” Mary chimed. “Really, I think I got it. This is going to be fun.”

Issues in Qualitative Data Collection and Analysis

A few issues discussed in earlier chapters for quantitative data require special attention
in relation to qualitative data. The details are different. Two issues relate to data
collection, regarding the population sample and the potential for bias. A third issue
relates to bias in the interpretation of codes and themes during the analysis.

Selection of Study Participants


In quantitative studies that use inferential statistics, the ability to generalize the study
results to the population of interest depends on probability sampling and the size of the
sample. Due to the nature of the data collection in qualitative research, sample sizes
tend to be smaller. Also, participants tend to be selected with nonprobability sampling
Chapter 14  Qualitative Data Analysis   ❖  293

approaches to target particular sources of information (or because the sampling frame
is unknown). In any case, qualitative data is basically exploratory, and the richness of
the data prohibits a strict quantification of every possible input. Even with probability
sampling, the open-ended form of data collection would make each individual unique
and no longer equally likely to respond to any one particular issue.
In Mary’s case, we saw that she has a genuine concern about the generalizability of
her results. Even if she interviewed every volunteer in her population, how could she
be sure she gave every person an equal opportunity to address something raised by
another person? With that limitation in mind, how many volunteers should she be
interviewing? Is five enough? Should she strive to interview 10, or 20, or more?
Determining the appropriate number of participants for a qualitative study is
not as exact as the quantitative description of 95% confidence for a sample from a
population with a normal distribution on the item being measured. Instead, qualita-
tive researchers, particularly with interviews as in Mary’s case, look for a data satu-
ration point. This is a point when no new information is being obtained as more
individuals are interviewed; the variety of arguments is exhausted. Some qualitative
researchers agree that saturation occurs within a homogenous population with
something like 25 to 35 respondents (Delbecq, van de Ven, & Gustafson, 1975;
Seidman, 1991).

Interviewer Effect
Many qualitative data collection approaches involve in-person contacts between the
researchers or interviewers and the study participants. Researchers should be mindful
that this personal contact can affect the quality of the data. On the one hand, in-
person contact allows a researcher to probe and get more in-depth information, and
in that sense, the interaction may help to obtain better data. On the other hand, the
presence of the researcher can affect what and how participants share information. In
Mary’s case, for example, all volunteers know that she works for Health First and that
she knows the employees and other volunteers who work there. Naturally, they may
try to please her by highlighting more positive experiences and appraisals of others.
In face-to-face contact, participants could find it harder to be direct and critical.
Qualitative researchers need to bear in mind a general tendency for people to offer
socially desirable comments in an interview situation. (We observed a version of this
phenomenon in relation to any research data collection with the example of the
Hawthorne effect in Chapter 6.)
In Mary’s case, as a manager at Health First in charge of volunteers, she must also
consider that she is not independent of the situation she is researching. This could
definitely affect what people want to share with her. For example, what if volunteers
consider Mary herself as a problem? What if they believe many volunteers left the
organization because they didn’t like Mary’s approach, or they didn’t get along with
Mary? In a practical research setting in public or nonprofit organizations, this feature
of bias in data collection could be a real cause of concern. It may be appropriate to hire
an external consultant as an interviewer.
294  ❖  SECTION II  DATA ANALYSIS

The skills of the interviewer also affect the quality of the data. To conduct good
interviews requires skill and practice. Researchers who do not have much experience
in interviewing are advised to conduct a few pilot interviews for practice. Of course, as
with any survey approach to data collection, a pilot interview is likely to improve the
process even for a skilled interviewer.

Subjective Nature of the Analysis


Coding text and interpreting themes in qualitative data analysis rely heavily on a
researcher’s subjective judgments of the meanings found in the data. What an inter-
viewee said can be misinterpreted or misunderstood by the researcher. This is the
principal reason why we recommended earlier that coding the text should be repeated,
and thematic analysis should be iterative. Due to the prominent role of subjective
interpretation in qualitative data analysis, the researcher should keep a heightened
awareness of the potential for researcher bias.
To some extent, it is in our human nature to find what we want to find. In Mary’s
case, for example, if she approached her research with a strong conviction that people
should volunteer to serve the public or contribute to the community, she might find
those things because she is looking for them. She might miss the fact that a volunteer
is really talking about convenience or fun. Or perhaps she would not find what she was
looking for, because the comments do not reach her standard of commitment. A
researcher needs to be constantly vigilant against overinterpreting, or missing all
together, what exists in the text. In qualitative data analysis, the researcher is the instru-
ment of analysis, and special care must be taken when applying subjective judgments
on what the data mean.
Some qualitative research experts recommend that those who conduct qualitative
data collection and analysis should create a list of possible assumptions and biases
before starting the study. This will make the researcher more aware of personal incli-
nations in defining meaning in the data (e.g. Denzin & Lincoln, 2005; Lofland, Snow,
Anderson, & Lofland, 2006; Patton, 2002).
Bias is not the only issue here, though. The researcher may simply lack the expe-
rience to understand the issues raised by the respondents. This is another reason why
it is important to read through all of the transcripts at least once before starting the
coding process. The cumulative experiences of all the respondents together may finally
strike a chord of recognition, and the researcher will then see an idea in the data that
was previously missed.

Chapter Summary
This chapter introduced key aspects of qualitative data analysis. First, we briefly reviewed key
differences between quantitative data and qualitative data, and basic qualitative data collection
methods covered in the previous chapters. We then discussed how to prepare qualitative data for
analysis. Two approaches for qualitative data analysis were introduced. One approach is called
Chapter 14  Qualitative Data Analysis   ❖  295

thematic analysis. In this approach, researchers code the data identifying recurring
themes that answer the research question. Another approach to qualitative data
analysis is to convert the qualitative data into numerical data by counting how many
times the idea was expressed by the interviewees. We concluded the chapter by
discussing three issues related to qualitative data collection and analysis. In qualitative
data collection, researchers need to pay attention to issues related to study participant
selection and interviewer effect on the interviewee. Also, the researchers need to be
mindful of the subjective nature of qualitative data analysis and take steps to avoid
researcher bias in the data analysis.

Review and Discussion Questions and Exercises


1. Discuss when you should consider using a qualitative research approach.
2. Compare the advantages and disadvantages of quantitative and qualitative data
analysis approaches.
3. Discuss what considerations Emily needs to make if she were to conduct focus
groups with the employees.
4. Select a topic of interest and interview your classmate for about 15 to 30 minutes
and audio record them. Transcribe the interview. How long did it take to transcribe
the interview? Anything you noticed while transcribing the interview?
5. Analyze the transcription using the thematic analysis approach discussed in this
chapter. How long did it take for you to analyze the data? Anything you noticed
while analyzing the data?

References
Creswell, J. W. (2007). Qualitative inquiry & research design: Choosing among five approaches.
Thousand Oaks, CA: Sage.
Delbecq, A. L., van de Ven, A. H., & Gustafson, D. H. (1975). Group techniques for program
planning: A guide to nominal group and delphi processes. Palo Alto, CA: Scott Foresman.
Denzin, N. K., & Lincoln, Y. S. (2005). The SAGE handbook of qualitative research (3rd ed.).
Thousand Oaks, CA: Sage.
Denzin, N. K., & Lincoln, Y. S. (2011). The Sage handbook of qualitative research (4th ed.).
Thousand Oaks, CA: Sage.
DeWalt, K. M., & DeWalt, B. R. (2002). Participant observation: A guide for fieldworkers. Walnut
Creek, CA: AltaMira.
Giddens, A. (1990). The consequences of modernity. Stanford, CA: Stanford University Press.
Lofland, J., Snow, D. A., Anderson, L., & Lofland, L. H. (2006). Analyzing social settings a
guide to qualitative observation and analysis (4th ed.). Belmont, CA: Thomson
Wadsworth.
Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: An expanded sourcebook.
Thousand Oaks, CA: Sage.
Patton, M. Q. (2002). Qualitative research and evaluation methods. Thousand Oaks, CA: Sage.
296  ❖  SECTION II  DATA ANALYSIS

Seidman, I. E. (1991). Interviewing as qualitative research: A guide for researchers in education


and the social sciences. New York, NY: Teacher’s College.
Trochim, W. M. K., & Donnelly, J. P. (2007). Research methods knowledge base. Mason, OH:
Thomson Custom.
Ziman, J. M. (2000). Real science: What it is, and what it means. New York, NY: Cambridge
University Press.

Key Terms

Codes 286 Researcher Bias  294 Thematic Analysis  286


Data Saturation Point  293 Socially Desirable  293

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional
learning tools:

•• Data sets to accompany the exercises in the chapter


SECTION III:

Summing Up:
Putting The
Pieces Together
15
Writing Reports

Learning Objectives 299
Data Collected and Analyzed—Then What? 299
Emily’s Case 299
Jim’s Case 300
Mary’s Case 301
Key Issues When Writing Reports 302
Understanding Your Audience 302
Academic Style Reporting Versus Nonacademic Style Reporting 302
Key Components of the Report 304
Abstract or Executive Summary 304
Table of Contents 305
Introduction 305
Review of the Literature or Project Background 305
Methods 305
Results 307
Discussions and Conclusions or Recommendations 308
References 308
Notes 308
Appendix 309
Alternative Forms of Reporting 309
Chapter Summary 310
Review and Discussion Questions and Exercises 310
Key Terms 311
Table 15.1 Key Components of Reports 304

298
Chapter 15  Writing Reports  ❖  299

Learning Objectives

In this chapter you will

1. Understand the role of reports in the research process


2. Understand the difference between academic and nonacademic reports
3. Understand the importance of tailoring the report to your target audience
4 Learn key components of a research report
5. Understand different forms of reporting

Data collected and Analyzed—Then what?

Emily’s Case
Nine months had passed since Emily, HR director at the city of
Westlawn, received the grant award from the Community
Foundation to conduct a diversity training for city employees. She
and her team worked diligently to implement the training. They
also conducted a research project to evaluate the training as a
requirement of the grant. Now it was time to write a final report.
“I should meet with Ahmed, the foundation’s program officer,” Emily thought. “I’ve gone
through the detailed instructions the foundation provided, but it won’t hurt to meet with
Ahmed again now that we’re done and see if he has any tips.”
At the Community Foundation office, Ahmed greeted Emily with a big smile, “Glad to
see you again, Emily.” Once settled in his office, he beamed again, “I’m glad you wanted to
meet. I’d like to hear about my favorite project.”
Emily was surprised. His favorite project? She hoped that was a good thing.
“Sure. I brought some materials you can look at.”
Emily opened a big binder of materials she compiled from the training and the research
and passed a few things over the desk to Ahmed. He listened attentively while Emily
explained what she and her team did for the training, data collection, and analysis. She
also showed him Leo’s recent conference presentation.
“Very impressive, Emily. You’ve done a great job,” Ahmed remarked.
Emily felt relieved. “The reason I wanted to see you is to help me get a better feel for
what you want for the final report. The requirements recommended ‘academic style report-
ing,’ and I guess I’m not sure what that means exactly.”
“Of course.” Ahmed swiveled to the bookshelf beside him and grabbed a couple of spiral-bound
booklets. “These are some recent reports from other projects. They may give you some idea.”
“Wow—these are pretty big,” Emily said, as she thumbed through the booklets.
Ahmed smiled, “It doesn’t have to be this size. But looking at the size of your binder
there, I won’t be surprised if you give me a final report a little bigger than these.”
300  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

“I see you need a fairly detailed description of the project, plus all the statistical results,”
Emily noted as she looked through one of the booklets. “I have to say, if this is academic
style, my council members would not want to read this.”
Ahmed chuckled, “I understand. You may want to create a different type of report for
your city manager, elected officials, and the citizens.” Ahmed then cited some specific
requirements for information in the appendices and the financial reports.
Emily thought, “I’m glad I decided to meet with Ahmed in person. This report writing
requires a lot of thinking. More than I originally thought.”

Jim’s Case
“Jim, do you have a moment?” Chief Chen stopped Jim, deputy fire chief at the
city of Rockwood, and beckoned him into his office. “I’ve been reading the reports
you gave me for the response-time analysis and the alternative service delivery
study.”
Jim submitted the reports to Chief Chen a couple of weeks earlier. He was
wondering when he would hear back. Jim was relieved that the chief sounded
happy and satisfied.
“First off, thanks for spending a lot of time on these studies. I appreciate that
you took them seriously,” Chief Chen said. “You produced a detailed report on your
analyses.” The chief pulled out a report with the picture of a fire engine on the
cover and continued. “I understand the response-time analysis is mainly for the accredita-
tion. Is that right?”
“Yes, sir. This will eventually be compiled as a part of the accreditation self-study report.
So it’s organized using the format they required,” Jim explained.
“That’s what I thought. OK then.”
Chief Chen pulled out the second report. This one was thicker, with a plain cover and a
big bold title: Alternative Service Delivery Study.
“Actually,” Jim thought, “Lavita did most of the work compiling this one.” He remem-
bered when Lavita showed him the report draft; he thought the overall tone was a little too
academic, and the cover was boring, but the key information was there. So he didn’t make
a lot of changes before he submitted it to Chief Chen.
“This one had a lot of good information in it, but it’s a little dense,” the chief said, flip-
ping the pages. “I really like it, though.”
Chief Chen had a reputation for being numbers oriented, so Jim believed him.
Then Chief Chen got to the point he intended to make. “I’d like you to present the results
of these two studies to the mayor and the city council. Can you consolidate the results from
both studies into one short executive summary? No longer than two or three pages.” The
chief smiled when he saw Jim’s eyes widen. “You know the drill. Also, prepare to give a
presentation, so put together some slides.”
“No problem, chief,” Jim replied. “I can give you something next week to review.” Indeed,
Jim knew the drill. He had done short reports for the mayor and city council members
many times.
Chapter 15  Writing Reports  ❖  301

As Jim walked to his office, he thought, “I need to think carefully how many statistical
results to include in the summary and how to visually present them in the slides.” This was
the kind of challenge he enjoyed. Once at his desk, he started right away on the executive
summary.

Mary’s Case
Hours elapsed, days passed, and time blurred as Mary completed her
coding and thematic analysis of her volunteer interviews. Mary, a
program manager at Health First, was finally ready to write up the
results. She felt comfortable with the themes in the data and knew
the story she needed to tell. After writing awhile, she took a break
and chuckled to herself, “This story should win a prize. Yuki would be
surprised to know how much I’m enjoying doing a qualitative study.”
Her phone rang.
“Hello? Mary? This is Ruth. Is this a good time?”
Mary knew Ruth pretty well. She was a long-time Health First volunteer and was one of
the interviewees in Mary’s study. Ruth helped a lot with the snowball sampling; she called
Mary several times on her own initiative to offer new names for interviews. Mary immedi-
ately thought Ruth must be calling to recommend another volunteer for an interview.
“Oh, sure, Ruth. How can I help you?”
“It’s about the interviews,” Ruth said, a little nervously.
“I’m done with the interviews, Ruth. I’m writing up the results now.”
“Actually, I was calling about your report on the interviews,” Ruth countered. “I’ve heard
from some of the volunteers that they are a little worried about some things they told you.”
“Worried?” Mary wondered what Ruth was talking about. She got curious.
“They are concerned that other program managers who read your report might get
offended by what they said,” Ruth explained. “I know you assured us of confidentiality and
said you would not put our names in the report. People know that. But they are worried
that people might be able to guess who said what by reading the context.”
Mary’s gut reaction was to tell Ruth there was nothing in what she heard in the inter-
views that might offend the program managers. There was no reason to be concerned. In
any case, no names. How could anyone know? But she knew she needed to address these
concerns.
“OK, I see what you mean. How about this? Once I finish writing the report, I will send
the draft to everyone who participated in the interviews before I send it to our executive
director. All of you can review the draft and tell me if there’s anything you are concerned
about. I can make changes before I send it out. Do you think if I tell everyone I interviewed
that I am going to do this it will make people feel more comfortable?”
“That’s a great idea,” Ruth responded, sounding relieved. “I’m glad I talked to you, Mary.
I know we all signed that consent form, but I just didn’t want your interviews to become an
issue in any way.”
“Ruth, I really appreciate your telling me this. I really do,” Mary closed.
302  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

After hanging up the phone, Mary reflected that there was always some new twist in
this project, something new to learn at every step. Who would have thought that people
would be so worried about offending others by what they said in an interview?
“Oh well,” she said out loud.
When she got back to her page on the computer screen, she paid more attention to
keeping the identities of her interviewees unrecognizable in the report.

Key Issues When Writing Reports


In this chapter we discuss reporting—the last step of the research flow. Writing a
research report is an important part of the research process. More often than not,
results are reported in a written format. There are many formats. The main purpose of
the report is to convey your findings and their implications to readers, and there are
different target audiences with different capacities, interests, and purposes. To maxi‑
mize the impact of your research, it is important to pay attention to this final step in
the process and determine how you want to present your results.

Understanding Your Audience


Before starting to write your report, you need to consider who the report is for and what
they will be looking to find. Identifying the main target audience will determine the for‑
mat, the tone, and the style of the report. In Emily’s case, her main target audience is the
Community Foundation. As Emily discovered, the Community Foundation is looking for
detailed coverage of the study results written in an academic reporting style. From the
sample reports Emily was shown by Ahmed, it looked like there was no limit on the over‑
all length of the report. On the other hand, in Jim’s case, he needed an executive summary
for the city council: something short, to the point, and written in plain language. In an
executive summary, it is better to reduce academic and technical terminology.
Frequently, an applied research report in the public and nonprofit sectors will have
multiple audiences. In Emily’s case, although the Community Foundation is the main
stakeholder, it is not the only entity interested in knowing the results of her diversity
training evaluation. The mayor, other elected officials, and the city manager will also
be interested in the results, as well as some city employees and residents of the City of
Westlawn. Jim and Mary also have multiple audiences. Jim’s audience includes the
accreditation organization, Chief Chen, elected officials, and citizens. Mary’s report
will most likely be read by the executive director, board members, program managers,
and the volunteers. When there are multiple audiences, a researcher has to think about
writing multiple reports in different formats, all based on the same study.

Academic Style Reporting Versus Nonacademic Style Reporting


Academic journals and reports require the writer to follow a specific format (Beins & Beins,
2008; Rocco & Hatcher, 2011). Style manuals document the specifications in minute
Chapter 15  Writing Reports  ❖  303

detail. Common styles in academic writing are APA style by the American Psychological
Association (American Psychological Association, 2011), and Chicago style by the
University of Chicago Press (University of Chicago Press, 2010). Style specifications include
how to report statistical results and formats for tables and charts. For example, in Emily’s
case, academic style reporting will require her to report the results of her chi‑square analy‑
sis in the following manner, so readers will get all the information they need:

A two-way contingency table analysis was conducted to evaluate whether


those who attended the recent diversity training are likely to support the idea
to make the diversity training a requirement for all employees. The two vari‑
ables were “diversity training attendance” with two levels (yes, no) and the
response to “should the diversity training be a requirement” with two levels
(yes/no). The diversity training attendance and their attitude whether to make
the training a requirement or not were found to be not significantly related.
Pearson χ2 (1, N = 235) = 1.34, p = ns.

When reporting to elected officials or citizens, academic style reporting is not


expected and probably not recommended (Dysart-Gale, Pitula, & Radhakrishnan,
2010). Although some elected officials and citizens may have knowledge of research
methods and statistical analysis, it is best to assume that the majority will not want
to read a report of the chi-square analysis in the academic style illustrated above.
For a lay audience, reporting the same chi-square analysis could be written in the
following manner:

There is an ongoing discussion among local government HR professionals on


whether the diversity training should be required for all employees to attend.
Using our survey response, we examined if attendance at the recent diversity
training was associated with a person’s views on whether to make the diversity
training a requirement. The result of the statistical test, using chi-square anal‑
ysis, did not suggest any generalizable pattern in the relationship between
training attendance and opinions on making the training a requirement. (The
detailed result of the statistical analysis can be found in the attachment.)

A research report for elected officials and citizens is not the time to show off your
analytic prowess. Understandably, after putting in so much time in the statistical anal‑
ysis for your research, it might be tempting to explain the study results using mathe‑
matical formulae and tables of numbers, but the readers need to be engaged to get your
point, and the finer details will be lost all together if they don’t read it. Worse, a lay
audience could interpret the use of academic style in your report as a way to intention‑
ally hide information from them. For some audiences, the use of technical phrases and
statistics may remind them of the phrase used by Mark Twain and others, “There are
three kinds of lies: lies, damned lies, and statistics.” You can always provide additional
statistical information as an appendix so those who are more inclined to examine the
results of the statistical analyses have the opportunity to review them.
304  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Key Components of the Report


Reports in an academic style need to follow a specific format. Nonacademic reports
usually have more flexibility in the format, though there may be some suggested tem‑
plates that each organization uses. Academic reports document the research flow
(see Table 15.1), including a problem statement, a theoretical framework based on a
review of the literature, data collection and analysis methods, results, and discussion
(Ridley, 2012; Rocco & Hatcher, 2011). A nonacademic report may include similar
elements for context, but some elements such as a formal problem statement, theoret‑
ical framework,  and an extensive review of the literature could be omitted, and the
order is flexible. More emphasis is placed on the readability and utility of the study
results (Bogg, 2012). Table 15.1 shows the key components in reports.

Abstract or Executive Summary


Academic journals typically require an abstract, which is a brief summary of the key
points of the study. An abstract identifies the research topic and research questions and
typically describes the study participants, data collection and analysis methods, results,
and implications of the research. Although different journals specify different lengths
for the abstract, it is usually one paragraph with somewhere between 150 to 250 words.
Nonacademic reports usually require an executive summary at the beginning of
the report. The executive summary includes information similar to the abstract in an
academic journal, though it is usually longer and ranges from a few paragraphs to a
couple of pages. Enough detail and information should be provided in the executive

Table 15.1  Key Components of Reports

Key components of reports


 1. Abstract (or Executive summary)
 2. Table of contents
 3. Introduction
 4. Review of the literature (or Project background)
 5. Methods
 6. Results
 7. Discussion and conclusion (or Recommendation)
 8. References
 9. Notes
10. Appendices
Chapter 15  Writing Reports  ❖  305

summary so the reader can get a general understanding of the purpose and the out‑
come of the study without having to read the full report.

Table of Contents
A table of contents is not required for academic journal articles. For longer academic
and nonacademic reports, it is helpful for the reader to have a navigation device at the
front of the report which a table of contents provides.

Introduction
Almost all reports have some sort of introduction. The introduction states the purpose
and importance of the research. This section also provides a place to give the reader a
road map for the organization of the research process and the report.

Review of the Literature or Project Background


It is important to summarize in the report what you learned in your literature review.
This will help the reader understand the context for your research question and the
value of your research. Nonacademic reports typically call this a background section,
explaining what influences shaped the selection of the topic. In public and nonprofit
sectors, this section will probably refer more to events and interests with people and
programs than to peer‑reviewed literature. For example, in Emily’s case, in addition to
the results of the literature review she conducted on diversity training, she may include
policy information on the change in demographics of the city of Westlawn, past efforts
in the city to address the issue of diversity, and some current challenges the city is facing.

Methods
In the methods section, you describe the study participants, what you measured, how
you collected the data, and how you analyzed the data. The reader will need to see
specific details to understand the study results and be confident in their validity. This
will be more important in an academic report, but even in a nonacademic report, you
should consider what will help the reader understand who and what you studied and
how you arrived at your results. The following details refer to essential elements in the
methods section for an academic report.

1. Study participants. In the description of study participants, you will want to


include any demographics or other population characteristics related to the purpose
of the study that influenced your selection. You will also need to describe how you
selected the sample from the population, the reason for selecting a particular sam‑
pling method, and the size of the sample. Describing your study participants will also
provide an outline of your research design by identifying the comparison groups in
the research.
306  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

2. Measurement. For the description of what you measured, specify how the concepts
you were studying were operationalized. This will involve the measurement tools used
and possibly how information was recorded. In Jim’s case, for the alternative delivery
model study, the reader will want to know how he measured his outcome variables for
cost and mortality and something about the sources of the information. In Emily’s case,
she will need to describe the survey questions that represented the concepts of cul‑
tural competence and workplace conflict and possibly provide a list of the exact ques‑
tions, the response categories, and how the questions were combined into one measure.
For Mary, who did qualitative interviews, she should describe the standard questions
she asked and explain how these questions were developed to elicit responses to her
research questions.

3. Data collection. Here you will describe how, when, and where data were col‑
lected. Explicitly address the research design and describe the comparison groups,
if this was not done earlier in the description of the study population. The reader
needs to be assured that the data collected were exactly the same for all groups in the
study and measured the same thing. Different details will be expected here, depend‑
ing on the data. In Jim’s case, both research projects involved data that were already
recorded during operations at the individual fire stations; he should say something
about how the administrative records were obtained and processed. For a survey, as
in Emily’s case, you will want to include details on how it was administered, espe‑
cially if the sample represents a population that is difficult to reach. The reader of
Emily’s research report may have more confidence in the coverage of the survey if
she includes information about the pre-and post-training events when she gathered
all the study participants together to give them the survey. These details can also
alert the reader to potential sources of bias. In Mary’s case, it will be appropriate to
describe how she contacted the interviewees and where the interviews took place.

4. Data analysis. How the data were analyzed starts with the data source. It may help
the reader to know how the collected data were composed in a specific data set. In our
case examples, Jim can say the data were compiled in Excel files; Emily can say the
survey data were recorded in SPSS; and Mary can say she transcribed the interviews,
analyzed printed copies with highlighters and notes in the margin, and completed the
coding by using a software program. For quantitative research, you will want to describe
the level of measurement for each variable in the analysis, whether it is categorical or
continuous, independent or dependent, and how these features fit with the type of sta‑
tistical test employed in the analysis. Identify the test used and the applied confidence
level. For any study, you will also want to describe issues with the integrity of the data,
particularly how you validated the data and dealt with missing data.
This looks like a lot to include in the methods section, and it is. It can be challeng‑
ing to crisply cover the essential details for a process that took you a long time and
numerous trials to complete. An academic report will probably include more specifics
but has the advantage of being able to run through terms like experimental design or
probability sampling without explanation. In a nonacademic report, you will need to be
Chapter 15  Writing Reports  ❖  307

careful to provide enough information to give the reader confidence in the results and
also make sure any technical terms you use will be understood. The purpose in the
methods section is to describe everything you need to know about the process of the
research, so when you reach the results section you can state very briefly and clearly
what you found, without digression.

Results
Make sure to report the results of your research concisely and accurately. This is the
section readers may flip to first, and you need the message to be direct and readily
accessible. An important thing to remember here is to refrain from discussing the
implications of the results. This can be challenging, as it is sometimes hard to distin‑
guish a result from corollary information or an interpretation of a result, but you will
definitely improve your results section by culling all commentary. You will get a chance
to add these things later in the discussion section.
In presenting qualitative research results, it is appropriate to include direct quota‑
tions from the data to illustrate the points. As we saw in Mary’s case, you will need to
make sure that the confidentiality of the interviewees is protected.
The results section is the place to use graphs, figures, and tables. These tools can
be useful as a way to present complex information and highlight group comparisons.
Tables are good for summarizing large amounts of information. Graphs and figures
provide a visual illustration of key results. When creating graphs, figures, and tables,
pay attention to the following points:

1. In academic reports, check the style requirements for graphs, figures, and tables
and follow the specifications.
2. Be explicit with the scales you use. Label the axes on charts. Make sure column
heads for tables make sense, and spell out abbreviations in notes. Be consistent
in the scales when making comparisons.
3. Do not simply cut and paste your raw output from SPSS or Excel. Construct
your own tables to highlight key information.
4. Number all of the graphs, figures, and tables, and use the numbering to refer‑
ence them in the narrative. Label each graph, figure, and table with a brief
description of what it represents.
5. Well-constructed graphs, figures, and tables will stand by themselves
and convey complete information without the reader needing to read the
narrative.

The results section is another place where attention needs to be given to the
difference between academic and nonacademic reports. Using scientific notations—
such as “Pearson χ2 (1, N = 235) = 1.34, p = ns”—is appropriate and expected in
academic reports. In nonacademic reports, you have to be tactful how you use these
308  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

scientific notations. Some audiences may appreciate the inclusion of such informa‑
tion, but you should highlight the sense of the test results in lay language first, and
perhaps put the numbers in a subordinate position in parentheses, a footnote, or
appendix.

Discussions and Conclusions or Recommendations


In this section, you need to first remind the reader of your research questions and
describe in a general way how the study results contribute to the objective of the
research. You may then want to address different aspects of the results and discuss the
implications. In academic reports, especially journal articles, it is important to discuss
the implications of your research to theory. In nonacademic reports, the emphasis will
probably be more on the implications for policy and practice.
Nonacademic reports often provide specific recommendations derived from the
results. You may choose to create a separate recommendations section, especially if you
have an extensive list of recommendations. When making your recommendations, be
very clear and specific about the course of action. Suggestions should be based upon
the information in the report and not stray beyond it.
You can also discuss the limitations of the research in this section. In any project,
unforeseen circumstances arise that limit the scope of the research. For example, you
might have had difficulty obtaining data, or sample sizes might have been smaller than
you would have liked. Limitations offer a frank acknowledgment of the roadblocks you
encountered in the research process or issues you did not resolve. Transparency is
important. This information can help readers interpret the meaning of your research
results and may help researchers or practitioners in future projects.
You can also discuss any ideas you have for future research. This can provide leads
for other researchers and practitioners toward areas you judge to be important after
experience with your project.

References
The reference section gives a list of all the resources mentioned in your report. These
can include articles, books, interviews, newspapers, organizational policies, and data
sets. If a resource is mentioned within the narrative of the report, it is important that
the full citation appear in the reference section. Style guides specify different ways to
cite the references in the text and how to format the reference. In academic reports, the
reference list is restricted to sources mentioned in the text. In nonacademic reports,
references are typically not cited in the text or may appear in footnotes. The reference
list may be used to offer additional materials for suggested reading.

Notes
If you included notes in your report indicated by superscript numbers for supplemen‑
tal information, you can add the list of notes as endnotes in your report. Alternatively,
Chapter 15  Writing Reports  ❖  309

you can also incorporate these notes as footnotes. In academic journals, whether to use
endnotes or footnotes is determined by the journal.

Appendix
The appendix section includes detailed supplementary information that is important
to a reader but would be distracting if included in the main body of the report.
Examples of the information appropriate for an appendix include the following: a
detailed explanation of the statistical analysis, a copy of a survey instrument, a copy of
an interview guide, detailed figures, tables, or diagrams, and policy documentation. As
an example, in Jim’s case, he might want to append a description of the accreditation
standards for response time. In Emily’s case, she could append the survey question‑
naire. In Mary’s case, she might append her interview guide or the list of codes in her
thematic analysis.
All of the information included in the appendix should be relevant to the research
you conducted and provide specific information that could be helpful for the reader.
Information in the report, however, should be able to stand on its own without the
appendix. The reader should not be required to turn to the appendix to understand
elements in the report.

Alternative Forms of Reporting


Research results are commonly communicated in alternate formats to reach a variety
of audiences. A written report may be only one mechanism you use to communicate
your results. The most common alternative reporting formats include oral and slide
presentations and websites. Oral presentation can include public testimony to elected
officials, as we saw in Jim’s case, or a conference presentation, as we saw in Emily’s case.
Another venue for oral presentation could be in a workshop or training as we might
find appropriate in Mary’s case where the research results could inform program man‑
agers on how to recruit and retain more volunteers.
Visual aids help in an oral presentation, and electronic slide shows are a current
standard. This is not the place to say too much about graphic design, but it is important
to note that you should not try to pack all of your results onto slides in a dense array.
Keep the font size large enough to read from a distance and tables of numbers simple
enough to readily grasp. Bullet points should be short, and they do not need to copy
everything you intend to say. Attractive design is important to help engage interest, but
be aware that fancy graphics and animation can also be more distracting than useful
(Tufte, 2006).
In some oral presentations, low-tech visual aids such as flip-charts can be more
effective than slides. In a workshop setting, flip-charts can facilitate interactive brain‑
storming with participants. A handout that complements slides or other aspects of a
presentation can also be useful. With a handout, the audience can review the informa‑
tion at their own pace, use it to add their own notes, and take it with them.
310  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Websites offer a way to widely disseminate your research results. You might start
by uploading your research report in a link at your organization’s website. You can
also design a Web page to summarize the research. Other options include posting a
podcast or other audiovisual material. Interactive features can be incorporated to
solicit feedback.
Chavkin and Chavkin (2008) recommend that websites have the following features
when used for disseminating research:

•• Make it interactive and allow audience input


•• Make the materials downloadable
•• Focus on targeted audiences
•• Include links to other resources
•• Provide other relevant publications and resources
•• Provide online technical assistance
•• Provide timely and regular updates
•• Make the interface user-friendly

Additional social media options are fairly new and are growing in popularity. You
might consider these options as a suitable way to further share your research results
with a broader audience (Mergel & Greeves, 2013).

Chapter Summary
The report writing process is an important part of the overall research process. A report draws
together what you know about the topic and what you learned from your research. It is important
to dedicate enough time and resources to reporting so the effort expended in the research itself
will receive attention. The report needs to follow a format and style that will be most likely to
engage the target audience. Academic reports require a specific format and style. Nonacademic
reports are more flexible. Start by identifying your key audience. Seek feedback from your peers
and stakeholders during the writing process. If done correctly, the presentation of the final report
product can be a very rewarding experience that culminates your research experience. (For more
information on report writing, see Bogg, 2012; Emerson, 2009; Polonsky & Waller, 2011; Rocco &
Hatcher, 2011; Thomas & Hodges, 2010; Turk & Kirkman, 1989.)

Review and Discussion Questions and Exercises


1. Take one or two of the data analyses you conducted in the statistical analysis exercises in the
previous chapters. Write two short reports: one for an academic audience and another for a
nonacademic audience.

2. Discuss the kinds of considerations you made in drafting your academic and nonacademic
reports.
Chapter 15  Writing Reports  ❖  311

3. Find one academic report and one nonacademic report. Compare and discuss the similarities
and differences between the two reports.

4. Find two nonacademic reports and review them. Are they both similar in their structure and
headings? Are you able to easily identify the intended audience? Do you believe the headings
easily convey the essence of the research to the intended audience?

5. What kinds of advice and recommendation would you give to Jim who is preparing for the oral
presentation for the city council? List your recommendations.

6. What kinds of considerations do you have to make when posting your research report on the
website or using social media to disseminate it?

References
American Psychological Association. (2011). Publication manual of the American Psychological Association
(6th ed.). Washington, DC: Author.
Beins, B., & Beins, A. (2008). Effective writing in psychology: Papers, posters, and presentations. Malden, MA:
Blackwell.
Bogg, D. (2012). Report writing. Maidenhead, UK: Open University Press.
Chavkin, N. F., & Chavkin, A. (2008). Promising website practices for disseminating research on family-
school partnerships to the community. School Community Journal, 18(1), 79–92.
Dysart-Gale, D., Pitula, K., & Radhakrishnan, T. (2010). Improving professional writing for lay practitioners:
A rhetorical approach. Transactions on Professional Communication, 53(3), 293–303.
Emerson, L. (2009). Writing guidelines for business students (4th ed.). South Melbourne, VIC: Cengage
Learning Australia.
Mergel, I., & Greeves, B. (2013). Social media in the public sector field guide: Designing and implementing
strategies and policies. San Francisco, CA: Jossey-Bass.
Polonsky, M. J., & Waller, D. S. (2011). Designing and managing a research project: A business student’s guide
(2nd ed.). Thousand Oaks, CA: Sage.
Ridley, D. (2012). The literature review: A step-by-step guide for students. London, UK: Sage.
Rocco, T. S., & Hatcher, T. (2011). The handbook of scholarly writing and publishing. San Francisco, CA:
Jossey-Bass.
Thomas, D. R., & Hodges, I. D. (2010). Designing and managing your research project: Core knowledge for
social and health researchers. London, UK: Sage.
Tufte, E. R. (2006). The cognitive style of PowerPoint: Pitching out corrupts within. Cheshire, CT: Graphics
Press.
Turk, C., & Kirkman, J. (1989). Effective writing: Improving scientific, technical, and business communication.
London, UK: Taylor & Francis.
University of Chicago Press. (2010). The Chicago manual of style (16th ed.). Chicago, IL: Author.

Key Terms
Abstract 304 Executive Summary  304 Oral Presentation  309
Academic Report  304 Nonacademic Report  304
312  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter


16 ❖
Using Research
Methods for
Continuous
Improvement
Program Evaluation and
Performance Measurement


Learning Objectives 314
Using Research in Program Evaluation and Performance Measurement 314
Emily’s Case 314
Jim’s Case 315
Mary’s Case 316
Program Evaluation and Performance Measurement as Research 316
Difference Between Program Evaluation and Performance Measurement 317
Ty and Mary at the Conference 318
Key Issues in Program Evaluation 319
Types of Evaluation 319
Key Issues in Performance Measurement 322
Types of Performance Measurement 323

313
314  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Who Conducts Program Evaluation and Performance Measurement? 323


Ethical Considerations in Program Evaluation and
Performance Measurement 324
Practitioners Becoming Researchers: Making Sense of It All 325
Round Table Discussion at the Conference 325
Chapter Summary 330
Review and Discussion Questions 330
Key Terms 332
Table 16.1 Matrix of the Type of Evaluation and Examples 321
Table 16.2 American Evaluation Association Guiding
Principle for Evaluators (2004) 325


Learning Objectives

In this chapter you will

1. Understand how research is an integral part of program evaluation and per-


formance measurement
2. Learn the differences between program evaluation and performance measurement
3. Learn different approaches to program evaluation and performance measurement
4. Understand ethical considerations in program evaluation and performance
measurement

Using Research in Program Evaluation and Performance


Measurement

Emily’s Case
Emily was surprised to hear Ahmed’s cheerful voice over the phone.
“I have a favor to ask.”
“OK. I’m game,” Emily laughed.
“Every year, the foundations in the region cohost a conference to provide
opportunities for our grantees to network and help their capacity building,”
Ahmed started. “It’s also an opportunity for us to feature some of the projects we
fund. That’s where you come in. This year’s conference theme is ‘Program Evolution
and Performance Measurement: Practitioner as Researcher’—and I would like to
invite you and your team to participate.”
Chapter 16  Using Research Methods for Continuous Improvement  ❖  315

Emily could tell Ahmed was excited about this conference. She felt honored to be invited
to participate.
“Thank you, Ahmed. That’s an incredible opportunity for us,” Emily replied without
hesitation.
“Great! I’m going to put you in the program evaluation track. Another group in your
session is from JPB Research. They do a lot of evaluation of the projects we fund.”
“JPB Research! That’s one of the major research firms, right?” Emily was astounded and
a bit intimidated.
”Yes, they are, but don’t worry,” Ahmed responded soothingly. “They will focus on the role
of an external evaluator. Your perspective will be as a practitioner who has done a program
evaluation. You won’t be competing.” Ahmed paused, and when Emily did not respond, he
added, “Your project exemplifies the conference theme perfectly. That’s why I’m asking you
personally to help us out. You just have to be yourself.”
Emily then asked, “I would like to bring Leo, our graduate student intern, with me, if you
don’t mind. He will be graduating soon and will be on the job market. So this would be a
great opportunity for him to network,” said Emily. “Of course,” Ahmed replied and contin-
ued, “Actually, I was going to ask you to bring him. We are also planning a facilitated
round table discussion session for practitioners and new professionals to share their experi-
ence. It will be nice to have his perspective in the discussion.”
Right after getting off the phone with Ahmed, Emily picked up the phone and called Leo
to deliver the exciting news.

Jim’s Case
Jim pushed open the heavy glass door of the trendy bar and saw it
was crowded inside. He took off his rain jacket and looked around
for Ty. The air was heavy and vibrated with chatter. He reflected on
how he knew this place before it was remodeled. The room was more
plain then, a hangout for cops and firefighters.
When Ty spotted Jim shuffling between the occupied tables
toward him, he closed his iPad and nodded a greeting. They
shook hands. “Haven’t seen you for awhile,” Ty said. “I have been
hearing about you, though, from Lavita. Sounds like a lot is going
on at the station.”
“Yeah,” Jim agreed with a skeptical grin as he sat down. A waiter appeared, and Jim
ordered a beer. “Chief Chen is very systematic. He likes to be completely informed before he
makes a decision, and it looks like I am now his research guy. He loved the results from
these last two projects. Lavita is a godsend.”
“Or a Ty-send,” Ty quipped. “What would you do without me?”
The waiter reappeared and dropped a frothing pint on the table. Jim thanked him.
Before he could turn back to beat wit for wit with his old station partner, Ty leaned forward
with an earnest look.
“Talking about data and research, I’m invited as a keynote speaker for a regional
conference on program evaluation and performance measurement. The theme of the
conference is ‘practitioner as researcher.’ I’m thinking you should come. Lavita, too. I want
316  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

to use your case as an example. You did both a program evaluation and performance
measurement.”
Jim was taken by surprise. “Is that what I did?”
“You are a prince that looks like a frog, sir.”
Jim scowled.
“Really, I want you to attend this conference,” Ty pressed.
Jim considered. “I don’t quite see myself as a researcher, but I guess Chief Chen, he
would like it. Sounds interesting. When is it?”

Mary’s Case
Mary’s smartphone beeped at her. The tone told her she had a Facebook noti-
fication. She found a message from Yuki, with a Web link: “Check this out. I
think this will be good for you.”
The link connected to a website for a regional conference, cohosted by the
foundation where Yuki worked. Yuki’s name was among the organizing committee
members. The conference was called, “Program Evolution and Performance
Measurement: Practitioner as Researcher.” It sounded interesting.
Yuki called later that afternoon to follow up. “Did you look at the conference
website? Can you come?”
“I’m thinking about it,” Mary replied. She had decided the topic of the conference
meshed with her recent volunteer research project. She might get something out of it.
“Good,” Yuki said with a determined tone, as if that was settled. “I’m going to put you
on a roundtable discussion with the HR director from the city of Westlawn and the assistant
chief of the Rockwood Fire Department.”
“Whoa, I didn’t say I’m coming,” Mary protested, “I said I’m thinking about it.”
“Oh, come on,” Yuki cajoled, “this is a good opportunity for you, and you’ll enjoy it. I’m
facilitating this roundtable discussion. I got this idea from you. Remember you said it would
be nice to have a group of practitioners to use as a sounding board for the research projects
you are doing? This is your group.”
Mary could tell Yuki was not going to take no for an answer, and Yuki made it sound
like she had worked to make a place specifically for her. Mary was obliged to participate.
She owed Yuki.
“OK, OK, I’ll come.”

Program Evaluation and Performance Measurement as Research

Public and nonprofit organizations are faced with increased expectations to monitor
and evaluate the performance of their programs. They have to be accountable to
stakeholders and achieve intended outcomes. Heightened emphasis on the use of
program evaluation and performance measurement in the public and nonprofit sectors
is partly attributed to the New Public Management (NPM) movement. Introduced in
the mid-1970s, NPM emphasizes improving the efficiency of public-sector organizations
Chapter 16  Using Research Methods for Continuous Improvement  ❖  317

by measuring and rewarding performance. The intent is to introduce market-like


principles in the way public services are delivered so public and nonprofit organizations
will operate more like a commercial business: with clearly stated objectives and
planning and management that pays attention to efficiency and the bottom line (Hood,
1995; Osborne, Plastrik, & Miller, 1998).
Ongoing discussions among practitioners and academics question the appropri-
ateness of NPM principles in public service operations (e.g. Denhardt, 2011; Hood,
2000). It is, however, undeniable that NPM has influenced contemporary thinking.
The practice in public and nonprofit sectors appears to be committed to program eval-
uation and performance measurement regardless of the underlying philosophy. A large
body of literature on program evaluation and performance measurement has grown
over the last few decades (e.g. Hatry, 2007; Julnes & Holzer, 2008; McDavid &
Hawthorn, 2006; Patton, 2002, 2011, 2012). A variety of approaches to program eval-
uation have been proposed, such as utilization-focused evaluation (Patton, 2011),
developmental evaluation (Patton, 2011), theory-driven evaluation (Chen, 1990), and
mixed-method evaluation (Greene & Caracelli, 1997). All of these theories and
approaches acknowledge that the research process is an integral part of both program
evaluation and performance measurement.
A textbook on program evaluation by Bingham and Felbinger (2002) states:
“[g]ood evaluations use scientific methods. These methods involve the systematic
process of gathering empirical data to test hypotheses indicated by [the] program’s or
policy’s intent” (p. 3). Practitioners who are expected to conduct program evaluation
and performance measurement need to know how to conduct research. Our case
examples with Emily, Jim, and Mary have illustrated program evaluation and perfor-
mance measurement activities where practitioners have needed to learn and apply
basic research approaches.

Difference Between Program Evaluation and Performance


Measurement
Practitioners and academics have provided several views on the distinction and rela-
tionship of program evaluation and performance measurement. The two activities are
recognized to be complementary (e.g. McDavid & Hawthorn, 2006; Newcomer, 1997).
Program evaluation is more inclusive than performance measurement (e.g. Bingham &
Felbinger, 2002). Some scholars made a distinction between the two based on who
conducts the activity and how. Program evaluation is typically conducted by an exter-
nal entity on an ad hoc basis, while performance measurement is typically conducted
internally on an ongoing basis (e.g. Hatry, 1997; United States General Accounting
Office, 2011).
In this book we take the view that performance measurement is an integral part of
evaluation, used to inform the organization to better manage its performance
(McDavid & Hawthorn, 2006). The difference between program evaluation and per-
formance measurement is in the focus. The focus of performance measurement is on
regular monitoring and reporting of particular organizational operations that may or
318  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

may not be associated with a program. Program evaluation is focused on assessing


how well a clearly identifiable program is working and whether the program has
achieved its overall objectives.
The difference between program evaluation and performance measurement can
be ambiguous. Let’s see how Ty addresses this topic in his keynote speech at the con-
ference on “Program Evaluation and Performance Measurement: Practitioner as
Researcher.”

Ty and Mary at the Conference


Ty, a professor at the university in Rockwood, in his conference keynote speech
reviewed how program evaluation and performance measurement became an
integral part of performance management in the public and nonprofit sectors. He
then discussed briefly the key differences between program evaluation and per-
formance measurement, noting that program evaluation tends to be more epi-
sodic, issue specific, and focused on the outcomes of a particular program, while
performance measurement is an ongoing effort focused on some aspect of gen-
eral organizational performance.
Mary was in the audience. While listening to Ty, she had been trying to figure out if her
project was program evaluation or performance measurement. When Ty opened the floor to
questions, Mary raised her hand. She described her project briefly and pointed to where she
was confused.
“The purpose of my study is to improve volunteer recruitment and retention, so the
emphasis is on general organizational performance. But there is also a program manager
for the volunteers. It’s a program. We have not routinely collected feedback from volunteers
or analyzed information about the volunteers, so it looks like I’m doing a program evalua-
tion. Yet I’m measuring performance. So how would you categorize what I’m doing? Is it
program evaluation or performance measurement?”
Ty smiled and thanked Mary for the question. “Yes, as Mary said, sometimes in practice
you find a situation that does not neatly fall under one of these two categories. If, indeed,
this effort of yours—interviewing your volunteers and analyzing the data—ends up to be a
one-time study to provide your board members information on how the volunteer program
is doing, then you might call it a program evaluation. However, if your effort leads to a
regular process to monitor volunteer satisfaction, which in turn, informs your effort to con-
tinuously improve your volunteer program, then I would say you are conducting perfor-
mance measurement as part of your effort to evaluate and manage your volunteer program.
In other words, what you are doing now could be the beginning of a systematic perfor-
mance measurement process. As I mentioned at the outset, I see performance measurement
as an integral part of an organization’s management activities. Certain aspects of perfor-
mance are always measured and monitored. You may be simply drawing attention to an
area of concern where you have found a way to measure and monitor performance. Your
program evaluation could become performance measurement. It’s probably not worthwhile
to worry too much about whether you are doing one or the other. I do think, however, that
it is important that somebody like you, a research-minded practitioner, document what you
Chapter 16  Using Research Methods for Continuous Improvement  ❖  319

are doing to give the organization a record of how to measure performance. This is useful,
whether you call it a program evaluation or a performance measurement, episodic or ongo-
ing. I think a program manager in the future will be grateful for your effort.”

Key Issues in Program Evaluation

As noted earlier, program evaluation is a systematic assessment of the process and


outcome of a program. A program can be conceptualized as “a group of related
activities that is intended to achieve one or several related objectives” (McDavid &
Hawthorn, 2006, p. 446). The reason for conducting program evaluation is to inform
key stakeholders about whether the programs are working and producing the outcomes
they are intended to produce. It also provides information that assists in performance
improvement.
Program evaluation seeks to answer a question or set of questions and collects and
analyzes data to answer these questions. The most basic program evaluation question
is, “Does the program work?”

Types of Evaluation
There are different ways to classify evaluations. One common way to classify the
evaluation is by its role. Scriven (1967) first introduced the idea to distinguish between
two different roles of evaluation: formative and summative evaluations. The intended
role of formative evaluation is to improve the program implementation process. It
evaluates the program process, with the main goal of providing information to pro-
gram managers and other stakeholders to assist them in improving the process. A
formative evaluation can take place as a pilot study, where a proposed program is
tested for its feasibility and to obtain feedback from stakeholders prior to the official
implementation of the program. A formative evaluation can also take place during
program implementation to assess how the process is working and inquire how the
process can be improved. The basic question the evaluator asks in the formative evalu-
ation is, “Can the program be improved?”
On the other hand, the role of summative evaluation is to provide decision
makers with information on what to do with a program. The information obtained
in the summative evaluation usually determines the survival of a program, whether
or not it should be funded and continued. McDavid and Hawthorn (2006) charac-
terize the summative evaluation as having a focus on the bottom line. The key
questions asked in the summative evaluation include: “Should we be spending less
money on this program; should we be reallocating the money to other uses; or
should the program continue to operate?”(p. 21). The difference between formative
and summative evaluations has been summarized as “formative evaluation for per-
formance improvement, and summative evaluation for accountability, policy and
budget decision making, and other purposes beyond performance improvement”
(Wholey, 1996, p. 145).
320  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Another way to classify evaluation is by the focus. Evaluation can be conducted by


focusing on processes or outcomes. Process evaluation focuses on “the internal
dynamics and actual operations of a program in an attempt to understand its strengths
and weaknesses” (Patton, 1997, p. 206). The questions asked in the process evaluation
include the following:

•• What’s happening and why?


•• How do the parts of the program fit together?
•• What do participants experience, and how do they perceive the program?
•• What are the strengths and weaknesses of the day-to-day operation?
•• How can the process be improved?

Outcome evaluation focuses on whether the program had a demonstrated effect


and produced intended outcomes. Some scholars use the term outcome evaluation to
refer to a form of evaluation that assesses the effect of the program by comparing the
impact on a target population to what would have happened if the program did not
exist (United States General Accounting Office, 2011). The questions asked in the
outcome evaluation include these examples:

•• To what extent was the desired outcome attained?


•• What were the effects of the program to the key stakeholders?
•• What were the unintended outcomes the program produced?

These two types of classification of evaluation—formative versus summative and


process versus outcome—are not mutually exclusive. Table 16.1 presents a matrix that
illustrates how formative or summative evaluation can overlap with process or out-
come evaluation.
Notice that both formative evaluation and process evaluation focus on program
process improvements. Some scholars use the term process evaluation synonymously
with formative evaluation. For example, Chen (1996) describes the formative evalua-
tion as “essentially a kind of process evaluation with an emphasis on improvement”
(p. 122). Similarly, some scholars equate summative evaluation with outcome evalua-
tion, because program bottom-line decisions may be based on information from the
outcome evaluation.
However, as Scriven (1996) points out, formative evaluation is not a type of pro-
cess evaluation, and the summative evaluation is not the same as outcome evaluation.
As illustrated in Table 16.1, a formative evaluation that intends to improve a program
can focus on collecting information on either the process or the outcome.
There are other approaches in conducting evaluations that may apply to
either formative or summative evaluation or processor outcome evaluation. We
will describe two common approaches: needs assessment and cost-benefit or cost-
effectiveness analysis.
Needs assessment is typically conducted as part of a formative evaluation, before
a program is implemented. It focuses on identifying the need for a program among the
Chapter 16  Using Research Methods for Continuous Improvement  ❖  321

Table 16.1  Matrix of the Type of Evaluation and Examples

Role of evaluation

Focus of evaluation Formative Summative


Process (A) In the pilot phase of the (C) After the alternative service
alternative service delivery delivery model is fully implemented,
model, Jim asks the firefighters Jim investigates citizen satisfaction.
and paramedics how they think The city council decides whether or
the process can be improved. not to continue the alternative service
model.
Outcome (B) In the pilot phase of the (D) After the alternative service
alternative service delivery delivery model is fully implemented,
model, Jim evaluates the Jim monitors the cost and mortality
outcome of the alternative rate of the operation to provide to the
service delivery model, based city council. If cost and mortality
on cost and mortality rate. increase, the city council may decide to
discontinue the model.

target recipients. For example, a needs assessment of a community can determine what
programs should be offered that are currently not provided.
Mary’s case provides an example. Her efforts to identify what Health First can
do to recruit and retain volunteers could be considered a needs assessment.
Typically, the first step in the needs assessment process is to identify the stakehold-
ers for the program. In Mary’s case, we saw that her first concern was to identify the
existing volunteers and those who might become volunteers. The second step in a
needs assessment is to examine what services are currently available to the stake-
holders. Here, Mary obtained feedback from volunteers to examine the current
situation and assess how well Health First activities were working. The third step is
to collect data to identify needs among the stakeholders. Mary’s interviews also
helped her assess what attracted people to volunteer at Health First and what might
encourage them to stay. An analysis of collected data in a needs assessment should
identify needs among stakeholders and provide a basis for recommendations to
improve program performance.
Cost-benefit analysis and cost-effectiveness analysis are often pooled together.
The two evaluations have common features but are also slightly different. Both focus
on cost and the value produced by a program. One difference is in the way the value
of the program is defined and measured. In cost-benefit analysis, all potential out-
comes of the program are specified in financial terms. In cost-effectiveness analysis,
the potential outcomes of the program are also captured in nonfinancial terms
322  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

(Bingham & Felbinger, 2002). The comparable metric in cost-benefit analysis is used
to evaluate if a single program is worth the cost (input to outcome). In contrast, the
outcomes in cost-effectiveness analysis are used to rank different options according to
their relative value (outcome to outcome).
Both cost-benefit and cost-effectiveness analysis can be conducted before, during,
or after a program is implemented. While these analyses are useful for program evalu-
ation, the process is complex. The major challenge is in the way the program costs and
outcomes are measured. In cost-benefit and cost-effectiveness analysis, both the costs
and the outcomes of the project need to be quantified, and in real-life projects, this can
be hard to accomplish.
Emily’s case provides an example of barriers that can be encountered in trying to
conduct a cost-benefit or cost-effectiveness analysis. If Emily had tried to conduct a
cost-benefit analysis for her diversity training, she would have to identify the costs of
the training and the outcomes of the training. It would be fairly easy to calculate the
direct cost of the training; this would entail the cost of the trainer, room, lunch, and
materials. This does not, however, include the costs for planning by Emily’s team,
coordination by Mei-Lin, evaluation by Leo, or the opportunity costs involved by
accounting for the time spent in the training by all the participants who were taken
from their regular work. Maybe there were additional costs to get other employees to
cover for them. If their work was not covered, we might need to account for conse-
quences to citizens who came to City Hall and had to wait longer than usual or were
denied services at that time. These are also a cost of the training program. On the
outcome side, even if Emily found that the training had an impact on cultural com-
petence and workplace conflict, how would she measure the benefits?
Many of the costs and benefits in program operations are not typically or readily
monetized. Cellini and Kee (2010) discuss these challenges in cost-benefit and
cost-effectiveness analysis and offer some suggestions that may help evaluators interested
in this approach to evaluation.

Key Issues in Performance Measurement

Performance measurement, defined broadly, refers to the process of designing and


measuring a specific outcome related to program performance. As noted earlier, the
focus is on monitoring ongoing operations in a routine process. With performance
measurement, an organization can analyze trends and assess progress toward a specific
goal. Performance measurement also allows the organization to compare performance
on a specific indicator with other organizations in the sector or with an industry
standard or benchmark, as in Jim’s case with service call response time.
The key question asked in performance measurement is “How are we doing?”
Alternate ways to phrase the question include these below:

•• Can the operation be improved?


•• Does the operation meet the standard?
•• Does the operation accomplish the intended outcome?
Chapter 16  Using Research Methods for Continuous Improvement  ❖  323

Types of Performance Measurement


Performance measurement can be divided into two broad categories, based on the
intended use. When the information from the performance measurement is used to
make adjustments and improvement in the process of operation, it is used for a forma-
tive purpose. When the information is used to assess the outcome of the operation, so
decision makers can make a bottom-line decision on its survival, the performance
measurement is used for a summative purpose.
In Jim’s case, when he evaluated if the response time of the eight stations changed
from year to year, and compared the stations against each other, he was using perfor-
mance measurement for a formative purpose to see where improvements might be
made. When he evaluated if the average response time met the industry standard for
accreditation, the purpose was basically summative. The information indicated
whether or not response time in the current system was adequate.

Who Conducts Program Evaluation and


Performance Measurement?
In making the comparison between program evaluation and performance measurement,
some scholars (e.g. Hatry, 1997; McDavid & Hawthorn, 2006) note that program
evaluation tends to be conducted by experts outside the organization, while performance
measurement is conducted by people within the organization. The justification for an
external expert for program evaluation is to assure objectivity. The assumption is that
those who are internal to the organization have a vested interest in presenting the
program as effective, and therefore, will not be able to provide an objective assessment.
Alternately, performance measurement that involves more routine, ongoing data
collection and analysis can be conducted more efficiently by those who are already
working in the organization.
Other scholars (e.g. Love, 1998; Patton, 2012), including the authors of this book,
do not agree with the simple classification that program evaluation is for external
experts, while performance measurement is for internal members of an organization.
We believe that program evaluation can be conducted by those internal to the organi-
zation, and it can be part of management and leadership functions. A long-standing
discussion among evaluators questions whether program evaluation is really better
when conducted by experts outside the organization, compared to internal evaluators
(Conley-Tyler, 2005). The classic view of evaluation favored external evaluators (e.g.
Posavac & Carey, 1992; Rossi, Freeman, & Lipsey, 1999), but a growing literature sug-
gests the effectiveness of internal evaluators (Volkov & Baron, 2011). Some views
advocate combining external and internal evaluators at different stages of a program
evaluation (Dahler-Larsen, 2009; Vanhoof & Petegem, 2007).
In this book, we have focused on the knowledge and skill sets necessary for
practitioners engaged in applied research. In the case examples of Emily, Jim, and
Mary, we saw practitioners in different settings apply their research knowledge and
skills, and gather the skills of others, to conduct program evaluation and performance
324  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

measurement. None of them commissioned an external research expert, though this


could have been an option.
External experts frequently lack knowledge of the organization and the context
of its operations, and hiring them involves extra cost. There is also no guarantee that
external experts will not be influenced by their contracts to soft-pedal an evaluation
to gain favor and future business. The charge of vested interest so often leveled
against internal evaluators may also apply to external evaluators. When internal
resources make it possible, an organization may well find that an evaluation adds
more value when its own practitioners participate in the research needed for pro-
gram evaluation and performance measurement. The research process can enhance
skills, teamwork, and knowledge within the organization to better manage and
improve organizational performance.

Ethical Considerations in Program Evaluation and


Performance Measurement
Those who conduct program evaluation and performance measurement—whether
internal or external to the organization—need to be professional and ethical in their
practice. We have made it clear that evaluation is a research process. The ethical
principles of research apply. We discussed ethical principles in research in several
chapters as they related to different stages of the research process (notably in Chapters 4, 6,
and 15). The following summary reiterates these ethical considerations in relation to
program evaluation and performance measurement.
When an evaluation or measurement involves people, it is important to take their
rights into consideration in the study objectives, design, data collection approaches,
and in the way the results are disseminated. When data are collected directly from
individuals, an evaluator needs to obtain consent before the individual participates.
The person needs to be informed of potential physical or psychological harm that
could occur by participating and what measures the evaluator has taken to mitigate
the harm.
In data collection and analysis, evaluators need to use appropriate approaches that
do not attempt to prove or disprove a result based on the evaluator’s preference or bias.
In reporting results, evaluators need to take care to report accurate and unbiased infor-
mation. Falsification of data, of course, is certainly unethical. Evaluators also need to
maintain confidentiality of the people from whom they obtained information.
Those who practice program evaluation and performance measurement may want
to consult professional organizations in their field to see if a set of guiding principles
or code of ethics is available. The American Evaluation Association provides a refer-
ence for professional evaluators, Guiding Principles for Evaluators (2004), that identi-
fies five key principles for professional conduct. (See Table 16.2.) The American
Society of Public Administration provides a code of ethics for practitioners in the field
of public administration (American Society of Public Administration, 2012). Other
professional organizations may do the same.
Chapter 16  Using Research Methods for Continuous Improvement  ❖  325

Table 16.2 American Evaluation Association Guiding Principle for Evaluators


(2004)

A. Systematic Inquiry: Evaluators conduct systematic, data-based inquiries about whatever is being
evaluated.
B. Competence: Evaluators provide competent performance to stakeholders.
C. Integrity or Honesty: Evaluators ensure the honesty and integrity of the entire evaluation process.
D. Respect for People: Evaluators respect the security, dignity, and self-worth of the respondents,
program participants, clients, and other stakeholders with whom they interact.
E. Responsibilities for General and Public Welfare: Evaluators articulate and take into account the
diversity of interests and values that may be related to the general and public welfare.

Being ethical in program evaluation and performance measurement is a core


aspect of professional practice. We have only provided a brief overview. For further
information, you may want to review Newman and Brown (1996), which provides an
extensive study of evaluation practice and identifies key ethical principles important
for evaluators.

Practitioners Becoming Researchers: Making Sense of It All

In this final section, let’s return to our gathered practitioners at the conference on
“Program Evaluation and Performance Measurement: Practitioner as Researcher.”
We can eavesdrop as they share their experiences and review their challenges and
accomplishments.

Round Table Discussion at the Conference


The round table discussion Yuki organized for the breakout session was the final event of
the conference. She made sure her friend Mary, program manager at the nonprofit organi-
zation Health First, was at the table. Yuki also recruited Emily, HR director at the city of
326  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Westlawn, and her graduate student intern, Leo; and Jim, assistant chief at the Rockwood
Fire Department, and a graduate student who was helping him, Lavita. Unexpectedly, the
conference keynote speaker, Ty, asked to join. He was a friend of Jim’s, and Lavita worked
for him at the university.
Once everyone took their seats, Yuki started off, thanked everyone at the table for par-
ticipating, and then asked them to go around and introduce themselves. The room for the
breakout session was smaller and provided a cozy atmosphere for these seven people at the
table ready to share their experiences with each other.
Yuki got things rolling, “You all had a recent experience conducting some kind of pro-
gram evaluation or performance measurement that involved a research project. Can you tell
us some of the challenges you faced? I’m sure there were many, but if you were to name
just one, what was it?”
There was a moment of silence. Everyone at the table looked around to see if anyone
was going to say something. Emily broke the ice.
“Yes, I faced many challenges. I could go on and on.” She gave a short laugh. “If I were
to pick one, I think it was at the very beginning phase, trying to identify my research ques-
tion and align it with the data collection method.”
Emily told about her first meeting with Ahmed, when he pointed out that there was a
misalignment in what she wanted to find out in her research and the data she said she
wanted to collect. Jim was nodding his head.
“I had a very similar challenge at the initial stage of my research,” Jim cut in. “I had so
many things on my plate; I was all over the map. Without the help of this professor buddy
of mine,” he gestured toward Ty at the table, “I couldn’t have focused my research
questions.”
No one took up the thread, so Jim continued, “It was also a big challenge for me to
figure out the research design for my alternative service delivery model study. Eventually, I
was able to set up the study using an experimental design. But getting there was not easy.
I guess that’s two challenges, but both are about focusing in the beginning.” He looked
across the table, “Maybe Lavita can add something about the data analysis stage. She was
the stats wiz who got me out of trouble there.”
Lavita blushed and smiled. Emily jumped in before Lavita mustered the courage to say
anything.
“Oh, that’s the same for me, too. I am so grateful for Leo. He taught me so much about
statistics, and what I learned would not have pulled me through the analysis without
him.”
“I had a different challenge,” Mary jumped in. “I knew quantitative research, so I started
my project thinking I could use my statistics skills. Then it became clear that I needed to do
interviews and use a qualitative approach. At first I was trying to bend the research to what
I knew how to do. I didn’t know anything about qualitative research. I had to learn it from
scratch. And I had help, too, from Yuki here.”
Yuki giggled and decided to move on.
“Congratulations on completing your projects, despite the challenges. Now, how about
the outcome of your projects? Can you share what happened in your organizations once
you had results?”
Chapter 16  Using Research Methods for Continuous Improvement  ❖  327

Emily started again. “I am glad to report that just yesterday the city council approved
the budget for next fiscal year’s diversity training.”
Others at the table applauded. Leo fidgeted. He had not heard about the budget deci-
sion. He pumped his fist in the air and exclaimed, “Yes!”
Emily smiled around the table. She felt a sense of camaraderie with these people. They
understand what it takes to conduct research in a real-world setting and how rewarding it
is to see something come out of it. She decided to keep going.
“The council members were all very pleased to see that we took thoughtful steps to
measure the baseline before the training and compared the difference between those who
took the training versus those who did not. We also did some additional analyses using the
employee survey we conducted to evaluate the impact of the diversity training. That gave
us information on differences in workplace culture across different departments. The execu-
tive team is going to look into some of our findings.”
Emily saw Ahmed walk in from the foyer, either checking how things were going or curi-
ous about the applause. She raised her voice to get his attention.
“And Ahmed, there, has agreed to give us another round of funding from the Community
Foundation to develop and implement a long-term evaluation plan for our training
programs.”
Ahmed smiled at Emily and gave a little wave before retreating. Yuki broke in to keep
things moving.
“It sounds like you definitely used your research well to leverage more funding. Good
job.” She looked around the table, “How about you Jim? What about your results?”
“Well, we also had a great success in our project,” Jim stated proudly. “First, we got
the accreditation we needed.” Again, applause. “The study we did on response time was
only a small part of it, of course, but I could tell the accreditation team was pretty
impressed with our analysis. Again, I praise the good work Lavita did in writing up the
analysis results.”
Jim paused to allow Lavita her due attention, and continued, “Second, it’s looking like
the city council is going to approve the proposal we made to adopt the alternative service
delivery model we studied and implement it at all fire stations in the city of Rockwood. Our
research results showed that the experimental group of stations with the alternative model
had lower costs, and the decrease was statistically significant. That result sold the idea to
the city council members.”
Ty jumped in. “I just want to comment there, if I may. Not that I want to test you on
what I talked about in the keynote speech, but I want you to think about how Jim’s projects
fit into program evaluation and performance measurement.”
An active discussion followed, mixing details about the research projects and the models
Ty had introduced in his talk, illustrated in a matrix that distinguished formative or sum-
mative evaluation versus processor outcome evaluation. Eventually, Yuki interjected to pull
the group forward.
“Mary, we didn’t hear what happened after your project was done? Any effects in your
organization?”
“Well, I presented my study results to the board members,” Mary replied hesitantly, “but
actually, the work is still going on. Based on the interviews and some data analysis on the
328  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

background of the volunteers, we identified a couple of targeted recruiting strategies, and


now we have some volunteers assigned to the recruitment task. One of the things we found out
was that personal connections are a key factor in why people decide to volunteer and also
why people stay. So we thought asking our own volunteers to go out and recruit other
volunteers would be a good way to strengthen personal connections and networking
among the volunteers. That might be a good effect, but we don’t really know yet.”
“That’s a great idea,” Emily commented. “I’m impressed.”
Mary continued, “When I started this project, I was conceptualizing it more or less like
a one-time program evaluation, the kind of thing Ty mentioned in the Q & A session this
morning. I have to say, I have learned some things today. I realize that what I need to do
is to start thinking about how I can establish an ongoing monitoring system to measure
our organization’s performance in recruitment and retention of the volunteers. That makes
it more of a performance measurement project, I guess.”
Yuki carried the thought. “It is definitely an ongoing process, isn’t it? Even after you
submit your research report, there is always something new to follow up on. My final ques-
tion, to all of you: What are the lessons you learned from your current experiences doing a
research project? What did you gain that will help you in your next project?”
Jim jumped in. “It’s important to have a good graduate intern. Ask for help right
away. I think, even if Ty or Lavita are not around for me next time, I will be sure to
brainstorm with people who may be able to help me focus and get a good start. That’s
what I learned.”
Laughter fluttered around the table, but it was clear Jim was serious.
Emily followed up, “That’s true for me, too. Leo was such a great help, and we had a
great team to work things through. I also have a whole lot of appreciation for the kind of
education the students are receiving at grad school. It’s actually useful.”
Yuki smiled and noted, “We all owe a lot to the well educated students and staff to get
our work done.” She looked towards the two students at the table in an encouraging man-
ner and said, “Any comments on that? Leo? Lavita?”
Leo took the cue, “When I was taking my research methods and statistics class in
school, I liked the class, but I did not think it was relevant to what I want to do. I want
to work in the government sector as an administrator. Working with Emily on the evalu-
ation of the diversity training program changed my mind. Designing the survey and
analyzing the data, all that, made me realize these tools are real. Administrators use
analytic skills.”
Lavita was nodding in agreement. She took over, “I learned a lot by writing up the results
of the study in the report. That’s where it became real for me. I mean, I was really glad to
get the data Jim gave me, and knowing it was recorded right here in the community, but I
guess it was still just data—which I love, don’t get me wrong. The report, though, I knew
was going to the fire chief and the mayor, and all, and it hit me that I was not just writing
for my professor.” She glanced at Ty. “I really had to concentrate on what the data means,
and I understood that accuracy—and fluency—could make a difference in how real policy-
makers made their decisions. That was the wow moment for me.”
Lavita flushed, and stopped abruptly. Yuki picked up.
Chapter 16  Using Research Methods for Continuous Improvement  ❖  329

“That’s great feedback. Thanks Leo and Lavita. Anything else from anyone? We have
about fifteen minutes. Why don’t you take the time to talk about whatever you want?”
Emily looked deep in thought, apparently connected to Lavita’s remarks. Again, she was
first to comment, “I think I gained confidence through this experience.” She continued,
reflectively, “I have a better idea what information I need to develop a program, and now
I know how I should collect that information. I also see how I need information to inform
my decisions and the decisions of other managers in the organization. I think, like Leo and
Lavita, I see now much more clearly how this information is real. I was getting information
before, of course, but to be honest, I don’t think I was using it effectively or really taking
it seriously in my decisions. So, what Leo said is right. These analytic skills are really impor-
tant for managers. Plus, I also had a revelation today“—she smiled at Mary at the table—
“I learned some things today, too. I was on a panel this morning with folks from JPB
Research. You probably know they are the big research firm in the area. I was pretty
intimidated being on the same panel with them, but you know what? Now that I under-
stand all the research terms, I understood their sampling strategy, what it means when
they said they used one-way ANOVA for their analysis and what it means when the result
is significant. It felt pretty good.”
Emily and Leo looked at each other and smiled. “What about you, Mary?” Yuki prompted.
“What’s your take away?”
Mary looked down and stared at her coffee cup, and then responded. “I agree with Emily
that this experience gave me confidence. My major take away is that now I feel I am a bet-
ter advocate for the volunteers. I understand their perspectives, and when I go to the
executive director or the board members, I can make a better argument advocating more
support for the volunteers, because I’m better informed.”
“I feel the same as Mary,” Jim followed. “I’m definitely better informed. And I can
articulate the need for the program.” He paused, looked at Ty, and continued, “This may
sound a little corny, but I now think having research skills makes me a better leader. In
our profession, especially at the level of assistant chief, we talk a lot about leadership.
Leaders need to solve problems, make judgments, secure and allocate resources, be a
good advocate for your subordinates and the organization, and make decisions. That’s
what we learned, and all of that is true. And I see now that good research skills help
me to do all those things leaders need to do. I learned that research is not just knowing
about the experimental design, or sampling methods, or statistical analysis. Doing this
research helped me observe my surroundings carefully, analyze it, and make decisions.
The logic and the thinking process involved in the research, I believe, helped me become
a better leader.”
Everyone at the table expressed agreement with Jim’s comment. Yuki spoke for all of them.
“That was great, Jim. What a powerful way to conclude our round table discussion.”
Yuki then thanked them for taking the time to get together for this special round table
discussion and concluded with enthusiasm, “Good luck with your next venture in program
evaluation, performance measurement, and any other research endeavors.”
Nobody left the room. Emily, Jim, and Mary exchanged business cards and agreed to
stay in touch, perhaps as a sounding board for each other in their next research projects.
330  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Leo and Lavita exchanged their email addresses and decided to get together for coffee
soon. Yuki, Ty, and Ahmed stood in the corner of the room, looking at the intent exchanges
going on.
Ahmed said “Well, this was a great conference. It looks like this round table discussion
was particularly good. I think we succeeded in creating a community of research-minded
practitioners, didn’t we?”
Yuki and Ty nodded and shook hands. Ahmed shook hands with each of them. They
knew they would see each other again.

Chapter Summary
This chapter introduced program evaluation and performance measurement and discussed
their relevance to the research process. It also discussed the difference and relationship
between program evaluation and performance measurement. Different types of program
evaluation and performance measurement approaches were introduced. We discussed the issue
of whether program evaluation and performance measurement should be performed by
external experts or internal members of the organization. We covered ethical considerations in
evaluation.
In this chapter, we also closed the stories of our three key characters and their research partners.
We want to thank these fictional practitioners for sharing their experiences and helping us get
through some complex issues in applied research. May they, and you, prosper in all future endeavors.

Review and Discussion Questions


1. Discuss how research approaches are relevant to program evaluation and performance mea-
surement. What are the specific research techniques you learned in this book that may be
applicable in conducting program evaluation and performance measurement?
2. Describe the differences and relationship between program evaluation and performance
measurement.
3. Describe how Emily, Jim, and Mary’s projects fit into program evaluation and performance
measurement.
4. Identify different examples that fit the four cells in Table 16.1: Matrix of the Type of Evaluation
and Examples.
5. Discuss pros and cons of using internal evaluators versus external evaluators.
6. List key ethical concerns that Emily, Jim, and Mary should be addressing.
7. Research some professional organization and identify its professional ethical code of conduct.
8. Think of examples for performance measurement that are used for a formative purpose and a
summative purpose.
Chapter 16  Using Research Methods for Continuous Improvement  ❖  331

References
American Evaluation Association. (2004). Guiding principles for evaluators. Retrieved from http://www.eval
.org/publications/GuidingPrinciplesPrintable.asp
American Society of Public Administration. (2012 ). ASPA code of ethics. Retrieved from http://www.aspanet
.org/public/ASPA/Resources/Code_of_Ethics/ASPA/Resources/Code%20of%20Ethics1
.aspx? hkey=acd40318-a945-4ffc-ba7b-18e037b1a858
Bingham, R. D., & Felbinger, C. L. (2002). Evaluation in practice: A methodological approach (2nd ed.). New
York, NY: Longman.
Cellini, S. R., & Kee, J. E. K. (2010). Cost-effectiveness and cost-benefit analysis. In H. P. H. Joseph, S.Wholey,
& Kathryn E. Newcomer (Eds.), Handbook of practical program evaluation (3rd ed., pp. 493–530). San
Francisco, CA: Jossey-Bass.
Chen, H. T. (1990). Theory-driven evaluations. Newbury Park, CA: Sage.
Chen, H. T. (1996). A comprehensive typology for program evaluation. American Journal of Evaluation,
17(2), 121–130. doi: 10.1177/109821409601700204
Conley-Tyler, M. (2005). A fundamental choice: Internal or external evaluation? Evaluation Journal of
Australasia, 4(1 & 2), 3–11.
Dahler-Larsen, P. (2009). Learning oriented educational evaluation in contemporary society. In K. E. Ryan
& J. B. Cousins (Eds.), The Sage international handbook of educational evaluation. Thousand Oaks, CA:
Sage.
Denhardt, J. V. (2011). New public service: Serving, not steering. Armonk, NY: M. E. Sharpe.
Greene, J. C., & Caracelli, V. J. (1997). Advances in mixed-method evaluation: The challenges and benefits of
integrating diverse paradigms. San Francisco: Jossey-Bass.
Hatry, H. P. (1997). Where the rubber meets the road: Performance measurement for state and local public
agencies. New Directions for Evaluation, 75, 31–44.
Hatry, H. P. (2007). Performance measurement: Getting results (2nd ed.). Washington, DC: Urban Institute.
Hood, C. (1995). The “new public management” in the 1980s: Variations on a theme. Accounting,
Organizations, and Society, 20(2–3), 93–109. doi.: 10.1016/0361-3682(93)E0001-W
Hood, C. (2000). Paradoxes of public-sector managerialism, old public management, and public service
bargains. International Public Management Journal, 3(1), 1–22.
Julnes, P. D. L., & Holzer, M. (2008). Performance measurement: Building theory, improving practice. Armonk,
NY: M. E. Sharpe.
Love, A. J. (1998). Internal evaluation: Integrating evaluation and social work practice. Scandinavian Journal
of Social Welfare, 7(2), 145–151.
McDavid, J. C., & Hawthorn, L. R. L. (2006). Program evaluation and performance measurement: An
introduction to practice. Thousand Oaks, CA: Sage.
Newcomer, K. E. (1997). Using performance measurement to improve public and nonprofit programs. San
Francisco: Jossey-Bass.
Newman, D. L., & Brown, R. D. (1996). Applied ethics for program evaluation. Thousand Oaks, CA: Sage.
Osborne, D., Plastrik, P., & Miller, C. M. (1998). Banishing bureaucracy: The five strategies for reinventing
government. Political Science Quarterly, 113(1), 168.
Patton, M. Q. (1997). Utilization-focused evaluation: The new century text. Thousand Oaks, CA: Sage.
Patton, M. Q. (2002). Qualitative research and evaluation methods. Thousand Oaks, CA: Sage.
Patton, M. Q. (2011). Developmental evaluation: Applying complexity concepts to enhance innovation and use.
New York, NY: Guilford.
Patton, M. Q. (2012). Essentials of utilization-focused evaluation. Los Angeles, CA: Sage.
Posavac, E. J., & Carey, R. G. (1992). Program evaluation: Methods and case studies. Englewood Cliffs, NJ:
Prentice Hall.
Rossi, P. H., Freeman, H. E., & Lipsey, M. W. (1999). Evaluation: A systematic approach. Thousand Oaks, CA: Sage.
332  ❖  SECTION III  SUMMING UP: PUTTING THE PIECES TOGETHER

Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagné, & M. Scriven (Eds.),
Perspectives of curriculum evaluation (pp. 39–83). Chicago, IL: Rand McNally.
Scriven, M. (1996). Types of evaluation and types of evaluator. Evaluation Practice, 17(2), 151–161.
United States General Accounting Office. (2011). Performance measurement and evaluation: Definition and
relationships (GAO-11-646SP). Retrieved from http://www.gao.gov/assets/80/77277.pdf
Vanhoof, J., & Petegem, P. V. (2007). Matching internal and external evaluation in an era of accountabiliy
and school development: Lessons from a Flemish perspective. Studies in Educational Evaluation,
33(2), 101–119.
Volkov, B. B., & Baron, M. E. (Eds.). (2011). Internal evaluation in the 21st century [Special Issue]. Retrieved
from http://onlinelibrary.wiley.com/doi/10.1002/ev.v2011.132/issuetoc
Wholey, J. S. (1996). Formative and summative evaluation: Related issues in performance measurement.
American Journal of Evaluation, 17(2), 145–149. doi: 10.1177/109821409601700206

Key Terms
Cost-Benefit Analysis  321 Needs Assessment  320 Process Evaluation  320
Cost-Effectiveness Outcome Evaluation  320 Program 319
 Analysis 321 Program Evaluation  318
Performance
Formative Evaluation  319  Measurement 317 Summative Evaluation  319

Student Study Site


Visit the Student Study Site at www.sagepub.com/nishishiba1e for these additional learning tools:

•• Data sets to accompany the exercises in the chapter



Appendix A:
Additional SPSS and
Excel Instructions
Instructions for Creating a New Variable

There are situations where you want to convert the values of the variables in the
existing data and create different variables. You can do this in two ways. You can recode
the variable, or you can compute a new value.

Recoding Data in SPSS


Here’s a step-by-step guide in recoding and creating a dummy variable in SPSS, using
Emily survey.sav. In the current SPSS Emily survey.sav file Q20 (What is your gender?),
code male as 1 and female as 2. If you want to create a dummy variable it must be
coded as 0 and 1.
To accomplish this follow the steps below:

1. Click Transform à Recode into different variable.


2. Move Gender (Q20) into the numerical value-> output variable box.
3. Call the new variable female and also the label as female.
4. Under old value enter 1.
5. Under new value enter 0.
6. Click Add.
7. Again, under old value enter 2.
8. Under new value enter 1.
9. Click Add.

333
334  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Figure Appendix .1  Menu Selection for Creating New Variables

Figure Appendix .2  SPSS Recode Dialogue Box

10. Click Continue.


11. Click Change.
12. Click OK.
Following the same steps, you can recode any variable into different variables.
Appendix A❖  335

Figure Appendix .3  


Recoding Values of Variables in SPSS

Recoding Data in Excel


In Excel, you can recode the variables in each cell to reflect 1 = female and 0 = male,
either manually or using an IF statement. The variable Q20 appears in column W.
Insert a column directly to the right and in cell X2 enter = IF(W2=2, 1,0). Then copy
the content of this cell all the way down to the end of the data. This statement tells
Excel that the value of X2 should be a 1 if W2 contains 2. If W2 does not contain 2,
then the value should be 0. Once a dummy variable is created, follow the same proce-
dure used to run the multiple regression analysis.

Computing Values in SPSS


In Emily’s example, she created a cultural competence score based on the 8 questions
she had in Emily Survey.sav. She created a composite variable by creating a new
variable by computing the average of the responses to questions 1 to 8.
To accomplish this, follow the steps below:

1. Click TransformàCompute Variable.


2. In the Target Variable box create a name for the new variable (culturalcompe-
tence in this case).
3. Click on the “()” button in the window key pad to begin building an expression.
4. Using the window in the keypad, build the following expression (Q1 + Q2 + Q3
+ Q4 + Q5 + Q6 + Q7 + Q8) / 8.
5. Click OK.
336  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Figure Appendix .4  Computing Value in SPSS

Computing Values in Excel


In order to create the same new variable by computing, follow the steps below.
In column AF, create a new variable titled culturalcompetence.

1. In cell AF2, enter the formula =SUM(B2:I2)/8.


2. Press the Enter key.
3. Click on cell AF2 and drag the cursor all the way down to the last case (AF236).
This will insert the formula into each cell but will use the new cells as the refer-
ences (e.g. B2, B3….).

Instruction for Creating a Boxplot

In Chapter 7, we introduced a boxplot as a way to visually examine the median and


percentile of the data. You can also use a boxplot to visually compare the median and
the spread of two or more groups.
To create a boxplot for multiple groups using SPSS, follow the steps below:

1. Open Rockwood 2011 Response Time.sav


2. Click GraphsàLegacy Dialogsà Boxplot.
Appendix A❖  337

3. Click on Simple, then click Define.


4. Enter a variable such as response time into the variable box and a variable such
as station location into the category box.
5. Click OK.

Figure Appendix .5  Menu Selection for Boxplot

Figure Appendix .6  Menu Selection for Type of Boxplot


338  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Figure Appendix .7  Defining the Variable and Groupings for Boxplot



Appendix B:
Emily’s Survey Form
Date _________________________ Survey ID _______________

 1. I can discuss my own ethnic/cultural heritage.

Very Rarely Rarely Sometimes Frequently Very Frequently


 2. I am aware of how my cultural background and experience have influenced my attitudes
about psychological process.

Very Rarely Rarely Sometimes Frequently Very Frequently


 3. I am able to discuss how my culture has influenced the way I think.

Very Rarely Rarely Sometimes Frequently Very Frequently


 4. I can recognize when my attitudes, beliefs, and values are interfering with providing the best
services to my clients.

Very Rarely Rarely Sometimes Frequently Very Frequently


 5. I can discuss my family’s perspective regarding acceptable and nonacceptable codes of conduct.

Very Rarely Rarely Sometimes Frequently Very Frequently


 6. I verbally communicate my acceptance of culturally different persons.

Very Rarely Rarely Sometimes Frequently Very Frequently


 7. I nonverbally communicate my acceptance of culturally different persons.

Very Rarely Rarely Sometimes Frequently Very Frequently


 8. I can identify my reactions that are based on stereotypical beliefs about different ethnic
groups.

Very Rarely Rarely Sometimes Frequently Very Frequently

339
340  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

 9. Please rank the following activities in terms of its usefulness in improving your work unit’s
overall level of understanding on diversity and inclusiveness. (Most useful = 1, Least useful =3)
Rank
A. Diversity training _____
B. Diversity award event _____
C. Newsletter _____
10. How often are there differences of opinion in your team?

Very Rarely Rarely Sometimes Frequently Very Frequently


11. How often do the members of your team disagree about how things should be done?

Very Rarely Rarely Sometimes Frequently Very Frequently


12. How often do the members of your team disagree about which procedure should be used to
do your work?

Very Rarely Rarely Sometimes Frequently Very Frequently


13. How often are the arguments in your team task related?

Very Rarely Rarely Sometimes Frequently Very Frequently


14. How much are personality clashes evident on your team?

Very Rarely Rarely Sometimes Frequently Very Frequently


15. How much tension is there among the members of your team?

Very Rarely Rarely Sometimes Frequently Very Frequently


16. How often do people get angry while working in your team?

Very Rarely Rarely Sometimes Frequently Very Frequently


17. How much jealousy or rivalry is there among the members of your team?

Very Rarely Rarely Sometimes Frequently Very Frequently


18. How old are you?

Years Old

19. How many years have you worked at the City?

Year(s)
Appendix B  ❖  341

20. What is your gender?


Male
Female
21. Do you identify yourself as Gay, Lesbian, Bisexual, or Transsexual?
Yes
No
22. What is your ethnic background?
Caucasian/White
Non-Caucasian/Non-White
Rather Not Say
23. Total number of years of education?

Year(s)
24. Which department in the City do you work for?
City Hall Administration
City Hall Technical
Culture and Recreation
Field and Fleet
Public Safety
Transit
Rather not say
25. How many diversity trainings have you attended?

Training(s)
26. Should the diversity training be required?
Yes
No
27. Did you attend the diversity training offered?
Yes
No

Glossary
Abstract: A brief summary of the key points of the study in an academic report format,
usually one paragraph in length.
Academic Report: A format of writing study results which documents the research
flow, including a problem statement, a theoretical framework based on a review of the
literature, data collection and analysis methods, results, and discussion.
Adjusted R Square (R2): Adjusts the value of R2 when the sample size is small, because
an estimate of R2 obtained when the sample size is small tends to be higher than the
actual R2 in the population.
Administrative Records and Management Information: One source for secondary
data, particularly useful for analyzing organizations.
After-Only Design With Comparison Group: A research design that measures the
dependent variable in an experimental group and a comparison group after the research
intervention has occurred.
Alpha (α) Inflation: The increased probability of making a type I error as a result of
conducting multiple t-tests, controlled by the ANOVA test.
Anonymity: Occurs when data is collected without identifying information that can
link the respondents to the response.
Assigned Grouping: A nonrandom technique in which the researcher determines
which participants will be placed in which groups.
Baseline Data: Data which is collected to represent the level or rate of some variable
before an experimental intervention.
Before-and-After Design: A research design that measures the dependent variable
before and after the research intervention has occurred in order to establish a baseline
(before) and compare against that baseline (after).
Bivariate/Simple Linear Regression Analysis: A regression analysis performed when
there is only one independent variable in the regression analysis.
Boxplot: A graphical representation of numerical data which includes a five number
summary, including (1) the lowest value, (2) the highest value, (3) the median value,
(4) the lower quartile, and (5) the upper quartile.
Cases: The individuals or the entities (such as organizations) from which the data are
collected.
342
Glossary  ❖  343

Causal Research Question: A research question that hypothesizes one factor, X, is a


cause of the effect, Y.
Central Limit Theorem: States that with a sufficiently large sample size, the sampling
distribution is the standard error.
Central Tendency: A descriptive statistic that indicates the middle or central position
within a data set.
Chi-Square Analysis: A nonparametric test used to examine the relationship between
two categorical variables.
Closed-Ended Question: A survey question that limits the number of responses to
that particular question.
Cluster Sampling: A probability sampling technique which identifies independent
cases based on a cluster, which is typically a naturally occurring grouping of elements
of the population.
Codebook: A document which provides a guide to the layout and definitions of the data file.
Codes: Labels in a thematic analysis which attach a categorical meaning to bits of text
to represent a single concept.
Coefficient of Determination (R2): A measure of how much of the variance in one
variable is explained by variance in another.
Comparison Groups: In an experiment, a group or groups that receive either no treat-
ment or a different treatment than the experimental group.
Computer-Assisted Telephone Interview (Cati): A telephone-based system of survey-
ing where the interview script is provided by a computer.
Conceptualization: The act of refining and specifying the abstract concepts in the research.
Confidence Interval: The range obtained from the sample within which the researcher
believes it is likely (at the confidence level) the true population value lies.
Confidence Level: The degree to which the researcher wants to be confident about the
estimation made by the sample, usually 95% confident.
Confidentiality: The assurance that any information that will link the respondents3
identity with the information collected in the research is kept secret.
Confirm/Test the Hypothesized Relationship: A type of research performed with the
objective of testing group differences, the relationship between groups, or cause and
effect based on inferential statistics.
Confounding Factors: Variables that obscure the relationship between other variables
in the study.
Control Group: In an experiment, a group that does not undergo the experimental
treatment to act as a comparison to the group receiving the treatment.
Control Variables: Variables incorporated into the research design to control for the
effects of the extraneous variable.
344  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Convenience Sampling: A non-probability sampling technique where the research-


ers take the opportunity to sample from individuals and entities that are conveniently
available to them.
Correlation: The relationship between two concepts of interest.
Correlation Coefficient: A numerical index to represent the relationship between two
variables.
Correlational Research Question: A question which posits that a characteristic of one
individual, condition, object, or event is related to a characteristic of another individ-
ual, condition, object, or event.
Cost-Benefit Analysis: A type of evaluation where all potential outcomes of the pro-
gram are specified in financial terms.
Cost-Effectiveness Analysis: A type of evaluation where the potential outcomes of the
program are captured in financial and nonfinancial terms.
Covariation of the Cause and Effect: One condition necessary to establish a cause and
effect relationship where changes in the dependent variable are related in a systematic
way to changes in the independent variable.
Co-Vary: How the value of one variable changes when the value of another variable changes.
Cross Sectional Survey Design: A research design that utilizes a survey instrument to
collect data at a single point in time.
Curvilinear Relationship: A relationship represented by a curved line, rather than a
straight one.
Data Analysis: The evaluation of either quantitative or qualitative data with the goal of
answering a research question.
Data Archives: Data collected and made available to other researchers for secondary
data analysis.
Data Cleaning: The data preparation process where researchers check the data for
errors and screen it for accuracy.
Data Collection: The process of preparing and collecting data used to gain information
about a particular program or research project.
Data Saturation Point: The point where no new information is being obtained as more
individuals are interviewed; the variety of arguments is exhausted.
Deductive Approach: A type of analytic thinking that forms a hypothesis based on
a pattern of ideas that can be tested to see if it is true, or perhaps, in what specific
instances it is true.
Degrees of Freedom: The sample or group size minus 1 for each set of scores to approx-
imate greater variance within the population compared to the sample.
Dependent Variable (DV) / Outcome Variable / Criterion Variable: The variable that
is hypothesized to be affected by the independent variable(s).
Glossary  ❖  345

Descriptive Research Question: A research question in which the answer is expected


to document the existence and status of a phenomenon.
Descriptive Statistics: Statistics used to summarize and describe characteristics of
quantitative data, such as central tendency.
Deviance: The difference between a particular data point within a variable and the
mean of that variable.
Directional Research Hypothesis: A hypothesis that identifies a particular direction of
change in the dependent variable.
Disproportional Stratified Sampling: A probability sampling technique where the
researcher intentionally varies the proportion of the subgroups (strata) from how they
are in the population.
Double-Blind Studies: Studies in which the participant and researcher do not have
knowledge whether the participant is in the treatment or control group.
Dummy Variables: A variable that takes a value of 0, representing the absence of the
attribute, or 1, indicating the presence of the attribute.
Effect Size: A measure that tells you the magnitude of difference between groups.
Empirical Phenomenology: A qualitative approach in which the researcher acknowl-
edges they are impacting the study by determining what is relevant.
Errors of Nonobservation: A survey error resulting from inadequate coverage of pop-
ulation, sampling error, or nonresponse.
Errors of Observation: A survey error resulting from the poor wording of a question
or inappropriate selections of the question.
Ethical Implications: A likely moral consequence of the way in which research is
­conducted.
Exclusion Criteria: The predetermined set of standards or criteria used to determine
those participants who are ineligible for the study.
Executive Summary: A brief summary of the key points of the study in a nonacademic
report format, usually ranging from a few paragraphs to a couple of pages.
Experimental Design: A form of research design that is identified as having an exper-
imental group and a control group, random assignment to those groups, and in which
the variable of interest is measured both before and after the intervention.
Experimental Group: In an experiment, a group that receives the experimental treat-
ment or manipulation.
Expert Sampling: A non-probability sampling technique where the researcher samples expert
opinions rather than aggregating the opinions that represents the population of interest.
Explore and Describe the Phenomenon: A type of research performed with the
objective of exploring and describing a phenomenon through the use of qualitative or
descriptive statistics.
346  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

External Validity: The extent to which the result of a given research can be applied to
draw a conclusion about the population of interest.
Extraneous Variables / Control Variable: Variables that may influence change in the
dependent variable that were not considered in the hypothesized relationship of inde-
pendent and dependent variables.
Extreme Case Sampling: A non-probability sampling technique where the researcher
selects study participants who fall outside what is normally expected and examines the
elements that make the case extreme or different.
Face-to-Face Interview: A survey technique where the survey participant is inter-
viewed rather than filling out the survey themselves.
Factorial Design ANOVA: Used with multiple grouping variables to examine the
impact of each independent variable on the dependent variable as well as the interac-
tion of the independent variables together on the dependent variable.
Falsifiability: The concept that in scientific inquiry, the hypothesis needs to be falsifi-
able, because no matter how many observations you have, you cannot verify that your
observation is universally generalizable.
Focus Group Interviews: A type of group interview with a collection of six to 12 indi-
viduals brought together for a period that ranges from an hour to three hours to discuss
a specific topic, guided by a trained moderator.
Formative Evaluation: A type of evaluation aimed at improving the program imple-
mentation process.
Frequency Distribution: A method of documenting the relative frequencies of a dis-
tribution by first making categories of numbers and then plotting how many values fit
in the different categories.
Frequency Polygon: A graph of the frequency distribution which uses a line to connect
the frequency count of each category.
Frequency Table: A table with categories of numbers and a corresponding count of
how many values fit in each category.
Generalizability: The ability to generalize the research result to the population of interest
which requires the appropriate sample size, sampling frame, and sampling technique.
Group Assignment: The way in which individuals or entities in a research project are
assigned to groups, either randomly or nonrandomly.
Group Difference Research Question: A research question in which the answer is
expected to confirm the hypothesis that there is a difference between the groups.
Histogram: A visual representation of the frequency distribution where the count for
each group of values goes on the vertical axis and a series of bars show how many times
each range of values occurred in the data set.
History Threat: A potential threat to internal validity which occurs when an external
event may be a threat to the causal argument.
Glossary  ❖  347

Homoscedasticity: An assumption in linear regression that the degree of random


noise in the dependent variable remains the same regardless of the values of the inde-
pendent variables.
Hypothesis: A tentative statement about the plausible relationship between two or
more variables that is subject to empirical verification.
Hypothesis Testing: The steps to confirm the hypotheses.
Inadequate Coverage of Population in the Sampling Frame: A phenomenon that
occurs when the sampling frame selected to identify the sample did not fully cover the
population of interest.
Inclusion Criteria: The set of predefined standards or criteria used to determine those
participants who are eligible for the study.
Independent Samples T-Test: Used when you have two groups in your sample that are
independent from each other, and you would like to compare their means to see if they
are significantly different.
Independent Variable (IV): The variable (or variables) that you hypothesize as causing
the change in the dependent variable.
Indicators: Items which are measurable phenomenon that substitute for a concept that
is not easily measured.
Inductive Approach: A type of analytic thinking that starts with specific observations
which develop into a specific hypothesis and tentative theory.
Inferential Statistics: Statistics used to draw conclusions about quantitative data.
Informed Consent: A way of informing participants in a study of the real and potential
risks and uses of the study as well as ensuring they acknowledge the information.
Informed Consent Form: The form that the study participants sign in order to indicate
that they are giving their consent to participate in the study.
Instrumentation Threat: A potential threat to internal validity which occurs when the
instrument itself could be influencing the result.
Intercept: The point at which the regression line crosses the Y-axis (where x = 0).
Internal Validity: The extent to which the research design accurately demonstrates the
causal relationship between the variables and is not a reflection of a fault in the research
design.
Interrupted Time Series Designs: A research design that includes several observa-
tions of the dependent variable prior to the intervention which is then compared to
several observations of the dependent variable after the intervention.
Interquartile Range: The difference between the 75th percentile point and the 25th
percentile point.
348  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Iterative Process: In research, the researcher may be required to revisit past steps of the
process in order to ensure the research is properly aligned.
Kurtosis: A measure of the shape of the distribution that indicates the degree of peakedness.
Leptokurtic: Describes a distribution that is more peaked than a normal curve, repre-
sented by a positive kurtosis value.
Level of Significance: The quantifiable risk of committing an error that the researcher
is willing to take.
Levels of measurement: The character of the variable which can be nominal (categor-
ical or dichotomous), ordinal, interval, or ratio.
Levene’s Test: A test used to see if the population variances for the two groups you are
comparing in your analysis are equal.
Linearity: An assumption in linear regression that the relationship between the depen-
dent variable and the independent variables are linear in nature.
Linear Regression Slope or X Coefficient: The slope of the regression line indicating
how much the Y value changes when there is a one-unit change in the value of X.
Literature Review: The process of gathering and critically reviewing information from
sources such as reports, books, and journal articles to gain background information
on a particular topic, assess past work, and assist in formulating the research question.
Logistic Regression: A regression analysis approach used to predict the outcome when
the dependent variable is dichotomous.
Mail Survey: The method of delivering a paper and pencil survey form through the mail.
Margin of Error: The degree of difference (or error) between the characteristics repre-
sented in the sample and the population.
Matched Subjects Design: A research design where you have a pair of people or sub-
jects assessed once on the same measure.
Matching: The case where the researcher deliberately matches certain characteristics of the
individuals or entities participating in the study to make the groups appear comparable.
Maturation Threat: A potential threat to internal validity which occurs as a result of the
study participants learning from their daily experiences as well as physically maturing.
Mean: A measure of central tendency represented by an arithmetic average.
Measurement Error: A survey error resulting from the poor wording of a question or
inappropriate selections of the question.
Measures of Variability/ Dispersion/ Spread: Measures which represent how much
the values in the data differ from each other.
Median: A measure of central tendency represented by the value found at the exact middle
of the range of values for a variable, when the values are listed in numerical order.
Mesokurtic: Describes a distribution that is a normal curve, represented by a zero kur-
tosis value.
Glossary  ❖  349

Mixed Design ANOVA: Used when you want to explore the effect of one or more
grouping variables on one or more repeated measures.
Mode: A measure of central tendency represented by the value that occurs most fre-
quently in the data set.
Mortality (Attrition) Threat: A potential threat to internal validity which could occur
if research participants drop out of the study.
Multicollinearity: Occurs when there is a strong relationship among the independent
variables.
Multiple Regression Analysis: A regression analysis performed when there is more
than one independent variable in the regression analysis.
Naturally Occurring Grouping: A nonrandom technique in which the researcher
specifies the qualities of interest and studies the people found to fit the definition.
Needs Assessment: A type of research that attempts to analyze the needs of a particular
group and evaluates the appropriateness of a social program to address that need.
Negative Correlation: A relationship between two variables where an increase in one
variable is related to a decrease in the other variable.
No Plausible Alternative Explanation: One condition necessary to establish a cause-
and-effect relationship where there are no other factors responsible for the observed
changes in the dependent variable.
Nonacademic Report: A format of writing study results which is more flexible than the
academic report, but usually includes an executive summary.
Nondirectional Research Hypothesis: A hypothesis that does not identify a particular
direction of change in the dependent variable.
Nonequivalent Groups: Occurs when group assignment is done in a nonrandomized
manner.
Nonparametric Test: A family of tests which applies to categorical data and does not
require a normal distribution.
Nonparticipant Observation: A method of observation where the researcher is isolated
from the observed group and observing them from the outside while collecting data.
Non-Probability Sampling: A sampling technique where the probability of any case
selected to be part of the sample is not known.
Nonrandom Assignment (Nonequivalent Groups): The case when the assignment of
the study participants or entities is not randomized.
Nonresponse: When a selected individual or entity does not choose to participate in
the data collection.
Normal Curve: A theoretical distribution which has perfect symmetry as well as equal
mean, median and mode.
Normal Distribution: The theoretically ideal bell-curve distribution where the mean,
median, and mode are the same and locate at the exact midpoint of the distribution.
350  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Normality: An assumption in linear regression that the dependent variable is mea-


sured as a continuous variable and is normally distributed.
Null Hypothesis: A hypothesis which states that there is no relationship between the
variables, often the opposite of the research hypothesis.
Observation: The act of watching the phenomenon or the behavior you are interested in
researching and recording it so you can describe, analyze, and interpret what it means.
Observer Effect: The impact that the act of observing the study has on the study results.
Omnibus Test: Used in reference to ANOVA because the test does not indicate between
which pairs there is a significant difference.
One-Sample T-Test: Used when you have only one sample, and you are comparing its
mean to some other set value.
One-Way ANOVA: Used when you want to compare the means of several independent
groups.
Open-Ended Question: A survey question that allows the respondents to answer in
any way they like and add additional commentary.
Operationalization: The process of developing research procedures (operation) that will
result in empirical observations which represents the research concepts in the real world.
Oral History: A method used to collect data about the past that stretches over a long
time period.
Oral Presentation: One mechanism for communicating study results in a conference
setting, public testimony, or other oral format.
Outcome Evaluation: A type of evaluation which focuses on the program’s intended
outcomes.
Outliers: An extreme score of a variable within the data set.
Oversample: The sampling of a subpopulation beyond the level at which they appear
in the population with the intent of obtaining a more reliable sample from that group.
Paired-Sample T-Test: Procedure used when you want to compare the means of two
groups that are closely related or matched, or when one group is measured twice, and
you would like to compare their means to see if they are significantly different.
Paper and Pencil Survey: A method of survey data collection where the respondents
are asked to fill out a hard copy of a survey.
Parametric Tests: A family of tests which is based on the assumption that the underlying
population is normally distributed.
Participant Observation: A method of observation where the researcher becomes a
member of the observed group while collecting data.
Pearson Product Moment Correlation Coefficient (R): One correlation coefficient
used to represent the relationship between two variables that are continuous in nature.
Glossary  ❖  351

Percentile Points: The value identified to show what percentage of the data are less
than or equal to that particular value.
Performance Measurement: The process of designing and measuring a specific out-
come related to program performance.
Placebo Effect: The psychological effect that results from the mere fact of being a par-
ticipant in a study regardless of receiving the treatment.
Platykurtic: Describes a distribution that is less peaked than a normal curve, repre-
sented by a negative kurtosis value.
Population: The complete set of people or entities that the researcher is interested in
studying.
Positive Correlation: A relationship between two variables where an increase in one
variable is related to an increase in the other variable.
Post Hoc Test: A test conducted after the initial omnibus test to determine which of
the groups differ.
Power: The probability of correctly rejecting the null hypothesis.
Primary Data: Data that is collected by the researcher for the given study.
Probability Sampling: A sampling technique where each unit in your population has
an equal chance of being selected for the sample.
Process Evaluation: A type of evaluation which focuses on the processes occurring
within a program.
Program: Activities aimed at achieving a specific goal or set of goals.
Program Evaluation: The process of assessing how well a clearly identifiable program
is working, and whether the program has achieved its overall objectives.
Proportional Stratified Sampling: A probability sampling technique where the sam-
ple that the researcher selects reflects the actual proportion of the subgroups (strata) in
the population.
Purposive Sampling: A non-probability sampling technique where the researcher
selects samples based on particular predetermined criteria.
P-Value: The probability of observing a test statistic as extreme or more, given the null
hypothesis is true.
Qualitative Data: Data which captures information as words in the form of narratives
or statements.
Quantitative Data: Data which captures information using some kind of measure-
ment, usually numbers.
Quasi-Experimental Design: A research design which compares groups before and
after a treatment or intervention, but group assignment is not random.
352  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

R (Multiple R): An indicator of how well the overall regression equation predicts the
observed data.
Random Assignment: The case where all study subjects are given an equal chance to
be assigned to one of the groups in the study.
Range: A measure of variability found by calculating the difference between the high-
est value and the lowest value in the data set.
Raw data: The unsummarized and nontabulated form of data before it is analyzed.
Reference Group: The category designated as 0 in the dummy variable.
Regression Coefficient (b)/Slope: Indicates how much the Y value changes when
there is a one-unit change in the value of X.
Regression Line/ Line of Best Fit: The line that best represents the pattern of the rela-
tionship between the dependent variable and the independent variable.
Regression Threat/ Regression Artifact/Regression to the Mean: A potential threat
to internal validity that occurs as a result of the statistical phenomenon that mean
scores from a nonrandom sample of a population, when measured twice, move closer
to the population mean.
Repeated Measures ANOVA: Used when you want to compare the means of three or
more groups that are related.
Repeated Measures Design: A research design where data is collected from the same
group twice.
Reporting: Articulating the implications of the research results in relation to the
research questions and research objective.
Research Alignment: A research approach that integrates and aligns the components
of the research process. These components include the research question, research
design, data collection, data analysis, results, and interpretation.
Research Design: The overarching strategy for how the various components of research
are assembled to answer the research question.
Research Hypothesis / Alternative Hypothesis: A hypothesis which states that there is
a relationship between two or more variables of interest.
Research Objective: A statement that identifies which problem(s) will be addressed by
the research project.
Research Question: A question that addresses the research objective and can be
answered through the collection and analysis of data.
Research Topic: Broad descriptions or areas of interest in which there is an articulated
problem to be addressed.
Researcher Bias: The potential for researchers to find results that support their hypoth-
esis by reflecting their own values and beliefs in the data.
Residual Sum of Squares (RSS): How much the actual score in the dependent variable
differs from the value estimated by the regression equation.
Glossary  ❖  353

Residuals/Error in Prediction: The difference between the observed values and the
regression line.
Robust: The degree to which the statistics are resistant to errors, and even when the
assumptions are violated, the results are not impacted.
R-Square (R2): See Coefficient of Determination.
Sample: A group of individuals or entities selected for study from the population.
Sample Selection: Identifying from whom or what the data will be collected.
Sample Size: The number of observations in a sample.
Sampling: The process for identifying the subset of people or entities from which to
gather the data.
Sampling Distribution: A distribution which draws all possible samples of the same
size from a population and plots the mean value of the samples.
Sampling Error: The degree of difference (or error) between the characteristics repre-
sented in the sample and the population.
Sampling Error/ Margin of Error: The difference between the value from the samples
and the population value.
Sampling Frame: The list that contains information about each element of the popula-
tion from which the researchers can draw the sample.
Sampling Technique: The approach the researcher uses to select the sample from the
population.
Saturation Point: The point at which the researcher feels that no new or relevant infor-
mation is obtained from additional data collection.
Scatterplot: A graph which displays values of two variables as a collection of points on
the space determined by horizontal axis (X) and vertical axis (Y).
Secondary Data: Data that has already been collected for another purpose, but being
used by the researcher for the given study.
Secondary Data Analysis: Analysis performed on data gathered in the past rather than
collected by the researcher at the time of the study.
Selection Bias: Occurs when there is a systematic error in the group assignment which
influences the outcome of the study.
Selection Interaction Threats: Occurs when the selection threat to validity interacts
with other threats to internal validity.
Selection Threat: A potential threat to internal validity that occurs as a result of
s­ election bias.
Significance Test: A test to determine if the null hypothesis should be accepted or
rejected.
Simple Random Sampling: A probability sampling technique that assures that every
member of the population has an equal chance to be selected to be a sample.
354  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Single-Blind Studies: Studies in which the participant does not have knowledge
regarding whether they are in the treatment or control group.
Skewness: A measure of the degree of lopsidedness (non-normality) of the frequency
distribution.
Snowball Sampling: A non-probability sampling technique where the researcher first
identifies one person (or entity) to contact and collect information. Then, subsequent
participants are selected by asking the first study participant to introduce whom he/she
thinks would be useful for the research to include.
Socially Desirable: The tendency in qualitative research for the subjects to offer com-
ments that are socially desirable.
Solomon Four-Group Design: A research design which utilizes four groups in a hybrid
experimental design: the first group (A) has a pretest and posttest with intervention;
the second group (B) is a control group to Group A, with a pretest and posttest, but no
intervention; the third group (C) receives an intervention like Group A, and a posttest,
but no pretest; and the fourth group (D) is a control group for Group C, with a posttest,
but no pretest and no intervention.
Sphericity: The variances of the differences between all combinations of related groups
(levels) are equal.
Standard Deviation: A measure of variability found by calculating the square root of
the variance.
Standard Error: An estimate of the size of sampling error in any given sample.
Statistical Probability: The probability that the result obtained based on the character-
istics of the sampling distribution is due to chance.
Statistically Significant: When the null hypothesis is rejected as a result of the signif-
icance test.
Statistics: The study and set of tools and techniques used to quantitatively describe,
organize, analyze, interpret, and present data.
Stratified Random Sampling: A probability sampling technique where the population
is first divided into subgroups based on certain characteristics (strata), and the ran-
dom sample is then selected from each of the subgroups (strata).
Summative Evaluation: A type of evaluation concerned with the program as a whole,
with a strong focus on the financial viability of the program.
Survey: A data collection tool which asks questions of the sample in a standardized form.
Syllogism: A logical argument which describes the relationship between two variables
based on a major premise, minor premise, and a conclusion.
Systematic Random Sampling: A probability sampling technique where the first ele-
ment of the sample is randomly selected from the sampling frame, where the cases are
ordered sequentially, followed by the selection of every kth element.
Glossary  ❖  355

Telephone Survey: A method of survey data collection where the participants are
asked a series of questions over the phone to complete the survey.
Temporal Precedence: One condition necessary to establish a cause and effect rela-
tionship which holds that the changes in the independent variable precede the changes
in the dependent variable.
Testing Threat: A potential threat to internal validity that occurs when measurement
takes place more than once resulting in learning among the participants.
Textual Analysis / Content Analysis: The act of collecting and analyzing texts as a data
collection method.
Thematic Analysis: An approach to qualitative research that focuses on identifying
themes that adequately represent the data.
Time Series Design: A research design which takes measures or observations of a sin-
gle variable at many consecutive periods in time.
Total Sum of Squares (SST): The sum of squared deviations from the mean.
Trend Analysis: Analyzing data over multiple time periods in order to draw conclu-
sions about general trends.
Two-Way Contingency Table Analysis: A 2x2 table that cross tabulates the fre-
quency of the cases within each possible category when the two variables are
­combined.
Type I Error: The incorrect rejection of the null hypothesis (a false positive).
Type II Error: The incorrect acceptance of the null hypothesis (a false negative).
Unit of Analysis: The person or entities being studied in the research.
Variables: The documented information measured or observed during the course of
the data collection process.
Variance: A measure of variability found by calculating the average of sum of squared
deviance.
Variance Inflation Factor (VIF): A diagnosis for multicollinearity where a value of 10
indicates a multicollinearity problem.
Variation: The concept that when something is measured multiple times, there will be
a different result each time.
Web-Based Survey: A method of survey data collection where the participants are
asked to complete a survey online.
Weighting: An adjustment made to the sample in order to generalize to the population,
especially in cases where disproportionate sampling is utilized.
Z Score: The number of standard deviations the observation lies from the mean.

Index
Abstract, of report, 304, 342 researcher, 107, 111, 294, 353
Academic report, 302–303, 342 selection, 55, 56, 58, 61, 354
Adjusted R-square (R2), 267, 275–276, 342, 352 survey questions and, 95
Administrative Records and Management Bingham, R. D., 317
Information, 109, 342 Bivariate correlation, 222–238
Advocacy skills, 10 causality and, 236–237
After-only design with comparison group, examining relationships, 223–224
59, 60, 67, 342 hypothesis testing/statistical significance for,
Alignment. See Research alignment 230–231
Alpha (α) inflation, 342 Pearson product moment correlation (r),
Alpha (α) level, 160, 167, 200 224–230, 231–233 (figure)
Analysis of variance (ANOVA), 193–221 running using Excel, 235–236 (figure),
comparing more than two groups, 195–196 269 (figure)–270 (figure)
effect size and, 201 running using SPSS, 231–235, 232 (figure),
factorial ANOVA, 217 (table)–218 (table) 265 (figure)–268 (figure)
f-statistic and, 197–199, 198 (figure), 203 Bivariate/simple linear regression analysis,
introduction to, 196–201 258, 342
mixed design ANOVA, 217–218, 218 (table) Bonferroni, Carlo, 200
null hypothesis in, 200, 202 Bonferroni correction, 200, 201, 215
one-way ANOVA, 197, 201–210 Bottom-up approach, 23
post hoc tests, 200–201, 206 (figure), Boxplots
207 (figure), 351 creating in SPSS, 336–338 (figure), 337 (figure)
reasons to conduct, 197 defining, 342
repeated measures ANOVA, 197, 210–216 example of, 129–130 (figure)
summary of different types of, 219 (table) Brown, R. D., 324
Anonymity, 111, 342
ANOVA. See Analysis of variance (ANOVA) Cases, 121, 343
APA style, 303 Categorical level of measurement, 123, 124
Assigned grouping, 41, 342 Categorical variables, 78, 240–242
ATLAS.ti, 289 Causality, bivariate correlation and, 236–237
Attrition (mortality) threat, 54 Causal research question, 41, 343
Availability sampling, 83 Cause and effect research design, 51–56, 59
covariation of cause and effect, 51, 52–53, 344
Baseline data, 51, 342 temporal precedence, 51, 52 (figure), 355
Before-and-after design, 15, 21, 59, 60–61, Cellini, S. R., 322
68, 342 Central limit theorem, 164, 343
Before-and-after two group design, 59, 61, Central tendency
88–89 defining, 343
Bias skewness and, 143 (figure)
data collection and, 293 See also Measures of central tendency

356
Index  ❖  357

Chavkin, A., 310 Covariation of cause and effect, 51, 52–53, 344
Chavkin, N. F., 310 Co-vary, 225–226, 343
Chen, H. T., 320 Cross sectional survey design, 50, 344
Chicago style, 303 Cultural competence and workplace conflict
Chi-square analysis, 242–252 case study, 3–4
calculating statistics for, 243–245, 244 (table) bivariate correlation, 223–224
definition of, 343 comparing means between two groups,
relationships between two categorical 173–174
variables, 240–242 comparing means of more than two groups,
running using Excel, 249–251 (figure), 195–196
250 (figure) data analysis, 23, 282–283
running using SPSS, 245–248, data collection, 22, 88–89, 99, 101, 282–283
246 (figure)–248 (figure) descriptive statistics analysis, 118–119
samples size, 245 hypothesis testing, 152–153, 156–158 (figure)
statistical significance and, 245 nonalignment of research and, 339–341
Ciulla, J. B., 10 predicting relationships, 255–256
Closed-ended question, 94, 96, 343 program evaluation/performance
Cluster sampling, 77, 81–82, 83 (figure), 343 measurement, 314–315, 325–330
Codebook, 121, 343 relationships between two categorical
Codes, defining, 286, 343 variables, 240–241
Coding, 286–287, 290 (table), reporting, 23
291 (table)–292 (table) report writing, 299–300
Coefficient of determination (R2), 262–265, 343 research design, 48
Cohen’s d, 201 research objective, 18, 19–20
Comparison groups, 16, 343 research steps, summary of, 24 (table)
Computer-assisted telephone interview sampling, 21, 73–74, 84–85
(CATI), 98–99, 343 survey for, 339–341
Computing value in SPSS, 336 (figure) types of research, 36
Conceptualization, 96–97, 343 Curvilinear relationship, 229, 230 (figure), 344
Confidence interval, 78, 343
Confidence level, 77, 78, 168, 215, 343 Data analysis
significance level vs., 161 defining, 344
Confidentiality, 111, 343 types of, 35–36
Confirmatory approach, 34 See also Data collection; Qualitative data
Confirm/test hypothesized relationship, analysis; Quantitative data analysis
32, 40–41, 120, 343 Data archives, 109, 344
Confounding factors, 15, 344 Data cleaning, 122, 344
Content analysis, 107, 355 Data collection, 21–22, 87–114
Continuous level of measurement, 124, 126 bias in, 293
Continuous variables, 78, 125 defining, 344
Control group, 55, 57, 344 ethical considerations in, 110–111
Control variables, 156, 344 focus group, 102–106
Convenience sampling, 83, 344 identifying methods for, 88–91
Correlation, 225, 344 interviews (see Interviews)
Correlational research question, 41, 344 nonresponse, 93, 350
Correlation coefficients observations, 68, 106–107, 278, 355
defining, 344 secondary data, 109–110
guidelines for interpreting, 227 (table) surveys (see Surveys)
types of, 225 (table) types of data, 91
Cost-benefit analysis, 321, 322, 344 Data saturation point, 293, 345
Cost-effectiveness analysis, 321–322, 344 Decision making, research skills as facilitating, 10
358  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Deductive approach Error


defining, 34, 345 inadequate coverage of a population, 93, 347
vs. inductive approach, 33–35 measurement, 92, 348
Degrees of freedom (df), 199, 345 nonobservation, 92–93, 345
Dependent variable (DV)/outcome variable/ nonresponse, 93, 350
criterion variable, 155, 156, 257–258, 345 observation, 92, 345
Descriptive research question, 40, 345 residuals/error in prediction, 263–264
Descriptive statistics sampling error/margin of error, 77, 93, 162,
defining, 35, 120, 345 164, 353
kurtosis, 139 (figure)–140 (figure), 348 standard, 164, 354
mean, 127 (table)–128, 348 survey, 92–93
measures of central tendency, 126–131 Type I error, 166–168, 200, 203, 210,
measures of shape of distribution, 137–144 215, 355
measures of variability, 131–137 Type II error, 167, 168, 200, 201, 215, 355
median, 128–130 (figure), 129 (table), 349 Errors of nonobservation, 92–93, 345
mode, 130–131 (table), 349 Errors of observation, 92, 345
normal distribution curve, 138–139, Eta squared (η2), 201
163 (figure) Ethical implications, 22
overview of, 126 data collection and, 110–111
range, 133–134 (figure), 352 defining, 345
running using Excel, experimental/quasi-experimental design
147 (figure)–148 (figure) and, 69
running using SPSS, 145–147, program evaluation/performance
145–147 (figure), 146 (figure) measurement and, 324–325
skewness, 142 (figure)–144, 143 (figure), 354 Ethical leadership, research skills as
standard deviation, 136–137, 141 (figure), supporting, 10–11
163 (figure), 354 Ethridge, M. E., 49
variance, 77, 78, 134–136 (table), Exclusion criteria, 76
135 (table), 355 Executive summary, 304–305, 345
Deviance, 134–135 (table), 263, 345 Exhaustive, 95, 96, 123
Df (degrees of freedom), 199, 345 Expected frequency scores, 244, 245
Dichotomous variable, 123, 125 Experimental design
Dillman, D. A., 94 definition of, 345
Directional research hypothesis, 159–160, 345 ethical considerations in, 69
Disproportional stratified sampling, 80, 81, group assignment and, 57–58
82 (figure), 345 groups and, 57
Double-barreled questions, 95 illustration of, 57 (figure), 58 (figure)
Double-blind studies, 107 interrupted time series design, 68
Double entry, 122 key elements of, 56–59
Double negative, 95 Minneapolis Domestic Violence Experiment,
Dummy variables, 258, 271–273, 345 66–67
coding, 272 (table) observations and, 56
Dunnett’s C post hoc test, 201 placebo design, 67
Duplicate elements, in sampling frame, 76–77 Solomon Four-Group Design, 67–68
DV (dependent/outcome/criterion variable), threats to, 55
155, 156, 257–258, 345 time issues in, 58–59
time series design, 68
Effect Size, 78, 201, 345 treatment/intervention and, 57
Emily. See Cultural competence and workplace Experimental group, 55, 57, 345
conflict case study Expert sampling, 84, 346
Empirical phenomenology, 107 Exploratory approach, 33
Index  ❖  359

Explore and describe the phenomenon, 32, 40, problems in, 76


120, 346 sampling and, 77, 78, 81, 83, 84, 292–293
External validity, 75, 346 statistical significance and, 161, 184
Extraneous variables/control variable, 156, 346 surveys and, 92
Extreme case sampling, 84, 346 Gosset, William, 175
Green, S. B., 245
Face-to-face interview, 99, 121, 346 Grounded approach, 34, 36
Factorial design ANOVA, Group assignment
217 (table)–218 (table), 346 defining, 61, 347
False negative, 167 experimental design and, 57–58
False positive, 166 Group difference research question, 41, 347
Falsifiability, 159, 346 Groups
Felbinger, C. L., 317 assigned grouping, 41, 342
Fire department operational efficiency case comparison, 16, 343
study naturally occurring, 41, 349
alternative service model, 5–6, 90 nonequivalent, 58, 349
comparing means between two groups, reference, 271, 272, 273, 276–277, 352
174, 179–180, 185–186 Groves, R. M., 94
comparing means of more than two groups, Guiding Principle for Evaluators (2004;
195–196 American Evaluation Association),
data collection, 90, 109–110 324, 325 (table)
data-driven approach, 4–5
descriptive statistics analysis, 119–120 Haphazard sampling, 83
focus of research, 27–29 Harding, F. D., 8
hypothesis testing, 153–154 Hawthorne effect, 107, 293
program evaluation/performance Histogram, 137, 139–144, 141 (figure),
measurement, 315–316, 325–330 142 (figure)–143 (figure), 347
report writing, 300–301 History threat, 53–54, 347
research design, 59–65, 63 (figure), Homogeneity of variance, 179, 202, 203,
65 (figure) 207 (figure)
research objectives, 31–32 Homoscedasticity, 258, 347
research questions, 37–40 (table), 39 (figure) Honestly significant difference (Tukey’s HSD)
Fisher, R. A., 159, 197 test, 200, 201, 208 (figure)
Fisher’s LSD (least significant difference) test, Hypothesis
200–201, 215 defining, 155
Fleishman, E. A., 8 developing, 155–158
Focus group, description of, 108 (table) directional, 159–160, 345
Focus group interviews, 102, 104–106, 346 nondirectional, 159–160, 349
Focus of research, identifying, 27–30 null (see Null hypothesis)
Foreign elements, in sampling frame, 76 Hypothesis testing, defining, 350
Formative evaluation, 319, 346 Hypothesis testing, with inferential statistics
Fowler, F. J., 94 defining, 34, 154, 347
Frequency distribution, 137–144, 346 developing hypotheses, 155–158
Frequency polygon, 138 (figure), 346 errors and risks in, 166–168
Frequency table, 137 (table)–138, 346 falsifiability and, 159, 346
F-statistic, 197–200, 198 (figure), 203, 267 four possible outcomes in, 167 (table)
hypothesis testing, summary of steps in, 166
Games-Howell post hoc test, 201 normal distribution, 162, 163 (figure)
Generalizability sampling distribution of mean, 162–165
defining, 75, 346 statistical probability, 162, 354
falsifiability and, 159 statistical significance, 160–162
360  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

statistical vs. practical significance, 168 Leadership skills, as research skills, 8–11
using inferential statistics, 152–154 Least significant difference (Fisher’s LSD) test,
variables in hypothesized relationship, 200–201, 215
155–156 Leptokurtic, 139 (figure)–140 (figure), 348
Level of significance, 160–162, 167, 348
Inadequate coverage of a population, 93, 347 Levels of measurement, 122–126
Inclusion criteria, 75 categorical, 123, 124
Independent samples t-test, 178–186 continuous, 124, 126
assumptions of, 178–179 defining, 348
overview of, 175 interval, 124
running using Excel, 184–185 (figure) key characteristics of, 125 (figure)
running using SPSS, 180–184, 181 (figure), nominal, 123
182 (figure), 183 (figure) ordinal, 123–124 (figure), 125–126
Independent variable (IV), 155–156, 258, 347 ratio, 124, 125
In-depth interview, 51 Levene’s Test, 179, 182–184, 348
Indicators, 97 Likert item, 96
Inductive approach Likert scale, 96, 98 (figure), 125–126
defining, 34, 347 Linearity, 258, 348
vs. deductive approach, 33–35 Linear regression analysis, 257–265
Inferential statistics, 36, 120, 126. See also assessing individual predictors, 265
Hypothesis testing, with inferential assessing prediction, 262–265,
statistics 263 (figure)–264 (figure)
Informed consent, 69, 347 basis for prediction, 258–262,
sample form for, 111 259 (figure)–261 (figure)
Informed voluntary participation, 110–111 Literature review, 9, 20, 42–44, 305, 348
Instrumentation threat, 54, 347 Logistic regression, 278, 348
Intercept (a), 261, 347
Internal validity Mail survey, 99, 348
defining, 53, 347 MANOVA (multivariate test), 215
threats to, 53–56 Margin of error, 77, 348
Interquartile range, 130 Mary. See Volunteer management case study
Interrupted time series design, 68, 278–279, 347 Matched subjects design, 186, 348
Interval level of measurement, 124 Matching, 62, 348
Interviewer effect, 293–294 Maturation threat, 54, 348
Interviews Mean, 127 (table)–128, 348. See also Analysis
computer-assisted telephone, 98–99, 343 of variance (ANOVA); T-tests
description of, 108 (table) Measurement error, 92, 348
face-to-face, 99, 346 Measures of central tendency, 126–131
focus group, 102, 104–106, 346 choosing appropriate, 131, 132 (table)
in-depth, 51 mean, 127 (table)–128, 348
interview guide, 101–102, median, 128–130 (figure), 129 (table), 349
103 (table)–104 (table) mode, 130–131 (table), 349
Iterative process, 18, 21, 294, 348 Measures of shape of distribution, 137–144
IV (Independent variable), 155–156, 258, 347 Measures of variability/dispersion/spread,
131–137
Jacobs, T. O., 8 defining, 348
Jim. See Fire department operational efficiency range, 133–134 (figure), 352
case study standard deviation, 136–137, 141 (figure),
163 (figure), 354
Kish, Leslie, 76 variance, 77, 78, 134–136 (table),
Kurtosis, 139 (figure)–140 (figure), 348 135 (table), 355
Index  ❖  361

Median, 128–130 (figure), 129 (table), 349 Objective data collection, 111
Mesokurtic, 139 (figure)–140 (figure), 349 Observations, 108 (table)
Minneapolis Domestic Violence Experiment, defining, 350
66–67 experimental design and, 56
Missing elements, in sampling frame, 76 nonparticipant, 106, 349
Mixed design ANOVA, 217–218, 218 (table), 349 participant, 22, 106, 351
Mode, 130–131 (table), 349 structured/nonstructured, 106–107
Morgan, D. L., 105 Observed frequency scores, 244
Mortality (attrition) threat, 54, 349 Observer effect, 107
Multicollinearity, 271, 349 Omega squared (ω2), 201
Multiple R (R), 275, 352 Omnibus test, 200, 209–210, 350
Multiple regression analysis One-sample t-test, 175–178
defining, 258, 349 assumptions of, 176
multicollinearity and, 271 defining, 350
overview of, 270 software programs for, 176–178 (figure),
running using Excel, 277–278 177 (figure)
running using SPSS, 273 (figure)–277 (figure) One-way ANOVA, 197, 201–210
using dummy variables in, 271–273, defining, 350
272 (figure) post hoc tests, 206 (figure)
Multivariate test (MANOVA), 215 running using Excel, 209 (figure)
Mumford, M. D., 8 running using SPSS, 203–209,
Mutually exclusive, 95, 96, 123 204 (figure)–209 (figure)
sample sizes for, 202–203
Naturally occurring grouping, 41, 349 significance and, 209–210
Needs assessment, 320–321, 349 Tukey HSD post hoc comparisons for,
Negative correlation, 226, 349 208 (figure)
Newman, D. L., 324 with one grouping variable,
New Public Management (NPM), 316–317 217 (table)
Nominal level of measurement, 123 Open-ended question, 94, 350
Nonacademic report, 302–303, 349 Operationalization, 96–97, 154, 350
Nondirectional research hypothesis, Oral history, 51, 350
159–160, 349 Oral presentation, 309, 350
Nonequivalent groups, 58, 349 Ordinal level of measurement,
Nonparametric test, 242–243, 349. See also 123–124 (figure), 125–126
Chi-square analysis O’Sullivan, E., 155
Nonparticipant observation, 106, 349 Outcome evaluation, 320, 350
Non-probability sampling, 78–79, 82–85, 349 Outliers, 128, 130, 131, 143, 350
Nonrandom assignment, 58, 62, 350 Output range, 184
Nonrandom sampling, 55 Oversample, 81
Nonresponse, 93, 350
No plausible alternative explanation, 51, Paired samples t-test, 186–190
53–56, 349 assumptions of, 186
Normal distribution, 138–139, 162, defining, 350
163 (figure), 350 overview of, 175
Normality, 258, 350 running using Excel,
NPM (New Public Management), 316–317 189 (figure)–190 (figure)
Null hypothesis, 158–159, 160–161 running using SPSS,
ANOVA and, 200, 202 187 (figure)–189 (figure), 188 (figure)
chi-square analysis and, 245 Paper and pencil survey, 99, 350
defining, 350 Parametric tests, 242, 350
NVIVO, 289 Participant observation, 22, 106, 351
362  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Pearson product moment correlation (r), 201, summative evaluation, 319, 320
224–230, 267 using research in, 314–316
defining, 351 Program evaluation and performance
direction of relationship, 225–226 measurement, 313–332
linear vs. curvilinear relationship and, Proportional stratified sampling, 80,
229–230 82 (figure), 351
running using SPSS, 231–235, 232 (figure) Psychological well-being, 111
scatterplot of, 226–229 Purposive sampling, 84, 351
strength of relationship, 226 P-value, 160–162, 351
Percentile points, 129, 351
Performance measurement Qualitative data, defining, 23, 91, 352
defining, 351 Qualitative data analysis, 281–296
ethical considerations in, 324–325 analyzing converting into numbers, 291–292
formative purpose of, 323 coding, 286–287, 290 (table)
internal vs. external evaluation, 323–324 collecting and analyzing, 282–285
key issues in, 322 data collection, 101–102, 108 (table)
program evaluation vs., 317 interviewer effect and, 293–294
summative purpose of, 323 participants selection and, 292–293
using research in, 314–316 preparing data for, 285–286
Physical well-being, 111 quantitative data analysis vs., 284–285
Placebo effect, 67, 351 software for, 289
Platykurtic, 139 (figure)–140 (figure), 351 subjective nature of analysis, 294
Popper, Karl, 159 thematic analysis, 286–290
Population, defining, 75, 351, 352 Quantitative data, defining, 23, 35–36, 91, 352
Positive correlation, 226, 351 Quantitative data analysis
Post hoc tests, 200–201, 209–210, 351 code book for, 121, 343
Posttest only design, 59 data cleaning, 122, 344
Power, 168, 351 levels of measurement, 122–126
Predictor variables, 258 preparing data for analysis, 121–122
Primary Data, 91, 351 qualitative data analysis vs., 284–285
Probability sampling, 78, 79–82 starting data analysis, 120
cluster sampling, 81–82 See also Descriptive statistics
defining, 351 Quasi-experimental design
simple random sampling, 79, 80 (figure) defining, 352
stratified random sampling, 80–81 ethical considerations in, 69
systematic random sampling, 79–80 illustration of, 58 (figure)
Problem solving skills, 8 interrupted time series design, 68
Process evaluation, 351 Minneapolis Domestic Violence Experiment,
Program, 319, 351 66–67
Program evaluation placebo design, 67
cost-benefit analysis, 321, 322 Solomon Four-Group Design, 67–68
cost-effectiveness analysis, 321–322 time series design, 68
defining, 351 Questions, research, 37–41
ethical considerations in, 324–325 causal, 41, 343
examples of types of, 321 (table) closed-ended, 94, 96, 343
formative evaluation, 319, 320 confirm/test approach to, 32, 40–41
internal vs. external evaluation, 323–324 correlational, 41, 344
needs assessment and, 320–321 defining, 353
outcome evaluation, 320 double-barreled, 95
performance measurement vs., 317 double negative, 95
process evaluation, 320 focusing, 38–40
Index  ❖  363

focusing, steps in, 40 (table) alternative forms of reporting and, 309–310


group difference, 41, 347 audience issues, 302
identifying, 20, 39 (figure), 40–41 See also Report, components of
open-ended, 94, 350 Research, types of, 32–33 (figure)
Research alignment, 13–25
R (Multiple R), 275, 352 definition of, 7, 352
Random assignment, 24 (table), 58, 78–79, 352 misalignment example, 14–18
Range, 133–134 (figure), 352 research flow and components, 18–24
Rassel, G. R., 155 Research design, 20, 47–71
Ratio level of measurement, 124, 125 causal argument based on design, 63
Raw data, 126, 352 cause and effect conditions, 51–56
Reference group, 271, 272, 273, 276–277, 352 cross sectional survey design, 50, 344
Regression analysis, 253–280 definition of, 352
bivariate linear regression, 258, 265–270 experimental design (see Experimental
interrupted time series analysis, 278–279 design; Quasi-experimental design)
linear regression, 257–265 identifying design, 48–50
logistic regression, 278 key factors in choosing, 49
multiple regression, 270–278 oral history, 51, 350
predicting relationships, 255–257 secondary data analysis, 51, 353
time series analysis, 278 types of design, 50 (figure)–51
Regression coefficient (b)/slope, 260, 265, 352 Researcher bias, 107, 111, 294, 353
Regression line/line of best fit, 259–262, Research flow overview, 18–24
260 (figure), 261 (figure), 352 data analysis, 22–23
Regression threat/regression artifact/regression data collection, 21–22
to the mean, 55, 352 overview of, 18, 19 (figure)
REGWQ (Ryan, Einot, Gabriel, and Welsch Q reporting, 23–24
procedure), 200, 201 research design, 20
Repeated measures ANOVA, 197, 210–216 research objective, 18–20
running using Excel, 215–216 (figure) research question, 20
running using SPSS, 211–215, sample selection, 21
212 (figure)–214 (figure), 216 (figure) Research hypothesis/alternative hypothesis,
Repeated measures design, defining, 186, 352 158, 352. See also Hypothesis; Hypothesis
Report, components of, 304 (table)–309 testing, with inferential statistics
abstract or executive summary, 304–305 Research objectives, 18–20, 30–32, 352
appendix, 309 Research skills, as leadership skills, 8–11
data analysis, 306–307 Research topic, 32, 353
data collection, 306 Residuals/error in prediction, 263–264, 353
discussions/conclusions/recommendations, 308 Residual sum of squares (RSS), 262, 353
introduction, 305 Resources
literature review or project background, 305 acquiring, 8–9
measurement, 306 allocating, 9–10
methods, 305–307 Response options, 95–96, 97 (figure)
notes, 308–309 Response rate, 93
references, 308 Robson, C., 101
results, 307–308 Robust, 176, 202, 353
study participants, 305 R-square (R2), 262–265, 267, 275, 343
table of contents, 305 adjusted, 267, 275–276, 342, 352
See also Report writing Rubin, H. J., 78
Reporting, 23–24, 352 Rubin, I. S., 78
Report writing, 298–312 Ryan, Einot, Gabriel, and Welsch Q procedure
academic vs. nonacademic style, 302–303 (REGWQ), 200, 201
364  ❖  RESEARCH METHODS AND STATISTICS FOR PUBLIC AND NONPROFIT ADMINISTRATORS

Salkind, N. J., 245 Scope of work, 39


Sample, 21, 75, 353 Scriven, M., 320
Sample size Secondary data, 91, 109–110, 353
defining, 75, 353 Secondary data analysis, 51, 353
identifying, 77–78 Selection bias, 55, 56, 58, 61, 354
Sampling, 21, 72–86 Selection interaction threats, 55–56,
availability sampling, 83 345, 354
cluster sampling, 77, 81–82, 83 (figure), 343 Selection threat, 55, 354
convenience sampling, 83, 344 Sidak correction, 215
defining, 75, 353 Simple random sampling, 79, 80 (figure), 354
disproportional stratified sampling, 80, 81, Single-blind studies, 107
82 (figure), 345 Skewness, 142 (figure)–144, 143 (figure), 354
expert, 84, 346 Slope (X coefficient), 260, 265, 352
extreme case, 84, 346 Snowball sampling, 84, 354
generalizability and, 77, 78, 81, 83, 84, 292–293 Social judgment skills, 8
identifying samples, 73–74 Socially desirable, 293, 354
non-probability sampling, 78–79, 82–85, 349 Solomon Four-Group Design, 67–68, 354
probability sampling, 78, 79–82, 351 Solution construction skills, 8
purposive sampling, 84, 351 Sphericity, 210–214 (figure), 354
sample selection, 74–79 Squared deviance, 136 (table)
sample size, 75, 77–78, 353 SSR (sum of residuals), 264
sampling frame, 75–77, 353 SSR (sum of squares), 135–136, 198,
sampling technique, 75, 78–79, 353 264 (figure)
simple random sampling, 79, 80 (figure), 354 SST (total sum of squares), 262, 263 (figure),
snowball, 84, 354 264, 355
stratified random sampling, 80–81, 82 (figure) Standard deviation, 136–137, 141 (figure),
systematic random sampling, 79–80, 163 (figure), 354
81 (figure) Standard error, 164, 354
Sampling distribution, 164, 165 (figure), 353 Statistically significance
Sampling error/margin of error, 77, 93, 162, chi-square analysis and, 245
164, 353 defining, 175, 354
Sampling frame, 75–77, 353 generalizability and, 161, 184
Sampling technique level of, 160–162, 167, 348
defining, 75, 353 practical significance vs., 168
identifying, 78–79 Statistical probability, 162, 354
Saturation point, 78 Statistical significance test, 154, 160, 354
Scatterplot Statistics, defining, 126, 354. See also
correlation, 226–229 Descriptive statistics; T-tests
curvilinear relationship, 229–230 (figure) Stratified random sampling, 80–81,
defining, 353 82 (figure), 354
example of, 227 (figure) Student’s t-test, 175
menu selections for creating, 233 (figure) Summative evaluation, 319, 354
Pearson product moment correlation (R), Sum of residuals (SSR) [Is SSR correct for this
226–229 entry and the next entry?], 264
positive/negative correlation, 228 (figure) Sum of squares (SSR), 135–136, 198,
positive/negative perfect relationship, 264 (figure)
228 (figure) Survey questions, writing, 94–98
regression analysis, 259 (figure), 260 (figure) biased, leading phrasing, 95
strong/weak negative relationship, 229 (figure) double-barreled questions, 95
Scheffe’s test, 200, 201 double negative questions, 95
Schutt, R. K., 92 operationalizing concept, 96–98
Index  ❖  365

response options, 95–96, 97 (figure) Unit of analysis, 21, 355


types of questions, 94
wording considerations, 94–95 Validity
Surveys, 91–92 external, 75, 346
administration modes, 98–101 internal, 53–56, 347
administration modes, advantages/ Variables
disadvantages of, 100 (table) categorical, 78, 240–242
advantages of, 92 continuous, 78, 125
defining, 355 control, 156, 346
survey errors, 92–93 creating new in Excel, 335, 336
See also Survey questions, writing creating new in SPSS, 333–335 (figure),
Syllogism, 52–53, 355 334 (figure), 336 (figure)
Systematic random sampling, 79–80, 355 defining, 121, 355
dependent, 155, 156, 257–258, 345
Taliaferro, J. D., 155 dichotomous, 123, 125
Telephone survey, 121, 355 extraneous, 156, 346
computer-assisted, 98–99, 343 grouping, 123
Temporal precedence, 51, 52 (figure), 355 independent, 155–156, 258, 347
Testing threat, 54, 355 Variance, 134–136 (table), 135 (table)
Textual analysis, 107, 108 (table), 355 assumption of homogeneity of, 179, 202,
Thematic analysis, 286–290, 203, 207 (figure)
291 (table)–292 (table), 355 defining, 355
Theory building approaches, 33–35 (figure) Variance inflation factor (VIF), 271, 355
Time issues, in experimental design, 58–59 Variation [is this the same as variance?],
Time series design, 68, 278, 355 77, 78
Top-down approach, 23 Volunteer management case study, 6
Total sum of squares (SST), 262, 263 (figure), bivariate correlation, 224
264, 355 data analysis, 36, 287–289, 290 (table),
Treatment/intervention, 57 291 (table)–292 (table)
Trend analysis, 51, 355 data collection, 90–91, 107–108,
T-tests 283–284
background of, 175 interview guide, 103 (table)–104 (table)
comparing two groups, 173–174 predicting relationships, 256–257, 278
independent samples t-test, 178–186 program evaluation/performance
one-sample t-test, 175–178 measurement, 316, 318–319, 325–330
paired samples t-test, 186–190 relationships between two categorical
reasons to conduct, 175 variables, 240–241
Student’s t-test, 175 report writing, 301–302
summary of, 191 (table) research design, 49
Tukey’s HSD (honestly significant difference) research focus, 29–30
test, 200, 201, 208 (figure) sampling, 74
Two-samples t-test. See Independent samples survey, 339–341
t-test
Two-way contingency table analysis, Web-based survey, 98, 99, 355
242, 243 (table), 244 (table), 355 Websites, for results dissemination, 310
Type I error, 166–168 Weighting, 81
ANOVA and, 200, 203, 210, 215 Within-subjects ANOVA, 210. See also
defining, 355 Repeated measures ANOVA
Type II error, 167, 168
ANOVA and, 200, 201, 215 Zaccaro, S. J., 8
defining, 355 Z-score, 268, 277, 355

About the Authors
Masami Nishishiba, PhD Masami Nishishiba is an assistant professor of Public
Administration and the associate director of the Center for Public Service at Mark O.
Hatfield School of Government, Portland State University. Her academic expertise
covers research methods, cultural competence in the public sector, local government
performance management, and civic engagement. She has served as a principal inves-
tigator and a consultant for a number of projects focusing on local government in
regional, national, and international contexts.
Dr. Nishishiba’s publications have appeared in State and Local Government Review,
Journal of College and Character, Journal of Public Affairs Education, Journal of Applied
Communication Research, Journal of Public Affairs, and other journals. She is also a
lead author of a Japanese/English bilingual book, Project Management Toolkit: A
Strategic Approach to New Local Governance.
Dr. Nishishiba has a BA degree in Linguistics from Osaka University, Japan and an
MS degree in Communication and a PhD degree in Public Administration and Policy
from Portland State University.

Matthew Jones, PhD Matthew Jones has a BA in Criminal Justice from Norwich
University, an MPA from Portland State University, and a PhD in Public Administration
and Policy from Portland State University. Dr Jones’ interests and research focus on the
following areas: the application of quantitative methods to public performance and
evaluation, computer simulation modeling, leadership development, and the use of
information technology in the public sector. He has served as a consultant for public
organizations in Oregon, Washington, and New York, as well as consulting on national
grant-funded projects. He has a strong dedication to the practitioners in the public
administration community and strives to first and foremost accomplish research and
provide services and training that is beneficial to the practicing professional.
His publications have appeared in Police Quarterly, Public Administration Review,
Law Enforcement Executive Forum, Public Administration Quarterly, and The
International Journal of Electronic Government Research and Policing: An International
Journal of Police Strategies and Management. He also has coproduced an edited book
on strategic website development for public organizations and Web 2.0 technologies
for public service.

366
About the Authors  ❖  367

Mariah Kraner, MA (Doctoral candidate) Mariah is currently completing her doc-


torate in Public Affairs and Policy at Portland State University. She holds a BA in
International Relations from Willamette University and an MA in Adult Education
and Political Science from Oregon State University. In addition to her studies, Mariah
has over 10 years experience as a project manager, working on both nonprofit and
government grants.
Mariah currently serves as a Research Associate at Portland State, where she man-
ages a federal grant-funded randomized control trial to enhance employee health and
performance. Mariah’s research interests include political participation, social network
analysis, employee engagement, and international non-governmental organizations
(NGOs).
The essential online tool for researchers from the world’s
leading methods publisher

More content
Find exactly what and new
you are looking for, features added
from basic
this year!
explanations to
advanced
discussion

Discover
Methods Lists—
“I have never really methods readings
seen anything like this suggested by
other users
product before, and I think it
is really valuable.”
John Creswell, University of
Nebraska–Lincoln

Watch video
interviews with leading
methodologists

Search a
custom-designed
taxonomy with
more than
Explore the 1,400 qualitative,
Methods Map quantitative, and
mixed methods
to discover terms
links between
methods
Uncover more
than 120,000
pages of book,
journal, and reference
content to support
your learning

Find out more at


www.sageresearchmethods.com

S-ar putea să vă placă și