Sunteți pe pagina 1din 4

Introduction

In: Analyzing Complex Survey Data

By: Eun Sul Lee & Ronald N. Forthofer


Pub. Date: 2011
Access Date: March 10, 2018
Publishing Company: SAGE Publications, Inc.
City: Thousand Oaks
Print ISBN: 9780761930389
Online ISBN: 9781412983341
DOI: http://dx.doi.org/10.4135/9781412983341
Print pages: 2-3
©2006 SAGE Publications, Inc.. All Rights Reserved.
This PDF has been generated from SAGE Research Methods. Please note that the
pagination of the online version will vary from the pagination of the print book.
SAGE SAGE Research Methods
©2006 SAGE Publications, Ltd.. All Rights Reserved.

Introduction

Survey analysis often is conducted as if all sample observations were independently selected
with equal probabilities. This analysis is correct if simple random sampling (SRS) is used in
data collection; however, in practice the sample selection is more complex than SRS. Some
sample observations may be selected with higher probabilities than others, and some are
included in the sample by virtue of their membership in a certain group (e.g., household) rather
than being selected independently. Can we simply ignore these departures from SRS in the
analysis of survey data? Is it appropriate to use the standard techniques in statistics books for
survey data analysis? Or are there special methods and computer programs available for a
more appropriate analysis of complex survey data? These questions are addressed in the
following chapters.

The typical social survey today reflects a combination of statistical theory and knowledge about
social phenomena, and its evolution has been shaped by experience gained from the conduct
of many different surveys during the last 70 years. Social surveys were conducted to meet the
need for information to address social, political, and public health issues. Survey agencies were
established within and outside the government in response to this need for information. In the
early attempts to provide the required information, however, the survey groups were mostly
concerned with the practical issues in the fieldwork—such as sampling frame construction, staff
training/supervision, and cost reduction—and theoretical sampling issues received only
secondary emphasis (Stephan, 1948). As these practical matters were resolved, modern
sampling practice had developed far beyond SRS. Complex sample designs had come to the
fore, and with them, a number of analytic problems.

Because the early surveys generally needed only descriptive statistics, there was little interest
in analytic problems. More recently, demands for analytic studies by social and policy scientists
have increased, and a variety of current issues are being examined, using available social
survey data, by researchers who were not involved with the data collection process. This
tradition is known as secondary analysis (Kendall & Lazarsfeld, 1950). Often, the researcher
fails to pay due attention to the development of complex sample designs and assumes that
these designs have little bearing on the analytic procedures to be used.

The increased use of statistical techniques in secondary analysis and the recent use of log-
linear models, logistic regression, and other multivariate techniques (Aldrich & Nelson, 1984;
Goodman, 1972; Swafford, 1980) have done little to bring design and analysis into closer
alignment. These techniques are predicated on the use of simple random sampling with
replacement (SRSWR); however, this assumption is rarely met in social surveys that employ

Page 2 of 4 Analyzing Complex Survey Data


SAGE SAGE Research Methods
©2006 SAGE Publications, Ltd.. All Rights Reserved.

stratification and clustering of observational units along with unequal probabilities of selection.
As a result, the analysis of social surveys using the SRSWR assumption can lead to biased
and misleading results. Kiecolt and Nathan (1985), for example, acknowledged this problem in
their Sage book on secondary analysis, but they provide little guidance on how to incorporate
the sample weights and other design features into the analysis. A recent review of literature in
public health and epidemiology shows that the use of design-based survey analysis methods is
gradually increasing but remains at a low level (Levy & Stolte, 2000).

Any survey that puts restrictions on the sampling beyond those of SRSWR is complex in design
and requires special analytic considerations. This book reviews the analytic issues raised by the
complex sample survey, provides an introduction to analytic strategies, and presents
illustrations using some of the available software. Our discussion is centered on the use of the
sample weights to correct for differential representations and the effect of sample designs on
estimation of sampling variance with some discussion of weight development and adjustment
procedures. Many other important issues of dealing with nonsampling errors and handling
missing data are not fully addressed in this book.

The basic approach presented in this book is the traditional way of analyzing complex survey
data. This approach is now known as design-based (or randomization-based) analysis. A
different approach to analyzing complex survey data is the so-called model-based analysis. As
in other areas of statistics, the model-based statistical inference has gained more attention in
survey data analysis in recent years. The modeling approaches are introduced in various steps
of survey data analysis in defining the parameters, defining estimators, and estimating
variances; however, there are no generally accepted rules for model selection or validating a
specified model. Nevertheless, some understanding of the model-based approach is essential
for survey data analysts to augment the design-based approach. In some cases, both
approaches produce the same results; but different results occur in other cases. The model-
based approach may not be useful in descriptive data analysis but can be useful in inferential
analysis. We will introduce the model-based perspective where appropriate and provide
references for further treatment of the topics. Proper conduct of model-based analysis would
require knowledge of general statistical models and perhaps some consultation from survey
statisticians. Sections of the book relevant to this alternative approach and related topics are
marked with asterisks(∗).

Since the publication of the first edition of this book, the software situation for the analysis of
complex survey data has improved considerably. User-friendly programs are now readily
available, and many commonly used statistical methods are now incorporated in the packages,
including logistic regression and survival analysis. These programs will be introduced with

Page 3 of 4 Analyzing Complex Survey Data


SAGE SAGE Research Methods
©2006 SAGE Publications, Ltd.. All Rights Reserved.

illustrations in this edition. These programs are perhaps more open to misuse than other
standard software. The topics and issues discussed in this book will provide some guidelines
for avoiding pitfalls in survey data analysis.

In our presentation, we assume some familiarity with such sampling designs as simple random
sampling, systematic sampling, stratified random sampling, and simple two-stage cluster
sampling. A good presentation of these designs may be found in Kalton (1983) and Lohr
(1999). We also assume general understanding of standard statistical methods and one of the
standard statistical program packages, such as SAS or Stata.

http://dx.doi.org/10.4135/9781412983341.n1

Page 4 of 4 Analyzing Complex Survey Data

S-ar putea să vă placă și