Documente Academic
Documente Profesional
Documente Cultură
OF OPERATIONS
MANAGEMENT
1990
EXECUTIVE SUMMARY
This paper discusses the need for more research in operations management which is based on data
from the real world. Tying operations management theory in with practice has been called for over a long
period of time, however, many PlOM researchers do not have a strong foundation in gathering and using
empirical data. This paper provides a starting point that encourages operations management researchers
to use empirical data and provides a systematic approach for conducting empirical studies.
Empirical research can be used to document the state of the art in operations management, as well as to
provide a baseline for longitudinal studies. It can also be invaluable in the development of parameters and
distributions for mathematical and simulation modeling studies. A very important use for empirical data
is in theory building and verification, topics which are virtually ignored in most P/OM research.
Operations management researchers may be reluctant to undertake empirical research, due to its cost,
both in dollars and time, and the relative risk involved. Because empirical research may be considered
soft, compared with mathematical modeling, it may be perceived as risky. This paper attempts to
provide a foundation of knowledge about empirical research, in order to minimize the risks to researchers.
It also provides a discussion of analytical techniques and examples of extremely rigorous empirical PiOM
research.
Although operations management researchers may not recognize it, all research is based on theory.
The initial step in conducting empirical research deals with articulating the theoretical foundation for the
study. It also includes determining whether the problem under investigation involves theory building or
theory verification.
In the second step, a research design should be selected. Although surveys are fairly common in
empirical P/OM research, a number of other designs, including single and multiple case studies, panel
studies and focus groups, may also be used, depending on the problem being studied. Third, a data
collection method should be selected. One method, or a combination of several data collection methods,
should be used in conjunction with the research design. These include historical archive analysis,
participant observation, outside observation, interviews, questionnaires and content analysis.
The implementation stage involves actually gathering the data. This section of the paper focuses on
using questionnaires as the method of data analysis, although some of the concepts discussed may be
applicable to other data collection methods, as well. A brief overview of data analysis methods is given,
along with documentation
of the types of data analysis which have been used in various types of
Manuscript
250
55455
Vol. 9, No. 2
empirical research conducted by operations management researchers over the past ten years. Potential
outlets for publication of empirical P/OM research are discussed, and their history of publishing such
research is documented.
Underlying every step of the process are considerations
of reliability and validity. Conducting
empirical research without considering its reliability and validity is pointless, because the researcher will
not be able to generalize from the results. This should be considered in each of the four stages listed in the
approach described above.
A number of conclusions are discussed. These include the need for more empirical research and the
need for PiOM researchers to become more critical readers of the empirical research done by others.
Colleagues in the social sciences can bc a valuable source of information about conducting empirical
research. Industry contacts can be useful, as well, in pilot testing, finding industry sites and determining
consensus on the definition of terms. Finally, researchers in operations management need to be more
aware of the theory which underlies their work. Empirical research can be highly useful in both theory
building and theory verification.
INTRODUCTION
This paper provides a foundation for P/OM researchers who seek to incorporate real world
which means knowledge based on real world
data into their research. The term empirical,
observations or experiment,
is used here to describe field-based research which uses data
gathered from naturally occurring situations or experiments, rather than via laboratory or
simulation studies, where the researchers have more control over the events being studied.
A substantial amount of empirical operations management research has been published in the
academic and practitioner-oriented
journals in recent years. Although the proportion of
empirical P/OM research is increasing, relative to P/OM modeling research, empirical P/OM
research with a strong conceptual and methodological base is less common. This may be due to
the fact that P/OM researchers lack exposure to the variety of data collection and analysis
methods used in empirical studies. The purpose of this paper is to provide a justification for the
use of empirical research in operations management, an overview of empirical research designs
and data collection methods, and a tutorial in the mechanics of empirical data collection and
analysis. Where appropriate, selected examples from the operations management literature are
provided.
Journal
of Operations
Management
251
specific group, such as the Hyer (1984) survey of adopters of group technology. Systematic data
gathering efforts also provide a baseline for longitudinal studies of P/OM practices (see Harman
and Tuma (1979) and Martin (1983)) before some anticipated change in an industry or in general
practice. The Boston University manufacturing
futures study (Ferdows, Miller, Nakane and
Vollmann (1986)) is an example of the establishment
of a cross-industry
baseline, with
longitudinal follow-ups, designed to examine changing manufacturing technology and strategy.
Empirical data can also be used in conjunction with simulation and mathematical programming research. In 1980, Buffa called for mathematical
modelers to attempt to develop
objectives, criteria and parameters from real systems. However, mathematical model distributions and parameters are still frequently chosen to correspond with current modeling practice or
are selected for computational
convenience.
Too often, it is forgotten that the results of
mathematical modeling are only as valid as the assumptions upon which the model is based. It is
vital that mathematical models be based on realistic, rather than simply convenient, assumptions. For example, simulation studies frequently assume that job interarrival times are Poisson
distributed and that processing times follow an exponential distribution, often without verifying
this through examination of the actual values. Using empirically-based
distributions as inputs to
simulation models can yield findings with greater external validity.
A very important use for empirical data in P/OM is theory building and verification. Theory
should be developed from a careful, consistent documentation
of actual practice and the
subsequent discovery of relationships between actual practice and plant performance. Anderson,
Schroeder, Tupy and Whites (1982) MRP study provides an example of this. Theory can also be
verified through the collection of empirical data, as illustrated by Roths (1987, 1989)
manufacturing
strategy studies. Causal relationships can be refined and explored through
subsequent mathematical and simulation modeling research.
252
Vol. 9, No. 2
RESEARCH
This paper describes a systematic approach for conducting empirical research which should
help prevent P/OM researchers avoid the potential problems described above. Its components are
drawn primarily from the social sciences, where empirical research is the predominant research
mode. Our colleagues in organization
behavior, psychology, marketing, anthropology and
sociology can be very valuable resources on conducting empirical research (see Weinberg
(1983)). Although the methods are not unique to PIOM, the approach described in this paper can
be very useful in addressing P/OM issues.
Figure 1 provides an overview of the approach to conducting empirical research described
herein. In the first stage, the theoretical foundation for the research is established. Based on the
theory which underlies the problem being studied, either a theory-building
or a theoryverification approach will be pursued. Next, a research design, which is appropriate to both the
problem and theoretical foundation, is selected. An overview of a number of research designs
which may be appropriate for empirical P/OM research is provided. The third section describes
several data collection methods. One of these methods, or a combination of them, is used, in
conjunction with the research design. Implementation includes selecting an appropriate sample,
designing and administering
data collection instruments.
The fifth step is processing and
analyzing the data. The final step is preparing the research report for publication. Underlying
each of these steps are considerations
of reliability and validity. Following some simple
reliability and validity procedures at every step of the process provides more assurance that the
results of the study will be generalizable and will merit publication as a contribution to research.
ESTABLISHING THE THEORETICAL
FOUNDATION
Theory provides the foundation for all scientific research. Although some may perceive
operations management as being virtually atheoretical, in actuality, P/OM theories are all too
often implicit or difficult for the researcher to articulate. Empirical studies can be used to either
build theory or to verify theory. It is important to determine, in advance, which of these is being
done, since both cannot be accomplished in the same study.
Theory verijcation is the most widely understood approach. It is based on the scientific
method, in which many OM researchers are grounded. Hypotheses are generated in advance of
the study, and they are tested by the data collected. The origin of the hypotheses has historically
been of little concern to the researcher. Hypotheses may have been generated from prior studies,
from the literature, or literally picked from thin air. Classical inferential statistics and
significance tests are used to either accept or reject the hypotheses. The focus of theory
verification is on testing the hypotheses to within specified confidence levels, not on the origin
of the hypotheses.
A theory-building study is based upon a different origin and uses data in a different way.
However, even in theory building, a priori theory or constructs provide the foundation (Glaser &
Strauss (1967)). Without a conscious attempt at theory building, research can degenerate into
simply data dredging, or the ad hoc collection and analysis of data for whatever conclusions
can be found. Generally speaking, the origin for a theory-building study is not a hypothesis, but
rather, some assumptions,
frameworks, a perceived problem or perhaps, very tentative
hypotheses. Proponents of the theory-building
methodology (see, for example, Glasser and
Strauss (1967) and Yin (1989)) argue that a stronger theory will result if it has been grounded in
Journal of Operations
Management
253
Reliability
and validity
considerations
underlie
all stages.
-b
- Historical
archive analysis
Participant
observation
- Outside
observation
- Interviews
- Questionnaires
- Content analysis
Select a Data
Collection
Method
- Population
selection
- Sample selection
Scale
development
- Questionnaire
construction
- Pilot testing
- Mailing
- Analysis of
nonrespondent
characteristics
- Data entry
Implementation
FIGURE 1
A SYSTEMATIC APPROACH FOR EMPIRICAL RESEARCH
data, rather than if the origin of the theory is of little concern. They argue that data should also
be used to build theories, not only to verify them.
For example, assume that the researcher begins with an initial theory, called Theory A. This
initial theory includes some suppositions and variables of possible interest. The researcher then
proceeds to collect data which would elaborate Theory A or suggest modifications to Theory A,
but would not confirm it or deny it in any statistical sense. The notion of hypothesis testing is
inappropriate for theory building, since probability distributions and even random selection of
the sample points are not used. Rather, theory building is an interpretative exercise designed to
produce a theory for later testing. Theory building uses extensive questioning and strategic
positioning of the sample, in order to enrich the initial Theory A and to suggest modifications to
it. At the end of the data collection phase, a new Theory B, which is grounded in data, is
proposed. If appropriate, the new Theory B can be subjected to traditional theory verification
methods, or it can be further enriched to Theory C, before verification.
An example of theory building might be helpful to further illustrate this discussion. Suppose,
for example, that we start with Theory A, which suggests that there is a connection between the
product and the production process used. We further posit that a high volume product, with
limited variety, will be made in a highly automated, assembly line process, while a low volume
product with high variety will be made in a process with less automation and a process layout.
This Theory A is, of course, the familiar product-process
matrix (Hayes and Wheelwright
(1979)). This is all the theory which is needed initially, however, the theory could be somewhat
more elaborate, or even somewhat less.
The next task is to think about what data could help to elaborate this theory. First, we might
speculate that the industry in which plants are located could affect this theory, so, a single
industry, containing both extremes of products and processes, different levels of volumes, and
different levels of automation, would be selected. The next step is to select a sample which
appears to contain characteristics which would permit the researchers to elaborate this theory.
These would be comprised of some companies with high volume products and automated
assembly lines and others with the opposite situation. Access to these companies allows testing
and refinement of the definition and measurement of what constitutes high and low volumes,
what constitutes high and low automation, and other measurements. It also enables examination
of other variables that might affect the product and process match, such as intensity of
competition, company strategy, etc.
The next step is to select some companies which appear to violate the proposed theory. These
companies could include those who were operating on the margin of the industry or which had
gone bankrupt. They could also include some companies which were not aware of the
importance of product and process match. The use of companies which disprove the theory
permits enrichment of the theory, in order to explain when it does not apply. The goal of this
theory development is to gain an understanding
of the processes in the organization which
produce the observed effects, in terms of the proposed theory. The next step in theory building is
to extend the theory, for example, by moving to another industry to further develop the theory or
to provide support for the theory via hypothesis testing. This involves much larger samples and
much more structured data collection methods, for hypothesis testing.
One caution in theory building is that researchers must consider that their theories will be
understood by their subjects and others; the problem of reactivity
occurs during and after
each study. This is especially true in business-related
research, such as action research, where
many academics may perform a consulting, as well as an investigative, role. This relationship
Journal
of Operations
Management
255
with subjects makes sound observation and careful theory development, based on empirical
research methods, of utmost importance.
Theories are never proven or disproven, but rather, are constantly evolving (Glasser and
Strauss (1967), Strauss (1987)). Because of this, even theory verification is limited and must be
followed by further verification and elaboration. The value of theory building lies in its
permitting a wider range of observation and inquiry than the more traditional theory testing
does. Theory building is very useful in fields, such as P/OM, which lack an established base of
theory and measurement methods.
256
Vol. 9, No. 2
purpose of the multiple case study is theory building, confirmatory statistical analysis is not
expected or desired.
Multiple case studies can also be used for theory verification. Large samples are not
necessarily required; in theory verification, one case can falsify a hypothesis. If there are enough
cases, however, some forms of inferential statistical analysis are possible. For example, Pesch
(1989) used 12 factory sites and 23 plant-within-a-plant
units. This allowed a modest application
of regression analysis and limited statistical testing. On the other hand, studying a population,
rather than a sample, eliminates the need for inferential statistical analysis. In the case of
Garvins (1983) room air conditioner study, data was gathered at all of the major companies in
the industry. Thus, there was no need for statistical analysis because there was not a need. to
make inferences about the population; Garvin already had complete information about the
population.
Field Experiment
In a field experiment, the researcher manipulates some aspect (an independent variable) of the
natural setting and systematically observes the resulting: changes (Stone (1978)). For example, a
researcher might administer an attitude survey to workers in a plant before and after a pay-forskill program is implemented. Field experiments have much greater external validity than lab
experiments, because they take place in the natural setting. Because of their richness, field
experiments are useful in both building and verifying theory. However, the researchers limited
control of the natural setting may preclude accurate conclusions about causality.
Panel Study
A panel study obtains the consensus of experts. It can be very useful in defining terms and
making predictions. The Delphi method is probably the most frequently cited technique for
panel studies. Experts respond, in writing, to a series of questions. Their anonymous responses
are distributed to all members of the panel, who are permitted to revise their own responses in
subsequent rounds. The rounds continue until consensus is reached. Panel studies are
occasionally used in operations management research, for example Groffs (1989) study of
group technology practices and problems, Ettlies (1989) development of a P/OM research
agenda and Peschs study of the definition of the term factory focus.
Focus Group
A focus group is similar to a panel study, however, the group is physically assembled and each
response is given to the entire group orally, rather than in written form. Thus, the group is aware
of the origin of the response. The group is given a set of questions, often prior to its gathering. It
meets with a facilitator, who asks the questions, allowing every member a chance to express his
or her opinion. Discussion is permitted, with the goal of reaching consensus. Topics appropriate
for a focus group are similar to those which are appropriate for a panel study.
Surveys
The survey is undoubtedly the most commonly used research design in operations management. It relies on self-reports of factual data, as well as opinion. One approach is to administer a
survey to a group which is homogeneous with respect to at least one characteristic, such as
industry or use of a common technology. Hyers (1984) group technology research provides an
example. She sampled only users of group technology, since the goal of this study was to define
the state of the art in group technology. Sampling manufacturing firms at random did not make
Journal
of Operations
Management
257
sense, because the likelihood of selecting more than a few users of group technology was low. In
contrast, when the focus of the research is generalizability
to an entire population of firms,
administering
a survey to a large sample is a more appropriate approach, as in the Boston
University manufacturing
futures research (Ferdows, et al. (1986)), which used samples in
excess of 500 respondents.
SELECTING
A DATA COLLECTION
METHOD
The methods described in this section may be used alone, or in tandem, with most of the
research designs described above, to document what is being observed. A combination of data
(Jick (1979)), may be very
collection methods to study the same issue, or triangulation
useful. For example, a combination
of questionnaires,
structured interviews and archival
financial data could be used to determine the impact of JIT implementation
on plant
improves the reperformance.
By providing several sources of verification, triangulation
searchers judgmental accuracy. Useful references on data collection methods include Emory
(1980), Best ( 1970), Converse and Presser (1986), Stone (1978) and McGuigan (1978).
Historical
Archive Analysis
Historical archive analysis uses unobtrusive measures, including physical traces and archives
(Bouchard (1976)), often in conjunction with a single or multiple case study design. Abernathy
and Wayne (1974) used archival analysis in their single case study of how the production
processes of the Ford Model T developed over time, demonstrating
the limits to using the
learning curve for cost reduction. Archival data is unbiased, because the providers of it have no
awareness of being observed. However, since the researcher does not control the environment, it
may be impossible to obtain the type of data desired. Therefore, the collection of archival data is
sometimes used in conjunction with a survey or panel study, to gather historical factual data
from respondents.
Some archival sources exist because firms are required to file reports, which become public
information. For example, airlines file a great deal of operating information with the federal
government. Other firms also make operations data available to the public, however, little use
has been made of it in P/OM research to date. University reference librarians can be excellent
guides to identifying untapped archival sources.
Participant
Observation
258
Vol. 9, No. 2
typically used in the design of jobs, may be used for structuring the collection of research data.
For example, unbiased observers could develop flow diagrams of traditional and JIT shops, in
order to determine whether JIT shops have simplified flows. A variety of charts may be used to
document interactions between workers in different types of industrial settings. Stopwatch
studies can also be useful, for example, to compare the amount of time a particular task requires
when it is done repetitively, with the same task done as part of an enlarged job. Research designs
appropriate for outside observation include single and multiple case studies, as well as panels.
Interviews
Interviewing involves mar? than talking with members of an organization and, perhaps,
taking notes. There are several methods of interviewing which permit the use of an organized
approach, without compromising the richness of conversation (see Bradburn (1983)).
Structured Interviews. A structured interview involves the use of a script, which specifies
questions to be used. Other questions may be asked, as well, based on the direction of the
conversation,
however, certain questions are standard. Structured interviews permit some
comparison between interviewees, without sacrificing the depth of the personal interview.
Ethnographic Interviews. Ethnographic interviewing facilitates discovery of what is meant by
specific concepts. A hierarchy of questions is asked, beginning with a general question. Further
questions are framed based on the respondents answers to previous questions. Used in
conjunction with pilot testing of a survey, ethnographic interviews can be used to validate
response categories in questionnaires,
or to indicate where improvement is needed. They are
very useful for discriminating
between the myriad definitions of popular concepts, and
determining when such categories coincide with concepts used in the hypotheses. Spradley
(1979) and others (Pelt0 (1978), Narroll and Cohen (1973), Gregory (1983), Schall (1983)
Sanday (1979)) provide good references on the ethnographic interview process.
Both structured interviews and ethnographic interviews are greatly enhanced by transcriptions. Most respondents do not object to being taped, and the quality of the interview is raised
significantly if the researcher does not have to take meticulous notes. Transcriptions can be used
by the research team to improve interviewing techniques, to detect the presence of leading
questions on the part of the interviewer and to guard against selective memory. They may be
used in conjunction with content analysis. The interview is taped, and a transcript is prepared.
Content analysis then codifies the transcript, noting recurrent usage of a phrase or concept of
interest. Hypotheses may be developed or tested, based on content analysis of the transcript.
Questionnaires
The questionnaire is most commonly used in survey research, however, it may also be used in
single and multiple case studies, panels and focus groups. Although P/OM researchers have used
questionnaires
as a data collection method with some frequency, many P/OM questionnaires
appear to have been thrown together hastily, with little thought of reliability, validity or
generalizability.
For example, many P/OM surveys use questionnaires which were constructed
without first articulating
the hypotheses of the study. This paper, in subsequent sections,
provides a foundation for the construction
and administration
of questionnaires
so as to
maximize their reliability and validity.
IMPLEMENTATION
Since survey designs with questionnaires are the most commonly used approach in empirical
P/OM research, most of the remainder of this paper will be devoted to a discussion of
Journal
of Operations
Management
259
questionnaire construction and use. Some of this discussion may also be applicable to other
designs and data collection methods.
The foundation for questionnaire construction is the theory which underlies it. A questionnaire should not merely consist of a series of interesting questions, but should be designed to
develop or test a theory. The theory should be carefully defined by reference to the literature and
by logical thought. The resulting theory (a set of variables and relationships among those
variables) can be depicted by a flow chart or diagram which shows the key relationships in the
theory. Useful comprehensive guides to questionnaire design and construction include McIver
and Carmines (198 1) and Alreck and Settle (1985).
Population
Selection
Much empirical research uses the corporation or the individual as its level of analysis. It may
be easier to obtain data at these levels, and many important problems deal with corporate or
individual issues. The plant level may also be appropriate for P/OM studies. For example, the
Minnesota-Iowa State research on World Class Manufacturing
(see Flynn, Bates, Schroeder,
Sakakibara and Flynn (1989)) used the plant as the level of analysis because, although World
Class Manufacturing is a strategic approach, many of its measurable improvement initiatives
have occurred at the plant level. Whether the individual, plant, division or corporate level is
selected as the unit of analysis depends upon the research questions and hypotheses. SIC codes
are commonly used to define industry classifications. However, SIC codes were not designed for
P/OM research, and may be somewhat misleading. For example, process technology can vary
considerably between two related SIC codes (e.g., computers are classified with machinery).
There are other justifiable ways of choosing a sample, such as based on the use of a common
process technology. SIC codes can provide a useful starting point, however, their classifications
may need to be modified, as appropriate to the needs of the P/OM researcher.
Duns Metalworking Directory (1986) is one of only a few sources which gives plant, rather
than corporate, level information. It can be invaluable in obtaining addresses of plants and other
plant level information, such as products made and number of employees at that plant. Despite
its somewhat misleading title, this is a comprehensive reference which deals with most types of
manufacturing.
Sample Selection
The sample should be selected as randomly as possible, in order to help control against bias.
Convenience samples, for example, the sample of students in an executive MBA class, are highly
biased. Even when the sample is drawn from a specific group, such as a given industry or users
of a specific technology, the actual sample should be drawn randomly, once the master set of
names has been obtained. Using SIC or Duns Metalworking Directory listings in conjunction
with a random number table is a good way to ensure the randomness of the sample.
Controlling for industry effects can compensate for variability between industries, in terms of
process, work force management, competitive forces, degree of unionization, etc. This can be
done through the use of an experimental design which includes several industries. Within an
industry, the type of production process can vary widely between job shop, batch production and
line production. Including the process as an independent variable will permit controlling for the
process during data analysis. If the level of analysis is the plant or division, corporation effects
may also be important. Company size is another critical variable. The number of employees and
total sales are widely available figures, which can be incorporated into the sample selection
process.
260
Vol. 9, No. 2
Scale Development
Appendix A contains a brief description of some potentially useful scales. Appendix B
contains P/OM examples of each scale described in Appendix A. These tables are based
primarily on information contained in Alreck and Settle (1985). The sophistication of data
analysis is highly dependent upon the type of data obtained, thus, using Appendix A requires
consideration of the type of data which each scale gathers.
Nominal data assigns observations to categories (Best (1970)). For example, respondents may
be asked to check the quality techniques they understand. Their choices cannot be placed in a
specific order. Ordinal data indicates relative rank, or order, among the categories. For example,
respondents may be asked to rank their strategic manufacturing goals. Ordinal measures have no
absolute values, and the differences between adjacent ranks may not be equal. Interval data can
be ranked, and the differences between the ranks are equal. The widely used Likert scale is an
example of an interval scale. Interval measures may also be added or subtracted. For example,
Likert scale responses are frequently added to form a summated scale. However, since a Likert
scale has no true zero, responses cannot be related to each otheras multiples or ratios. Finally,
ratio data has all of the properties of the three types of data mentioned above, as well as a true
zero and all of the qualities of real numbers. Thus, ratio data can be added, subtracted,
multiplied or divided. It is mostly gathered from factual, archival sources; ratio scales designed
to gather opinion data are not readily available. Because of their mathematical properties,
making an attempt to obtain interval or ratio data, as much as possible, opens up a host of
analytical techniques to the researcher.
Using items which are worded to assure comparability of responses greatly simplifies data
input and analysis. For example, rather than asking, What is the defect rate of your primary
product?, more comparable responses will be obtained by asking, What is the defect rate, in
parts per million, of your primary product? Pilot testing, in conjunction with structured
interviews, panel studies or ethnographic interviews, can be extremely helpful in assuring
comparability of responses.
Summated Scales
A summated scale score serves as an index of attitudes towards the major issue or a general
construct. They are used because individual items tend to have a low statistical relationship with
attributes. Summated scales permit averaging of the relationship with other items, and allow
more exact distinctions to be made between respondents. Their use also enhances the reliability
of the responses.
In the initial development of a summated scale, the Thurstone approach (Alreck and Settle
(1985)) can be very useful. This method allows the inclusion of a small number of items which
have high discriminating
power, combined with high reliability. A list of a large number of
statements which are related to the same construct or dimension is generated. They are then
rated by a group of respondents, known as judges, who are experts, or very well informed about
the construct. The judges are asked to rate each statement on an 1 l-point scale, ranging from
unfavorable statement.
Ten to 20 of the
Strongly
favorable statement
to Strongly
statements are ultimately selected for inclusion in a summated scale, based on the criteria that
they should each have a relatively low standard deviation of responses from the judges and the
range of average responses for the ten to 20 items selected should be distributed evenly across all
11 choices.
Journal
of Operations
Management
261
Questionnaire
Construction
A portion (one-fourth to one-half) of the items in each summated scale should place the
preferred choice at one extreme, while the remainder places the preferred choice at the other
extreme (Alreck and Settle (1985)). This helps to keep the respondents interested in the items
and prevents them from being lulled into marking Strongly agree for every item. Intermixing
items from a given summated scale with items from other summated scales will prevent
respondents from guessing the construct which is being measured.
Record keeping is facilitated by the development of a master copy of the questionnaire, with
each summated scale identified. Of course, the questionnaires
which are distributed to the
respondents should not identify items by summated scale. Development of a master list of all
summated scales is useful in both developing the summated scales and analyzing the data. This
lists each summated scale in its entirety, its source and a list of the questionnaires which contain
the summated scale.
Pilot Testing
Pilot testing is an integral part of questionnaire construction. It provides feedback on how
easy the questionnaire is to complete and which concepts are unclear or out of the respondents
range of knowledge and/or responsibility. For example, quantitative accounting data can be very
useful to P/OM researchers, yet most plants exhibit a marked reluctance to divulge it. Through
pilot testing, the researcher may learn that respondents are more likely to provide accounting
data when they are instructed that giving rough estimates is preferable to leaving items blank.
Pilot testing consists of administering
the preliminary questionnaire to a small group of
typical respondents. There is no need to select the respondents randomly; a convenience sample,
such as students in an executive MBA class or members of the local APICS chapter, is quite
acceptable. By administering the pilot test in person, the researcher can determine whether there
are systematic differences between the way the researcher views specific measures versus the
respondents.
After pilot testing, questionnaires
typically require revision, in order to help ensure the
validity and reliability of the measures, as well as making it more user-friendly. If the pilot test
indicates that the questionnaire contains sensitive questions or that key variables are measured
differently by most respondents, it may be necessary to consider using site visits to administer
the surveys to all respondents, rather than using a mail survey.
Mail Surveys
Mail surveys are very effective for well-defined research topics with a fairly narrow scope.
Such topics permit the use of a short questionnaire, which is more likely to be completed and
returned. A typical approach is to send questionnaires to a relatively large, randomly selected
sample and hope that an acceptable number are returned. Researchers in the social sciences look
skeptically at any survey with less than a 40% to 60% response rate. While studies have been
published in the operations management literature with response rates as low as 10% to 20%,
such studies are highly unreliable, even if non-respondent
bias has been checked. A higher
standard should be established, perhaps in the 50% response range, together with nonrespondent
checking, in order to ensure more reliable and, therefore, more generalizable results.
Researchers who send questionnaires to respondents who are to remain completely anonymous can enhance their response rate by sending a reminder letter to all recipients of the
questionnaire, instructing them to disregard it if the questionnaire has already been completed
and returned. However, because relatively high response rates are virtually required for
262
Vol. 9, No. 2
publication in high-quality journals, the researcher, despite his or her best efforts in sample
selection and questionnaire design, is left very vulnerable when a random, anonymous mail
survey is used.
One effective means for increasing the response rate is to contact potential respondents and
obtain their commitment
to completing the questionnaire,
prior to distribution.
When the
respondents understand the purpose of a study, lack of anonymity may not be as bothersome.
This also facilitates provision of feedback to respondents, which may serve as an incentive for
participation. This approach is being used by the Minnesota-Iowa State study of World Class
Manufacturing (Flynn, et al. (1989)), which provides each plant with a profile of its responses,
relative to the sample of other plants in its industry. In the case of a battery of questionnaires, the
initial contact can assign a site research coordinator to oversee the distribution and return of the
questionnaires.
For example, Saraph, Benson and Schroeder (1989) appointed a research
coordinator at each location to serve as the liaison between that location and the researchers,
assisting with questionnaire distribution and collection. This may also be useful in gathering
missing data when incomplete questionnaires are returned.
Nonrespondents
If those who chose not to respond are systematically different than those who did respond, the
generalizability of the results is compromised. For example, respondents from firms with quality
problems may exhibit a marked reluctance to return a questionnaire on quality. Generalizing
from those who chose to return this survey may not be truly reflective of the characteristics of
the entire population.
Analyzing important characteristics
of nonrespondents
can help to
determine whether they are systematically
different from respondents. However, there is an
obvious dilemma in attempting to identify characteristics of nonrespondents.
Since they have
chosen not to respond, there is little detailed data for determining
their characteristics.
Furthermore, when questionnaires
are distributed anonymously, it will be impossible to even
identify the nonrespondents.
If nonrespondents
can be identified, they may be willing to answer a few brief questions on
the telephone. Another useful approach is demographic matching, using archival data to
determine whether there is a difference between respondents and nonrespondents.
However,
sampling the nonrespondents using a site visit, mail survey or telephone contact is the preferred
method of nonrespondent analysis, especially when there is a large non-response rate (greater
than 50%).
Data Entry
Careful examination of completed questionnaires, prior to data entry, can prevent subsequent
data analysis problems. Things to look for include incomplete or blank items, handwritten
comments about difficulty or interpretation of specific items and inappropriate responses, for
example, Too many to count, as a response for How many suppliers did your plant do
business with last year?
It is unwise for members of the research team to perform the actual data entry task. The
integrity of the data is vital to the generalizability
of the conclusions of the study. Experienced
data-entry personnel assure input accuracy through the use of such devices as a mask, or
template, which makes the screen emulate a page of the questionnaire. Responses are entered
from the questionnaire, and the software positions the responses in the appropriate records in the
database. For large and complex databases, it is recommended that the research team also seek
the help of experts in database design; a well-designed database can save countless hours of
agony during data analysis.
Journal
of Operations
Management
263
DATA ANALYSIS
It is difficult, if not impossible, to draw conclusions from empirical data and to generalize
them, without the assistance of statistical evidence. Perhaps part of the reason why empirical
research in operations management is not held in the same esteem as other types of operations
management research is that, with a few notable exceptions, its data analysis is relatively
unsophisticated.
This is probably due to the predominance of nominal and ordinal data.
Table 1 summarizes the characteristics of several methods which are useful in empirical data
analysis. Descriptive statistics are appropriate for any type of data. Statistical analysis of
nominal data is limited to categorical methods of data analysis. A number of nonparametric
techniques are also useful with ordinal data. Many other techniques of statistical analysis, using
interval or ratio scales, are listed in Table 1. Since these methods are well known, no further
discussion is provided here.
STATISTICAL
Method
TABLE 1
TECHNIQUES
FOR EMPIRICAL
Purpose
Sample Size
RESEARCH
Comments
Small to large
Used to describe
industry practice.
Compares
variables
Small to large
Small to large
F-Test
Small to large
Yields interactions, as
well as main effects.
Regression/Correlation
Study relationships
between variables
Small to large
Allows specification of
dependent and
independent variables.
Path Analysis
Establish causal
inference
Medium to large
New to many OM
researchers. Powerful
for theory verification.
Cluster Analysis
Define groups
Medium to large
New to many OM
researchers. Powerful
for theory building.
Factor Analysis
Classify data
Medium to large
Useful in establishing
reliability and validity
Descriptive
Statistics
T-Test
Chi-Square
Test
two
It is tempting to statistically test for a number of differences, when analyzing survey results.
This is appropriate, as long as it was predetermined in the design of the study and the nature of
the Bonferroni problem of multiple comparisons is understood. As more comparisons are made,
the likelihood increases that some of them will be statistically significant, based solely on
chance. For example, if 100 comparisons are made at the 5% level of significance, an expected
value of five of them should be statistically significant, due to chance alone. This problem can be
dealt with by using 5% as the overall error probability, rather than the probability for individual
Vol. 9, No. 2
comparisons.
follows:
The relationship
P =
where:
and individual
comparisons
is as
1-(1-P)
P = overall significance level
P = significance level of each test
= number of tests made
n
For example, if each test is made at the .05 level of significance and there are 50 tests made, then
the overall significance of the results is 0.923. Conversely, in order to arrive at an overall
significance level of .OS, an individual test level of about .OOl is needed for 50 tests.
RELIABILITY
AND VALIDITY
The data collected by surveys and other empirical designs is of little use unless its reliability
and validity can be demonstrated. Measurement papers, which discuss strictly the reliability and
validity of survey instruments, have been very limited in P/OM research and are just beginning
to appear. Saraph, Benson and Schroeders (1989) quality measurement paper provides a good
example. Measurement papers are important in disseminating reliable and valid instruments for
use by other researchers.
Useful references on measurement,
including concerns about
reliability and validity, include Nunnally (1978), Mehrens and Ebel (1967), and Carmines and
Zeller (1979).
Reliability
Reliability measures the extent to which a questionnaire, summated scale or item which is
repeatedly administered to the same people will yield the same results. Thus, it measures the
ability to replicate the study. A non-reliable measure is like an elastic tape measure; the same
thing can be measured a number of times, but it will yield a different length each time. One of
those measurements may, indeed, be the correct length, but it is impossible to determine which.
Reliability is a prerequisite to establishing validity, but not sufficient (Schwab (1980)). If a
measure yields inconsistent results, even very highly valid results are meaningless.
There are a number of methods for measuring various aspects of reliability. When a measure
is administered to a group of individuals at two different points in time and their scores at the
two times are correlated, that correlation coefficient is a measure of test-retest reliability. While
appropriate for physical measures, such as machine speeds, there are obvious problems with
administering the same questionnaire items to the same group of people at two different points in
time. During the retest phase, the respondents will be answering questions which they have
previously seen. Reflection or discussion with coworkers after the first phase may alter the way in
which they answer questions during the retest phase. In addition, if a substantial period of time
has passed between test and retest, differences in scores may reflect actual changes which have
taken place, rather than unreliability. Parallelforms reliability can be established by constructing
two equivalent (in terms of means and variances) forms of the same measure and administering
them to a common set of subjects at different points in time. The correlation between the scores
is known as the parallel forms reliability estimate. This is particularly appropriate when both
forms of the measure will actually be used in data collection, such as in assessing job
satisfaction before and after a technological change in the plant.
Internal consistency is important when there is only one form of a measure available, such as
most P/OM questionnaires. There should be a high degree of intercorrelation among the items
Journal
of Operations
Management
265
that comprise the measure or summated scale. Using the split-half technique, the items within a
summated scale are divided into two subsets and the total scores for the two subsets are
correlated. To the extent that the measure is internally consistent, the correlation between total
scores for the two subsets will be high. The most widely accepted measure of a measures
internal consistency is Cronbachs Alpha (Cronbach & Meehl (1955)). Alpha is the average of
the correlation coefficient of each item with each other item (Nunnally (1978)). It is popular
because it incorporates every possible split of the scale in its calculation, rather than one
arbitrary split, such as the split-half measure. Cronbachs alpha is part of the standard reliability
package in SPSSX and is quite easy to use. The minimum generally acceptable Alpha value is
.70, however, Nunally suggests allowing a somewhat lower threshold, such as .60, for
exploratory work involving the use of newly developed scales.
Validity
In general, validity measures two things. First, does the item or scale truly measure what it is
supposed to measure? Second, does it measure nothing else? If either of these questions can be
answered no, the item or scale should not be used. Using an invalid item or scale is like trying
to measure inches with a meter stick; precise quantitative data can be collected, but it is
meaningless.
Content validity is a judgement, by experts, of the extent to which a summated scale truly
measures the concept that it intended to measure, based on the content of the items. Content
validity cannot be determined statistically. It can only be determined by experts and by reference
to the literature. The Delphi method is a very useful means for establishing the content validity
of items (for example, see Pesch (1989)). An extended literature search, particularly noting
recurring concepts, should also be used.
Content validity is, of course, very critical. If the content of a construct, or theory, is faulty,
no amount of reliability or construct validity will suffice. Researchers must carefully consider
content, in advance of data collection, by informed logical analysis, insight and theory
formulation. As noted above, this should be based on literature searches and expert opinion.
Content validity can also be improved, over time, by theory building and theory verification.
Well-done empirical studies should lead to evolving knowledge and a more sophisticated
understanding of content.
Criterion-related (predictive) validity investigates the empirical relationship
between the
scores on a test instrument (predictor) and an objective outcome (the criterion). For instance, if
the researcher is developing a quality measurement instrument, the instrument should accurately
predict objective quality outcomes. The most commonly used measure of criterion-related
validity is a validity coefficient, which is the correlation between predictor and criterion scores.
A validity coefficient is an index of how well criterion scores can be predicted from the
instrument
score. Therefore, computing the multiple correlation coefficient between the
instrument score and performance or outcome, and obtaining a high value, is an indication that
the measurement instrument has criterion-related
validity. Two techniques are generally used.
These are simple correlation, for testing a summated scale with a single outcome, and canonical
correlation, for testing a summated scale, or a battery of summated scales, with multiple
outcomes.
Construct validity measures whether a scale is an appropriate operational definition of an
abstract variable, or a construct. For example, job satisfaction is a familiar construct which is
relevant in operations management research. It cannot be directly assessed, but instead must be
266
Vol. 9, No. 2
inferred from scores on summated scales which purport to measure job satisfaction. These
summated scales comprise the operational
definition of the construct, job satisfaction.
Establishing construct validity is a difficult process (Schwab (1980)). Since the construct cannot
be directly assessed empirically, only indirect inference about construct validity can be made by
empirical investigations. Appropriate construct validity procedures depend on both the nature of
the construct specified and the hypothetical linkages which it is presumed to have with other
constructs.
In examining the hypothetical linkages which a construct has with other constructs, it is
helpful to construct a nomological
network, or a framework, to illustrate the proposed
linkages. Based on the nomological network, hypothesized linkages to other valid constructs can
be empirically tested. This is also helpful in providing clarification of the definition of the
construct, itself. To demonstrate the linkages between the construct and other constructs, a
series of assessments should be made, examining the measure against criteria established from
multiple hypotheses, measures of alternative constructs and samples (Schwab (1980)). The
hypotheses should be a logical outgrowth of the proposed linkages illustrated by the nomological
network.
Factor analysis can also be useful in establishing construct validity. It may be used in two
ways (Schwab (1980)). First, factor analysis is helpful in identifying tentative dimensions, as
well as suggesting items for deletion and places where items should be added. Conducting a
factor analysis on a single summated scale will show whether all items within the summated
scale load on the same construct, or whether the summated scale actually measures more than
one construct. The results of such an analysis are likely to be sample specific, however, having a
large sample (ratio of respondents to items of IO: 1) can ameliorate this problem. Also, withinsample heterogeneity (age, sex, job level, etc.) can influence the resulting factor structure; thus,
generalizations should only be made to similar populations. A second use for factor analysis in
establishing construct validity is in testing hypotheses in an already-developed
scale. In this
case, the factor analysis is conducted on the scales which comprise several summated scales
simultaneously. The researcher should specify both the number of dimensions in the construct
and the specific items or scales which are hypothesized to load on those dimensions a priori.
Comparing this with the dimensions and loadings from factor analysis will help in establishing
construct validity of a previously-developed
summated scale.
Because of the difficulty of establishing reliability and validity, it is very useful for
researchers to use summated scales whose reliability and validity have already been demonstrated. These are difficult to find for the technical aspects of operations management research,
however, there is a wealth of behavioral measures available in such handbooks as Cook,
Hepworth, Wall and Warr (1981) and Price and Mueller (1986). Many of these are relevant to
topics which are of interest to researchers in operations management.
PUBLICATION
Appendix C shows that there is a substantial amount of empirical research in operations
management which has been published. In addition, a surprisingly large number of doctoral
students have used empirical research for their dissertations, which, for the most part, are not
referenced here. Based on our search of recent volumes of relevant journals, Table 2 lists the
publication of empirical operations management articles, by journal. The largest amount of
empirical operations management research was published in Production and Inventory Management Journal and the tnternational Journal of Operations and Production Management. These
Journal
of Operations
Management
267
journals have a predominantly practitioner audience, although they are cited by academics, as
well. It is not surprising that empirical work would be of interest to their readers. The empirical
work published in these journals tends to be highly descriptive, documenting current OM
practices. It is encouraging to note, however, that some more academic journals are contained on
this list, as well. The empirical research published by these journals has moved beyond
description into the inferential mode. There is more hypothesis testing, and some of these
articles use highly sophisticated and powerful statistical analysis. The fact that the more
academic journals have not published as much empirical operations management work in the
past may be reflective of the relative sophistication of the studies which they have received and
reviewed. It does not appear that academically-oriented
journals are systematically
biased
against empirical operations management research. This is very encouraging to researchers who
are committed to conducting empirical research and yet are reluctant to sacrifice the opportunity
to publish in high quality academic journals.
RECENT
EMPIRICAL
TABLE 2
RESEARCH
(1980-1989)
IN POM BY JOURNAL
Number of Articles
Journal
Production and Inventory Management Journal
International Journal of Operations and Production Management.
Journal of Operations Management.
Decision Sciences.
Interfaces..................................................
International Journal of Production Research.
Harvard Business Review.
Other.....................................................
.......
.......
.......
.......
.......
......
......
19
18
14
I
6
4
3
5
76
TOPICS
OF RECENT
TABLE 3
EMPIRICAL
(1980-1989)
POM RESEARCH
Number of Articles
Topic
Manufacturing Strategy .........................
Manufacturing Technology (FMSiCAMIMISISoftware)
MRP ........................................
Manufacturing Management
.....................
Quality/Quality
Control/Quality
Circles ............
Just-in-Time ..................................
Use of OR Techniques in POM...................
Productivity
..................................
Production Supervision
Innovation
Future Manufacturing Systems
Reliability and Validity
Teaching POM
Other.........................
266
10
......
......
......
......
......
.,
.,..............
I
5
5
5
4
3
3
2
2
2
2
18
76
Vol. 9, No. 2
Journal
of Operations
Management
269
cannot be conducted in isolation from the real world. Industry contacts can be invaluable in
doing empirical research. Keeping in touch with former students, attending APICS or OMA
meetings, etc., can be vital in finding industry experts, pilot testing and determining consensus
on the meaning of terms. Having the support of an industry group, in writing, can greatly
enhance the response rate of a large sample questionnaire. Industry groups may also be willing
to provide some financial support to help defray the out-of-pocket costs of conducting empirical
research.
Finally, the distinction between the exploratory and confirmatory
mode of research in
operations management is useful to operations management researchers. The focus of most OM
research to date has been on confirmation. We need to do much more exploratory research, in
order to lay a foundation for our confirmatory
research. In the long run, the results of
confirmatory research will be greatly enhanced when effort has been initially put into theory
building.
It is hoped that this paper will generate more empirically based and high quality empirical
studies in P/OM, as well as measurement papers. The trends in this direction are already
positive. We hope to help accelerate these trends.
ACKNOWLEDGMENT
This research
Foundation.
was supported,
Friendship
Commission
REFERENCES
Abernathy, W. J.. and K. Wayne. The Limits of the Learning Curve. Harvard Business Review, September-October
1974, 109-I 19.
Alreck, I!, and R. Settle. The Survey Research Handbook. Homewood, IL: Richard D. Irwin Co., 1985.
Amoako-Gyampah,
K., and J. R. Meredith. A Journal Review of Operations Management Research. In Proceedings,
Decision Sciences Insfitute. Vol. 2. Atlanta, GA: Decision Sciences Institute, 1989, 1021-1023.
Anderson, J. C., R. G. Schroeder, S. E. Tupy, and E. M. White. Material Requirement Planning Systems: The State of
the Art. Production and Inventory Management Journal, vol. 24, no. 4, Fourth Quarter 1982, 51-66.
Ansari, A., and 9. Modarress. The Potential Benefits of Just-in-Time Purchasing for U.S. Manufacturing.
Production
and Inventory Management Journal, vol. 28, no. 2.. Second Quarter 1987, 30-35.
Bahl, H. C. Teaching Production and Operations Management at the MBA Level-A
Survey. Production and
Inventor?, Management Journal, vol. 30, no. 3, Third Quarter 1989, 5-E.
Benson, I? G., A. V Hill, and T. R. Hoffmann. Manufacturing
Systems of the Future-A
Delphi Study. Production
and Inventory Management Journal, vol. 24, no. 3, Third Quarter 1982, 87-105.
Bessant, J., and 9. Haywood. Experiences with FMS in the U.K. International Journal of Operations and Production
Management, vol. 6, no. 5, 1986, 44-56.
Best, J. W. Research in Education. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1970.
Blackstone, J. H., and J. E Cox. Selecting
MRP Software for Small Businesses.
Production
and Inventory
Management Journal, vol. 26, no. 4, Fourth Quarter 1985, 42-49.
Blalock, H. M. Theory Construction: From Verbal to Mathematical Formulations. Englewood Cliffs, NJ: Prentice-Hall,
Inc., 1969.
Bouchard, T. J., Jr. Unobtrusive Measures: An Inventory of Uses. Sociological Methods and Research, vol. 4, no. 3,
1976.
Bradbard, D. A., E N. Ford, J. E Cox, and W. N. Ledbetter. The Management Science/Operations
Research IndustrialAcademic Interface. Interfaces, vol. 17, no. 2, March-April 1987, 39-48.
Bradburn, N. M. Response Effects. In Handbook of Survey Research, PH. Rossi, J. D. Wright, and A. 9. Anderson
(eds.) New York: Academic Press, 1983.
Decision Sciences. 19. 1988, 635-653.
Buffa, E P An Empirical Study of Inbound Consolidation Opportunities.
Buffa, E. S. Commentary on Production/Operations
Management: Agenda for the EOs. Decision Science, vol. 12,
no. 4, 1981, 572-573.
270
Vol. 9, No. 2
Journal
of Operations
Management
271
Gergen, K. J. Correspondence
versus Autonomy in the Language of Human Action. In Methodology in Social
Science, D. W. Fiske and R. Q. Schweeler (eds.) Chicago, IL: University of Chicago Press, 1986.
International Journal of Operations and
Gilgeous, V Aggregate
Planning in UK Manufacturing
Companies.
Production Management, vol. 7. no. 1, 1987. 50-61.
Glaser, B. Cl., and A. L. Strauss. The Discovery of Grounded Theory: Strategies for Qualitative Research. Chicago:
Aldine, 1967.
Gregory, K. Native View Paradigms: Multiple Cultures and Culture Conflict in Organizations.
Administrative Science
Quarterly, vol. 28, 1983, 359-376.
Groff, G. Personal communication,
1989.
Groff, G. K., and T. B. Clark. Commentary
on Production/Operations
Management: Agenda for the ~OS. Decision
Science. vol. 12, no. 4, 1981, 578-581.
Hamilton, S., and R. G. Schroeder. Computer-Based
Manufacturing and Accounting Systems for Smaller Manufacturing Firms. Production and Inventory Management Journal. vol. 25, no. 4, Fourth Quarter 1986, 92-105.
Hannan, M. T., and N. B. Tuma. Methods for Temporal Analysis. Annual Review of Sociology, vol. 5, 303-1983,
328.
Hax, A. Comment on Production/Operations
Management: Agenda for the 80s: Decision Sciences, Vol. 12, no. 4,
198 1, 574-577.
Hayes, R. H., and K. B. Clark. Explaining
Observed Productivity Differentials between Plants: Implications for
Operations.
Interfaces, vol. 15, no. 6, 1985, 3-14.
Hayes, R. H., and S. C. Wheelwright.
Line Manufacturing Process and Product Life Cycles. Harvard Business
Review, January-February,
1979.
Hitt, M. A., R. D. Ireland, B. W. Kats, and A. Vianna. Measuring Subunit Effectiveness. Decision Sciences, vol. 14,
no. 1, 1983, 87-102.
Horte, S. A., I! Lindberg, and C. Tunalv. Manufacturing
Strategies in Sweden. International Journal of Production
Research, vol. 25, no. 11, 1987, 1573-1586.
HSU, J. I. S. Equipment Replacement Policy-A
Survey. Production and Inventory Management Journal, vol. 29, no.
4, 1988, 23-26.
Huber. V L., and N. L. Hyer. The Human Factor in Cellular Manufacturing.
Journal of Operations Management, vol.
5, no. 2, February 1985, 213-228.
Huck, S. W., W. H. Cormier, and W. G. Bounds. Reading Statistics and Research. New York: Harper and Row,
Publishers, 1974.
Hyer, N. L. The Potential of Group Technology for U.S. Manufacturing.
Journal of Operations Management, vol. 4,
no. 3, May 1984, 183-202.
Im, J. H., and S. M. Lee. Implementation
of Just-in-Time Systems in U.S. Manufacturing Firms. International
Journal of Operations and Production Management. vol. 9, no. 1, 1989, 5-14.
Inman. R. A., and S. Mehra. The Transferability of Just-in-Time Concepts to American Small Business. Interfaces,
vol. 20, no. 2, 1990, 30-37.
Instone, E J., and B. G. Dale. A Case Study of the Typical Issues Involved in Quality Improvement.
Internationul
Journal of Operations and Production Management, vol. 9, no. 1, 1989, 15-26.
Jick, T. D. Mixing Qualitative and Quantitative Methods: Triangulation in Action. Administrative Science Quarterly,
vol. 24, no. 4, 1979, 602-611.
Kerlinger, E N. Foundations of Behavior Research. New York: HoIt, Rinehart and Winston, Inc., 1973.
Khan, A., and V Manopichetwattana.
Innovative and Noninnovative Small Firms: Type and Characteristics.
Management Science, vol. 35, no. 5, 1989, 597-606.
Kivijarvi, R., I? Korhonen, and R. Wallenius. Operations Research and its Practice in Finland. Interfaces, vol. 16, no.
4, July-August, 1986, 53-59.
1984,
Klein, J. A. Why Supervisors Resist Employee Involvement. Harvard Business Review, September-October
87-95.
Kulonda, D. J., and W. H. Moates, Jr. Operations Supervisors in Manufacturing and Service Sectors in the United
States: Are They Different? International Journal of Operations and Production Management, vol. 6, no. 2, 1986,
21-35.
LaForge, R. L., D. R. Wood, Jr., and R. G. Sleeth. An Application of the Survey-Feedback
Method in a Service
Operation. Journal of Operations Management, vol. 5, no. 1, November 1984, 103-l 18.
LaForge, R. L., and V L. Sturr. MRP Practices in a Random Sample of Manufacturing Firms. Production and
Inventory Management Journal, vol. 27, no. 3, 1986, 129-136.
272
Vol. 9, No. 2
Management
Journal
of Operations
Management
273
Runcie, J. E By Days I Make the Cars. Harvard Business Review, vol. 58, no. 3, 1980, 106-I 15.
Sanday, I? R. Ethnographic
Paradigm(s). Administrative Science Quarterly, vol. 24, 1979, 527-538.
Saraph, J. V, I? G. Benson, and R. G. Schroeder. An Instrument for Measuring the Critical Factors of Quality
Management.
Decision Sciences, vol. 20, no. 4, Fall 1989, 810-829.
Schall, M. S. A Communication Rules Approach to Organizational Culture. Administrative Science Quarterly, vol.
28, 1983,
557-581.
R. W. Multiplant Manufacturing
Strategies Among the Fortune 500. Journal of Operations Management, vol. 2, no. 2, February 1982, 77-86.
Schmenner, R. W., and R. L. Cook. Explaining Productivity Differences in North Carolina Factories. Journal of
Operations Management, vol. 5, no. 3, May 1985, 273-289.
Schonberger, R. I. Japanese Manufacturing Techniques: Nine Hidden Lessons in Simpliciry. New York: Free Press, 1982.
Schroeder, R. G., J. C. Anderson, and G. D. Scudder. Measurement
of White Collar Productivity.
International
Journal of Operations and Production Management, vol. 5, no. 2, 1985, 25-34.
Schroeder, R. G., J. C. Anderson, and G. Cleveland. The Content of Manufacturing Strategy: An Empirical Study.
Journal of Operations Management, vol. 6, no. 4, August 1986, 405-415.
Schroeder, R. G., G. D. Scudder, and D. R. Elm. Innovation in Manufacturing.
Journal of Operations Management,
vol. 8, no. 1, January 1989, l-15.
Schroeder, R. G., J. C. Anderson, S. E. Tupy, and E. M. White. A Study of MRP Benefits and Costs. Journul of
Operalions Management, vol. 2, no. 1, 1981, l-9.
Schwab, D. I! Construct Validity in Organizational Behavior. Research in Organizational Behavior. vol. 2, 1980,
Schmenner,
3-43.
Spradley, J. P The Ethnographic Interview. New York: Holt Rinehart Winston, 1979.
Stahl, M. J., and T. W. Zimmerer. Using Decision Modeling to Examine Management Consensus: A Study of a
Maintenance Management Control System. Journal of Operations Management. vol. 3, no. 2, February 1983,
98-98.
Stone, E. Research Methods in Organizational Behavior. Santa Monica: Goodyear Publishing Company, Inc., 1978.
Strauss, A. L. Qualitative Analysis for Social Scientists. Cambridge: Cambridge University Press, 1987.
Swamidass, P M. Manufacturing
Strategy: Its Assessment and Practice. Journal of Operations Management. vol. 6,
no. 4, August 1986, 471.484.
Swamidass, P M., and W. T. Newell. Manufacturing
Strategy, Environmental Uncertainty and Performance: A Path
Analytic Model. Management Science, vol. 33, no. 4. 1987, 509-524.
Temple, A. I., and B. G. Dale. A Study of Quality Circles in White Collar Areas. International Journal of Operations
and Production Management, vol. 7, no. 6, 1987, 17-31.
VOSS, C. A., and S. J. Robinson. Application of Just-in-Time Manufacturing Techniques in the United Kingdom.
International Journal of Operations and Production Manugement, vol. 7, no. 4, 1987, 462.
Voss, C. A. Japanese Manufacturing
Management Practices in the UK. International Journal of Operations and
Production Management, vol. 4, no. 2, 1984, 31-38.
Wathen. S. A Contingency Approach to the Design of Service Transformation Processes. Ph.D. dissertation, School of
Management, University of Minnesota, Minneapolis, MN, 1988.
Weber, R. P Basic Content Analysis. Beverly Hills: Sage Publications, 1985.
Weick, K. Systematic Observational Methods. In Handbook of Social Psychology, G. Lindzey and E. Fronson (eds.)
Reading, MA: Addison-Wesley, 1984.
Weinberg, E. Data Collection: Planning and Management.
In Handbook of Survey Research, l? H. Rossi, J. D.
Wright, and A. B. Anderson (eds.) New York: Academic Press, 1983.
Weiss, A. Simple Truths of Japanese Manufacturing.
Harvard Business Review, July-August 1984, 119-125.
White, C. R., J. Adams, K. Donehoo, and S. Hofacker. Educational and System Requirements for Production Control.
Production and Inventory Munugement, vol. 29, no. 2, 1988. 10-12.
White, E. M., J. C. Anderson, R. G. Schroeder, and S. E. Tupy. A Study of the MRP Implementation Process. Journal
of Operations Munugement, vol. 2, no. 3, 1982, 145-153.
Wild R. Survey Report: The Responsibilities and Activities of UK Production Managers. Infernational Journal of
Operations and Production Management, vol. 4, no. 1, 1984, 69-74.
Yin, R. K. Case Study Research. Beverly Hills: Sage Publications, 1989.
274
Vol. 9, No. 2
APPENDIX A
OVERVIEW OF SCALING TECHNIQUES
Technique
When to Use
Data Obtained
Benefits
Limitations
Multiple choice
items
Entire range of
response should be
classifiable into a
limited number of
discrete, mutually
exclusive categories.
Nominal
Simple, versatile.
Can be used to
obtain either a
single response or
several.
Respondents may
not always follow
directions when a
single response is
desired. Should not
be used for numeric
(continuous) data,
where a direct
.
question is more
appropriate. Limited
statistical analysis is
possible, due to
nominal data.
Likert scale
Use when it is
necessary to obtain
peoples position
on certain issues or
conclusions.
Interval
More readily
analyzed and
interpreted than
open-ended attitude
questions. Flexible,
economical and
easy to compose
items. Can obtain a
summated value in
order to measure a
more general
construct.
Respondents may be
lulled into marking
the same response
for each item,
therefore, care
should be taken so
that some of the
items are inclined
toward the pro side
of the issue and the
rest toward the con
side.
Verbal frequency
scale
Similar to Likert
scale; used to
indicate how often
an action has been
taken.
An approximation
of proportion
Ease of assessment
and response by
those being
surveyed. Useful
when respondent is
expected to be
unable to specify an
absolute number.
Ease of making
comparisons among
subsamples or
among different
types of actions for
the same sample of
respondents.
Provides only a
gross measure or
proportion
(sometimes
can
mean anywhere
from about 30% to
70% of the time).
Different groups
may assign different
breakpoints between
categories.
Journal
of Operations
Management
275
Appendix A Continued
Technique
When to Use
Data Obtained
Benefits
Limitations
Ordinal scale
Use when
researcher seeks to
include some
benchmark to obtain
a relative measure.
Ordinal
Obtains a measure
relative to some
other benchmark.
Statistical
limitations of
ordinal data. Some
information is lost
concerning the
distance or span
between categories.
Should not be used
when an absolute,
numeric value can
be easily obtained.
Forced ranking
scale
Use when
researcher seeks to
obtain standing of
the items relative to
each other.
Ordinal
Obtains most
preferred item, as
well as the sequence
of the remaining
items. Relativity
between items is
measured.
Absolute standing
of an item is not
measured, nor is the
interval between
items. Response
task rapidly
becomes tedious for
respondents if more
than a few items are
included.
Paired comparison
scale
Use when
researcher seeks to
measure simple,
dichotomous
choices between
alternatives. Focus
is on evaluation of
one entity, relative
to one other.
Ordinal data,
limited by lack of
transitivity.
Lack of transitivity
presents a major
problem,
particularly with
less sophisticated
and careful
respondents.
Number of items to
be compared should
be less than ten. No
measure of the
distance between
items in each pair.
Comparative
Use when
researcher is most
interested in the
comparison between
one object and one
or more others.
Interval
Statistical
advantages of
interval data.
Flexibility. An easy
task for the
respondent,
assuring
cooperation and
accuracy.
Economies of space
and time.
276
scale
Vol. 9. No. 2
Appendix A Continued
When to Use
Technique
Semantic
differential
Horizontal,
scale
scale
numeric
Adjective checklist
Journal
Data Obtained
Benefits
Limitations
Used when
researcher would
like to learn the
image of an entity
in the minds of the
public.
Interval
Portrays images
clearly and
effectively. Can also
be used to measure
images or attribute
levels. Entire image
profiles of several
topics or profiles
can be compared
with one another.
Adjectives must be
on the ultimate
extremes of the
spectrum and must
define a single
dimension; it may
be difficult to
identify antonyms,
as perceived by
potential
respondents. No
more than about 20
items should be
used. Precisely
what is to be rated
must be clearly
stated in the
instructions.
items are to be
judged on a single
dimension, arrayed
on a scale with
equal intervals.
Interval
Very economical,
applying a single
question, set of
instructions and
rating scale to many
items. Provides both
absolute and
relative measures of
importance. Data
analysis is relatively
unrestrictive, due to
the interval data.
Respondents have
little or no difficulty
understanding task.
Overall, few
limitations,
compared with
other scaling
methods.
Considerable
controversy among
researchers
concerning whether
intermediate points
on the scale should
be labeled with
words or left as
strictly numeric
values.
Nominal
Avoids a number of
the limitations of
the semantic
differential scale.
Doesnt require
specification of
bipolar opposites,
and virtually
unlimited in the
number of
adjectives which
can be included.
Yields dichotomous
data; there is no
indication of how
much each item
describes the topic.
of Operations
Management
277
Appendix
Technique
When to Use
A Continued
Data Obtained
Benefits
Limitations
Researcher wants to
learn what
proportion of some
resource or activity
has been devoted to
each of several
possible choices or
alternatives.
Approximate
proportions.
Clarity and
simplicity. Useful
when respondent is
likely to have great
difficulty with a
direct question
about percentage of
time.
Stapel scale
Used to portray
image profile.
Interval
Greater complexity;
respondent task is
more difficult to
explain.
278
an
Vol. 9, No. 2
APPENDIX B
Scale Examples
Multiple Choice
Which of the following
-
philosophies
concerning
quality are used by your plant? Please check all that apply.
Likert Scale
We emphasize
Strongly
good maintenance
agree
Agree
as a strategy
for achieving
Disagree
Strongly disagree
schedule compliance.
Always
Often
Never
Seldom
Sometimes
Ordinal Scale
Which term best describes
_
and processes?
Journal
of Operations
Management
279
APPENDIX B Continued
Scale Examples
Paired Comparison
Scale
at your plant.
_
_
_
-
_
_
Consistent quality
Ability to rapidly introduce
_
_
_
_
Comparative
Compared
new products
Scale
in our industry on a global basis, the quality of product and service at our plant is
Horizontal,
average -Poor
Numeric Scale
How important is each of the following objectives or goals for manufacturing at your plant over the next five years? If
you feel that the objective is extremely important, pick a number from the far left side of the scale and jot it in the space
beside the item. If you feel it is extremely unimportant, pick a number from the far right, and if you feel the importance
is between these extremes, pick a number from some place in the middle of the scale to show your opinion.
Scale
Extremely
_
_
_
important
Extremely
unimportant
Unfriendly
Helpful
Detrimental
Messy
Tidy
Cooperative
Uncooperative
Boring
Interesting
Well-organized
Disorganized
Competent
Incompetent
Encouraging
Discouraging
280
Vol. 9. No. 2
APPENDIX B Continued
Scale Examples
Adjective Check List
Please put a check mark in the space in front of each word phrase which describes
Easy
Technical
Boring
Interesting
Low-paying
Strenuous
Routine
Dead-end
_
-
your job.
_
_
-
Changing
Important
Demanding
Temporary
Safe
Exhausting
Difficult
Rewarding
_
-
Secure
Slow-paced
Enjoyable
Rigid
Pleasant
Satisfying
Degrading
Risky
Rape1 Scale
Please pick a number from the scale to show how well each word below describes your job, and jot it in the space in front
of each item.
Scale
Not at all
_
_
3
-
Easy
Technical
Boring
Interesting
Low-paying
Strenuous
Routine
Dead-end
_
_
_
_
_
5
Changing
Important
Demanding
Temporary
Safe
Exhausting
Difficult
Rewarding
Perfectly
_
-
Secure
Slow-paced
Enjoyable
Rigid
Pleasant
Satisfying
Degrading
Risky
_
_
_
_
Journal
of Operations
Management
281
APPENDIX C
ANALYTICAL METHODS USED IN EMPIRICAL P/OM RESEARCH
Citation1
Sample Size
Topic
DESCRIPTIVE
STATISTICS
(MEANS, PERCENTAGES,
Bahl (1989)
Chervany
MRP implementation
31
JIT implementation
Advanced
Raiszadeh
Rao (1989)
Crawford,
(1988)
Blackstone
and Cox
manufacturing
39
Manufacturing
in Korea
Education/systems
production
JIT purchasing
Darrow ( 1987)
Ford, Bradbard,
(1987)
Ledbetter
and Cox
(1987)
(1987)
47
431
29
(1987)
technology
JIT implementation
Equipment
205
Hsu (1988)
Bradbard,
(1987)
31
433
(1989)
TABLES, CHARTS)
replacement
60
policies
for
benefits
Use of OR techniques
21
100
72
in PiOM
72
Flexible manufacturing
systems
235
Use of OR techniques
in P/OM
72
systems
18
82
Indian production
131
115
132
Flexible manufacturing
the United Kingdom
(1986)
management
33
strategy (3 surveys)
Manufacturing
Kivijarvi,
(1986)
Use of OR techniques
Finland
and Wallenius
Operations
MRP practices
Schroeder,
(1986)
Manufacturing
282
Anderson
and Cleveland
33
systems in
72
supervisors
in P/OM in
64
2,018
107
strategy
39
Vol. 9, No. 2
APPENDIX C Continued
METHODS USED IN EMPIRICAL
ANALYTICAL
35
Swamidass
(1986)
Manufacturing
Blackstone
(1985)
Schroeder,
(1985)
strategy
(1985)
Anderson
22
86, 72, 67. 60
(1985)
P/OM RESEARCH
and Scudder
(1984)
16
managers
Information systems in
manufacturing
560
39
Computer-based
small firms
manufacturing
12
in
Hyer (1984)
Group technology
Klein (1984)
Supervisor/employee
involvement
Rosenthal
57, 38, 64
(1984)
Weiss ( 1984)
Comparison
Wild (1984)
Activities of production
the United Kingdom
Garvin (1983)
of Japanese/U.S.
Quality comparison
(1983)
Anderson, Schroeder,
White (1982)
Production
Tupy and
20
implementation
firms
managers
in
in U.S./Japan
management
supervisors
MRP systems
45
18
8,ooO
679
DELPHI METHOD
Benson, Hill and Hoffmann
(1982)
Future manufacturing
surveys)
systems (2
95, 65
Schroeder
and Anderson
(1989)
Production
Quality
case)
competence
improvement
(research
Pesch (1989)
Focused factories
12
Schroeder,
Innovation
65
voss ( 1984)
in manufacturing
Japanese manufacturing
management in the United Kingdom
t-TEST
Ferdows and Lindberg
Schmenner
(1987)
Journal of Operations
Flexible manufacturing
Productivity
Management
differences
systems
163
95
283
APPENDIX C Continued
ANALYTICAL METHODS USED IN EMPIRICAL P/OM RESEARCH
CHI-SQUARE
Lockyer and Oakland
(1983)
TEST
JIT implementation
F-TEST/ANOVA/MANOVA
Buffa (1988)
Inbound consolidation
Innovative
CLUSTER
(2 samples)
150, 25
ANALYSIS
small firms
50
FACTOR ANALYSIS
Meyer and Ferdows ( 1987)
Manufacturing
strategy
163
Horte, Lindberg
Manufacturing
strategy
125
Reliability
DISCRIMINANT
White, Anderson,
Schroeder
and
MRP implementation
162
ANALYSIS
process
679
SPY (1982)
PROBIT
Schmenner
(1982)
Multiplant
ANALYSIS
manufacturing
DECISION
Stahl and Zimmerer
(1983)
Maintenance
SMALL
Clark (1989)
284
strategies
MODEL
management
control
MATHEMATICAL
Product development
334
projects
60
MODEL
29
Vol. 9, No. 2