Sunteți pe pagina 1din 85

Social Inclusion

ASSIGNMENT-I

Q.1 (a) Distinguish between Research methods and Research methodology

Research Methods and Research Methodology are two terms that are often confused as one and
the same. Strictly speaking they are not so and they show differences between them. One of the
primary differences between them is that research methods are the methods by which we conduct
research into a subject or a topic.
On the other hand research methodology explains the methods by which we may proceed with
research.
Research Method and Research Methodology

Research methods are the various procedures, schemes, algorithms, etc. used in research. All the
methods used by a researcher during a research study are termed as research methods. They are
essentially planned, scientific and value neutral. They include theoretical. procedures,
experimental studies, numerical schemes, statistical approaches, etc. Research methods help us to
collect samples, data and find a solution to a problem. Particularly, scientific research methods call
for explanations based on collected facts, measurements and observations and not on reasoning
alone. They accept only those explanations which can be verified by experiments.

 Research methods involve conduct of experiments, tests, surveys and the like.
 In short it can be said that research methods aim at finding solutions to research problems.
 Research methodology is a systematic way to solve a problem. It is a science of studying
how research is to be carried out. Essentially, the procedures by which researchers go about
their work of describing, explaining and predicting phenomena are called research
methodology. It is also defined as the study of methods by which knowledge is gained. Its
aim is to give the work plan of research.
 On the one hand research methodology involves the learning of the various techniques that
can be used in the conduct of research and in the conduct of tests, experiments, surveys
and critical studies.
 On the other hand research methodology aims at the employment of the correct procedures
to find out solutions.
 It is thus interesting to note that research methodology paves the way scientific or non-
scientific research for research methods to be conducted properly. Research methodology
is the beginning
If the subject into which we conduct a research is a scientific subject or topic then the research
methods include experiments, tests, study of various other results of different experiments
performed earlier in relation to the topic or the subject and the like.

Research methodology pertaining to the scientific topic involves the techniques regarding how to
go about conducting the research, the tools of research, advanced techniques that can be used in
the conduct of the experiments and the like. Any student or research candidate is supposed to be
good at both research methods and research methodology if he or she is to succeed in his or her
attempt at conducting research into a subject.

(b) Describe the different types of research, clearly pointing out the difference between an

Experiment and a survey

Types of Research

Research can be conducted in a number of different ways for many different purposes. Most
research conducted today is in the corporate sector, and the reasons for this should be obvious.
Types of research can be classified in many different ways. Some major ways of classifying
research include the following.
 Descriptive versus Analytical Research
 Applied versus Fundamental Research
 Qualitative versus Quantitative Research
 Conceptual versus Empirical Research

Descriptive research concentrates on finding facts to ascertain the nature of something as it exists.
In contrast analytical research is concerned with determining validity of hypothesis based on
analysis of facts collected.

Applied research is carried out to find answers to practical problems to be solved and as an aid in
decision making in different areas including product design, process design and policy making.
Fundamental research is carried out as more to satisfy intellectual curiosity, than with the intention
of using the research findings for any immediate practical application.

Quantitative research studies such aspects of the research subject which are not quantifiable, and
hence not subject to measurement and quantitative analysis. In contrast quantitative research make
substantial use of measurements and quantitative analysis techniques.

Conceptual research is involves investigation of thoughts and ideas and developing new ideas or
interpreting the old ones based on logical reasoning. In contrast empirical research is based on firm
verifiable data collected by either observation of facts under natural condition or obtained through
experimentation.

Difference between Experiment and Survey:

It is a systematic and scientific approach to research in which the researcher manipulates one or
more variables, and controls and measures any change in other variables.
Survey research seeks to identify what large numbers of people (mass) think or feel about certain
things. It is used extensively in politics and marketing (such as TV advertising).

In an experimental design, the researcher actively tries to change the situation, circumstances, or
experience of participants (manipulation), which may lead to a change in behavior or outcomes
for the participants of the study.
Research may be very broadly defined as systematic gathering of data and information and its
analysis for advancement of knowledge in any subject. Research attempts to find answer
intellectual and practical questions through application of systematic methods.

Experimental Research is often used where:

1. There is time priority in a causal relationship (cause precedes effect)


2. There is consistency in a causal relationship (a cause will always lead to the same effect)
3. The magnitude of the correlation is great.

Survey research often used to collect:

 Public opinion polls


 Mail Surveys
 Telephone Surveys
 Consumer Surveys (in the Mall) in any specific topic.

The participants are ideally randomly assigned to different conditions, and variables of interest are
measured. The researcher tries to control the other variables in order to avoid confounds to
causality.

Surveys are often considered biased because

 They ask leading questions


 The sample population is biased in a particular way
 The questions were not clear
 The respondents were influenced by the researcher
 Very rare chances of Margin of error.
An amazing fact about survey research is that the amount of error (expressed as plus and minus a
certain percentage) is determined by the sample size (the number of people surveyed). Most
opinion polls use a sample size of around 1500, which has a margin of error of 3%.

Q.2 Write short notes on:

 Design of the research project;

Research design is considered as a "blueprint" for research, dealing with at least four problems:
which questions to study, which data are relevant, what data to collect, and how to analyze the
results. The best design depends on the research question as well as the orientation of the
researcher.

Execution of the project: Execution of the project is a very important step in the research process.
If the execution of the project proceeds on correct lines, the data to be collected would be adequate
and dependable.

The researcher should see that the project is executed in a systematic manner and in time. If the
survey is to be conducted by means of structured questionnaires, data can be readily machine-
processed. In such a situation, questions as well as the possible answers may be coded. If the data
are to be collected through interviewers, arrangements should be made for proper selection and
training of the interviewers. The training may be given with the help of instruction manuals which
explain clearly the job of the interviewers at each step. Occasional field checks should be made to
ensure that the interviewers are doing their assigned job sincerely and efficiently.

 Motivation in research;

What makes people to undertake research? This is a question of fundamental importance. The
possible motives for doing research may be either one or more of the following:

1. Desire to get a research degree along with its consequential benefits;


2. Desire to face the challenge in solving the unsolved problems, i.e., concern over practical
problems initiates research;

3. Desire to get intellectual joy of doing some creative work;

4. Desire to be of service to society;

5. Desire to get respectability.

However, this is not an exhaustive list of factors motivating people to undertake research studies.
Many more factors such as directives of government, employment conditions, curiosity about new
things, desire to understand causal relationships, social thinking and awakening, and the like may
as well motivate (or at times compel) people to perform research operations.

 Objectives of research;

The purpose of research is to discover answers to questions through the application of scientific
procedures. The main aim of research is to find out the truth which is hidden and which has not
been discovered as yet. Though each research study has its own specific purpose, we may think of
research objectives as falling into a number of following broad groupings:

1. To gain familiarity with a phenomenon or to achieve new insights into it (studies with this object
in view are termed as exploratory or formulate research studies);

2. To portray accurately the characteristics of a particular individual, situation or a group (studies


with this object in view are known as descriptive research studies);

3. To determine the frequency with which something occurs or with which it is associated with
something else (studies with this object in view are known as diagnostic research studies);
4. To test a hypothesis of a causal relationship between variables (such studies are known as
hypothesis-testing research studies).

 Criteria of good research

Whatever may be the types of research works and studies, one thing that is important is that they
all meet on the common ground of scientific method employed by them. One expects scientific
research to satisfy the following criteria:

1. The purpose of the research should be clearly defined and common concepts be used.

2. The research procedure used should be described in sufficient detail to permit another researcher
to repeat the research for further advancement, keeping the continuity of what has already been
attained.

3. The procedural design of the research should be carefully planned to yield results that are as
objective as possible.

4. The researcher should report with complete frankness, flaws in procedural design and estimate
their effects upon the findings.

5. The analysis of data should be sufficiently adequate to reveal its significance and the methods
of analysis used should be appropriate. The validity and reliability of the data should be checked
carefully.

6. Conclusions should be confined to those justified by the data of the research and limited to those
for which the data provide an adequate basis.

7. Greater confidence in research is warranted if the researcher is experienced, has a good


reputation in research and is a person of integrity.
Q.3 (a) “Creative management, whether in public administration or private industry,
depends on methods of inquiry that maintain objectivity, clarity, accuracy and consistency”.
Discuss this statement and examine the significance of research”.

All progress is born of inquiry. Doubt is often better than overconfidence, for it leads to inquiry,
and inquiry leads to invention. is a famous Hudson Maxim in context of which the significance of
research can well be understood. Increased amounts of research make progress possible. Research
inculcates scientific and inductive thinking and it promotes the development of logical habits of
thinking and organization.

The role of research in several fields of applied economics, whether related to business or to the
economy as a whole, has greatly increased in modern times. The increasingly complex nature of
business and government has focused attention on the use of research in solving operational
problems. Research, as an aid to economic policy, has gained added importance, both for
government and business.

Research provides the basis for nearly all government policies in our economic system. For
instance, government‘s budgets rest in part on an analysis of the needs and desires of the people
and on the availability of revenues to meet these needs. The cost of needs has to be equated to
probable revenues and this is a field where research is most needed. Through research we can
devise alternative policies and can as well examine the consequences of each of these alternatives.
Research is equally important for social scientists in studying social relationships and in seeking
answers to various social problems. It provides the intellectual satisfaction of knowing a few things
just for the sake of knowledge and also has practical utility for the social scientist to know for the
sake of being able to do something better or in a more efficient manner. Research in social sciences
is concerned both with knowledge for its own sake and with knowledge for what it can contribute
to practical concerns. .This double emphasis is perhaps especially appropriate in the case of social
science. On the one hand, its responsibility as a science is to develop a body of principles that
make possible the understanding and prediction of the whole range of human interactions. On the
other hand, because of its social orientation, it is increasingly being looked to for practical guidance
in solving immediate problems of human relations.
The significance of research can also be understood keeping in view the following points:

(a) To those students who are to write a master‘s or Ph.D. thesis, research may mean careerism or
a way to attain a high position in the social structure;

(b) To professionals in research methodology, research may mean a source of livelihood;

(c) To philosophers and thinkers, research may mean the outlet for new ideas and insights;

(d) To literary men and women, research may mean the development of new styles and creative
work;

(e) To analysts and intellectuals, research may mean the generalizations' of new theories.

Thus, research is the fountain of knowledge for the sake of knowledge and an important source of
providing guidelines for solving different business, governmental and social problems. It is a sort
of formal training which enables one to understand the new developments in one‘s field in a better
way.
Q. 3(b) Based on the objectives, how the research plan can be presented. Draw the research
plan table with time frame.

Work Plan: 2 YEARS


(Based on IUKL Credit transfer from MU)

Year 1 Year 2
Sem1 Sem 2 Sem 3 Sem 4
Q Q Q Q Q Q Q Q
1 2 3 4 1 2 3 4

Phase 1: Coursework (12 Credit hours): Postgraduate


research methodology, Qualitative Analysis, Quantitative
Analysis, Current issue based on specialization
Cover by study and assignment carried to date by
Exemption/Credit transfer
Phase II: Proposal Defense
Preliminary Data Gathering& Topic Development
Chapter 1: Introduction
Chapter 2: Literature Review
Chapter 3 : Research Method
Proposal Defense
Questionnaire Design
Pilot test
Survey
Result analysis
Chapter 4 : Result Analysis and Discussion
Chapter 5 : Conclusion and Recommendations
Journal article
Phase III: Thesis : Viva

Milestones and Dates

Particulars Expected Date of Completion


Preliminary Data Gathering: Proposal Development Sem 1, Year 1
Chapter 1: Introduction Sem 1, Year 1
Chapter 2: Literature Review Sem 2, Year 1
Chapter 3 : Research Method Sem 2, Year 1
Proposal Defense Sem 2, Year 1
Questionnaire Design Sem 3, Year 2
Pilot test Sem3, Year 2
Survey Sem 3, Year 2
Result analysis Sem 3, Year 2
Chapter 4 : Result Analysis and Discussion Sem 3, Year 2
Chapter 5 : Conclusion and Recommendations Sem 4, Year 2
Q.4 (a) what is the necessity of defining a research problem? How do we define a research
problem?

Give two examples to illustrate your answer.

A research problem, in general, refers to some difficulty which a researcher experiences in the
context of either a theoretical or practical situation and wants to obtain a solution for the same. We
can, thus, state the components of a research problem as under:

(i) There must be an individual or a group which has some difficulty or the problem.

(ii) There must be some objective(s) to be attained at. If one wants nothing, one cannot have a
problem.

(iii) There must be alternative means (or the courses of action) for obtaining the objective(s) one
wishes to attain. This means that there must be at least two means available to a researcher for if
he has no choice of means, he cannot have a problem.

(iv) There must remain some doubt in the mind of a researcher with regard to the selection of
alternatives. This means that research must answer the question concerning the relative efficiency
of the possible alternatives.

(v) There must be some environment(s) to which the difficulty pertains.

Thus, a research problem is one which requires a researcher to find out the best solution for the
given problem, i.e., to find out by which course of action the objective can be attained optimally
in the context of a given environment. There are several factors which may result in making the
problem complicated. For instance, the environment may change affecting the efficiencies of the
courses of action or the values of the outcomes; the number of alternative courses of action may
be very large; persons not involved in making the decision may be affected by it and react to it
favorably or unfavorably, and similar other factors. All such elements (or at least the important
ones) may be thought of in context of a research problem.
(b) “The task of defining the research problem often follows a sequential pattern”. Explain.

Research problem need to be defined in sequential pattern required are


1. Formulating a general idea to influence research work
2. dissecting the general idea into sub-topics to come up with alternative actions
3. choose the most interested sub-topic or part to base the research findings
4. then develop research questions

Besides that, the following points may be observed by a researcher in selecting a research problem
or a subject for research:

(i) Subject which is overdone should not be normally chosen, for it will be a difficult task to throw
any new light in such a case.

(ii) Controversial subject should not become the choice of an average researcher.

(iii) Too narrow or too vague problems should be avoided.

(iv) The subject selected for research should be familiar and feasible so that the related research
material or sources of research are within one‘s reach.

(v) The importance of the subject, the qualifications and the training of a researcher, the costs
involved, time factor are few other criteria that must also be considered in selecting a problem.

(vi) The selection of a problem must be preceded by a preliminary study.

Q.5 (a) Explain the meaning and significance of a Research design?

The term "research design" refers to how a researcher puts a research study together to answer a
question or a set of questions. Research design works as a systematic plan outlining the study, the
researchers' methods of compilation, details on how the study will arrive at its conclusions and the
limitations of the research. Research design is not limited to a particular type of research and may
incorporate both quantitative and qualitative analysis. When defining research design to an
audience, there are a few things we will need to make clear, while avoiding the use of scientific
terms that may lose audience.

The research design is a comprehensive master plan of the research study to be undertaken, giving
a general statement of the methods to be used. The function of a research design is to ensure that
requisite data in accordance with the problem at hand is collected accurately and economically.
Simply stated, it is the framework, a blueprint for the research study which guides the collection
and analysis of data. The research design, depending upon the needs of the researcher may be a
very detailed statement or only furnish the minimum information required for planning the
research project.

(b) Describe some of the important research designs used in experimental hypothesis-testing
research study.

There are various types of research designs are, e.g. descriptive, exploratory, experimental,
qualitative and quantitative research design used in basic research study for the hypothesis testing.
Out of these designs, experimental research design is used in experimental hypothesis testing
research study.

These are the design where the' researcher tests the hypotheses of causal relationships between
variables. Such studies require procedures that will not only reduce bias and increase reliability,
but also permit drawing inferences about causality. Usually, experiments meet these requirements.
Hence, these are better known as experimental research designs. An experiment deliberately
imposes a treatment on a group of objects or subjects in the interest of observing the response.
This differs from an observational study, which involves collecting and analyzing data without
changing existing conditions. Because the validity of an experiment is directly affected by its
construction and execution, attention to experimental design is extremely important.
Professor R.A. Fisher (Centre for Agricultural Research in England) enumerated three principles
of experimental designs.

i) The Principle of Replication: the term, replication has been derived from the fusion of two words,
namely repetition and duplication. Replication refers to the deliberate repetition of an experiment,
using nearly identical procedures, which may sometimes be with a different set of subjects in a
different setting, and, at different time periods. It helps to revalidate a previous study, or to raise
some questions about the previous studies.

ii) The Principle of Randomization: Randomization refers to a technique in which each member
of the population, or, universe has an equal and independent chance of being selected. This
provides for random distribution Research Designs of the effects of unknown or unspecified
extraneous variables over different groups. This is a method of controlling the extraneous variables
and reducing experimental error. Thus randomization makes the test valid.

iii) The Principle of Local Control: Local Control refers to the amount of balancing, blocking and
grouping of the subjects or the experimental units employed in the research design. The term,
grouping, refers to the assignment of homogeneous subjects, or experimental units, into a group
so that different groups of homogeneous subjects may be available for differential experimental
treatments. The term, blocking, refers to the assignment of experimental units to different blocks
in such a way that the assigned experimental units within a block may be homogeneous. The term,
balancing in a research design refers to the grouping, blocking and assignment of experimental
units to the different treatments in such a way that the resulting design appears to be a balanced
one. A design, to be statistically and experimentally sound must possess the property of local
control.

Q. 6 Explain and illustrate the following research designs:

 Two group simple randomized design;


The simplest of all experimental designs is the two-group posttest-only randomized experiment.
In design notation, it has two lines -- one for each group -- with an R at the beginning of each line
to indicate that the groups were randomly assigned. One group gets the treatment or program (the
X) and the other group is the comparison group and doesn't get the program (note that this we
could alternatively have the comparison group receive the standard or typical treatment, in which
case this study would be a relative comparison).

Notice that a pretest is not required for this design. Usually we include a pretest in order to
determine whether groups are comparable prior to the program, but because we are using random
assignment we can assume that the two groups are probabilistically equivalent to begin with and
the pretest is not required (although we'll see with covariance designs that a pretest may still be
desirable in this context).

In this design, we are most interested in determining whether the two groups are different after the
program. Typically we measure the groups on one or more measures (the Os in notation) and we
compare them by testing for the differences between the means using a t-test or one way Analysis
of Variance (ANOVA).

 Latin square design;

In combinatory and in experimental design, a Latin square is an n × n array filled with n different
symbols, each occurring exactly once in each row and exactly once in each column.

The name "Latin square" is motivated by mathematical papers by Leonhard Euler, who used Latin
characters as symbols. Of course, other symbols can be used instead of Latin letters: in the above
example, the alphabetic sequence A, B, C can be replaced by the integer sequence 1, 2, 3.

Facts about the LS Design


-With the Latin Square design we are able to control variation in two directions.
-Treatments are arranged in rows and columns
-Each row contains every treatment.
-Each column contains every treatment.
-The most common sizes of LS are 5x5 to 8x8
 Random replications design;

The limitation of the two-group randomized design is usually eliminated within the random
replications design. In a random replications design, the effect of extraneous variable are
minimised (or reduced) by providing a number of repetitions for each treatment. Each repetition
is technically called a=replication‘. Random replication design serves two purposes viz., it
provides controls for the differential effects of the extraneous independent variables and secondly,
it randomizes any individual differences among those conducting the treatments.

 Simple factorial design

An experiment may be designed to focus attention on a single independent variable or factor is


called simple factorial design. An alternative approach is to study the influence of one independent
variable in conjunction with variations in one or more additional dependent variables. We can
study not only the effects of the two independent variables separately but also how they combine
to influence the dependent variable.

A factorial design actually consists of a set of single-factor experiments. A factorial design


produces three important pieces of information; the simple effects, the interaction effects, and the
main effect.

Q.7 Explain the meaning of the following in context of Research design.

Extraneous variables

Extraneous variables are variables other than the independent variable that may bear any effect on
the behavior of the subject being studied. This only affects the people in the experiment, not the
place the experiment is taking place in. Some examples are gender, ethnicity, social class, genetics,
intelligence, age.
A variable is extraneous only when it can be assumed to influence the dependent variable. It
introduces noise but doesn't systematically bias the results.

Extraneous Variables are undesirable variables that influence the relationship between the
variables that an experimenter is examining. Another way to think of this, is that these are variables
the influence the outcome of an experiment, though they are not the variables that are actually of
interest. These variables are undesirable because they add error to an experiment. A major goal in
research design is to decrease or control the influence of extraneous variables as much as possible.

Confounded relationship;

In statistics, a confounding variable (also confounding factor, hidden variable, lurking variable, a
confounded, or confounder) is an extraneous variable in a statistical model that correlates
(positively or negatively) with both the dependent variable and the independent variable. Such a
relation between two observed variables is termed a spurious relationship. In the case of risk
evaluating the magnitude and nature of risk to human health, it is important to control for
confounding to isolate the effect of a particular hazard such as a food additive, pesticide, or new
drug. For prospective studies, it is difficult to recruit and screen for volunteers with the same
background (age, diet, education, geography, etc.), and in historical studies, there can be similar
variability.

Research hypothesis;

A research hypothesis is the statement created by researchers when they speculate upon the
outcome of a research or experiment. Every true experimental design must have this statement at
the core of its structure, as the ultimate aim of any experiment. The hypothesis is generated via a
number of means, but is usually the resultof a process of inductive reasoning where observations
lead to the formation of a theory. Scientists then use a large battery of deductive methods to arrive
at a hypothesis that is testable, falsifiable and realistic.
The research hypothesis is a paring down of the problem into something testable and falsifiable.
In the aforementioned example, a researcher might speculate that the decline in the fish stocks is
due to prolonged over fishing. Scientists must generate a realistic and testable hypothesis around
which they can build the experiment.

A hypothesis must be testable, but must also be falsifiable for its acceptance as true science.

A scientist who becomes fixated on proving a research hypothesis loses their impartiality and
credibility. Statistical tests often uncover trends, but rarely give a clear-cut answer, with other
factors often affecting the outcome and influencing the results. A hypothesis must be testable,
taking into account current knowledge and techniques, and be realistic.

Experimental and Control groups

In a controlled experiment, there are two groups. The control group is a group that nothing happens
to. The experimental group is the group that we subject to the variable with which we are
experimenting. At the end of the experiment, we test the differences between the control group,
for whom nothing happened, and the experimental group, which received the variable. The
difference (or similarities) between the two groups is how wer results are measured.

The difference between a control group and an experimental group is one group is exposed to the
conditions of the experiment and the other is not. An experimental group is the group in a scientific
experiment where the experimental procedure is performed. This group is exposed to the
independent variable being tested and the changes observed and recorded. A control group is a
group separated from the rest of the experiment where the independent variable being tested cannot
influence the results. This isolates the independent variable's effects on the experiment and can
help rule out alternate explanations of the experimental results.

While all experiments have an experimental group, not all experiments require a control group.
Controls are extremely useful where the experimental conditions are complex and difficult to
isolate. Experiments that use control groups are called controlled experiments.
There are two other types of control groups where the conditions the group are subjected to will
cause predetermined results.
Positive control groups are control groups where the conditions guarantee a positive result.
Positive control groups are effective to show the experiment is functioning as planned.
Negative control groups are control groups where conditions produce a negative outcome.
Negative control groups help identify outside influences which may be present that were not
unaccounted for, such as contaminants.

Q.9 (a) what do we mean by „Sample Design.? What points should be taken into
consideration by a researcher in developing a sample design for this research project.

Researchers rarely survey the entire population because the cost of a census is too high. The three
main advantages of sampling are that the cost is lower, data collection is faster, and since the data
set is smaller it is possible to ensure homogeneity and to improve the accuracy and quality of the
data.

Each observation measures one or more properties (such as weight, location, color) of observable
bodies distinguished as independent objects or individuals. In survey sampling, weights can be
applied to the data to adjust for the sample design, particularly stratified sampling (blocking).
Results from probability theory and statistical theory are employed to guide practice. In business
and medical research, sampling is widely used for gathering information about a population.

(b) How would we differentiate between simple random sampling and complex random
sampling designs? Explain clearly giving examples.

In a simple random sample ('SRS') of a given size, all such subsets of the frame are given an equal
probability. Each element of the frame thus has an equal probability of selection: the frame is not
subdivided or partitioned. Furthermore, any given pair of elements has the same chance of
selection as any other such pair (and similarly for triples, and so on). This minimizes bias and
simplifies analysis of results. In particular, the variance between individual results within the
sample is a good indicator of variance in the overall population, which makes it relatively easy to
estimate the accuracy of results.

However, SRS can be vulnerable to sampling error because the randomness of the selection may
result in a sample that doesn't reflect the makeup of the population. For instance, a simple random
sample of ten people from a given country will on average produce five men and five women, but
any given trial is likely to over represent one sex and underrepresent the other. Systematic and
stratified techniques, discussed below, attempt to overcome this problem by using information
about the population to choose a more representative sample.

SRS may also be cumbersome and tedious when sampling from an unusually large target
population. In some cases, investigators are interested in research questions specific to subgroups
of the population. For example, researchers might be interested in examining whether cognitive
ability as a predictor of job performance is equally applicable across racial groups. SRS cannot
accommodate the needs of researchers in this situation because it does not provide subsamples of
the population. Stratified sampling, which is discussed below, addresses this weakness of SRS.
Simple random sampling is always an EPS design (equal probability of selection), but not all EPS
designs are simple random sampling.

Q.10 (a) under what circumstances stratified random sampling design is considered
appropriate? How would we select such sample? Explain by means of an example.

Where the population embraces a number of distinct categories, the frame can be organized by
these categories into separate "strata." Each stratum is then sampled as an independent sub-
population, out of which individual elements can be randomly selected. There are several potential
benefits to stratified sampling.

First, dividing the population into distinct, independent strata can enable researchers to draw
inferences about specific subgroups that may be lost in a more generalized random sample.
Second, utilizing a stratified sampling method can lead to more efficient statistical estimates
(provided that strata are selected based upon relevance to the criterion in question, instead of
availability of the samples). Even if a stratified sampling approach does not lead to increased
statistical efficiency, such a tactic will not result in less efficiency than would simple random
sampling, provided that each stratum is proportional to the group's size in the population.

Third, it is sometimes the case that data are more readily available for individual, pre-existing
strata within a population than for the overall population; in such cases, using a stratified sampling
approach may be more convenient than aggregating data across groups (though this may
potentially be at odds with the previously noted importance of utilizing criterion-relevant strata).
Finally, since each stratum is treated as an independent population, different sampling approaches
can be applied to different strata, potentially enabling researchers to use the approach best suited
(or most cost-effective) for each identified subgroup within the population.

There are, however, some potential drawbacks to using stratified sampling. First, identifying strata
and implementing such an approach can increase the cost and complexity of sample selection, as
well as leading to increased complexity of population estimates. Second, when examining multiple
criteria, stratifying variables may be related to some, but not to others, further complicating the
design, and potentially reducing the utility of the strata. Finally, in some cases (such as designs
with a large number of strata, or those with a specified minimum sample size per group), stratified
sampling can potentially require a larger sample than would other methods (although in most cases,
the required sample size would be no larger than would be required for simple random sampling.

A stratified sampling approach is most effective when three conditions are met

1. Variability within strata are minimized


2. Variability between strata are maximized
3. The variables upon which the population is stratified are strongly correlated with the desired
dependent variable.
(b) “A systematic bias results from errors in the sampling procedures”. What do we mean
by such a systematic bias? Describe the important causes responsible for such a bias.

In measurement theory, "bias" (or "systematic error") is a difference between the expectation of a
measurement and the true underlying value. Bias can result from calibration errors or instrumental
drift, for example. Contrast this usage with the previous: here, a bias is a property of a
measurement, which is a physical process, whereas before it was a property of a statistical
estimator (which is a mathematically defined procedure to make guesses from data).

"Systematic bias" appears to be used only when distinguishing bias from random "error": the term
"error" tends to be used primarily for random terms with zero expectation.

In many cases, bias in the first sense decreases as the amount of data increases: many biased
estimators in practice become less and less biased with more data (although this is not theoretically
guaranteed, because the concept of bias is so broad). A good example is the maximum likelihood
estimator of the variance of a distribution when n independent draws xis from that distribution are
available. The ML estimator is

vˆ=1nSi=1n(xi-x¯)2, for x¯=1nSni=1xi. It is well known that this is biased; the estimator nn-1vˆ
is unbiased. Whence,
asn.8, vˆ.nn-1vˆ becomes asymptotically unbiased.

Bias in the measurement context (the second sense), however, is usually not reducible by taking
more measurements: the bias is inherent in the measurement procedure itself. One has to estimate
and reduce the bias by calibrating the measurement procedure or comparing it to other procedures
known to have no (or less) bias, estimating the bias, and compensating for that.
ASSIGNMENT-II

Q. 1 (a) What is the meaning of measurement in research? What difference does it make
whether we measure in terms of a nominal, ordinal, interval or ratio scale? Explain giving
examples.

Measurement is at the core of doing research. Measurement is the assignment of numbers to things.
In almost all research, everything has to be reduced to numbers eventually. Precision and exactness
in measurement are vitally important. The measures are what are actually used to test the
hypotheses. A researcher needs good measures for both independent and dependent variables.

Measurement consists of two basic processes called conceptualization and operationalization, then
an advanced process called determining the levels of measurement, and then even more advanced
methods of measuring reliability and validity.

A level of measurement is the precision by which a variable is measured. For 50 years, with few
detractors, science has used the Stevens (1951) typology of measurement levels. There are three
things to remember about this typology: (1) anything that can be measured falls into one of the
four types; (2) the higher the type, the more precision in measurement; and (3) every level up
contains all the properties of the previous level. The four levels of measurement, from lowest to
highest, are:

 Nominal
 Ordinal
 Interval
 Ratio

The nominal level of measurement describes variables that are categorical in nature. The
characteristics of the data we're collecting fall into distinct categories. If there are a limited number
of distinct categories (usually only two), then we're dealing with a discrete variable. If there are an
unlimited or infinite number of distinct categories, then we're dealing with a continuous variable.
Nominal variables include demographic characteristics like sex, race, and religion.
The ordinal level of measurement describes variables that can be ordered or ranked in some order
of importance. It describes most judgments about things, such as big or little, strong or weak. Most
opinion and attitude scales or indexes in the social sciences are ordinal in nature.

The interval level of measurement describes variables that have more or less equal intervals, or
meaningful distances between their ranks. For example, if we were to ask somebody if they were
first, second, or third generation immigrant, the assumption is that the distance, or number of years,
between each generation is the same. All crime rates in criminal justice are interval level measures,
as is any kind of rate.

The ratio level of measurement describes variables that have equal intervals and a fixed zero (or
reference) point. It is possible to have zero income, zero education, and no involvement in crime,
but rarely do we see ratio level variables in social science since it's almost impossible to have zero
attitudes on things, although "not at all", "often", and "twice as often" might qualify as ratio level
measurement.

(b) Point out the possible sources of error in measurement. Describe the tests of sound
measurement.

In principle, every operation of a survey is a potential source of measurement error. Some


examples of causes of measurement error are non-response, badly designed questionnaires,
respondent bias and processing errors. The sections that follow discuss the different causes of
measurement errors.

Measurement errors can be grouped into two main causes, systematic errors and random errors.
Systematic error (called bias) makes survey results unrepresentative of the target population by
distorting the survey estimates in one direction. For example, if the target population is the entire
population in a country but the sampling frame is just the urban population, then the survey results
will not be representative of the target population due to systematic bias in the sampling frame.
On the other hand, random error can distort the results on any given occasion but tends to balance
out on average.

Measurements may display good precision but yet be very inaccurate. Often, we may find that
measurements are usually consistently too high or too low from the accurate value. The
measurements are systematically wrong. For example, weighing ourselves with a scale that has a
bad spring inside of it will yield weight measurements that are inaccurate, either too high or too
low, even though the weight measurements may seem reasonable. We might, for example, notice
that we seem to gain weight after Thanksgiving holiday and lose weight after being ill. These
observations may in fact be true but the actual value for weight will be inaccurate.
This is an example of a systematic error. Measurements are systematically wrong. It is often very
difficult to catch systematic errors and usually requires that we understand carefully the workings
of measuring device(s).

Some of the types of measurement error are outlined below:

 Failure to identify the target population


 Non-response bias
 Questionnaire design
 Interviewer bias
 Respondent bias
 Processing errors
 Misinterpretation of results

Q. 2 Discuss the relative merits and demerits of: Rating vs Ranking scales?

Sometimes we use the words ranking and rating interchangeably, even though there is a distinction.
The difference is simple: a rating question asks we to compare different items using a common
scale (e.g., "Please rate each of the following items on a scale of 1-10, where 1 is =not at all
important‘ and 10 is =very important‘") while a ranking question asks we to compare different
items directly to one another (e.g., "Please rank each of the following items in order of importance,
from the #1 most important item through the #10 least important item"). Both types of questions
have their strengths and weaknesses.

Rating Questions

 Pros o Commonly used and easily understood by respondents o Allow respondents to assign
items the same
 Cons o Often have a narrow distribution of ratings, which typically fall into an upper band (for
instance, most items are considered important when using important scales) o Lead to less
differentiation among items, with the possibility that a respondent rates every item identically
o Accept great personal variations in response styles (e.g., respondents who never assign the
highest rating) o Produce possibly spurious positive correlations due to individuals' personal
variations o Matrix questions are tedious and lead to satisficing

Ranking Questions

 Pros o Guarantee that each item ranked has a unique value


 Cons o Force respondents to differentiate between items that they may regard as equivalent o
Emphasize items earlier in the list, which are more likely to be ranked highest o Return
different results depending on the completeness of the list of items being ranked o Limit the
range of statistical analysis available: they should not be analyzed as averages, as ranking
questions do not measure the distance between two subsequent choices (which might be nearer
or farther from each other than from other choices) o Can confuse respondents if numeric rating
scales with 1 being the lowest rating are being used elsewhere in the questionnaire (though we
should fully labeled scales instead on rating questions) o Take on average three times longer
to answer than rating questions (Munson and McIntyre, 1979) o Mentally tax respondents,
requiring them to compare multiple items against one another o Increase the difficulty of
answering disproportionately as choices are added
 Summated vs. Cumulative scales;
There are 3 principal types of attitude scales: interval scales, cumulative (or Guttman), and
summated rating scales. Most commonly employed in education research are summated rating
scales. A summated rating scale is comprised of a set of attitude items, all of which are considered
of approximately equal ==attitude value,‘‘ and to each of which participants respond with varying
degrees of intensity (eg, 5 to 7 points) on ordinal measures.

The scores of the items on such a scale are summed, or summed and averaged, to yield an
individual‘s attitude score. The purpose of the summated rating scale is to place an individual
somewhere on a continuum of the attitude in question.

There are 2 types of summated ratings scales: the Likert-type scale and the semantic differential
scale.
A cumulative or Guttman scale consists of a relatively small set of homogenous items that are
measured with dichotomous response sets. It is used more frequently to assess beliefs or
knowledge. The scale is created from items in some natural progression, such as level of difficulty
for an examination, wherein a difficult item or problem would be placed first, followed by a less
difficult problem, followed by an even easier one. One would suspect that few students who missed
the first problem would miss either of the latter 2. The responses to a Guttman scale are used to
rank individuals. The Guttman scale has been highly criticized by psychologists, particularly in
the measurement of attitudes.

 Scalogram analysis vs Factor analysis

Cumulative scales or Louis Guttmans scalogram analysis, like other scales, consist of series of
statements to which a respondent expresses his agreement or disagreement. The technique
developed by Louis Guttman is known as scalogram analysis, or at times simply scale analysis.
Scalogram analysis refers to the procedure for determining whether a set of items forms a one-
dimensional scale. Under this technique, the respondents are asked to indicate in respect of each
item whether they agree or disagree with it, and these items form a one-dimensional scale, the
response pattern will be as under:
Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. Factor
analysis originated a century ago with Charles Spearman's attempts to show that a wide variety of
mental tests could be explained by a single underlying intelligence factor. Factor analysis is used
to

 reduces attribute space from a larger number of variables to a smaller number of factors.
 reduce a large number of variables to a smaller number of factors for data modeling
 validate a scale or index by demonstrating that its constituent items load on the same factor,
and to drop
 proposed scale items which cross-load on more than one factor.
 select a subset of variables from a larger set, based on which original variables have the highest
correlations with the principal component factors.
 create a set of factors to be treated as uncorrelated variables as one approach to handling multi-
 co-linearity in such procedures as multiple regression

Q. 3 (a) Describe the different methods of scale construction, pointing out the merits and
demerits of each.

In the social sciences, scaling is the process of measuring or ordering entities with respect to
quantitative attributes or traits. For example, a scaling technique might involve estimating
individuals' levels of extraversion, or the perceived quality of products. Certain methods of scaling
permit estimation of magnitudes on a continuum, while other methods provide only for relative
ordering of the entities.

Methods of Scale Construction

 Empirical: Ask items that discriminate known groups


– People in general versus specific group
– Choose items that are maximally independent and that have highest validities

 Rational: Ask items with direct content relevance


 Theoretical: Ask items with theoretical relevance
 Homogeneous: Select items to represent single domain

(b) “Scaling describes the procedures by which numbers are assigned to various degrees of
opinion, attitude and other concepts.” Discuss. Also point out the bases for scale
classification.

The term scaling is applied to the attempts to measure the attitude objectively. Attitude is a
resultant of number of external and internal factors. Depending upon the attitude to be measured,
appropriate scales are designed.

Scaling is a technique used for measuring qualitative responses of respondents such as those
related to their feelings, perception, likes, dislikes, interests and preferences. The number of
assigning procedures or the scaling procedures may be broadly classified on the following bases:

 Subject orientation

Under it a scale may be designed to measure characteristics of the respondent who completes it or
to judge the stimulus object which is presented to the respondent. We presume that the stimuli
presented are sufficiently homogeneous so that the between stimuli variation is small as compared
to the variation among respondents. In the latter approach, we ask the respondent to judge some
specific object in terms of one or more dimensions and we presume that the between respondent
variation will be small as compared to the variation among the different stimuli presented to
respondents for judging.

 Response form

Under this we may classify the scales as categorical and comparative.

Categorical scales are also known as rating scales. These scales are used when a respondent scores
some object without direct reference to other objects.
Under comparative scales, which are also known as ranking scales, the respondent is asked to
compare two or more objects.

 Degree of subjectivity

With this basis the scale date may be based on whether we measure subjective personal preferences
or simply make non preference judgments. The respondent is asked to choose which person he
favours or which solution he would like to see employed, whereas in the latter case he is simply
asked to judge which person is more effective in some aspect or which solution will take fewer
resources without reflecting any personal preference.

 Scale properties
 Nominal Scales
 Ordinal Scales
 Interval Scales and
 Ration Scales
 Number of dimensions

In respect of this basis, scales can be classified as a. one-dimensional and b. multidimensional


scales.

Under the former we measure only one attribute of the respondent or object, whereas
multidimensional scaling recognizes that an object might be described better by using the concept
of an attribute space of and dimensions, rather than a single-dimension continuum.

 Scale construction techniques

Following are the five main techniques by which scales can be developed.

 Arbitrary approach .
 Consensus approach
 Item analysis approach
 Cumulative scales
 Factor scales

Q. 4 (a) Examine the merits and limitations of the observation method in collecting material.
Illustrate your answer with suitable examples.

Advantages of Observational Methods:

Observation forms the basis of any scientific enquiry. It is the primary mode of acquiring
knowledge about the environment. Through systematic observation, and a process of induction,
the investigator forms hypotheses, which are tested later by using experimental method? The
experimental and other laboratory-based methods study behaviors under artificially controlled
conditions. But through observational method, the investigator gets a real picture of the behaviors
and the events as they manifest in natural settings. Systematic and unbiased observation can yield
a true picture of individual's natural set of behaviors. Certain phenomena can be accessed and
properly understood only through observation. Crowd behavior, social behaviors of the animals,
and mother-child interaction at home are some exemplary situations, which can be meaningfully
assessed, and understood only through observation.

Disadvantages of Observational Method:

The major problem with observational methods is that the investigator has little control over the
situation he is interested to observe. In the natural setting, too many extraneous factors influence
the phenomenon. As a result, it is difficult to assess what causes or determines the behaviors of
researcher's interest. It is extremely difficult, and sometimes impossible to establish cause-and-
effect relationships in our understanding of the behaviors. The observational report in most cases
turns out to be descriptions of events rather than explanations for the event that can be used for
prediction and control.
In many cases the observer has to wait until the appropriate event takes place. To study crowd
behavior, the investigator would have to wait until a crowd is formed in a natural setting.
Therefore, some types of observations are time-consuming, and labor-intensive. Observer-bias is
one of the important problems in observational research. The personal philosophy, attitudes,
beliefs, convictions, and sometimes the personal interests of the observer are most likely to color
his perceptions of the event. His observational report may in part reflect his biases in describing
and interpreting the event. Thus, the description may not reflect the true features of an event.

(b) How does the case study method differ from the survey method? Analyze the merits and
limitations of case study method in sociological research.

Different between case study method and survey method

 Case study method


 Survey Method

A case study examines one case, probably a person, in detail and follows it through some period
of time. A survey involves many different individual things or people, not studied in as much detail
or during as much time. It is qualitative nature

Survey research seeks to identify what large numbers of people (mass) think or feel about certain
things. It is used extensively in politics and marketing (such as TV advertising). It is quantitative
nature

Case method used to collect:


 Detail past story of study unit, Current situations and Further planning

Survey research often used to collect:


 Public opinion in any specific topic through polls, Mail Surveys, Telephone Surveys and
Consumer Surveys (in the Mall)
Case study has chances of biased:

 When researcher purposively select the respondents,


 When respondents gives the false information,

Surveys are often considered biased because

 They ask leading questions


 The sample population is biased in a particular way
 The questions were not clear
 The respondents were influenced by the researcher

It is difficult to quantify the margin of error in case study because it is based on qualitative
techniques.

An amazing fact about survey research is that the amount of error (expressed as plus and minus a
certain percentage) is determined by the sample size (the number of people surveyed). Most
opinion polls use a sample size of around 1500, which has a margin of error of 3%.

Merits of Case study

 Develops analytic and problem solving skills


 Allows for exploration of solutions for complex issues
 Allows student to apply new knowledge and skills
 Limitation of case study
 May not see relevance to own situation
 Insufficient information can lead to inappropriate results
 Not appropriate for elementary level
 With a researcher observing the specimen closely, the specimen is likely to change their
behavior
Q. 5 Clearly explain the difference between collection of data through questionnaires and
schedules. What are the guiding considerations in the construction of questionnaire?
Explain.

The main tool that is used in survey research is questionnaire. A questionnaire is a formal list of
questions designed to gather responses from respondents on a given topic.

Questionnaires Schedules

1. Questionnaires are generally sent through 1. The schedule is generally filled out by the
mail to informants to be answered, research worker or the enumerator,

2. Relatively cheap to collect data 2. More expensive

3. Non- response is usually high in case of 3. Non- response is generally very low in case
questionnaires of schedule

4. Personal contact with respondents is not 4. Personal contact is established with


possible, respondents,

5. Questionnaires can be used only for literate 5. Schedule can be used for Illiterate also

6. It can cover the wider areas 6. It can't cover the wider areas

7. Physical observation of field areas or 7. Physical observation of field areas or


respondents is not possible, respondents is possible,

Q. 6 (a) How does the case study method differ from the survey method? Analyse the merits
and limitations of case study method in sociological research.

The difference between case study and survey method is included in Q. 4 (b) of Research
Methodology Assignment – I
(b) Distinguish between an experiment and survey. Explain fully the survey method of
research.

Survey vs Experiment

Survey and experiment are one and the same thing when we see them superficially but an in depth
study of these two terms will reveal a truly different story. When a business man wants to market
his products it‘s the survey he will need and not an experiment and similarly a scientist who has
discovered a new element or a new drug he will need an experiment to prove its usefulness and
not a survey. A survey is random opinion of different people who give their opinion about a
particular product or about a particular issue whereas experiment is a comprehensive study about
something to prove it scientifically.

Survey is often conducted by the volunteers or by the employees of a company so that the
usefulness of a product to the consumer can be established but an experiment of the same product
is conducted by the qualified person a scientist or an educated person so that the effectiveness of
the product and its safety of the consumer can be ensured. Survey involves analysis of data that is
amassed by the volunteers regarding the product or regarding the opinion as in the case of an issue
but experiment zeroes down on the figures that are obtained when the product is put to different
tests.

Both survey and experiment can at times be mistaken as same by a layman but they are definitely
poles apart. Survey is conducted on a mass scale with lots of data but experiment does not require
mass data as it only requires qualitative data. The results of survey are never dependable as they
are simply opinions and may show a certain bias but the results of an experiment are the confirmed
results that reflect the true nature of the product. Hence it can be said that survey is a mere shadow
whereas experiment is the true reflection.

Q. 7 (a) “Processing of data implies editing, coding, classification and tabulation”. Describe
in brief these four operations pointing out the significance of each in context of research
study.
Processing of data--editing, coding, classification and tabulation

After collecting data, the method of converting raw data into meaningful statement; includes data
processing, data analysis, and data interpretation and presentation. Data reduction or processing
mainly involves various manipulations necessary for preparing the data for analysis. The process
(of manipulation) could be manual or electronic. It involves editing, categorizing the open-ended
questions, coding, computerization and preparation of tables and diagrams.
Editing data:

Information gathered during data collection may lack uniformity. Example: Data collected through
questionnaire and schedules may have answers which may not be ticked at proper places, or some
questions may be left unanswered. Sometimes information may be given in a form which needs
reconstruction in a category designed for analysis, e.g., converting daily/monthly income in annual
income and so on. The researcher has to take a decision as to how to edit it.

Coding of data:

Coding is translating answers into numerical values or assigning numbers to the various categories
of a variable to be used in data analysis. Coding is done by using a code book, code sheet, and a
computer card. Coding is done on the basis of the instructions given in the codebook. The code
book gives a numerical code for each variable.

Data classification/distribution:

Sarantakos (1998: 343) defines distribution of data as a form of classification of scores obtained
for the various categories or a particular variable. There are four types of distributions:
1. Frequency distribution
2. Percentage distribution
3. Cumulative distribution
4. Statistical distributions
Tabulation of data:

After editing, which ensures that the information on the schedule is accurate and categorized in a
suitable form, the data are put together in some kinds of tables and may also undergo some other
forms of statistical analysis.

(b) Write a brief note on different types of analysis of data pointing out the significance of
each.

There are two types of analysis of data as follows:

(1) To describe (summarize) the population of interest by describing what was observed in the
(study) sample.
Employs descriptive statistics, which involves:

 Summarizing continuous variables using the mean, standard deviation, range, and
percentiles (including the median)
 Summarizing categorical variables using raw and relative frequencies.

(2) To use patterns in the (study) sample data to draw inferences about the population represented.
Employs inferential statistics, which involves:
 Confidence intervals.
 Hypothesis tests & p-values.
 Correlation and determining associations.
 Determine relationships, estimating etc, and making
 Predictions using regression analysis.

Q. 8 (a) What do we mean by multivariate analysis? Explain how it differs from bivariate
analysis?
Statistical procedure for analysis of data involving more than one type of measurement or
observation. It may also mean solving problems where more than one dependent variable is
analyzed simultaneously with other variables.

Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which
involves observation and analysis of more than one statistical variable at a time. In design and
analysis, the technique is used to perform trade studies across multiple dimensions while taking
into account the effects of all variables on the responses of interest.

Uses for multivariate analysis include:

 Design for capability (also known as capability-based design)


 Inverse design, where any variable can be treated as an independent variable
 Analysis of Alternatives (AoA), the selection of concepts to fulfill a customer need
 Analysis of concepts with respect to changing scenarios
 Identification of critical design drivers and correlations across hierarchical levels.

Multivariate analysis can be complicated by the desire to include physics-based analysis to


calculate the effects of variables for a hierarchical "system-of-systems." Often, studies that wish
to use multivariate analysis are stalled by the dimensionality of the problem. These concerns are
often eased through the use of surrogate models, highly accurate approximations of the physics-
based code. Since surrogate models take the form of an equation, they can be evaluated very
quickly. This becomes an enabler for large-scale MVA studies: while a Monte Carlo simulation
across the design space is difficult with physics-based codes, it becomes trivial when evaluating
surrogate models, which often take the form of response surface equations.

(b) How will we differentiate between descriptive statistics and inferential statistics?
Describe the important statistical measures often used to summarize the survey/research
data.
Both descriptive and inferential statistics look at a sample from some population. The difference
between descriptive and inferential statistics is in what they do with that sample:

 Descriptive statistics aims to summarize the sample using statistical measures, such as average,
median, standard deviation etc. For example, if we look at a basketball team's game scores
over a year, we can calculate the average score, variance etc. and get a description (a statistical
profile) for that team.
 Inferential statistics aims to draw conclusions about the population from the sample at hand.
For example, it may try to infer the success rate of a drug in treating high temperature, by
taking a sample of patients, giving them the drug, and estimating the rate of effectiveness in
the population using the rate of effectiveness in the sample.

Descriptive Statistics

Descriptive statistics comprises the kind of analyses we use when we want to describe the
population we are studying, and when we have a population that is small enough to permit our
including every case. Descriptive statistics are for describing data on the group we study. Example:
Babbie and Halley's survey for describing own class.

Inferential Statistics

The important keys to the difference between descriptive and inferential statistics are the
capitalized words in the description: COULD DESCRIBE, COULD NOT CONCLUDE, AND
REPRESENTATIVE OF.

Descriptive statistics can describe the actual sample we study. But to extend our conclusions to a
broader population, like all such classes, all workers, all women, we must be use inferential
statistics, which means we have to be sure the sample we study is representative of the group we
want to generalize to.
Inferential statistics are for generalizing wer findings to a broader population group. Example:
Babbie and Halley's analysis of SPSS data that can be generalized to the population at large.

Q. 9 What does a measure of central tendency indicate? Describe the important measures
of central tendency pointing out the situation when one measure is considered relatively
appropriate in comparison to other measures.

Measures of central tendency, or "location", attempt to quantify what we mean when we think of
as the "typical" or "average" score in a data set. The concept is extremely important and we
encounter it frequently in daily life. For example, we often want to know before purchasing a car
its average distance per litre of petrol.
Or before accepting a job, we might want to know what a typical salary is for people in that position
so we will know whether or not we are going to be paid what we are worth. Or, if we are a smoker,
we might often think about how many cigarettes we smoke "on average" per day. Statistics geared
toward measuring central tendency all focus on this concept of "typical" or "average." As we will
see, we often ask questions in psychological science revolving around how groups differ from each
other "on average". Answers to such a question tell us a lot about the phenomenon or process we
are studying.

The mean, or "average", is the most widely used measure of central tendency. The mean is defined
technically as the sum of all the data scores divided by n (the number of scores in the distribution).
In a sample, we often

http://www.une.edu.au/WebStat/unit_materials/c4_descriptive_statistics/image9.gif
http://www.une.edu.au/WebStat/unit_materials/c4_descriptive_statistics/image10.gif
symbolise the mean with a letter with a line over it. If the letter is "X", then the mean is symbolised
as , pronounced "X-bar." If we use the letter X to represent the variable being measured, then
symbolically, the mean is defined as For example, using the data from above, where the n = 5
values of X were 5, 7, 6, 1, and 8, the mean is (5 + 7 + 6 + 1 + 8) / 5 = 5.4. The mean number of
sexual partners reported by UNE students who responded to the question is, from Figure 4.1, (1 +
0 + 2 + 4 + . . . + 0 + 6 + 2 + 2)/ 177 = 1.864. Note that this is higher than both the mode and the
median. In a positively skewed distribution, the mean will be higher than the median because its
value will be dragged in the direction of the tail. Similarly in a negatively skewed distribution, the
mean will be dragged lower than the median because of the extra large values in the left-hand tail.
Distributions of qualitative data do not have a mean.

Q. 10 Discuss multiple correlation and regression for data analysis. Apply the method for a
given data.

As we develop Cause & Effect diagrams based on data, we may wish to examine the degree of
correlation between variables. A statistical measurement of correlation can be calculated using the
least squares method to quantify the strength of the relationship between two variables. The output
of that calculation is the Correlation Coefficient, or (r), which ranges between -1 and 1. A value of
1 indicates perfect positive correlation - as one variable increases, the second increases in a linear
fashion. Likewise, a value of -1 indicates perfect negative correlation - as one variable increases,
the second decreases. A value of zero indicates zero correlation.

Multiple Regression Analysis

Multiple Regression Analysis uses a similar methodology as Simple Regression, but includes more
than one independent variable. Econometric models are a good example, where the dependent
variable of GNP may be analyzed in terms of multiple independent variables, such as interest rates,
productivity growth, government spending, savings rates, consumer confidence, etc.

Many times historical data is used in multiple regression in an attempt to identify the most
significant inputs to a process. The benefit of this type of analysis is that it can be done very quickly
and relatively simply. However, there are several potential pitfalls:

 The data may be inconsistent due to different measurement systems, calibration drift,
different operators, or recording errors.
 The range of the variables may be very limited, and can give a false indication of low
correlation. For example, a process may have temperature controls because temperature
has been found in the past to have an impact on the output. Using historical temperature
data may therefore indicate low significance because the range of temperature is already
controlled in tight tolerance.
 There may be a time lag that influences the relationship - for example, temperature may be
much more critical at an early point in the process than at a later point, or vice-versa. There
also may be inventory effects that must be taken into account to make sure that all
measurements are taken at a consistent point in the process.
ASSIGNMENT-III

Q. 1 (a) Explain the meaning and significance of the concept of “Standard Error in sampling
analysis?

The standard error is the standard deviation of the sampling distribution of a statistic. The term
may also be used to refer to an estimate of that standard deviation, derived from a particular sample
used to compute the estimate. For example, the sample mean is the usual estimator of a population
mean. However, different samples drawn from that same population would in general have
different values of the sample mean. The standard error of the mean (i.e., of using the sample mean
as a method of estimating the population mean) is the standard deviation of those sample means
over all possible samples (of a given size) drawn from the population.
Secondly, the standard error of the mean can refer to an estimate of that standard deviation,
computed from the sample of data being analyzed at the time.

A way for remembering the term standard error is that, as long as the estimator is unbiased, the
standard deviation of the error (the difference between the estimate and the true value) is the same
as the standard deviation of the estimates themselves; this is true since the standard deviation of
the difference between a random variable and its expected value is equal to the standard deviation
of the random variable itself.

In practical applications, the true value of the standard deviation (of the error) is usually unknown.
As a result, the term standard error is often used to refer to an estimate of this unknown quantity.
In such cases it is important to be clear about what has been done and to attempt to take proper
account of the fact that the standard error is only an estimate. Unfortunately, this is not often
possible and it may then be better to use an approach that avoids using a standard error, for example
by using maximum likelihood or a more formal approach to deriving confidence intervals. One
well-known case where a proper allowance can be made arises where Student's t-distribution is
used to provide a confidence interval for an estimated mean or difference of means. In other cases,
the standard error may be used to provide an indication of the size of the uncertainty, but its formal
or semi-formal use to provide confidence intervals or tests should be avoided unless the sample
size is at least moderately large. Here "large enough" would depend on the particular quantities
being analyzed.

(b) Describe briefly the commonly used sampling distributions.

Suppose that we draw all possible samples of size n from a given population. Suppose further that
we compute
a statistic (e.g., a mean, proportion, standard deviation) for each sample. The probability
distribution of this statistic is called a sampling distribution.

Variability of a Sampling Distribution

The variability of a sampling distribution is measured by its variance or its standard deviation. The
variability of
a sampling distribution depends on three factors:

 N: The number of observations in the population.


 n: The number of observations in the sample.
 The way that the random sample is chosen.

If the population size is much larger than the sample size, then the sampling distribution has
roughly the same sampling error, whether we sample with or without replacement. On the other
hand, if the sample represents a significant fraction (say, 1/10) of the population size, the sampling
error will be noticeably smaller, when we sample without replacement.

Q. 2 Distinguish between the following: Statistic and Parameter;

A parameter is a number describing something about a whole population. eg population mean or


mode.
A statistic is something that describes a sample (eg sample mean) and is used as an estimator for
a population parameter.

Parameter is any characteristic of the population. Statistic on the other hand is a characteristic of
the sample.
Statistic is used to estimate the value of the parameter. Note that the value of statistic changes from
one sample to the next which leads to a study of the sampling distribution of statistic. When we
draw a sample from a population, it is just one of many samples that might have been drawn and,
therefore, observations made on any one sample are likely to be different from the =true value‘ in
the population (although some will be the same).

Imagine we were to draw an infinite (or very large) number of samples of individuals and calculate
a statistic, say the arithmetic mean, on each one of these samples and that we then plotted the mean
value obtained from each sample on a histogram (a chart using bars to represent the number of
times a particular value occurred).
This would represent the sampling distribution of the arithmetic mean. Don‘t worry about the
practicalities of doing this, as we are only talking about a hypothetical set of possible samples that
could, in theory, be drawn.

Confidence level and significance level

In statistics, a confidence interval (CI) is a kind of interval estimate of a population parameter and
is used to indicate the reliability of an estimate. It is an observed interval (i.e. it is calculated from
the observations), in principle different from sample to sample, that frequently includes the
parameter of interest, if the experiment is repeated. How frequently the observed interval contains
the parameter is determined by the confidence level or confidence coefficient. More specifically,
the meaning of the term "confidence level" is that, if confidence intervals are constructed across
many separate data analyses of repeated (and possibly different) experiments, the proportion of
such intervals that contain the true value of the parameter will match the confidence level; this is
guaranteed by the reasoning underlying the construction of confidence intervals. The level of
confidence of the confidence interval would indicate the probability that the confidence range
captures this true population parameter given a distribution of samples. It does not describe any
single sample. This value is represented by a percentage, so when we say, "we are 99% confident
that the true value of the parameter is in our confidence interval", we express that 99% of the
observed confidence intervals will hold the true value of the parameter.

A confidence interval does not predict that the true value of the parameter has a particular
probability of being in the confidence interval given the data actually obtained. (An interval
intended to have such a property, called a credible interval, can be estimated using Bayesian
methods; but such methods bring with them their own distinct strengths and weaknesses).

Statistical significance is a statistical assessment of whether observations reflect a pattern rather


than just chance, the fundamental challenge being that any partial picture is subject to
observational. In statistical testing, a result is deemed statistically significant if it is unlikely to
have occurred by chance, and hence provides enough evidence to reject the hypothesis of 'no
effect'. As used in statistics, significant does not mean important or meaningful, as it does in
everyday speech.

The amount of evidence required to accept that an event is unlikely to have arisen by chance is
known as the significance level or critical p-value: in traditional Fisherman statistical hypothesis
testing, the p-value is the probability of observing data at least as extreme as that observed, given
that the null hypothesis is true. If the obtained p-value is small then it can be said that either the
null is false or an unusual event has occurred. p-values do not have any repeat sampling
interpretation.

Random sampling and non-random sampling;

In a simple random sample ('SRS') of a given size, all such subsets of the frame are given an equal
probability. Each element of the frame thus has an equal probability of selection: the frame is not
subdivided or partitioned. Furthermore, any given pair of elements has the same chance of
selection as any other such pair (and similarly for triples, and so on). This minimizes bias and
simplifies analysis of results. In particular, the variance between individual results within the
sample is a good indicator of variance in the overall population, which makes it relatively easy to
estimate the accuracy of results.

However, SRS can be vulnerable to sampling error because the randomness of the selection may
result in a sample that doesn't reflect the makeup of the population. For instance, a simple random
sample of ten people from a given country will on average produce five men and five women, but
any given trial is likely to over represent one sex and underrepresent the other. Systematic and
stratified techniques, discussed below, attempt to overcome this problem by using information
about the population to choose a more representative sample.

Non-random sampling: The purpose of this method is to make an explicit choice based on our own
judgment about exactly whom to include in sample. When random sampling is not possible, then
we can choose this sampling method for studying how primary stakeholders are affected by a
project intervention. We might, alternatively, want a very specific perspective so we purposefully
seek certain people or groups.

Sampling of attributes and sampling of variables;

There are two types of data/measurements, =variable‘ (also called =continuous‘) and attribute (also
called =discrete‘). Discrete or attribute data can only measure by categories (like yes/no, true/false,
pass/fail etc.) or intervals (like absolute rank, educational level, types etc.). Attribute data is always
about counting of measurements falling in different categories. Attribute data cannot be further
divided, for example if we say we have 10 students who are either taking a math class or a science
class, there will be no student who would be taking 50% of the match class and 50% of the science
class.

On the other hand, variable or continuous data can be further divided into more classifications and
that will still have meaning. For example if we measure temperature for two rooms, 22F and 23F
respectively, this does not mean that a temperature of 22.2F or 22.5F cannot be recorded.
Attribute data is the simplest kind. We sample some number of items and we classify each item as
either having some attribute, like being defective, or not. We will have some standard for what a
defective item is -- like broken, scratched, non-functional, etc.

Variables data contains more information than attribute data per data-point. This is because it
allows assessing "how much" or "how bad" or "how good" rather than just "yes its defective" or
"no it's not defective". Because variables data contains more information per data point than
attribute data, variables sampling plans require fewer samples than attribute sampling plans -- for
the same level of protection. This translates into lower material cost, less work, and less time for
variables sampling plans.

In general, most of the information that we receive (about anything) is in the form of either attribute
data or variables data.

Point estimate and interval estimation

In statistics, point estimation involves the use of sample data to calculate a single value (known as
a statistic) which is to serve as a "best guess" or "best estimate" of an unknown (fixed or random)
population parameter.

More formally, it is the application of a point estimator to the data.

In general, point estimation should be contrasted with interval estimation: such interval estimates
are typically either confidence interval in the case of frequents inference, or credible intervals in
the case of Bayesian inference.

In statistics, interval estimation is the use of sample data to calculate an interval of possible (or
probable) values of an unknown population parameter, in contrast to point estimation, which is a
single number. Neyman (1937) identified interval estimation ("estimation by interval") as distinct
from point estimation ("estimation by unique estimate"). In doing so, he recognized that then-
recent work quoting results in the form of an estimate plus-or-minus a standard deviation indicated
that interval estimation was actually the problem statisticians really had in mind.

The most prevalent forms of interval estimation are:

 confidence intervals (a frequenters method); and


 Credible intervals (a Bayesian method).

Other common approaches to interval estimation, which are encompassed by statistical theory, are:

 Tolerance intervals
 Prediction intervals - used mainly in Regression Analysis

The value of „z. should be:


 2.58 for 99% confidence,
 1.96 for 95% confidence,
 1.64 for 90% confidence and
 1.28 For 80% confidence.

Likelihood intervals

There is a third approach to statistical inference, namely fiducially, that also considers interval
estimation. Non-statistical methods that can lead to interval estimates include fuzzy logic.

An interval estimate is one type of outcome of a statistical analysis. Some other types of outcome
are point estimates and decisions.

Q. 3 (a) What are the different approaches of determining a sample size? Explain.

Formula for Calculation a Sample for Proportions


For populations that are large, Cochran (1963:75) developed the Equation 1 to yield a
representative sample for proportions.
Or,
s = z2 (p (1-p))
e2
Equation 1:
Where,
s or n0 = is the sample size,
Z2 = desired confidence level
e = is the desired level of precision or the proportion of error we are prepared to accept
p = is the estimated proportion of an attribute that is present in the
population/prevalence/variability and
q= is 1-p.
Simplified Formula for Proportions for Finite Population
Yamane (1967:886) provides a simplified formula to calculate sample sizes.
A 95% confidence level and P = .5 are assumed for Equation 4.

http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image002.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image004.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image006.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image008.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image010.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image012.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image014.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image014_0000.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image017.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image019.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image002_0000.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image008_0000.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image021.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image023.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image021_0000.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image025.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image027.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image014_0001.gif
http://www.emathzone.com/basic-stat/simple-composite-hypothesis/clip_image012_0000.gif
Where,

n = is the sample size,


N= is the population size, and
E = is the level of precision.

(b) If we want to draw a simple random sample from a population of 4000 items, how large a
sample do we need to draw if we desire to estimate the per cent defective within 2 % of the true
value with 95.45% probability.
N/(1+N(e)2)=4000/1+4000(5)2)= 400 Approximately.

Q. 4 suppose a certain hotel management is interested in determining the percentage of the


hotel.s guests who stay for more than 3 days. The reservation manager wants to be 95 per
cent confident that the percentage has been estimated to be within ± 3% of the true value.
What is the most conservative sample size needed for this problem?

Q. 5 Distinguish between the following:

Simple hypothesis and composite hypothesis;


A simple hypothesis is one in which all parameters of the distribution are specified. For example,
if the heights of college students are normally distributed with, the hypothesis that its mean is, say,,
that is , we have stated a simple hypothesis, as the mean and variance together specify a normal
distribution completely. A simple hypothesis, in general, states that where is the specified value of
a parameter, (may represent etc).

A hypothesis which is not simple (i.e. in which not all of the parameters are specified) is called a
composite hypothesis. For instance, if we hypothesize that (and) or and, the hypothesis becomes a
composite hypothesis because we cannot know the exact distribution of the population in either
case. Obviously, the parameters and have more than one value and no specified values are being
assigned. The general form of a composite hypothesis is or, that is the parameter does not exceed
or does not fall short of a specified value. The concept of simple and composite hypotheses applies
to both null hypothesis and alternative hypothesis.

Null hypothesis and alternative hypothesis;

The logic of traditional hypothesis testing requires that we set up two competing statements or
hypotheses referred to as the null hypothesis and the alternative hypothesis. These hypotheses are
mutually exclusive and exhaustive.

Ho: The finding occurred by chance


H1: The finding did not occur by chance

The practice of science involves formulating and testing hypotheses, assertions that are capable of
being proven false using a test of observed data. The null hypothesis typically corresponds to a
general or default position.
For example, the null hypothesis might be that there is no relationship between two measured
phenomena or that a potential treatment has no effect.

The term was originally coined by English geneticist and statistician Ronald Fisher in 1935. It is
typically paired with a second hypothesis, the alternative hypothesis, which asserts a particular
relationship between the phenomena. Jerzy Neyman and Egon Pearson formalized the notion of
the alternative. The alternative need not be the logical negation of the null hypothesis; it predicts
the results from the experiment if the alternative hypothesis is true. The use of alternative
hypotheses was not part of Fisher's formulation, but became standard.

It is important to understand that the null hypothesis can never be proven. A set of data can only
reject a null hypothesis or fail to reject it. For example, if comparison of two groups (e.g.:
treatment, no treatment) reveals no statistically significant difference between the two, it does not
mean that there is no difference in reality. It only means that there is not enough evidence to reject
the null hypothesis (in other words, the experiment fails to reject the null hypothesis).

One-tailed test and two-tailed test;

There are two different types of tests that can be performed. A one-tailed test looks for an increase
or decrease in the parameter whereas a two-tailed test looks for any change in the parameter (which
can be any change- increase or decrease).

We can perform the test at any level (usually 1%, 5% or 10%). For example, performing the test
at a 5% level means that there is a 5% chance of wrongly rejecting H0.

If we perform the test at the 5% level and decide to reject the null hypothesis, we say "there is
significant evidence at the 5% level to suggest the hypothesis is false".

One-Tailed Test

We choose a critical region. In a one-tailed test, the critical region will have just one part (the red
area below). If our sample value lies in this region, we reject the null hypothesis in favor of the
alternative.

Suppose we are looking for a definite decrease. Then the critical region will be to the left. Note,
however, that in the one-tailed test the value of the parameter can be as high as we like.

Example
Suppose we are given that X has a Poisson distribution and we want to carry out a hypothesis test
on the mean, l, based upon a sample observation of 3.

Suppose the hypotheses are:


H0: l = 9
H1: l < 9

We want to test if it is "reasonable" for the observed value of 3 to have come from a Poisson
distribution with parameter 9. So what is the probability that a value as low as 3 has come from a
Po?

P(X = 3) = 0.0212 (this has come from a Poisson table)

The probability is less than 0.05, so there is less than a 5% chance that the value has come from a
Poisson distribution. We therefore reject the null hypothesis in favour of the alternative at the 5%
level.

However, the probability is greater than 0.01, so we would not reject the null hypothesis in favour
of the alternative at the 1% level.

Two-Tailed Test

In a two-tailed test, we are looking for either an increase or a decrease. So, for example, H0 might
be that the mean is equal to 9 (as before). This time, however, H1 would be that the mean is not
equal to 9. In this case, therefore, the critical region has two parts:

Example

Let's test the parameter p of a Binomial distribution at the 10% level.


Suppose a coin is tossed 10 times and we get 7 heads. We want to test whether or not the coin is
fair. If the coin is fair, p = 0.5. Put this as the null hypothesis:

H0: p = 0.5
H1: p . 0.5

Now, because the test is 2-tailed, the critical region has two parts. Half of the critical region is to
the right and half is to the left. So the critical region contains both the top 5% of the distribution
and the bottom 5% of the distribution (since we are testing at the 10% level).

If H0 is true, X ~ Bin (10, 0.5).

If the null hypothesis is true, what is the probability that X is 7 or above?


P(X = 7) = 1 - P(X < 7) = 1 - P(X = 6) = 1 - 0.8281 = 0.1719

Is this in the critical region? No- because the probability that X is at least 7 is not less than 0.05
(5%), which is what we need it to be.

So there is not significant evidence at the 10% level to reject the null hypothesis.

.
Type I error and Type II error

Rejecting the null hypothesis when it is in fact true is called a Type I error.

Not rejecting the null hypothesis when in fact the alternate hypothesis is true is called a Type II
error.

Q. 6 (a) What do we mean by the power of a hypothesis test? How can it be measured?
Describe and illustrate by an example.
The probability of not committing a Type II error is called the power of a hypothesis test.

Effect Size

To compute the power of the test, one offers an alternative view about the "true" value of the
population parameter, assuming that the null hypothesis is false. The effect size is the difference
between the true value and the value specified in the null hypothesis.

Effect size = True value - Hypothesized value

For example, suppose the null hypothesis states that a population mean is equal to 100. A
researcher might ask:
What is the probability of rejecting the null hypothesis if the true population mean is equal to 90?
In this example, the effect size would be 90 - 100, which equals -10.

Factors That Affect Power

The power of a hypothesis test is affected by three factors.

. Sample size (n). Other things being equal, the greater the sample size, the greater the power of
the test.
. Significance level (a). The higher the significance level, the higher the power of the test. If we
increase the significance level, we reduce the region of acceptance. As a result, we are more likely
to reject the null hypothesis. This means we are less likely to accept the null hypothesis when it is
false; i.e., less likely to make a Type II error. Hence, the power of the test is increased.
. The "true" value of the parameter being tested. The greater the difference between the "true"
value of a parameter and the value specified in the null hypothesis, the greater the power of the
test. That is, the greater the effect size, the greater the power of the test.

(b) Clearly explain how will we test the equality of variances of two normal populations?
To test for equality of variances given two independent random samples from univariate normal
populations, popular choices would be the two-sample F test and Levene‘s test. The latter is a
nonparametric test while the former is parametric: it is the likelihood ratio test, and also a Wald
test. Another Wald test of interest is based on the difference in the sample variances. We give a
nonparametric analogue of this test and call it the R test.
The R, F and Levene tests are compared in an indicative empirical study.

For moderate sample sizes when assuming normality the R test is nearly as powerful as the F test
and nearly as robust as Levene‘s test. It is also an appropriate test for testing equality of variances
without the assumption of normality, and so it can be strongly recommended.

Q. 7 Briefly describe the important parametric tests used in context of testing hypotheses.
How such tests differ from non-parametric tests? Explain.

The important parametric tests are: (1) z-test; (2) t-test; and (3) F-test. All these tests are based on
the assumption of normality i.e., the source of data is considered to be normally distributed.

In some cases the population may not be normally distributed, yet the tests will be applicable on
account of the fact that we mostly deal with samples and the sampling distributions closely
approach normal distributions.
z-test is based on the normal probability distribution and is used for judging the significance of
several statistical measures, particularly the mean. The relevant test statistic, z, is worked out and
compared with its probable value (to be read from table showing area under normal curve) at a
specified level of significance for judging the significance of the measure concerned. This is a
most frequently used test in research studies. This test is used even when binomial distribution or
t-distribution is applicable on the presumption that such a distribution tends to approximate normal
distribution as =n‘ becomes larger. z-test is generally used for comparing the mean of a sample to
some hypothesized mean for the population in case of large sample, or when population variance
is known.
t-test is based on t-distribution and is considered an appropriate test for judging the significance of
a sample mean or for judging the significance of difference between the means of two samples in
case of small sample(s) when population variance is not known (in which case we use variance of
the sample as an estimate of the population variance). In case two samples are related, we use
paired t-test (or what is known as difference test) for judging the significance of the mean of
difference between the two related samples.

F-test is based on F-distribution and is used to compare the variance of the two-independent
samples. This test is also used in the context of analysis of variance (ANOVA) for judging the
significance of more than two sample means at one and the same time. It is also used for judging
the significance of multiple correlation coefficients. Test statistic, F, is calculated and compared
with its probable value (to be seen in the F-ratio tables for different degrees of freedom for greater
and smaller variances at specified level of significance) for accepting or rejecting the null
hypothesis.

Q. 8 (a) Point out the important limitations of tests of hypotheses. What precaution the
researcher must take while drawing inferences as per the results of the said tests?

We have described above some important test often used for testing hypotheses on the basis of
which important decisions may be based. But there are several limitations of the said tests which
should always be borne in mind by a researcher.

Important limitations are as follows:

(i) The tests should not be used in a mechanical fashion. It should be kept in view that testing is
not decision-making itself; the tests are only useful aids for decision-making. Hence .proper
interpretation of statistical evidence is important to intelligent decisions.

(ii) Test do not explain the reasons as to why does the difference exist, say between the means of
the two samples. They simply indicate whether the difference is due to fluctuations of sampling or
because of other reasons but the tests do not tell us as to which is/are the other reason(s) causing
the difference.

(iii) Results of significance tests are based on probabilities and as such cannot be expressed with
full certainty. When a test shows that a difference is statistically significant, then it simply suggests
that the difference is probably not due to chance.

(iv) Statistical inferences based on the significance tests cannot be said to be entirely correct
evidences concerning the truth of the hypotheses. This is specially so in case of small samples
where the probability of drawing erring inferences happens to be generally higher.

For greater reliability, the size of samples be sufficiently enlarged.

All these limitations suggest that in problems of statistical significance, the inference techniques
(or the tests) must be combined with adequate knowledge of the subject-matter along with the
ability of good judgment.

(b) What is a t-test? When it is used and for what purpose(s)? Explain by means of examples.

t-test is based on t-distribution and is considered an appropriate test for judging the significance of
a sample mean or for judging the significance of difference between the means of two samples in
case of small sample(s) when population variance is not known (in which case we use variance of
the sample as an estimate of the population variance). In case two samples are related, we use
paired t-test (or what is known as difference test) for judging the significance of the mean of
difference between the two related samples. It can also be used for judging the significance of the
coefficients of simple and partial correlations. The relevant test statistic, t, is calculated from the
sample data and then compared with its probable value based on t-distribution (to be read from the
table that gives probable values of t for different levels of significance for different degrees of
freedom) at a specified level of significance for concerning degrees of freedom for accepting or
rejecting the null hypothesis. It may be noted that t-test applies only in case of small sample(s)
when population variance is unknown.
Q. 9 (a) Write a brief note on “Sandler.s A-test” explaining its superiority over t-test.

S A-Test

S A-TEST, Joseph Sandler has developed an alternate approach based on a simplification of t-test.
His approach is described as Sandler‘s A-test that serves the same purpose as is accomplished by
t-test relating to paired data. Researchers can as well use A-test when correlated samples are
employed and hypothesized mean difference is taken as zero i.e., H0 D. Psychologists generally
use this test in case of two groups that are matched with respect to some extraneous variable(s).
While using A-test, we work out A-statistic that yields exactly the same results as Student‘s t-test.

The number of degrees of freedom (d.f.) in A-test is the same as with Student‘s t-test i.e., d.f. = n
– 1, n being equal to the number of pairs. The critical value of A, at a given level of significance
for given d.f., can be obtained from the table of A-statistic (given in appendix at the end of the
book).

One has to compare the computed value of A with its corresponding table value for drawing
inference concerning acceptance or rejection of null hypothesis. If the calculated value of A is
equal to or less than the table value, in that case A-statistic is considered significant where upon
we reject H0 and accept Ha. But if the calculated value of A is more than its table value, then A-
statistic is taken as insignificant and accordingly we accept H0. This is so because the two test
statistics viz., t and A are inversely related.

Computational work concerning A-statistic is relatively simple. As such the use of A-statistic
result in considerable saving of time and labour, specially when matched groups are to be
compared with respect to a large number of variables. Accordingly researchers may replace
Student‘s t-test by Sandler‘s A-test whenever correlated sets of scores are employed.

Sandler‘s A-statistic can as well be used .in the one sample case as a direct substitute for the
Student t-ratio.4 This is so because Sandler‘s A is an algebraically equivalent to the Student‘s t.
When we use A-test in one sample case, the following steps are involved:
(i) Subtract the hypothesized mean of the population H b g from each individual score (Xi) to
obtain Di and then work
(ii) Square each Di and then obtain the sum of such squares.
(iii) Find A-statistic:
(iv) Read the table of A-statistic for (n – 1) degrees of freedom at a given level of significance
(using one-tailed or two-tailed values depending upon Ha) to find the critical value of A.
(v) Finally, draw the inference as under:

When calculated value of A is equal to or less than the table value, then rejects H0 (or accept Ha)
but when computed A is greater than its table value, and then accept H0.

(b) What is Chi-square text? Explain its significance in statistical analysis.

The chi-square test is an important test amongst the several tests of significance developed by
statisticians. Chi-square, symbolically written as ….. (Pronounced as Ki-square), is a statistical
measure used in the context of sampling analysis for comparing a variance to a theoretical variance.

As a non-parametric test, it .can be used to determine if categorical data shows dependency or the
two classifications are independent. It can also be used to make comparisons between theoretical
populations and actual data when categories are used..1 Thus, the chi-square test is applicable in
large number of problems. The test is, in fact, a technique through the use of which it is possible
for all researchers to (i) test the goodness of fit; (ii) test the significance of association between
two attributes, and (iii) test the homogeneity or the significance of population variance.

Q. 10 Write short notes on the following:

Chi-square as a test of goodness of fit.

Chi-square goodness of fit test is applied when we have one categorical variable from a single
population. It is used to determine whether sample data are consistent with a hypothesized
distribution.
For example, suppose a company printed baseball cards. It claimed that 30% of its cards were
rookies; 60%, veterans; and 10%, All-Stars. We could gather a random sample of baseball cards
and use a chi-square goodness of fit test to see whether our sample distribution differed
significantly from the distribution claimed by the company. The sample problem at the end of the
lesson considers this example.

 Precautions in applying Chi-square test


 Conditions for applying Chi-square test

The chi-square goodness of fit test is appropriate when the following conditions are met:
 The sampling method is simple random sampling.
 The population is at least 10 times as large as the sample.
 The variable under study is categorical.
 The expected value of the number of sample observations in each level of the variable is at
least 5.
ASSIGNMENT-IV

Q. 1(a) Explain the meaning of analysis of variance. Describe briefly the technique of analysis
of variance for one-way and two-way classifications.

Analysis of variance (abbreviated as ANOVA) is an extremely useful technique concerning


researches in the fields of economics, biology, education, psychology, sociology, and
business/industry and in researches of several other disciplines. This technique is used when
multiple sample cases are involved. As stated earlier, the significance of the difference. of all those
situations where we want to compare more than two populations such as in comparing the yield of
crop from several varieties of seeds, the gasoline mileage of four automobiles, the smoking habits
of five groups of university students and so on. In such circumstances one generally does not want
the term =Variance‘* and, in fact, it was he who developed a very elaborate theory concerning
ANOVA, explaining its usefulness in practical field.

Later on Professor Snedecor and many others contributed to the development of this technique.

ANOVA is essentially a procedure for testing the difference among different groups of data for
homogeneity. .The essence of ANOVA is that the total amount of variation in a set of data is
broken down into two types, that amount which can be attributed to chance and that amount which
can be attributed to specified causes..1 There may be variation between samples and also within
sample items. ANOVA consists in splitting the variance for analytical purposes. Hence, it is a
method of analysing the variance to which a response is subject into its various components
corresponding to various sources of variation.

(b) What do we mean by the additive property of the technique of the analysis of variance?
Explain how this technique is superior in comparison to sampling.

Q. 2 Write short notes on the following:


Latin-square design;

Facts about the LS Design

 With the Latin Square design we are able to control variation in two directions.

 Treatments are arranged in rows and columns

 Each row contains every treatment.

 Each column contains every treatment.

 The most common sizes of LS are 5x5 to 8x8

Advantages of the LS Design

1. We can control variation in two directions.

2. Hopefully we increase efficiency as compared to the RCBD.

Disadvantages of the LS Design

1. The number of treatments must equal the number of replicates.

2. The experimental error is likely to increase with the size of the square.
3. Small squares have very few degrees of freedom for experimental error.

4. We can‘t evaluate interactions between:

a. Rows and columns

b. Rows and treatments

c. Columns and treatments.

Coding in context of analysis of variance;

F-ratio and its interpretation;

The F-ratio is used to determine whether the variances in two independent samples are equal. If
the F-ratio is not statistically significant, we may assume there is homogeneity of variance and
employ the standard t-test for the difference of means. If the F-ratio is statistically significant, use
an alternative t-test computation such as the Cochran and Cox method.

Set the Rejection Criteria Determine the "degrees of freedom" for each sample
df = n1 - 1 (numerator = n for sample with larger variance)
df = n2 - 1 (denominator = n for sample with smaller variance)
Determine the level of confidence -- alpha
Compute the test statistic = S1^2/S2^2
where
S1^2= largest variance

S2^2= smallest variance


Compare the test statistic with the f critical value (Fcv) listed in the F distribution. If the f-ratio
equals or exceeds the critical value, the null hypothesis (Ho) (there is no difference between the
sample variances) is rejected. If there is a difference in the sample variances, the comparison of
two independent means should involve the use of the Cochran and Cox method. The F-distribution
is formed by the ratio of two independent chi-square variables divided by their respective degrees
of freedom.
Since F is formed by chi-square, many of the chi-square properties carry over to the F distribution.

Significance of the analysis of variance

Analysis of variance often abbreviated as ANOVA is a statistical technique used to test for
significance of difference among more than two sample means. Using this techniques it is possible
to draw inference about whether different sampled drawn have same mean. For example this
method may be used in studies such as comparing intelligence of students from different schools.

When using analysis of variance it is assumed that each of the sample is is drawn from population
having normal distribution with same variance. The assumption of normality is not required when
the sample size is large.

The analysis of variance is carried out in following three steps:

1. Population variance is estimated by variance among sample means.


2. A second estimate of variance is made form variance within samples.
3. Comparing these two estimates of variance. If they are approximately equal in value it is inferred
that
the means are not significantly different.
Q. 3 (a) What is logical form? Discuss the western and oriental conceptions of logic.

As cited by (Ludwig, 2001, p. 1), Bertrand Russell, in the second of his 1914 Lowell lectures, Our
Knowledge of the External World, asserted famously that =every philosophical problem, when it
is subjected to the necessary analysis and purification, is found either to be not really philosophical
at all, or else to be, in the sense in which we are using the word, logical‘ (Russell 1993, p. 42). He
went on to characterize that portion of logic that concerned the study of forms of propositions, or,
as he called them, =logical forms‘. This portion of logic he called =philosophical logic‘. Russell
asserted that ... some kind of knowledge of logical forms, though with most people it is not explicit,
is involved in all understanding of discourse. It is the business of philosophical logic to extract this
knowledge from its concrete integuments, and to render it explicit and pure (p. 53).

In logic the logical form of a sentence (or proposition or statement or truth bearer) or set of
sentences is the form obtained by abstracting from the subject matter of its content terms or by
regarding the content terms as mere placeholders or blanks on a form. In an ideal logical language,
the logical form can be determined from syntax alone; formal languages used in formal sciences
are examples of such languages. Logical form however should not be confused with the mere
syntax used to represent it; there may be more than one string that represents the same logical form
in a given language.

Logic was studied in several ancient civilizations, including India, China, and Greece. In the West,
logic was established as a formal discipline by Aristotle, who gave it a fundamental place in
philosophy. The study of logic was part of the classical trivia, which also included grammar and
rhetoric.

Western and oriental conceptions of logic

Western conceptions of logic


The principal brick in the foundation of Western logic is the .Principle of Contradiction.. The way
it works is that we can‘t say .A is B. and .A is not B. at the same time-not even if we‘re a
ventriloquist. We can only say one thing at a time.

It is believed in the West that logic began with Aristotle, and that logic is universal. The core of
Western thought—including present-day formal mathematics and the philosophy of science—is
premised on the belief that logical truths are universal, that they are necessary truths, and that
logical deduction is certain and infallible. These beliefs about logic however are untenable, both
historically and philosophically, in a larger picture which takes the non-West into account.

Oriental conceptions of logic:

With regard to the Indian and Chinese logics, here are some of the findings of research mentioned
in the article under review:

Indian logic dates from as early as the 5th century BCE, with grammatical investigations. Later
on, it also evolved in the framework of religious studies. Thinkers were 'interested in methods of
philosophical discussion', although 'logical topics were not always separated from metaphysical
and epistemological topics'.

In a Hindu text of the 1st century CE, we encounter sophisticated philosophical concepts, among
which some of a very logical character, like 'separateness, conjunction and disjunction, priority
and posteriority, motion, genus, ultimate difference, inherence, absence'. In the 2nd century,
examples of arguments akin to syllogism appear, which enjoin that generalities be applied to
specific cases. In the 7th-8th century, the various ways statements can be negated are explored.

A Buddhist text of the 5th century teaches that a mark found exclusively in a certain kind of
subject may be used to infer that subject. Another, appears to describe some properties of
implication and logical apodosis: an if-then statement is presented, and it is pointed out that
admission of the antecedent coupled with rejection of the consequent is wrong; although the if-
then statement has a specific content, its elucidation uses logical terminology.
(b) Define deductive and inductive reasoning.

Deductive reasoning concerns what follows necessarily from given premises (if a, then b).
However, inductive
reasoning—the process of deriving a reliable generalization from observations—has sometimes
been included in the study of logic. Similarly, it is important to distinguish deductive validity and
inductive validity (called "cogency"). An inference is deductively valid if and only if there is no
possible situation in which all the premises are true but the conclusion false. An inductive argument
can be neither valid nor invalid; its premises give only some degree of probability, but not
certainty, to its conclusion.

The notion of deductive validity can be rigorously stated for systems of formal logic in terms of
the well-understood notions of semantics. Inductive validity on the other hand requires us to define
a reliable generalization of some set of observations. The task of providing this definition may be
approached in various ways, some less formal than others; some of these definitions may use
mathematical of probability. For the most part this discussion of logic deals only with deductive
logic.

Q. 4 Define the terms: consistency; validity; soundness; completeness.

Among the important properties that logical systems can have:

 Consistency, which means that no theorem of the system contradicts another.


 Validity, which means that the system's rules of proof will never allow a false inference from
true premises.
 A logical system has the property of soundness when the logical system has the property of
validity and only uses premises that prove true (or, in the case of axioms, are true by definition).
 Completeness, of a logical system, which means that if a formula, is true, it can be proven (if
it is true, it is a theorem of the system).
 Soundness, the term soundness has multiple separate meanings, which creates a bit of
confusion throughout the literature. Most commonly, soundness refers to logical systems,
which means that if some formula can be proven in a system, then it is true in the relevant
model/structure (if A is a theorem, it is true). This is the converse of completeness. A distinct,
peripheral use of soundness refers to arguments, which means that the premises of a valid
argument are true in the actual world.

Some logical systems do not have all four properties. As an example, Kurt Gödel's incompleteness
theorems show that sufficiently complex formal systems of arithmetic cannot be consistent and
complete; however, first-order predicate logics not extended by specific axioms to be arithmetic
formal systems with equality can be complete and consistent.

Q. 5 (a) Give your understanding of non-parametric or distribution free methods explaining


their important characteristics.

Nonparametric, or distribution free tests are so-called because the assumptions underlying their
use are .fewer and weaker than those associated with parametric tests. (Siegel & Castellan, 1988,
p. 34). To put it another way, nonparametric tests require few if any assumptions about the shapes
of the underlying population distributions. For this reason, they are often used in place of
parametric tests if/when one feels that the assumptions of the parametric test have been too grossly
violated (e.g., if the distributions are too severely skewed).

Basically, there is at least one nonparametric equivalent for each parametric general type of test.
In general, these tests fall into the following categories:

 Tests of differences between groups (independent samples);


 Tests of differences between variables (dependent samples);
 Tests of relationships between variables.

(b) Narrate the various advantages of using non-parametric tests. Also point out their
limitations.
Advantages of nonparametric tests
Siegel and Castellan (1988, p. 35) list the following advantages of nonparametric tests:

1. If the sample size is very small, there may be no alternative to using a nonparametric statistical
test unless the nature of the population distribution is known exactly.

2. Nonparametric tests typically make fewer assumptions about the data and may be more relevant
to a particular situation. In addition, the hypothesis tested by the nonparametric test may be more
appropriate for the research investigation.

3. Nonparametric tests are available to analyze data which are inherently in ranks as well as data
whose seemingly numerical scores have the strength of ranks.

That is, the researcher may only be able to say of his or her subjects that one has more or less of
the characteristic than another, without being able to say how much more or less. For example, in
studying such a variable as anxiety, we may be able to state that subject A is more anxious than
subject B without knowing at all exactly how much more anxious A is. If data are inherently in
ranks, or even if they can be categorized only as plus or minus (more or less, better or worse), they
can be treated by nonparametric methods, whereas they cannot be treated by parametric methods
unless precarious and, perhaps, unrealistic assumptions are made about the underlying
distributions.

4. Nonparametric methods are available to treat data which are simply classificatory or categorical,
i.e., are measured in a nominal scale. No parametric technique applies to such data.

5. There are suitable nonparametric statistical tests for treating samples made up of observations
from several different populations. Parametric tests often cannot handle such data without
requiring us to make seemingly unrealistic assumptions or requiring cumbersome computations.

6. Nonparametric statistical tests are typically much easier to learn and to apply than are parametric
tests. In addition, their interpretation often is more direct than the interpretation of parametric tests.
Q. 6 Briefly describe the different non-parametric tests explaining the significance of each
such test.

The following are three types of commonly used nonparametric correlation coefficients (Spearman
R, Kendall Tau, and Gamma coefficients). Note that the chi-square statistic computed for two-way
frequency tables, also provides a careful measure of a relation between the two (tabulated)
variables, and unlike the correlation measures listed below, it can be used for variables that are
measured on a simple nominal scale.

Spearman R. Spearman R (Siegel & Castellan, 1988) assumes that the variables under
consideration were measured on at least an ordinal (rank order) scale, that is, that the individual
observations can be ranked into two ordered series. Spearman R can be thought of as the regular
Pearson product moment correlation coefficient, that is, in terms of proportion of variability
accounted for, except that Spearman R is computed from ranks.

Kendall tau is equivalent to Spearman R with regard to the underlying assumptions. It is also
comparable in terms of its statistical power. However, Spearman R and Kendall tau are usually not
identical in magnitude because their underlying logic as well as their computational formulas are
very different. Siegel and Castellan

http://www.aiaccess.net/Symboles_Maths/c_different.gif(1988) express the relationship of the


two measures in terms of the inequality: More importantly, Kendall tau and Spearman R imply
different interpretations: Spearman R can be thought of as the regular Pearson product moment
correlation coefficient, that is, in terms of proportion of variability accounted for, except that
Spearman R is computed from ranks. Kendall tau, on the other hand, represents a probability, that
is, it is the difference between the probabilities that in the observed data the two variables are in
the same order versus the probability that the two variables are in different orders.

The Gamma statistic (Siegel & Castellan, 1988) is preferable to Spearman R or Kendall tau when
the data contain many tied observations. In terms of the underlying assumptions, Gamma is
equivalent to Spearman R or Kendall tau; in terms of its interpretation and computation it is more
similar to Kendall tau than Spearman R. In short, Gamma is also a probability; specifically, it is
computed as the difference between the probability that the rank ordering of the two variables
agree minus the probability that they disagree, divided by 1 minus the probability of ties. Thus,
Gamma is basically equivalent to Kendall tau, except that ties are explicitly taken into account.

Q. 7 Under what circumstances is the Fisher-Irwin test used? Explain. What is the main
limitation of this test?

Fisher's exact test is a statistical significance test used in the analysis of contingency tables.
Although in practice it is employed when sample sizes are small, it is valid for all sample sizes. It
is named after its inventor, R. A. Fisher, and is one of a class of exact tests, so called because the
significance of the deviation from a null hypothesis can be calculated exactly, rather than relying
on an approximation that becomes exact in the limit as the sample size grows to infinity, as with
many statistical tests. Fisher's exact test is a statistical test used to determine if there are nonrandom
associations between two categorical variables.

We have two Bernoulli populations with respectively parameters p1 and p2. The Fisher-Irwin test
is about the hypothesis according to which these two parameters have the same value. More
specifically, it tests:

 The null hypothesis H0: p1 = p2,


 Against the alternative hypothesis H1: p1 p2.

This can be illustrated by the following example. We have two coins :

 C1 that generates "Heads" with probability p1.


 C2 that generates "Heads" with probability p2.

The null hypothesis H0 states that both coins are in fact identical. To test this hypothesis:
 C1 is tossed n1 times, thus generating x1 "Heads",
 C2 is tossed n2 times, thus generating x2 "Heads",

The question is: "Are the observed values x1 and x2 inconsistent with the hypothesis that the two
coins are identical ?". Note that it is not asked to estimate the probabilities p1 and p2, but only to
assess whether it is likely that these two probabilities are equal.

The cornerstone of the Fisher-Irwin test is the following result :

Let X and Y be two independent binomial random variables, with the same p but possibly different
sizes m and n. Choose an integer k, and consider the distribution of X under the condition X + Y
= k. This distribution is hyper geometric, and does not depend on p.

An example of use of the Fisher-Irwin test is in quality control. C1 and C2 are then machines that
manufacture the same part, and that are supposed to be identical. The proportions p1 and p2 of
defective parts made by these two machines should then be equal. But if one of the machines
drifted off optimal tuning, these proportions will become different. To test whether both machines
have the same (hopefully optimal) tuning, one control sample is collected on each of the machines,
and all the parts in these control samples are then tested individually. The Fisher-Irwin test then
answers the question "Are the numbers x1 and x2 of defective parts in the control samples
consistent with the hypothesis that both machines are equally well tuned ?"

LIMITATIONS:

The %RUN_FISHERS macro can only be used for a single table request. Specifying multiple rows
and/or multiple columns (corresponding to a multiple table request such as (a b)*(c d) in the
TABLE statement of
PROC FREQ) is not supported.

No error checking is done. The macro assumes that the specified data set and variables exist and
that their names are correctly specified.
Q. 8 (a) What are the purpose of sum tests? Discuss Wilcox on-Mann-Whitney test and

Krushal-Wallis test.

The t-test is the standard test for testing that the difference between population means for two non-
paired samples are equal. If the populations are non-normal, particularly for small samples, then
the t-test may not be valid. The rank sum test is an alternative that can be applied when
distributional assumptions are suspect.
However, it is not as powerful as the t-test when the distributional assumptions are in fact valid.

Purpose:

Perform a two sample rank sum test.

In statistics, the Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW) or


Wilcoxon rank-sum test) is a non-parametric statistical hypothesis test for assessing whether one
of two samples of independent observations tends to have larger values than the other. It is one of
the most well-known non-parametric significance tests. It was proposed initially by the German
Gustav Deuchler in 1914 (with a missing term in the variance) and later independently by Frank
Wilcoxon in 1945,[2] for equal sample sizes, and extended to arbitrary sample sizes and in other
ways by Henry Mann and his student Donald Ransom Whitney in 1947.

The rank sum test is also commonly called the Mann-Whitney rank sum test or simply the Mann-
Whitney test.
Note that even though this test is commonly called the Mann-Whitney test, it was in fact developed
by Wilcoxon.

Although Mann and Whitney developed the MWW test under the assumption of continuous
responses with the alternative hypothesis being that one distribution is stochastically greater than
the other, there are many other ways to formulate the null and alternative hypotheses such that the
MWW test will give a valid test.
A very general formulation is to assume that:

1. All the observations from both groups are independent of each other,
\rho
\rho
r_s
2. The responses are ordinal (i.e. one can at least say, of any two observations, which is the greater),
3. Under the null hypothesis the distributions of both groups are equal, so that the probability of
an observation from one population (X) exceeding an observation from the second population (Y)
equals the probability of an observation from Y exceeding an observation from X, that is, there is
a symmetry between populations with respect to probability of random drawing of a larger
observation.
4. Under the alternative hypothesis the probability of an observation from one population (X)
exceeding an observation from the second population (Y) (after exclusion of ties) is not equal to
0.5. The alternative may also be stated in terms of a one-sided test, for example: P(X > Y) + 0.5
P(X = Y) > 0.5.

Under more strict assumptions than those above, e.g., if the responses are assumed to be continuous
and the alternative is restricted to a shift in location (i.e. F1(x) = F2(x + d)), we can interpret a
significant MWW test as showing a difference in medians. Under this location shift assumption,
we can also interpret the MWW as assessing whether the Hodges–Lehmann estimate of the
difference in central tendency between the two populations differs from zero. The Hodges–
Lehmann estimate for this two-sample problem is the median of all possible differences between
an observation in the first sample and an observation in the second sample.
The Kruskal-Wallis test (H-test) is an extension of the Wilcoxon test and can be used to test the
hypothesis that a number of unpaired samples originate from the same population. In MedCalc,
Factor codes are used to break-up the (ordinal) data in one variable into different sample
subgroups. If the null-hypothesis, being the hypothesis that the samples originate from the same
population, is rejected (P<0.05), then the conclusion is that there is a statistically significant
difference between at least two of the subgroups.

(b) Explain briefly the Spearman.s rank correlation and Kendall.s coefficient of
concordance.

In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles
Spearman and often denoted by the Greek letter (rho) or as , is a non-parametric measure of
statistical dependence between two variables. It assesses how well the relationship between two
variables can be described using a monotonic function. If there are no repeated data values, a
perfect Spearman correlation of +1 or -1 occurs when each of the variables is a perfect monotone
function of the other.

Spearman's coefficient can be used when both dependent (outcome; response) variable and
independent (predictor) variable are ordinal numeric, or when one variable is a ordinal numeric
and their other is a continuous variable. However, it can also be appropriate to use Spearman's
correlation when both variables are continuous.

Kendall's W (also known as Kendall's coefficient of concordance) is a non-parametric statistic. It


is a normalization of the statistic of the Friedman test, and can be used for assessing agreement
among raters. Kendall's W ranges from 0 (no agreement) to 1 (complete agreement).

Suppose, for instance, that a number of people have been asked to rank a list of political concerns,
from most important to least important. Kendall's W can be calculated from these data. If the test
statistic W is 1, then all the survey respondents have been unanimous, and each respondent has
assigned the same order to the list of concerns. If W is 0, then there is no overall trend of agreement
among the respondents, and their responses may be regarded as essentially random. Intermediate
values of W indicate a greater or lesser degree of unanimity among the various responses.
While tests using the standard Pearson correlation coefficient assume normally distributed values
and compare two sequences of outcomes at a time, Kendall's W makes no assumptions regarding
the nature of the probability distribution and can handle any number of distinct outcomes.

Q. 9 (a) What do we mean by multivariate techniques? Name the important multivariate


techniques and explain the important characteristic of each one of such techniques.

Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which
involves observation and analysis of more than one statistical variable at a time. In design and
analysis, the technique is used to perform trade studies across multiple dimensions while taking
into account the effects of all variables on the responses of interest.

Uses for multivariate analysis include:

 Design for capability (also known as capability-based design)


 Inverse design, where any variable can be treated as an independent variable
 Analysis of Alternatives (AoA), the selection of concepts to fulfill a customer need
 Analysis of concepts with respect to changing scenarios
 Identification of critical design drivers and correlations across hierarchical levels.

Multivariate analysis can be complicated by the desire to include physics-based analysis to


calculate the effects of variables for a hierarchical "system-of-systems." Often, studies that wish
to use multivariate analysis are stalled by the dimensionality of the problem. These concerns are
often eased through the use of surrogate models, highly accurate approximations of the physics-
based code. Since surrogate models take the form of an equation, they can be evaluated very
quickly. This becomes an enabler for large-scale MVA studies: while a Monte Carlo simulation
across the design space is difficult with physics-based codes, it becomes trivial when evaluating
surrogate models, which often take the form of response surface equations.
There are many different models, each with its own type of analysis:

1. Multivariate analysis of variance (MANOVA) extends the analysis of variance to cover cases
where there is more than one dependent variable to be analyzed simultaneously: see also
MANCOVA.
2. Multivariate regression analysis attempts to determine a formula that can describe how elements
in a vector of variables respond simultaneously to changes in others. For linear relations, regression
analyses here are based on forms of the general linear model.
3. Principal components analysis (PCA) creates a new set of orthogonal variables that contain the
same information as the original set. It rotates the axes of variation to give a new set of orthogonal
axes, ordered so that they summarize decreasing proportions of the variation.
4. Factor analysis is similar to PCA but allows the user to extract a specified number of synthetic
variables, fewer than the original set, leaving the remaining unexplained variation as error. The
extracted variables are known as latent variables or factors; each one may be supposed to account
for conversation in a group of observed variables.
5. Canonical correlation analysis finds linear relationships among two sets of variables; it is the
generalized (i.e. canonical) version of bivariate correlation.
6. Redundancy analysis is similar to canonical correlation analysis but allows the user to derive a
specified number of synthetic variables from one set of (independent) variables that explain as
much variance as possible in another (independent) set. It is a multivariate analogue of regression.
7. Correspondence analysis (CA), or reciprocal averaging, finds (like PCA) a set of synthetic
variables that summarize the original set. The underlying model assumes chi-squared
dissimilarities among records (cases). There is also canonical (or "constrained") correspondence
analysis (CCA) for summarizing the joint variation in two sets of variables (like canonical
correlation analysis).
8. Multidimensional scaling comprises various algorithms to determine a set of synthetic variables
that best represent the pairwise distances between records. The original method is principal
coordinates analysis (based on PCA).
9. Discriminant analysis, or canonical variate analysis, attempts to establish whether a set of
variables can be used to distinguish between two or more groups of cases.
10. Linear discriminant analysis (LDA) computes a linear predictor from two sets of normally
distributed data to allow for classification of new observations.
11. Clustering systems assign objects into groups (called clusters) so that objects (cases) from the
same cluster are more similar to each other than objects from different clusters.
12. Recursive partitioning creates a decision tree that attempts to correctly classify members of the
population based on a dichotomous dependent variable.
13. Artificial neural networks extend regression and clustering methods to non-linear multivariate
models.

(b) Enumerate the steps involved in Thurstone.s centroid method of factor analysis.

This method of factor analysis, developed by L.L. Thurstone, was quite frequently used until about
1950 before the advent of large capacity high speed computers. The centroid method tends to
maximize the sum of loadings, disregarding signs; it is the method which extracts the largest sum
of absolute loadings for each factor in turn. It is defined by linear combinations in which all weights
are either + 1.0 or – 1.0. The main merit of this method is that it is relatively simple, can be easily
understood and involves simpler computations. If one understands this method, it becomes easy to
understand the mechanics involved in other methods of factor analysis.

Various steps involved in this method are as follows:

(i) This method starts with the computation of a matrix of correlations, R, wherein unities are place
in the diagonal spaces. The product moment formula is used for working out the correlation
coefficients.

(ii) If the correlation matrix so obtained happens to be positive manifold (i.e., disregarding the
diagonal elements each variable has a large sum of positive correlations than of negative
correlations), the centroid method requires that the weights for all variables be +1.0. In other
words, the variables are not weighted; they are simply summed. But in case the correlation matrix
is not a positive manifold, then reflections must be made before the first centroid factor is obtained.
(iii) The first centroid factor is determined as under:

(a) The sum of the coefficients (including the diagonal unity) in each column of the correlation
matrix is worked out.

(b) Then the sum of these column sums (T) is obtained.

(c) The sum of each column obtained as per (a) above is divided by the square root of T obtained
in (b) above, resulting in what are called centroid loadings. This way each centroid loading (one
loading for one variable) is computed. The full set of loadings so obtained constitute the first
centroid factor (say A).

(iv) To obtain second centroid factor (say B), one must first obtain a matrix of residual coefficients.
For this purpose, the loadings for the two variables on the first centroid factor are multiplied. This
is done for all possible pairs of variables (in each diagonal space is the square of the particular
factor loading). The resulting matrix of factor cross products may be named as Q1. Then Q1 is
subtracted clement by element from the original matrix of correlation, R, and the result is the first
matrix of residual coefficients, R1.* After obtaining R1, one must reflect some of the variables in
it, meaning thereby that some of the variables are given negative signs in the sum. This is usually
done by inspection.

(v) For subsequent factors (C, D, etc.) the same process outlined above is repeated. After the
second centroid factor is obtained, cross products are computed forming, matrix, Q2. This is then
subtracted from R1 (and not from R'1) resulting in R2. To obtain a third factor (C), one should
operate on R2 in the same way as on R1.
First, some of the variables would have to be reflected to maximize the sum of loadings, which
would produce R'2. Loadings would be computed from R'2 as they were from R'1. Again, it would
be necessary to give negative signs to the loadings of variables which were reflected which would
result in third centroid factor (C).
Q. 10 Write short notes on:

Cluster analysis;

The term cluster analysis (first used by Tryon, 1939) encompasses a number of different algorithms
and methods for grouping objects of similar kind into respective categories. A general question
facing researchers in many areas of inquiry is how to organize observed data into meaningful
structures, that is, to develop taxonomies. In other words cluster analysis is an exploratory data
analysis tool which aims at sorting different objects into groups in a way that the degree of
association between two objects is maximal if they belong to the same group and minimal
otherwise. Given the above, cluster analysis can be used to discover structures in data without
providing an explanation/interpretation. In other words, cluster analysis simply discovers
structures in data without explaining why they exist.

Reflections in context of factor analysis;

Overview: Factor analysis is used to uncover the latent structure (dimensions) of a set of variables.
It reduces attribute space from a larger number of variables to a smaller number of factors. Factor
analysis originated a century ago with Charles Spearman's attempts to show that a wide variety of
mental tests could be explained by a single underlying intelligence factor.

Applications:

 To reduce a large number of variables to a smaller number of factors for data modeling
 To validate a scale or index by demonstrating that its constituent items load on the same factor,
and to drop proposed scale items which cross-load on more than one factor.
 To select a subset of variables from a larger set, based on which original variables have the
highest correlations with the principal component factors.

 To create a set of factors to be treated as uncorrelated variables as one approach to handling


multi-collinearity in such procedures as multiple regression
 Factor analysis is part of the general linear model (GLM) family of procedures and makes
many of the same assumptions as multiple regression

Maximum likelihood method of factor analysis;

The ML method consists in obtaining sets of factor loadings successively in such a way that each,
in turn, explains as much as possible of the population correlation matrix as estimated from the
sample correlation matrix. If Rs stands for the correlation matrix actually obtained from the data
in a sample, Rp stands for the correlation matrix that would be obtained if the entire population
were tested, then the ML method seeks to extrapolate what is known from Rs in the best possible
way to estimate Rp (but the PC method only maximizes the variance explained in Rs). Thus, the
ML method is a statistical approach in which one maximizes some relationship between the sample
of data and the population from which the sample was drawn.

The arithmetic underlying the ML method is relatively difficult in comparison to that involved in
the PC method and as such is understandable when one has adequate grounding in calculus, higher
algebra and matrix algebra in particular. Iterative approach is employed in ML method also to find
each factor, but the iterative procedures have proved much more difficult than what we find in the
case of PC method. Hence the ML method is generally not used for factor analysis in practice.

The loadings obtained on the first factor are employed in the usual way to obtain a matrix of the
residual coefficients. A significance test is then applied to indicate whether it would be reasonable
to extract a second factor. This goes on repeatedly in search of one factor after another. One stops
factoring after the significance test fails to reject the null hypothesis for the residual matrix. The
final product is a matrix of factor loadings.
The ML factor loadings can be interpreted in a similar fashion as we have explained in case of the
centroid or the PC method.

Path analysis
The term =path analysis‘ was first introduced by the biologist Sewall Wright in 1934 in connection
with decomposing the total correlation between any two variables in a causal system. The
technique of path analysis is based on a series of multiple regression analyses with the added
assumption of causal relationship between independent and dependent variables. This technique
lays relatively heavier emphasis on the heuristic use of visual diagram, technically described as a
path diagram.

Path analysis makes use of standardized partial regression coefficients (known as beta weights) as
effect coefficients. In linear additive effects are assumed, and then through path analysis a simple
set of equations can be built up showing how each variable depends on preceding variables. .The
main principle of path analysis is that any correlation coefficient between two variables, or a gross
or overall measure of empirical relationship can be decomposed into a series of parts: separate
paths of influence leading through chronologically intermediate variable to which both the
correlated variables have links.

S-ar putea să vă placă și