Documente Academic
Documente Profesional
Documente Cultură
ASSIGNMENTS
Set 2
SUBJECT NAME: RESEARCH METHODOLOGY
For instance, one might want to test the claim that a certain drug reduces the chance of
having a heart attack. One would choose the null hypothesis "this drug does not reduce
the chances of having a heart attack" (or perhaps "this drug has no effect on the chances
of having a heart attack"). One should then collect data by observing people both taking
the drug and not taking the drug in some sort of controlled experiment. If the data is very
unlikely under the null hypothesis one would reject the null hypothesis, and conclude that
its negation is true. That is, one would conclude that the drug does reduce the chances
of having a heart attack. Here "unlikely data" would mean data where the percentage of
people taking the drug who had heart attack was much less then the percentage of
people not taking the drug who had heart attacks. Of course one should use a known
statistical test to decide how unlikely the data was and hence whether or not to reject the
null hypothesis.
Null hypothesis = Ho
Suppose we want to test the hypothesis that the population mean is equal to the
hypothesis mean (µ Ho) = 100. Then we would say that the null hypotheses are
that the population mean is equal to the hypothesized mean 100 and symbolical
we can express as Ho : µ = µ Ho = 100.
The results of exploratory research are not usually useful for decision-making by
themselves, but they can provide significant insight into a given situation. Although the
results of qualitative research can give some indication as to the "why", "how" and
"when" something occurs, it cannot tell us "how often" or "how many." Exploratory
research is not typically generalizable to the population at large..
Suppose, for example, that two experts, X and Y, were asked to rank N=8 items
with respect to some dimension germane to their field of expertise
(rank#1=highest, rank#8=lowest). To make it specific, you can imagine two
physicians ranking 8 patients with respect to the severity of their disease; two
psychotherapists ranking 8 patients with respect to the likelihood of improvement;
two wine experts ranking 8 wines from best to worst; two statisticians ranking 8
statistical concepts with respect to their fundamental importance; or whatever
else it might be that strikes your fancy.
wine X Y
a 1 2
b 2 1
c 3 5
d 4 3
e 5 4
f 6 7
g 7 8
h 8 6
As you can see from the accompanying graph, there is a substantial degree of
agreement between the rankings of the two experts. Plug the bivariate values of
X and Y into the formulaic structure given in the main body of Chapter 3,
As it happens, these are exactly the same values you will get when you calculate
the Spearman coefficient, rs. The simple reason for this is that r and rs are
algebraically equivalent in the case where the values of X and Y consist of two
sets of N rankings. The only advantage of rs is that the calculations are easier if
you are doing them by hand. [Note, however, that rs is precisely equal to r only
when the rankings within X and Y are the consecutive integer values: 1, 2, 3, and
so on, with no ties. With tied ranks there will tend to be discrepancies between rs
and r. If the proportion of tied ranks is fairly large, you would be better advised to
plug your rankings for X and Y into the standard formula for r.]
wine X Y D D2
The Simple Formula for rs, for Rankings without Ties
a 1 2 —1 1
Here is the same table you saw above, except now we b 2 1 1 1
c 3 5 —2 4
also take the difference between each pair of ranks d 4 3 1 1
(D=X—Y), and then the square of each difference. All e 5 4 1 1
f 6 7 —1 1
that is required for the calculation of the Spearman g 7 8 —1 1
coefficient are the values of N and-∑D2, according to the h 8 6 2 4
formula N = 8-∑D2 = 14
6∑D2
rs = 1 —
N(N2—1)
If this formula seems a bit odd to you, you are in good company. Generations of
statistics students have been presented with it, and generations have puzzled
over such mind- bending questions as: why do you start out with "1" and subtract
something from it? where does that N (N 2—1) in the denominator come from?;
and, above all, how does that peculiar "6" get into the numerator?
• For any set of N paired bivariate ranks, the minimum possible value of-∑D2
occurs in the case of perfect positive correlation. In this case, rank 1 for X
is paired with rank 1 for Y, rank 2 for X with rank 2 for Y, and so on. Each
value of D will accordingly be equal to zero, and so too will be the sum of
the squared values of D.
• Conversely, the maximum possible value of-∑D2 occurs in the case of
perfect negative correlation. This maximum possible value is in every
instance equal to
N(N2—1)
maximum-∑D2 =
3
•
Thus, for N=8 with perfect negative correlation:T
item X Y D D2
a 1 8 —7 49
b 2 7 —5 25 2
c 3 6 —3 9 -∑D = 168
d 4 5 —1 1 2
e 5 4 1 1 8(8 —1)/3 = 168
f 6 3 3 9
g 7 2 5 25
h 8 1 7 49
• The ratio of the observed-∑D2 to its maximum possible value will therefore
be equal to zero in the case of perfect positive correlation, to +1.0 in the case
of perfect negative correlation, and to +.50 in the case of zero correlation.
-∑D2 3∑D2
=
N(N2—1)/3 N(N2—1)
Double this ratio, subtract it from 1, and voila! you have a quantity that will
be equal to +1.0 in the case of perfect positive correlation, to —1.0 in the case
of perfect negative correlation, and to zero in the case of zero correlation.
6∑D2
rs = 1 —
N(N2—1)
• And here, finally, is the calculation of rs for the example with which we
began:
wine X Y D D2
a 1 2 —1 1
b 2 1 1 1
c 3 5 —2 4
d 4 3 1 1
e 5 4 1 1
f 6 7 —1 1
g 7 8 —1 1
h 8 6 2 4
N = 8-∑D2 = 14
6∑D2
rs = 1 —
N(N2—1)
6 x 14
=1—
8(82—1)
= +.83
r2s = .69
The meanings of rs and r2s in a rank- order correlation are essentially the same as
those of r and r2 in a correlation based on equal- interval data. For the present
example, r2s=.69 means that the covariance between the X and Y rankings is
69% as strong as it possibly could be, and the positive sign of rs=+.83 signals
that this covariation occurs along the upward slant, with higher values of X
tending to be associated with higher values of Y, and vice versa. However,
I would not recommend taking the parallels much farther than this. In particular,
I think it would not make much sense to subject bivariate rankings to the
predictive apparatus of linear regression.
3. Reference Material
• Bibliography
• Appendix
• Copies of data collection instruments
• Technical details on sampling plan
• Complex tables
• Glossary of new terms used.
Mechanics of Writing:
It is better to divide the body of the presentation into two to five parts. The
audience will be able to absorb only so much information. If that information can
be aggregated into chunks, it will be easier to assimilate. Sometimes the points to
be made canot be combine easily or naturally. In the case, it is necessary to use
a longer list. One way to structure the presentation is by the research questions.
Another method that is often useful when presenting the research proposal is to
base it on the research process. The most useful presentations will include a
statement of implications and recommendations relevant to the research
purpose. However, when research lacks information about the total situation
because the research study addresses only a limited aspect of it, the ability to
generate recommendations may be limited.
The research purpose and objective are good vehicles to provide motivation. The
research purpose should specify decisions to be made and should relate to the
research questions. A presentation that focuses on those research questions and
their associated hypothesis will naturally be tied to relevant decisions and hold
audience interest. In contrast, a presentation that attempts to report on all the
questions that were included in the survey and in the cross – tabulations often will
be long, uninteresting and of little value.
The presentation should include some indication of the reliability of the results. At
the minimum, it always should be clear what sample size was involved. The key
results should be supported by more precise information in the form of interval
estimates or a hypothesis test. They hypothesis test basically indicates, given the
sample size, what probability exists that the results were merely an accident of
sampling. If the probability of the latter is not low, then the results probably would
not be repeated. Do not imply more precision than is warranted.
Ans: Case study is a method of exploring and analyzing the life of a social unit or
entity, be it a person, a family, an institution or a community. Case study would
depend upon wit, commonsense and imagination of the person doing the case
study. The investigator makes up his procedure as he goes along. Efforts should
be made to ascertain the reliability of life history data through examining the
internal consistency of the material.. A judicious combination of techniques of
data collection is a prerequisite for securing data that are culturally meaningful
and scientifically significant. Case study of particular value when a complex set of
variables may be at work in generating observed results and intensive study is
needed to unravel the complexities. The case documents hardly fulfill the criteria
of reliability, adequacy and representativeness, but to exclude them form any
scientific study of human life will be blunder in as much as these documents are
necessary and significant both for theory building and practice. In-depth analysis
of selected cases is of particular value to business research when a complex set
of variables may be at work in generating observed results and intensive study is
needed to unravel the complexities.
Let us discuss the criteria for evaluating the adequacy of the case history
or life history which is of central importance for case study.
John Dollard has proposed seven criteria for evaluating such adequacy as
follows:
I. The subject must be viewed as a specimen in a cultural series. That is, the
case drawn out from its total context for the purposes of study must be
considered a member of the particular culture group or community. The
scrutiny of the life histories of persons must be done with a view to identify
thee community values, standards and their shared way of life.
II. The organic motto of action must be socially relevant. That is, the action of
the individual cases must be viewed as series of reactions to social stimuli
III. The strategic role of the family group in transmitting the culture must be
recognized. That is, in case of an individual being the member of a family,
the role of family in shaping his behaviour must never be overlooked.
VII. The life history material itself must be organized according to some
conceptual framework; this in turn would facilitate generalizations at a
higher level.
Q.No.4 - Give the importance of frequency tables and discuss the principles
of table construction, frequency distribution and class intervals
determination.
1. Every table should have a title. The tile should represent a succinct
description of the contents of the table. It should be clear and concise. It
should be placed above the body of the table.
2. A number facilitating easy reference should identify every table. The
number can be centred above the title. The table numbers should run in
consecutive serial order. Alternatively tables in chapter 1 be numbered as
1.1, 1.2, 1….., in chapter 2 as 2.1, 2.2, 2.3…. and so on.
3. The captions (or column headings) should be clear and brief.
In practice, all variables are treated as discrete units, the continuous variables
being stated in some discrete unit size according to the needs of a particular
situation. For example, length is described units of millimeters or a tenth of an
inch.
Class Intervals:
Ordinarily, the number of class intervals may not be less than 5 not more than 15,
depending on the nature of the data and the number of cases being studied. After
noting the highest and lower values and the feature of the data, the number of
intervals can be easily determined.
For many types of data, it is desirable to have class intervals of uniform size. The
intervals should neither be too small nor too large. Whenever possible, the
intervals should represent common and convenient numerical divisions such as 5
or 10 rather than odd division such as 3 to 7. Class intervals must be clearly
designated in a frequency table in such a way as to obviate any possibility of
misinterpretation of confusion. For example, to present the age group of a
population, the use of intervals of 1-20, 21-50 and 50 and above would be
confusing. This may be presented as 1-20, 21-50, and above 50.
Every class interval has a mid point. For example, the midpoint of an interval 1-20
is 10.50 and the midpoint of class interval 1-25 would be 13. Once class intervals
are determined, it is routine work to count the number of cases that fall in each
interval.
Decision
Accept Ho Reject Ho
Ho (true) Correct decision Type I error (a error)
Ho (false) Type II error (β error) Correct decision
But with a fixed sample size, n when we try to reduce type I error, the probability
of committing type II error increases. Both types of errors can not be reduced
simultaneously. There is a trade-off in business situations, decision-makers
decide the appropriate level of type I error by examining the costs of penalties
attached to both types of errors. If type I error involves time & trouble of
reworking a batch of chemicals that should have been accepted, where as type II
error means taking a chance that an entire group of users of this chemicals
compound will be poisoned, then in such a situation one should prefer a type I
error to a type II error means taking a chance that an entire group of users of this
chemicals compound will be poisoned, then in such a situation one should prefer
a type II error. As a result one must set very high level for type I error in one’s
testing techniques of a given hypothesis. Hence, in testing of hypothesis, one
must make all possible effort to strike an adequate balance between Type I &
Type II error.
TWO-TAILED t-TESTS
A two-tailed t-test divides a in half, placing half in the each tail. The null
hypothesis in this case is a particular value, and there are two alternative
hypotheses, one positive and one negative. The critical value of t, tcrit, is written
with both a plus and minus sign (± ). For example, the critical value of t when
there are ten degrees of freedom (df=10) and a is set to .05, is tcrit= ± 2.228. The
sampling distribution model used in a two-tailed t-test is illustrated below:
ONE-TAILED t-TESTS
There are really two different one-tailed t-tests, one for each tail. In a one-tailed t-
test, all the area associated with a is placed in either one tail or the other.
Selection of the tail depends upon which direction tobs would be (+ or -) if the
results of the experiment came out as expected. The selection of the tail must be
made before the experiment is conducted and analyzed.
The value tcrit would be positive. For example when a is set to .05 with ten
degrees of freedom (df=10), tcrit would be equal to +1.812.
The value tcrit would be negative. For example, when a is set to .05 with ten
degrees of freedom (df=10), tcrit would be equal to -1.812.
1. If tOBS = 3.37, then significance would be found in the two-tailed and the
positive one-tailed t-tests. The one-tailed t-test in the negative direction would not
be significant, because was placed in the wrong tail. This is the danger of a
one-tailed t-test.
2. If tOBS = -1.92, then significance would only be found in the negative one-tailed
t-test. If the correct direction is selected, it can be seen that one is more likely to
reject the null hypothesis. The significance test is said to have greater power in
this case.
The selection of a one or two-tailed t-test must be made before the experiment is
performed. It is not "cricket" to find a that tOBS = -1.92, and then say "I really
meant to do a one-tailed t-test." Because reviewers of articles submitted for
ublication are sometimes suspicious when a one-tailed t-test is done, the
recommendation is that if there is any doubt, a two-tailed test should be done.
X(height-cm) 174 175 176 177 178 182 183 186 189 193
Y(weight-kg) 61 65 67 68 72 74 80 87 92 95
where
are the standard score, sample mean, and sample standard deviation.
177 -5 25 68 -8 64 40
178 -4 16 72 -4 16 16
182 0 0 74 -2 4 0
183 1 1 80 4 16 4
186 4 16 87 11 121 44
676/10 – (-7/10X1/10)
r= _________________________________
√377/10 – (-7/10)² X √ 1265/10 – (1/10)²
67.60 + 0.07
= _____________________
√ 37.70 - 0.49 x √ 126.50 – 0.01
68.30
= 6.10 x 11.247
67.6700
= 68.6067
= 0.9863
End