Sunteți pe pagina 1din 5

Phil 380: Scientific Methodology Overview 1. Conditional arguments 1. If A then B 2 A Therefore, B (valid argument form) 1. If A then B 2.

not-B Therefore, not-A (valid argument form) 2. H-D confirmation vs. falsification H = a hypothesis statement O = an observation statement 1. If H then O 2. O Therefore, H If this hypothesis H is true, then we should expect to observe O O is in fact observed. Therefore, H is true (or, H is confirmed) 1. If A then B 2. B Therefore, A (invalid argument form) 1. If A then B 2. not-A Therefore, not-B (invalid argument form)

This is the basic form of what is called hypothetico-deductive confirmation; you confirm a hypothesis H by observing one of its observational consequences O. 1. If H then O 2. not-O Therefore, not-H If this hypothesis H is true, then we should expect to observe O O is in fact NOT observed. Therefore, H is NOT true (or, H is falsified)

This is called hypothetico-deductive falsification; you falsify a hypothesis H by failing to observe one of its observational consequences.

3. When H-D confirmation is valid 1. Either H1 or H2 2. If H1 then O 3. If H2 then not-O 4. O Therefore, H1 The evidence O confirms H1 and falsifies H2, but since we assume that H1 and H2 are the only possible alternatives, H1 must be true.

4. When evidence strongly confirms a hypothesis Our question is, when does evidence O strongly confirm hypothesis H? The general answer is that evidence NEVER strongly confirms any hypothesis in isolation. All we can say is that evidence O can strongly confirm hypothesis H over some alternative hypothesis. In short, hypothesis testing is always COMPARATIVE. O strongly confirms H1 over H2 if one can construct an argument of the following form: 1. If H1 then O 2. If H2 then not-O 3. O Therefore, evidence O strongly confirms H1 over H2. The first two premises capture the relevant conditions. Note what happens if dont have these conditions: 1. If H1 then O 2. If H2 then O 3. O ? That is, if the evidence O would be expected regardless if H1 or H2 is true, then the evidence cannot strongly support one over the other. The aim, then, is to find observations O that will effectively distinguish between all the reasonable alternative hypotheses. Example: 1. If (Johnny is an Olympic weight lifter) then (Johnny can lift my cap) 2. If (Johnny is NOT an Olympic weight lifter) then (Johnny can lift my cap) 3. Johnny can lift my cap. ? These two hypotheses both entail the same observation, hence, confirming the observation does not distinguish between the two. Now consider, 1. If (Johnny is an Olympic weight lifter) then (Johnny can lift 800 lbs in the clean-and- jerk) 2. If (Johnny is NOT an Olympic weight lifter) then (Johnny CANNOT lift 800 lbs in the clean and jerk) 3. Johnny can lift 800 lbs in the clean-and-jerk. ? Now, our conditions are satisfied: observation O does strongly support the hypothesis that Johnny is an Olympic weight-lifter, over the hypothesis that he is NOT an Olympic weight-lifter. Note also that in this case, by phrasing the alternative hypothesis as the NEGATION of the test hypothesis, weve built in the assumption that these are mutually exclusive and exhaustive hypotheses. In this case, confirmation is valid (because it employs falsification, and the assumption that either H is true or not-H is true). NOTE: the concept of a null model or null hypothesis in science is precisely the concept of constructing an alternative hypothesis to the test hypothesis, which is usually phrased as the negation of the test hypothesis. In standard statistical hypothesis testing that you learn in stats and research methods classes, you are deciding whether or not the evidence warrants rejecting the null hypothesis (leaving the test hypothesis as the only remaining alternative). Thus, the comparative nature of hypothesis testing is quite explicit in standard scientific methodology.

Note, however, that alternative scientific hypotheses are not always framed in terms of the negation of another hypothesis. Example: 1. If (the Copernican model is true) then (the planets will demonstrate retrograde motion) 2. If (the Ptolemaic model is true) then (the planets will demonstrate retrograde motion) 3. The planets demonstrate retrograde motion. Therefore, ? Here, the evidence doesnt strongly confirm the Copernican over the Ptolemaic model. However, the phases of Venus might: 1. If (the Copernican model is true) then (Venus will show phases). 2. If (the Ptolemaic model is true) then (Venus will NOT show phases). 3. Venus shows phases. Therefore, the evidence strongly confirms the Copernican model over the Ptolemaic model. But is the truth of H1, in this case, guaranteed, as it was in the case of the weight-lifter? NO. Why? Because here, H2 = (the Ptolemaic model) is NOT the negation of H1 (the Copernican model). The negation of H1 is (the Copernican model is NOT true). But there are an infinity of models that are not Copernican; the Ptolemaic model is only one of them). Hence, in cases like this, we cant even assign a likelihood to the truth of H1. All we can say is that H1 is much more probable than H2, given the evidence. But there may be other alternative hypotheses that are not falsified by the evidence (for example, the Kepler model of the solar system with elliptical orbits). 5. Relation to Statistical Hypothesis Testing (what you learn in those stats classes) The logic of hypothesis testing is somewhat obscured in statistics classes by the formal machinery needed to deal adequately with the random variation, imprecision and error that always accompanies the measurement process. In fact, Id say that most students in these classes dont a good grasp at all of the logic of scientific testing (theyre too immersed in getting calculating test statistics and variances that they forget what theyre supposed to mean). Typically, the evidence O is some data that exhibit a correlation of some kind among the measured variables (example: the frequency distribution for the results of successive roles of a dice). We want to test the evidence against different hypotheses that might explain the data. The way its taught in stats classes, hypothesis testing is a four step process: Step 1: Formulate all hypotheses This amounts to distinguishing your test hypothesis H, from the null hypothesis, H0. H: As it is usually phrased, the test hypothesis is a claim to the effect that there is a genuine statistical correlation between the actual, real-world variables, and the observations are the result of this real correlation (plus chance variation). e.g. Claim: the dice is loaded on the six side; hence, the six comes up more often than one would expect by chance. H0: As it is usually phrased, the null hypothesis is the claim that the observations are the result of pure chance; hence, there is no genuine correlation in the real-world variables. e.g. Claim: the dice is evenly balanced; the high number of sixes is the result of random variation

Step 2: Pick a Test Statistic A test statistic is a formal tool for evaluating the evidence against the null hypothesis. It involves the construction of a probabilistic model for the null hypothesis (i.e. a model that will generate random data that will be compared with the actual data), and an evaluation of the distance between the null data and the observed data. Step 3: Calculate the P-Value The p-value of a test statistic is an answer the following question: If the null hypothesis were true, then what is the probability of observing a test statistic at least as extreme as the one we observed? The smaller the p-value, the stronger the evidence is against the null hypothesis. Step 4: Compare the P-Value to a Fixed Significance Level Significance levels are stipulated cut-off points below which we agree that an effect is statistically significant. If the p-value is below this threshold, then we rule out the null hypothesis (in our language, we treat the null hypothesis as falsified). What remains, then, is the test hypothesis (we say that the test hypothesis is substantiated, or in our language, confirmed0. NOTE: In scientific work, a fixed significance level of .05 or .01 is often used. A significance level of .05 implies that we believe that, 5 times out of 100, we will reject the null hypothesis even when it is true. A significance level of .01 means we think this will only happen one time in a hundred. [This kind of error (rejecting the null hypothesis when it is true) is called a type I error in statistics. The other kind of error (failing to reject the null hypothesis when it is false) is called a type II error.] Its worth noting that the use of fixed significance levels is a holdover from the pre-computer era, when scientists had to refer to tables, which were printed only for selected critical values. Still, many scientific journals continue to publish results only when the p-value is less than or equal to .05. Moral: The key point of this discussion is that, logically speaking, all this statistical mumbo jumbo is just a careful way of implementing the argument form described in section 3. 6. Auxiliary hypotheses The picture we have given so far is still much too simple. We have assumed that we can infer or deduce an observational consequence O from a single test hypothesis H (if H then O). But in reality, scientific hypotheses, by themselves, often do not imply any observational consequences. We also need to assume some set of supporting or auxiliary hypotheses in order to derive a specific observation. Example: Newtons law of gravity. Let H be the hypothesis that Fg = Gm1m2/d2 This states that between any two massive bodies, there is a force of attraction proportional to the product of their masses and inversely proportional to the square of the distance between them. This is our test hypothesis. Question: does this hypothesis infer ANY specific observations, by itself? Answer: NO. To figure out, say, what gravitational force will be felt by a cannon ball on my desk due, you need to know a bunch of things. You need to know the mass of the Earth, the mass of the cannonball, and the distance between the center of mass of the Earth and the center of mass of the cannonball. These are the variables in the equation. Also, if the measurement is to be support the hypothesis, you also need to assume that there are no additional forces at work that would interfere with the measurement of the gravitational force (say, the presence of a giant magnet in the ceiling above the cannonball). So the inference to specific prediction for the measured force of gravity on the cannonball will look like this: IF [(Newtons law of gravity is true) and (the mass of the earth is __) and the mass of the cannonball is __) and (there are no confounding forces present)] THEN (the measured force on the cannonball should be __)

or in short If (H and A1 and A2 and A3 and and An) then O Now, note that confirmation and falsification act on this whole set of hypotheses. Thus, for H-D confirmation, we get: 1. If (H and A1 and A2 and A3 and and An) then O 2. O Therefore, H and A1 and A2 and A3 and and An (are confirmed) And for H-D falsification we get: 1. If (H and A1 and A2 and A3 and and An) then O 2. not-O Therefore, not-(H and A1 and A2 and A3 and and An) What does this last line mean? It says that it is not the case that H is true, and A1 is true, and so on. But this just means that they cant all be true at the same time. It doesnt tell us that theyre all false, only that at least of them is false. Thus, it means (not-H) or (not-A1) or (not-A2) or or (not An) but it doesnt tell us WHICH one is false. Hence, falsifying evidence wont allow us validly reject H by itself. In principle, one can always argue that the fault lies with one of the auxiliary hypotheses, and protect the test hypothesis from falsification. What scientists look for, then, is independent means of testing the auxiliary hypotheses. If you CANT do this, then falsification is WEAK.

S-ar putea să vă placă și