Documente Academic
Documente Profesional
Documente Cultură
Theory
Bayesian probability
Probability interpretations
Bayes' theorem
Bayes' rule Bayes factor
Bayesian inference
Bayesian network
Principle of indifference
Principle of maximum entropy
Cromwell's rule
Bernsteinvon Mises theorem
Bayesian information criterion
Credible interval
Maximum a posteriori estimation
Techniques
Bayesian linear regression
Bayesian estimator
Approximate Bayesian computation
Uses
Bayesian spam filtering
Binary classification
Naive Bayes classifier
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule) is
a theorem with two distinct interpretations. In the Bayesian interpretation, it expresses how a
subjective degree of belief should rationally change to account for evidence. In the frequentist
interpretation, it relates inverse representations of the probabilities concerning two events. In the
Bayesian interpretation, Bayes' theorem is fundamental to Bayesian statistics, and has
applications in fields including science, engineering, economics (particularly microeconomics),
game theory, medicine and law. The application of Bayes' theorem to update beliefs is called
Bayesian inference.
Bayes' theorem is named for Thomas Bayes (/bez/; 17011761), who first suggested using the
theorem to update beliefs. His work was significantly edited and updated by Richard Price before
it was posthumously read at the Royal Society. The ideas gained limited exposure until they were
independently rediscovered and further developed by Laplace, who first published the modern
formulation in his 1812 Thorie analytique des probabilits. Until the second half of the 20th
century, the Bayesian interpretation was largely rejected by the mathematics community as
unscientific[citation needed]. However, it is now widely accepted. This may have been due to the
Contents
[hide]
1 Introductory example
3 Forms
o 3.1 For events
4 Derivation
o 4.1 For general events
o 4.2 For random variables
5 Examples
o 5.1 Frequentist example
6 History
7 Notes
o 7.1 Further reading
8 External links
given event
is 0.75).
Our goal is to calculate the probability that the conversation was held with a woman, given the
fact that the person had long hair, or, in our notation,
theorem, we have:
where we have used the law of total probability. The numeric answer can be obtained by
substituting the above values into this formula. This yields
,
i.e., the probability that the conversation was held with a woman, given that the person had long
hair, is about 71%.
given
and
given
and
,
and
The meaning of this statement depends on the interpretation of probability ascribed to the terms:
and evidence
provides for
For more on the application of Bayes' theorem under the Bayesian interpretation of probability,
see Bayesian inference.
Illustration of frequentist interpretation with tree diagrams. Bayes' theorem connects conditional
probabilities to their inverses.
In the frequentist interpretation, probability measures a proportion of outcomes. Bayes' theorem
under this interpretation is most easily visualized using tree diagrams, as shown to the right. The
two diagrams partition the same population in different ways. The upper diagram partitions the
population into two groups, with and without property , and then partitions each group into
two further groups, with and without property . The lower diagram proceeds in reverse, first
partitioning according to and then according to . (For example, might be having a risk
factor for a medical condition, and might be having the condition.)
If we pick every member of a population with property
[edit] Forms
[edit] For events
[edit] Simple form
For events
and
, provided that
In a Bayesian inference step, the probability of evidence is constant for all models
posterior may then be expressed as proportional to the numerator:
. The
terms of
probability:
and
As previously, the law of total probability may be used to substitute for unknown marginal
probabilities.
Diagram illustrating the meaning of Bayes' theorem as applied to an event space generated by
continuous random variables and . Note that there exists an instance of Bayes' theorem for
each point in the domain. In practise, these instances might be parametrised by writing the
specified probability densities as a function of and .
Consider a sample space generated by two random variables and . In principle, Bayes'
theorem applies to the events
and
. However, terms become 0
at points where either variable has finite probability density. To remain useful, Bayes' theorem
may be formulated in terms of the relevant densities (see Derivation).
[edit] Simple form
If
is continuous and
If
is discrete and
If both
and
is discrete,
is continuous,
are continuous,
Diagram illustrating how an event space generated by continuous random variables X and Y is
often conceptualized.
A continuous event space is often conceptualized in terms of the numerator terms. It is then
useful to eliminate the denominator using the law of total probability. For
an integral:
, this becomes
where
.
So the rule says that the posterior odds are the prior odds times the Bayes factor.
[edit] Derivation
[edit] For general events
Bayes' theorem may be derived from the definition of conditional probability:
[edit] Examples
[edit] Frequentist example
Tree diagram illustrating frequentist example. R, C, P and P bar are the events representing rare,
common, pattern and no pattern. Percentages in parentheses are calculated. Note that three
independent values are given, so it is possible to calculate the inverse tree (see figure above).
An entomologist spots what might be a rare subspecies of beetle, due to the pattern on its back.
In the rare subspecies, 98% have the pattern. In the common subspecies, 5% have the pattern.
The rare subspecies accounts for only 0.1% of the population. How likely is the beetle to be rare?
From the extended form of Bayes' theorem,
Tree diagram illustrating drug testing example. U, U bar, "+" and "" are the events representing
user, non-user, positive result and negative result. Percentages in parentheses are calculated.
Suppose a drug test is 99% sensitive and 99% specific. That is, the test will produce 99% true
positive results for drug users and 99% true negative results for non-drug users. Suppose that
0.5% of people are users of the drug. If a randomly selected individual tests positive, what is the
probability he or she is a user?
Despite the apparent accuracy of the test, if an individual tests positive, it is more likely that they
do not use the drug than that they do.
This surprising result arises because the number of non-users is very large compared to the
number of users, such that the number of false positives (0.995%) outweighs the number of true
positives (0.495%). To use concrete numbers, if 1000 individuals are tested, there are expected to
be 995 non-users and 5 users. From the 995 non-users,
false positives are
expected. From the 5 users,
true positives are expected. Out of 15 positive
results, only 5, about 33%, are genuine.
[edit] History
Bayes' theorem was named after the Reverend Thomas Bayes (170261), who studied how to
compute a distribution for the probability parameter of a binomial distribution (in modern
terminology). His friend Richard Price edited and presented this work in 1763, after Bayes'
death, as An Essay towards solving a Problem in the Doctrine of Chances.[2] The French
mathematician Pierre-Simon Laplace reproduced and extended Bayes' results in 1774, apparently
quite unaware of Bayes' work.[3] Stephen Stigler suggested in 1983 that Bayes' theorem was
discovered by Nicholas Saunderson some time before Bayes[4]. However, this interpretation has
been disputed.[5]
Stephen Fienberg describes the evolution from "inverse probability" at the time of Bayes and
Laplace, a term still used by Harold Jeffreys (1939), to "Bayesian" in the 1950s.[6] Ironically,
Ronald A. Fisher introduced the "Bayesian" label in a derogatory sense[citation needed].
[edit] Notes
1.
^ McGrayne, Sharon Bertsch. (2011). The Theory That Would Not Die: How Bayes' Rule
Cracked The Enigma Code, Hunted Down Russian Submarines, & Emerged Triumphant from
Two Centuries of Controversy. New Haven: Yale University Press. 13-ISBN 9780300169690/10ISBN 0300169698; OCLC 670481486 The Theory That Would Not Die, p. 10. at Google Books
2.
^ Bayes, Thomas, and Price, Richard (1763). "An Essay towards solving a Problem in the
Doctrine of Chance. By the late Rev. Mr. Bayes, communicated by Mr. Price, in a letter to John
Canton, M. A. and F. R. S.". Philosophical Transactions of the Royal Society of London 53 (0):
370418. doi:10.1098/rstl.1763.0053. http://www.stat.ucla.edu/history/essay.pdf.
3.
4.
5.
6.
Earliest Known Uses of Some of the Words of Mathematics (B). Contains origins of
"Bayesian", "Bayes' Theorem", "Bayes Estimate/Risk/Solution", "Empirical Bayes", and
"Bayes Factor".
A tutorial on probability and Bayes theorem devised for Oxford University psychology
students
Objective
Complete
Well-written
Submit ratings
Saved successfully
Your ratings have not been submitted yet
Categories:
Bayesian statistics
Probability theorems
Statistical theorems
Hidden categories: