[summary: When an observation is more likely given one state of the world than another, we should increase our credence that we're in the world that was more likely to have produced that observation.
Suppose that Professor Plum and Miss Scarlet are two suspects in a murder, and that we start out thinking that Professor Plum is twice as likely to have committed the murder as Miss Scarlet. We then discover that the victim was poisoned. We think that, on occasions where they do commit murders, Professor Plum is around one-fourth as likely to use poison as Miss Scarlet. Then after observing the victim was poisoned, we should think Professor Plum is around half as likely to have committed the murder as Miss Scarlet: $~$2 \times \dfrac{1}{4} = \dfrac{1}{2}.$~$
The quantitative rule at work here, in its various forms, is known as Bayes' rule.]
[summary(Technical): Bayes' rule (aka Bayes' theorem) is the quantitative law governing how to revise probabilistic beliefs in response to observing new evidence. Suppose we previously thought that the probability of $~$h_1$~$, denoted $~$\mathbb {P}(h_1)$~$, was twice as great as $~$\mathbb {P}(h_2)$~$. Now we see a new piece of evidence, $~$e_0$~$, such that the probability of our seeing $~$e_0$~$ if $~$h_1$~$ is true (denoted by $~$\mathbb {P}(e_0\mid h_1)$~$) is one-fourth as great as $~$\mathbb {P}(e_0\mid h_2)$~$ (the probability of seeing $~$e_0$~$ if $~$h_2$~$ is true). After observing $~$e_0$~$, we should think that $~$h_1$~$ is now half as likely as $~$h_2$~$:
$$~$\frac{\mathbb {P}(h_1\mid e_0)}{\mathbb {P}(h_2\mid e_0)} = \frac{\mathbb {P}(h_1)}{\mathbb {P}(h_2)} \cdot \frac{\mathbb {P}(e_0\mid h_1)}{\mathbb {P}(e_0\mid h_2)}$~$$
More generally, Bayes' rule states: $~$\mathbb P(\mathbf{H}\mid e) \propto \operatorname{\mathbb {P}}(e\mid \mathbf{H}) \cdot \operatorname{\mathbb {P}}(\mathbf{H}).$~$]
Bayes' rule (aka Bayes' theorem) is the quantitative law of probability theory governing how to revise probabilistic beliefs in response to observing new evidence.
You may want to start at the Guide or the Fast Intro.
The laws of reasoning
Imagine that, as part of a clinical study, you're being tested for a rare form of cancer, which affects 1 in 10,000 people. You have no reason to believe that you are more or less likely than average to have this form of cancer. You're administered a test which is 99% accurate, both in terms of [-specificity] and [-sensitivity]: It correctly detects the cancer (in patients who have it) 99% of the time, and it incorrectly detects cancer (in patients who don't have it) only 1% of the time. The test results come back positive. What's the chance that you have cancer?
Bayes' rule says that the answer is precisely a 1 in 102 chance, which is a probability a little below 1%. The remarkable thing about this is that there is only one answer: the odds of you having that type of cancer, given the above information, is exactly 1 in 102; no more, no less.
%comment: (999,900 * 0.99 + 100 * 0.99) / (100 * 0.99) = (10098 / 99) = 102. Please leave this comment here so the above paragraph is not edited to be wrong.%
This is one of the key insights of Bayes' rule: Given what you knew, and what you saw, the maximally accurate state of belief for you to be in is completely pinned down. While that belief state is quite difficult to find in practice, we know how to find it in principle. If you want your beliefs to become more accurate as you observe the world, Bayes' rule gives some hints about what you need to do.
Learn Bayes' rule
- Bayes' rule: Odds form. Bayes' rule is simple, if you think in terms of relative odds.
- Bayes' rule: Proportional form. The fastest way to say something both convincing and true about belief-updating.
- Bayes' rule: Log-odds form. A simple transformation of Bayes' rule reveals tools for measuring degree of belief, and strength of evidence.
- Bayes' rule: Probabilistic form. The original formulation of Bayes' rule.
- Bayes' rule: Functional form. Bayes' rule for continuous variables.
- Bayes' rule: Vector form. For when you want to apply Bayes' rule to lots of evidence and lots of variables, all in one go.
Implications of Bayes' rule
- A Bayesian view of scientific virtues. Why is it that science relies on bold, precise, and falsifiable predictions? Because of Bayes' rule, of course.
- [update_by_inches __Update by inches.__] It's virtuous to change your mind in response to overwhelming evidence. It's even more virtuous to shift your beliefs a little bit at a time, in response to all evidence (no matter how small).
- Belief revision as probability elimination. Update your beliefs by throwing away large chunks of probability mass.
- Shift towards the hypothesis of least surprise. When you see new evidence, ask: which hypothesis is least surprised?
- Extraordinary claims require extraordinary evidence. The people who adamantly claim they were abducted by aliens do provide some evidence for aliens. They just don't provide quantitatively enough evidence.
- [ Ideal reasoning via Bayes' rule.] Bayes' rule is to reasoning as the Carnot cycle is to engines: Nobody can be a perfect Bayesian, but Bayesian reasoning is still the theoretical ideal.
Related content
- Subjective probability. Probability is in the mind, not the world. If you don't know whether a tossed coin came up heads or tails, that's a fact about you, not a fact about the coin.
- Probability theory. The quantification and study of objects that represent uncertainty about the world, and methods for making those representations more accurate.
- Information theory. The quantification and study of information, communication, and what it means for one object to tell us about another.
Comments
Eric Rogstad
The user already knows they're on Arbital. Why not just call it "Guide" and "introductions"?
Dan Davies
I'm confused, and surely wrong, about the cancer example.
1 in 10000 people are sick. 1 sick person : 9999 well persons multiply by 100: 100 sick people : 999900 well persons 99% of the sick people have positive tests: (0.99 * 100 = ) 99 Positive tests 1% of the well people have false positive tests: (0.01 * 999900 = 9999)
Using the odds view: number of sick persons with positive tests / total number of persons with positive tests: (99 / (99 + 9999) = 99 / 10098. Multiply top and bottom by (1/99) => (99/99) / (10098/99) = 1 / 102. The text says the answer is 1 / 101.010101… which is 99/10000.
So, try the waterfall method.
prior odds of being sick: 1 in 10000. Being sick: 1 Being well: 9999
chance of having positive test while sick: 99 chance of having positive test while well: 1
odds of being sick given positive test: (1 / 9999) * (99 / 1) = 99 / 9999 = 0.00990099 probability of being sick given positive test: 99 / (99+9999) = 1 / 102 from above.
Where did I go wrong? Thanks in advance for any time you have!