Tuesday, July 15, 2008

A quick intro to Bayesian inference

This, oh readers of my blog, is Bayes' Theorem, which you should learn well:

In english, it says, "The probability of A given B is equal to the probabilty of B given A times the probability of A, all divided by the probability of B."

But that's still not quite english, so let me put it another way. P{(A|B) and P(B|A) are called "conditional probabilities." P(A) and P(B) are what we call "prior probabilities"--they exist independently of information about each other. So, when you're solving for P(A|B), you're asking, "What's the probability of something given the probability of some other, related thing?"

People incorrectly apply probabilities all the time without considering Bayes' Theorem. You might even catch yourself doing it. Try out a probability problem and see: 10% of the population uses marijuana regularly, and a given drug test is 90% accurate. What's the probability that a randomly selected person who tests positive for marijuana use actually uses marijuana?

In general, people tend to ignore the ever-important prior information that 90% of the population doesn't use marijuana. They tend to think that the probability that our randomly selected person does pot is 90%. It's actually 50%. If you want the math, it works out like this:

P(B|A) = 0.9
P(A) = 0.1
P(B) = P(B|A)*P(A) + P(B|A')*P(A') = 0.9*0.1 + 0.1*0.9 = 0.18

In this case, P(B|A)*P(A) is the probability of a true positive and P(B|A')*P(A') is the probability of a false positive. Adding these two cases gives us the probability of B independent of A, or the odds that the test is positive regardless of whether the person uses marijuana. Now we have

(0.9*0.1)/0.18 = 0.5

So, as you can see, prior information is essential when calculating conditional probabilities. We can shift the probability that this given person smokes pot from 10% to 50%, and no more. This is counterintuitive to a lot of people, but it's strictly true. If this test comes back positive, it can only give you 50% confidence that a person smokes pot.

Now for something a little more controversial--and I'll write a post about the controversy later. Bayes' Theorem can be used to update our beliefs.

Well, maybe you don't think it's that controversial, when you think about the above example. You shift your belief from P=.1 to P=.5, no problem, right? Ah, but what if you don't have a nice, handy prior given to you from the annals of science? What if you don't know that 10% of the population smokes pot? What if you think it's significantly higher? Lower? What if you believe 0% of the population smokes marijuana? Then, according to Bayes' Theorem, even a 100% accurate marijuana test can't convince you otherwise--you simply won't believe the test can measure marijuana use.

But we'll come back to that. For now, a quick example, so you know what I'm talking about in the next entry. I'm going to steal this bit from Against the Modern World:

A stubborn, but rational, man, Smith, thinks it is extremely unlikely that cigarette smoking causes lung cancer. For Smith, say, P(cigs cause cancer) = 0.2. Instead, he licenses only one alternative hypothesis: that severe allergies cause cancer. Since these hypotheses are exhaustive, on pain of inconsistency, Smith must believe P(allergies cause cancer) = 0.8.

Now, suppose Smith's Aunt Liz dies of lung cancer. Furthermore, suppose Aunt Liz has been a heavy smoker her entire life, then P(Liz gets cancer | cigs cause cancer) = 1 (certainty). Suppose, also, that Liz has had minor allergies for most of her life; since these allergies are only minor, let's say the probability she gets cancer under the hypothesis that severe allergies cause cancer is only 0.5.

Briefly, how should we calculate P(E) here? We sum over the weighted possibilities:

P(E) = P(H1)P(E|H1) + P(H2)P(E|H2) = 0.2(1) + 0.8(0.5) = 0.6

So, now we can use Bayes' Rule to calculate Smith's (only consistent) subjective degree of belief in the hypothesis that cigarettes cause cancer given the evidence that Aunt Liz has died of cancer.

P(H=cigs cause cancer) = 0.2
P(E=Liz gets cancer | H=cigs cause cancer) = 1
P(E=Liz gets cancer) = 0.6

Plugging these values into Bayes' Rule we get:

P(H|E) = P(H)P(E|H)/P(E) = 0.2(1) / 0.6 = 1/3


So, in light of this evidence, Smith's belief in the hypothesis that cigarettes cause cancer has increased from 1/5 to 1/3.

Here's something to note: what if Smith didn't assign the remainder of his cancer hypothesis to allergies? What if he assigned it to absolutely anything else?

We'd have gotten the denominator to be

0.2*1 + 0.8*1 = 1. His inference would become (0.2*1)/1 = 0.2. He wouldn't have changed his belief at all.

This, my friends of blogland, is the secret to science.

4 comments:

Peepseo said...

Swimmi,

How does this jive with Frank Tipler's assertion in this paper that a theory's validity is greatly strengthened by the occurrence of very low probability predicted outcomes? The math was kind of over my head.

Peepseo

P.S. I don't play the odds when it comes to drug tests.

Peepseo said...

Swimmy,

Sorry about the misspelling...


Peepseo

Swimmy Lionni said...

He is correct. If you want a less mathematical explanation. see this entry on Overcoming Bias. In short, the key is this:

"If you expect a strong probability of seeing weak evidence in one direction, it must be balanced by a weak expectation of seeing strong evidence in the other direction."

Swimmy Lionni said...

Another useful entry: Absence of Evidence Is Evidence of Absence.