In a recent discussion group at work, we were discussing
predictive modeling and the “false positives vs false negatives” trade-off. As
I’ve described in previous posts, your predictive algorithm (be it a
computer-run statistical model or some fuzzy logic inside your brain) outputs
a probability. Your model tells you “There is a 26% chance that this person has
cancer.” If sheer prediction on a portfolio of data points is all you want, you’re
done. But usually some kind of action is required given the model output. Your
model doesn’t neatly cleave the population into “cancerous vs. cancer free”.
You have to set a cut-off probability, such that everyone above the threshold
is treated as if they have cancer and everyone below is treated as if they don’t.
Maybe I want to aggressively treat everyone who might have cancer, so I set the
cutoff low at 10%. Or maybe I don’t want to put someone through chemotherapy
unless I’m really sure, so I set the cutoff higher at 50%. Or maybe this is
just a screening for additional tests, so I want to set the cutoff low, say 5%
or even 1% lest I miss an opportunity to treat a cancer early. A higher
threshold means you’re more certain you’ve identified most of the cancerous
patients, but you’re also wrongly classifying healthy people as cancerous. A
lower threshold means you misidentify fewer healthy people as cancerous (saving
them the indignities of additional tests and chemo) but you also miss a few
cancers that you otherwise could have treated. The threshold depends on the
relative costs of the two kinds of errors, and sometimes errors in both directions are costly.
The example I actually used in our discussion group was the
following. It's World War II in London. You are reading the output of radar imaging. Those blips could be a
flock of birds. Or it could be the Luftwaffe doing another bombing raid on
London. Do you scramble the British fleet of Spitfires to go meet them? Or do
you save yourself the fuel and effort? It obviously depends on how certain you are about
your reading of the radar blips, but it also depends on what you consider a
reasonable cutoff. Are you 90% sure these are Nazis? Then you should probably
scramble the fighters. Only 10% sure? Well…probably, maybe not. Depends on the
cost of fuel and other resources spent scrambling the fighters. 1%? 0.01%?
Surely you can’t be sending them up every time a flock of birds looks creates
funny-looking blips on your radar, but then again failing to repel a bombing
raid is very costly indeed.
I didn’t even think about it during the meeting, but later
it struck me that a lot of people these days are misidentifying things as
Nazis. The threshold they set is far too low (in addition to their underlying
predictive algorithm being inaccurate and extremely biased). When actual Nazis
hold a parade, they say, “See! We told you this was a very big deal and now
there they are!” And most of us just shrug and say, “Stopped clocks. Yes, we all knew that there was a small segment of self-identified Neo Nazis. You don’t get
credit for ‘calling it’ when you’ve been calling everyone you don’t like a Nazi
for the last several years.” My example was meant to be completely apolitical
and simply illustrate the “relative costs of false positives and false
negatives” problem, but it generalizes well. This is another reason to set your threshold high. Too many false positives, and people start questioning your credibility. They may stop believing you when you most need them to.
No comments:
Post a Comment