I said in my previous post: “I think the “false positive/false negative” result described in the above paragraph is just a statistical artifact of the fact that black defendants, for whatever reason, are more likely to recidivate (51.4% vs 39.4%, according to Propublica’s data).” I’ve confirmed my suspicions. The false positive/false negative disparity arises from the different underlying rates of recidivism for the two races. I am not making any general claims about crime rates by race; these statements are specific to the sample of criminals used in Propublica’s analysis. You could compare males to females, young to old, multiple priors to no priors. Any comparison of a high-recidivism to low-recidivism population will show this false positive/false negative disparity, even if the model is completely unbiased.
Assume you can divide the world into two identifiable classes: Blues and Greens. Suppose we live in a world with 1000 Greens and 1000 Blues. There are 600 high-risk Greens and 400 low-risk Greens. Blues are flipped: 400 high-risk and 600 low-risk Blues. A high-risk person has a 60% chance of recidivating and a low-risk person has a 30% chance of recidivating, regardless of class. Here is the breakdown of high- and low-risk, who subsequently offended or didn’t offend, broken out by class. Notice that the Greens have a higher false-positive rate and the Blues have a higher false-negative rate. The model is fair. It is accurately predicting recidivism rates for each grouping. The false-positive/false-negative differences are driven by the relative propensity of Greens and Blues to recidivate. The “unfairness” of the false-positive/false-negative proportions is driven by the underlying propensity to commit crimes. The model itself is actually fair. (The numbers and proportions chosen for this example match fairly closely to those in the Propublica study.)
Trivially, if we set the proportions of high- and low-risk individuals equal (500/500 for both races), the false positive/false negative disparity disappears. If we exacerbate the difference (say 900 high- and 100 low-risk Greens, flipped for Blues), we also exacerbate the false positive/false negative disparity. You end up with 83.7% false positives and 5.3% false negatives for the Greens and 6.0% false positives and 81.8% false negatives for the Blues. Amazingly, you’re treating everyone fairly. 60% of people labeled high-risk re-offend, Green or Blue. 30% of people labeled low-risk re-offend, Green or Blue. Your model is as accurate as it can be, and it’s not showing a racial bias in terms of recidivism rates. It’s just that there “really” are more high-risk Greens.
I don't know why the original Propublica piece fixated on the false positive and false negative rates, other than that it gave them the answer they wanted. The false positive rate is the number of false positives divided by false positives plus true negatives. In other words, of those people who did not re-offend, the fraction that was wrongly labeled "high risk." The false negative rate is the number of false negatives over false negatives plus true positives. In other words, of those people who did re-offend, the fraction that was wrongly identified as low-risk. The false positive rate will be high for a high-risk group, even for an unbiased model. Ditto for the false negative rate for a low-risk group. These statistics simply don't tell you anything about whether the model is biased or not.
You will see this racial disparity arise whenever there is 1) some kind of system for targeting individuals and 2) some resolution as to whether the targeting was correct or not. You will see this so long as there are average demographic differences between the races, even if race itself isn’t a factor (as described in the previous paragraph). Suppose prosecutors use some kind of criteria or decision making process for deciding who to prosecute (step 1) and the resolution is a guilty/not-guilty verdict (step 2). Well, you’re going to see more black people prosecuted and then found “not guilty”, and more guilty white people let off the hook (although you won’t ultimately know how many of these are guilty). Or suppose that cops decide who to stop-and-frisk based on demographic characteristics (step 1), and the resolution is an arrest for possession of contraband (step 2). Once again, you’re going to have a lot of unnecessary police stops for black people, and a lot of guilty white people will be let off the hook. Even if the police really are colorblind.
I’m not trying to argue that the apparent racial disparity in our justice system is all attributable to factors other than race. I’m sure that race itself is a factor in many decisions to stop, arrest, prosecute, convict, beat, or shoot a person. I’m just issuing a word of caution that these disparities will continue to exist even if we achieve a color-blind society. A process will wrongly be labeled as racist even when it isn’t, as the Propublica article demonstrates clearly.
As terrible as the original Propublica article was, I’m sort of glad they wrote it, because I never would have worked out this result otherwise. It’s a good thing to keep in mind. A higher overall rate of something means more false positives and fewer false negatives; a lower overall rate of something means the opposite. You will get this result even from a fair, unbiased statistical model.