tag:blogger.com,1999:blog-7037761021913273375.post4196292302292347448..comments2023-02-13T20:27:45.657-08:00Comments on GrokInFullness: Propublica’s “Machine Bias” Article is Incredibly BiasedJubal Harshawhttp://www.blogger.com/profile/11196096815699469262noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-7037761021913273375.post-53762036801509981732017-04-27T06:51:33.323-07:002017-04-27T06:51:33.323-07:00@ JH.
Thanks for your comment.
Did you read the t...@ JH.<br />Thanks for your comment. <br />Did you read the technical write-up (the second link in my post above)? It explains how the analysis was done, and it was clear from this that the model wasn’t biased. A bias would mean that for some identifiable subgroup, the model overstates the chance of recidivism (and likewise understates it for those not in the subgroup). It doesn’t have to be race, either. Suppose the model accurately predicted that, say, younger offenders are more likely to recidivate than older offenders (or males vs females, or people with multiple priors vs people with 1 or no priors). Even if the model was unbiased (the predicted recidivism rate was approximately equal to the actual observed rate, as is the case for the dataset Propublica was using), you’d see the same kind of “bias” that Propublica found. The high-recidivism population will *always* have more “false positives” and the low-recidivism population will always have more “false negatives” as calculated in the Propublica piece. See my discussion in the paragraph above and below the data table. They could have written an equally compelling piece arguing that the model too easily lets females off the hook, but identifies too many non-recidivating males as high-risk. It’s not fair to call this a “bias,” though, because this pattern will emerge from a perfectly unbiased model. Indeed you’d have to bias your model to avoid this issue of disparities in false positives/false negatives. <br /><br />Open up an Excel workbook and create some completely fictitious sample data (as in the data table above) to convince yourself of this. You'll see that the high-recidivism group has more false positives, etc. I had scratched out a blog post expanding on this point but never shared. Maybe I’ll have to dust it off and consider posting it. Jubal Harshawhttps://www.blogger.com/profile/11196096815699469262noreply@blogger.comtag:blogger.com,1999:blog-7037761021913273375.post-82799500938762888612017-04-26T14:57:40.838-07:002017-04-26T14:57:40.838-07:00I think the main issue they pointed out here was t...I think the main issue they pointed out here was that the algorithm was more likely to have an error in favor of a white defendant, and more likely to have an error not in favor of a black defendant. I think the error was like 15 percent. So if a black and a white defendant where to each have a 50 percent chance of reoffending, the algorithm would give the white defendant a 35 percent chance and the black defendant a 65 percent chance on average. Thats bad. I also don't think the article claimed the algorithm was less accurate or bias than a judge, they didn't attempt to measure that.Dan Henrihttps://www.blogger.com/profile/12894848712325272253noreply@blogger.com