Saturday, April 28, 2018

"Don't use my data" or "Stop talking about me!"

People are freaking out over Facebook and privacy issues on social media. I have a contrarian take on this issue. When people say things like “Don’t use my data” and “Don’t violate my privacy” I hear something more like “Don’t talk about me, even if you’re discussing relevant information.”

I totally understand the creep-out factor. The idea that somebody is collecting information about me and potentially finding ways to use it against me is unsettling. I’m also sympathetic to “big brother” concerns, the notion that government agencies are using network analysis on, say, Facebook friend lists to identify networks of drug dealers. Obviously if this data is being used to help government do things that government should not be doing, that’s a problem. And of course companies should hold themselves accountable to their user agreements. If they promise to keep your data private to some standard they’d darn well better do it or they’re in violation of contract. And finally, there are obviously some reasonable expectations about privacy. If someone is snooping through your private e-mails looking for compromising information or setting up video cameras around (or inside!) your house, that’s a privacy violation by any reasonable standard. (On the other hand, what about a single e-mail leaked deliberately because the compromising information in it is relevant? What about incidental camera phone video that happened to catch something incriminating through your window?) [This paragraph is my big hedge. If you get to the end and you're mad at me and thinking "Yeah, but...", please re-read and see if I acknowledge/admitted your point already.]

That said, a lot of the information that exists about you is fair game. Is Jimmy friends with Johnny? Does Jimmy “own” that information? Does Johnny? If third parties are discussing whether Jimmy and Johnny are friends, are they “taking” information that belongs to one of the two parties? Not really. They’re merely talking about them. Whether third parties are talking about Jimmy and Johnny because they want to direct targeted ads more effectively (“If they are friends, I should send this ad to both because it increases the odds that a purchase will be made.”) or because a link between them compromises them (“We have a witness’s description roughly matching Jimmy and one matching Johnny; if they are friends it dramatically increases the odds that they were both accomplices in a crime.”), it seems that the knowledge of this connection is just information and doesn’t particularly “belong” to anyone.

Let’s face it: You form negative opinions of people based on their social media posts. The friend with an itchy trigger finger for the “share” button. The friend with despicable political posts. The friend who airs dirty laundry about family members, right there for all the world to see. You probably get fairly accurate impressions of people’s intelligence, conscientiousness, impulsiveness, manners, and so on, and certainly some of these opinions are negative. Is it unfair for you to judge people based on the information you know about them? (That last sentence strikes me as "Is it unfair of you to be a human being?") Are you using their data “against” them? What if you use these opinions to decide not to socialize with someone? Or one of them asks you for a loan or asks you to do some business venture with them? Are you supposed to ignore your impressions of this person’s character?

Now suppose you’re hiring someone to do some kind of work inside your home. A cleaner, a contractor, maybe a babysitter or nanny. You’d snoop, wouldn’t you? You’d Google and Facebook search their names. If you found something compromising, you’d consider it relevant. And here I'm not just saying, "You'd be tempted to cheat on a well-justified rule when it benefits you personally." We can all agree that norms and laws against theft should exist, even if I can concoct some silly hypothetical where you are tempted to steal. No, my point here is that it's good to snoop. It's good to keep people with questionable ethics from entering your home. It's good to keep people with impulse control problems and criminal histories away from your children. For some of these kinds of decisions, it's probably better if we as a society have "pro-snooping" rules and norms. We want to identify risk factors for various social problems so we can prevent them. This can mean unfairly flagging someone as high-risk just because on paper they look similar to other high-risk individuals, even though that particular person will not cause any problems. It's unfair to some individuals. But it means that society gets less child molestation, because only low-risk individuals are put in charge of small children. Or it means that society as a whole has functioning financial markets, because people who are likely to repay their debts can borrow money easily, and people who aren't likely to repay either don't get to borrow money or can only borrow at high interest rates that make the loan profitable.

I fully understand why this makes people seethe with indignation. "How dare someone use my information against me? How dare they snoop and use their ill-gotten findings in a way that compromises me?" But it's not really your information. The entities you are dealing with (potential employers, creditors, insurers) haven't taken anything from you. They are merely talking about you, and mostly they are discussing relevant information. 

I don't have a fully worked out theory of what should be private/protected information and what should not be. I like the idea that people with compromising histories can get a second chance at life. Then again I also think the right kind of data mining can identify those who will benefit most from a second chance. The solution here could be more snooping and more data crunching. We as a society would get more second chances, but at the expense of permanently marking others as irredeemable. I also like the idea that certain kinds of information are below a threshold and get ignored. (Low-level, infrequent swearing and occasional references to alcohol on social media don't register at all, but if they happen above some threshold they start to become relevant.) But, once again, a good enough dataset will allow one to determine this threshold empirically. The learning algorithm will itself tell you that the information isn't relevant...or perhaps that it is.

______________________

I wanted to say something about big, faceless corporations being different from individuals. We might sympathize with the snooping parent who is selecting a safe nanny for their children, but not with the big insurance company with millions of customers, or the mega-bank that holds millions of mortgages. Is it a big deal to tell them "no snooping?" Isn't there a different standard here because they are so big and faceless? I say no. As bureaucratic and automated and mathematical as the risk-pricing is for banks and insurers, these institutions are on the hook for you. Buying an insurance policy is saying, "Will you be on the hook for hundreds of thousands of dollars if I screw up and mangle someone with my car?" Buying a house is a similar deal. Imagine pleading with an individual to write you such a contract. They'd say something like, "Sure maybe, but I'm going to ask around about you so's I can vet you first." They'll ask former creditors if you'd repaid your loans, or ask former insurers if you'd made a ton of claims (or perhaps never made one). I think most people would think this kind of "snooping" is basically fair, and I don't really think changing the intermediary from an individual to a big company changes the example in any relevant way. If anything, the big company will make more accurate judgments based on a much larger data set; the individual will make bad choices based on a small sample of former dealings with customers/borrowers.

No comments:

Post a Comment