I totally understand the creep-out factor. The idea that
somebody is collecting information about me and potentially finding ways to use
it against me is unsettling. I’m also sympathetic to “big brother” concerns,
the notion that government agencies are using network analysis on, say,
Facebook friend lists to identify networks of drug dealers. Obviously if this
data is being used to help government do things that government should not be
doing, that’s a problem. And of course companies should hold themselves accountable
to their user agreements. If they promise to keep your data private to some
standard they’d darn well better do it or they’re in violation of contract. And
finally, there are obviously some reasonable expectations about privacy. If
someone is snooping through your private e-mails looking for compromising
information or setting up video cameras around (or inside!) your house, that’s a
privacy violation by any reasonable standard. (On the other hand, what about a
single e-mail leaked deliberately because the compromising information in it is
relevant? What about incidental camera phone video that happened to catch
something incriminating through your window?) [This paragraph is my big hedge. If you get to the end and you're mad at me and thinking "Yeah, but...", please re-read and see if I acknowledge/admitted your point already.]
That said, a lot of
the information that exists about you is fair game. Is Jimmy friends with
Johnny? Does Jimmy “own” that information? Does Johnny? If third parties are
discussing whether Jimmy and Johnny are friends, are they “taking” information
that belongs to one of the two parties? Not really. They’re merely talking about them.
Whether third parties are talking about Jimmy and Johnny because they want to
direct targeted ads more effectively (“If they are friends, I should send this
ad to both because it increases the odds that a purchase will be made.”) or
because a link between them compromises them (“We have a witness’s description roughly
matching Jimmy and one matching Johnny; if they are friends it dramatically
increases the odds that they were both accomplices in a crime.”), it seems that
the knowledge of this connection is just information and doesn’t particularly “belong”
to anyone.
Let’s face it: You form negative opinions of people based on
their social media posts. The friend with an itchy trigger finger for the “share”
button. The friend with despicable political posts. The friend who airs dirty
laundry about family members, right there for all the world to see. You probably
get fairly accurate impressions of people’s intelligence, conscientiousness,
impulsiveness, manners, and so on, and certainly some of these opinions are
negative. Is it unfair for you to judge people based on the information you know about them? (That last sentence strikes me as "Is it unfair of you to be a human being?") Are you using their data “against” them? What if
you use these opinions to decide not to socialize with someone? Or one of them
asks you for a loan or asks you to do some business venture with them? Are you
supposed to ignore your impressions of this person’s character?
Now suppose you’re hiring someone to do some kind of work
inside your home. A cleaner, a contractor, maybe a babysitter or nanny. You’d
snoop, wouldn’t you? You’d Google and Facebook search their names. If you found something compromising,
you’d consider it relevant. And here I'm not just saying, "You'd be tempted to cheat on a well-justified rule when it benefits you personally." We can all agree that norms and laws against theft should exist, even if I can concoct some silly hypothetical where you are tempted to steal. No, my point here is that it's good to snoop. It's good to keep people with questionable ethics from entering your home. It's good to keep people with impulse control problems and criminal histories away from your children. For some of these kinds of decisions, it's probably better if we as a society have "pro-snooping" rules and norms. We want to identify risk factors for various social problems so we can prevent them. This can mean unfairly flagging someone as high-risk just because on paper they look similar to other high-risk individuals, even though that particular person will not cause any problems. It's unfair to some individuals. But it means that society gets less child molestation, because only low-risk individuals are put in charge of small children. Or it means that society as a whole has functioning financial markets, because people who are likely to repay their debts can borrow money easily, and people who aren't likely to repay either don't get to borrow money or can only borrow at high interest rates that make the loan profitable.
I fully understand why this makes people seethe with indignation. "How dare someone use my information against me? How dare they snoop and use their ill-gotten findings in a way that compromises me?" But it's not really your information. The entities you are dealing with (potential employers, creditors, insurers) haven't taken anything from you. They are merely talking about you, and mostly they are discussing relevant information.
I fully understand why this makes people seethe with indignation. "How dare someone use my information against me? How dare they snoop and use their ill-gotten findings in a way that compromises me?" But it's not really your information. The entities you are dealing with (potential employers, creditors, insurers) haven't taken anything from you. They are merely talking about you, and mostly they are discussing relevant information.
I don't have a fully worked out theory of what should be private/protected information and what should not be. I like the idea that people with compromising histories can get a second chance at life. Then again I also think the right kind of data mining can identify those who will benefit most from a second chance. The solution here could be more snooping and more data crunching. We as a society would get more second chances, but at the expense of permanently marking others as irredeemable. I also like the idea that certain kinds of information are below a threshold and get ignored. (Low-level, infrequent swearing and occasional references to alcohol on social media don't register at all, but if they happen above some threshold they start to become relevant.) But, once again, a good enough dataset will allow one to determine this threshold empirically. The learning algorithm will itself tell you that the information isn't relevant...or perhaps that it is.
______________________
I wanted to say something about big, faceless corporations being different from individuals. We might sympathize with the snooping parent who is selecting a safe nanny for their children, but not with the big insurance company with millions of customers, or the mega-bank that holds millions of mortgages. Is it a big deal to tell them "no snooping?" Isn't there a different standard here because they are so big and faceless? I say no. As bureaucratic and automated and mathematical as the risk-pricing is for banks and insurers, these institutions are on the hook for you. Buying an insurance policy is saying, "Will you be on the hook for hundreds of thousands of dollars if I screw up and mangle someone with my car?" Buying a house is a similar deal. Imagine pleading with an individual to write you such a contract. They'd say something like, "Sure maybe, but I'm going to ask around about you so's I can vet you first." They'll ask former creditors if you'd repaid your loans, or ask former insurers if you'd made a ton of claims (or perhaps never made one). I think most people would think this kind of "snooping" is basically fair, and I don't really think changing the intermediary from an individual to a big company changes the example in any relevant way. If anything, the big company will make more accurate judgments based on a much larger data set; the individual will make bad choices based on a small sample of former dealings with customers/borrowers.