AI Bias: Good intentions can lead to nasty results

AI Bias: Good intentions can lead to nasty results

Price:

Read more

 Why fairness through unawareness is a pretty idea with ugly consequences


AI isn’t magic. Whatever “good judgment” it appears to have is either pattern recognition or safety nets built into it by the people who programmed it, because AI is not a person, it’s a pattern-finding thing-labeler. When you build an AI solution, always remember that if it passes your launch tests, you’ll get what you asked for, not what you hoped you were asking for. AI systems are made entirely out of patterns in the examples we feed them and they optimize for the behaviors we tell them to optimize for.

AI is not a person, it’s a pattern-finding thing-labeler.

So don’t be surprised when the system uses the patterns sitting right there in your data. Even if you tried your best to hide them…


Policy layers and reliability

If you care about AI safety, you’ll insist that every AI-based system should have policy layers built on top of it. Think of policy layers as the AI version of human etiquette.

Policy layers are the AI equivalent to human etiquette.

I happen to be aware of some very pungent words across several languages, but you don’t hear me uttering them in public. That’s not because they fail to occur to me. It’s because I’m filtering myself. Society has taught me good(ish) manners. Luckily for all of us there’s an equivalent fix for AI… that’s exactly what policy layers are. A policy layer is a separate layer of logic that sits on top of the ML/AI system. It’s a must-have AI safety net that checks the output, filters it, and determines what to do with it.

After all, if it turns out that your AI system emits some egregious edge case behavior that starts rolling in the direction of a viral PR nightmare as the rest of the internet retweets it, with more and more people prodding your system into embarrassing itself (or worse), wouldn’t it be nice to be able to disable those outputs instantly? Well, without a policy layer, the best your AI engineering team can offer is a plan to train a better version of the model… by next quarter, maybe? Thank goodness for policy layers!

Policy layers and AI bias

But there’s another side to policy layers which isn’t only about reliability and safety. They’re also very useful for preventing an AI system from emitting output that harmfully perpetuates nasty biases. When the world you live in is different from the world you’re trying to build, there’s something to be said for not blurting out whatever is in front of your nose.


Policy layers are the most ironclad implementation of good manners for a machine.

The trouble is that there might be all kinds of ugly things in data generated from human activity and if this dataset is a textbook for your AI to learn from, it shouldn’t surprise you that the AI system will learn to perpetuate (and perhaps even accentuate) attitudes we’ve collectively decided we’d prefer to leave in the past. The past, you see, is exactly where your data comes from. (Even “real-time” data is from the past, albeit a millisecond scale past.) Luckily, policy layers can be used disallow a whole host of behaviors, including those that keep us mired in a history that has no place in our tomorrows.
Fixing the symptoms versus the disease
I bet that if you appreciate what a policy later really is — a very simple piece of filtering logic that does not need to be retrained when you update your AI system — you couldn’t have helped but notice that using a policy layer to tackle AI bias only mitigates the symptoms, not the underlying disease.

What if you crave something that feels like a real solution? Good for you! True solutions include fundamentally improving your training approach, algorithm, data preparation, objective function, logging, or performance metric. Or, if you’re ambitious, making the world a better place so that reflecting its unfiltered reality is less awful. The trouble with these real fixes is that they take a long time to implement. In the meantime, maybe you’ll reach for a quick fix. Your obvious options are “fairness through unawareness” and policy layers.

True fixes take a long time to implement.

Fairness through unawareness is the knee jerk reaction of selectively deleting information that offends you in order to prevent your system from using it. This is a pretty idea… that rarely works in practice.

And yet, people keep trying to lean on it. Let’s see why the approach of sheltering your system from undesirable data is as naïve as thinking your kid will never learn naughty words if you don’t utter them at home.

What is fairness through unawareness?

Fairness through unawareness is the attempt to mitigate AI bias by selectively removing information from a training dataset (and then perhaps adding synthetic data to replace what you’ve removed). In the broadest strokes, there are two ways you might go about doing this:

  • Remove features.
  • Remove instances.
If you’re unfamiliar with this lingo, I’ve got a detailed explainer (with cats!) to help you brush up on your data words here. Roughly, an instance is an individual data point (e.g. in your airline’s dataset, the row with your data — you ticket price, your confirmation code, your seat number, etc. all together would be an example of an instance) while a feature is something that varies from data point to data point (e.g. the column of all the frequent flyer numbers in the dataset).


As a strategy, fairness through unawareness is a lot like trying to avoid being rude by preventing yourself from learning profanity in the first place. The first problem is that when you’re absorbing information in the real world — as a human or as a machine — it’s hard to control what you’re exposed to. Secondly, if you try to solve the problem of blurting rude words at strangers by making yourself clueless about what those words mean, you’d be hampering your ability to deal effectively with reality. Thirdly, when we’re trying to do this for a machine system, who knows what stimuli we forgot to add to the censorship list? If we make any mistakes in our censorship of the input data, we could create a woefully incompetent system. The much more reliable solution is manners: even if you know it, don’t %$&#ing say it out loud.

Let’s take a closer look at the two unawareness strategies.

Removing features

“I don’t want my model to know about the existence of this demographic attribute.” If your strategy relies on removing features, uh-oh. Just uh-oh.

The feature deletion strategy is a well-intentioned policymaker’s request that you, for example, “don’t use any personal demographic information for training an AI system.” I love how it sounds. Don’t discriminate based on personal characteristics like race, age, and gender. For sure! Something we can all get behind, right?

Well, here’s the thing. I dislike this approach precisely because I dislike discrimination and the harm it causes. You heard me.

If you truly want the world to be a better place and you hate the idea of harming groups of people through discrimination, you insist on more than a nice-sounding strategy. Good intentions aren’t enough; effectiveness is everything. The strategy must work and it must actually protect the people it purports to serve.

So when regulators tell you to remove offending features, do it. I have no problem with that. What I have a problem with is that you’ll stop there. Don’t. Assume that removing the feature solves nothing, then hold yourself accountable for coming up with a real solution.

However big hearted your good intentions, what often happens when you remove an offending feature from a complex dataset is that the system simply recovers the signal you deleted by combining some other features. For example if you’re building a hiring system and you snip out the candidate gender column, an AI algorithm can find (and exploit!) the fingerprint of gender by blending other inputs. Maybe not perfectly, but enough to perpetuate insidious discrimination. You’d be stunned by information it’s possible to extract from complex datasets.

So, while you’re tempted to pat yourself on the back for being a hero, deleting the feature rarely solves the problem. The negative effects of the gender bias you tried to avert will still scream through your dataset. Getting at the true root cause is usually hard (research-level hard) and it’s a long journey to the fix; the only effect of your impatient pruning is to make the issue harder to spot, tempting you to think you solved a problem that you didn’t. But when your system is launched, we’ll see the same ugly behavior.

Creating the illusion that you did something helpful is even more toxic than doing nothing.

If anything, creating the illusion that you did something helpful is even more toxic than doing nothing. You’ll relax, you’ll go on vacation, you’ll think you’re so kind and so lovely… meanwhile your system is continuing to perpetuate all the biases of the past. The past is where data comes from, after all, and the further back it goes the more it drags you back to times you’d rather leave behind.

Wanting things to be better is not good enough. Let’s take effective actions.

To summarize, it’s possible for the information you want your system to be unaware of to be subtly spread across a blend of features. The more complex the dataset, the more you should be afraid of this; your model will cheerfully learn whatever you wanted it to be unaware of in a subtler (and harder to debug) way. Call me grumpy, but I care not a jot for pretty intentions that aren’t accompanied by results. I far prefer building a better world to sitting around wishing for one. Let’s take effective actions.

Deleting features usually doesn’t solve the bias problem.

So, while you’re working hard (and long!) to implement a true fix, use a policy layer as a more reliable temporary measure than unawareness.

Removing instances

Another way you might be tempted to deal with bias is to remove certain individual data points from your dataset or to augment your data by adding synthetic ones.

This strategy is tricky to implement well in practice and the logic is similar to how you’d think about outliers (see the video below).

When you remove data that is truly incorrect and non-informative from a dataset, the results are typically worth it (on balance). Things get better when useless garbage is tossed in the bin. But should you tamper with data that effectively represents reality? That’s a dangerous game.

A sentence that I encourage all AI practitioners to tattoo on their foreheads is this one:

“The world represented by your training data is the only world you can expect to succeed in.”

In other words, if you tamper with the training data so that it no longer looks like it came from the real world, your system might cease to be performant when you launch it. Foisting ineffective AI systems on unsuspecting users isn’t exactly my idea of good manners.

Of course, it’s possible to intentionally make tradeoffs between the system’s performance and other objectives (such as cheerleading our way towards the world we want to live in by giving up some profit from the one we’re in right now). Giving up a bit of raw classification power to protect people from harm might be exactly the kind of trade you should be making.

The idea here goes beyond AI to a choice we all make about how to live. There’s plenty of nastiness in our world. You can decide to profit by reflecting it back at everyone around you or you can be a pillar of goodness in the muck. Maybe by effortfully inspiring other through a vision of a better future, you might inspire others to be kinder too. Maybe tomorrow’s world will be a little brighter because of you. You can make that choice with your software too.

In summary, just because something is true doesn’t mean your AI system should be acting according to it. You could opt to lead by example in your software design choices, rebalancing your system’s performance towards the world you want to live in rather than the one you’re in right now. Just don’t expect this to be cheap or straightforward to pull off in practice. Safe deletion and reweighting of the training data is a research problem that is currently giving AI fairness researchers a collective headache. Maybe they’ll figure it out, but in the meantime, no matter what your balance between aspiration and reality, always, always, always build safety nets.

Hello again, policy layers.

Thanks for reading! How about a YouTube course?
If you had fun here and you’re looking for an entire applied AI course designed to be fun for beginners and experts alike, here’s the one I made for your amusement:


0 Reviews

Contact Form

Name

Email *

Message *