How to Reduce Bias in AI with a Focus on Training Data

Top Eight Ways to Overcome and Prevent AI Bias

Algorithmic bias in AI is a pervasive problem. You can likely recall biased algorithm examples in the news, such as speech recognition not being able to identify the pronoun “hers” but being able to identify “his” or face recognition software being less likely to recognize people of color. While entirely eliminating bias in AI is not possible, it’s essential to know not only how to reduce bias in AI, but actively work to prevent it. Knowing how to mitigate bias in AI systems stems from understanding the training data sets that are used to generate and evolve models. In our 2020 State of AI and Machine Learning Report, only 15% of companies reported data diversity, bias reduction, and global scale for their AI as “not important.” While that’s great, only 24% reported unbiased, diverse, global AI as mission-critical. This means that numerous companies still need to make a true commitment to overcoming bias in AI, which is not only indicative of success, but critical in today’s context. Because AI algorithms are meant to intervene where human biases exist, they’re often thought to be unbiased. It’s important to remember that these machine learning models are written by people and trained on socially generated data. This poses the challenge and risk of introducing and amplifying existing human biases into models, preventing AI from truly working for everyone.

Examples of Bias in AI

Even major companies can run into challenging situations when it comes to bias, with often consequences for their reputation and for their end users.

Facial Recognition

Take the facial recognition example: researchers reviewed software from several top companies and discovered that these popular algorithms produced as much as a 34% higher error rate when identifying darker-skinned women versus lighter-skinned men. The implications for this type of bias are vast depending on where and when facial recognition is applied.

Speech Recognition

If you’ve used a voice-to-text service or a voice-commanded virtual assistant, you’ve interacted with speech recognition AI technology. Unfortunately, these algorithms still have more trouble understanding women than men. The data used to train these models tends to feature more male representation than women (or people of color), resulting in poorer accuracy rates for the latter. From a purely financial perspective, this impacts purchasing decisions because, after all, who would want to purchase technology that doesn’t understand them? This is why using training data that represents all of your end users is important.

Bank Loans

Another example of bias in AI is in banking. Some banks use lending algorithms to evaluate a potential borrower’s financials and determine their creditworthiness. But imagine the algorithm is trained on historical data without attention to bias: that algorithm may learn that men are more creditworthy than women because historically, more men were given loans than women due to societal biases. (In fact, a real-life story illustrates this: a man and his wife sent in two identical applications to their banking institution; the algorithm approved the man’s loan and rejected the woman’s.) Without paying attention to who’s represented in your data, you’re at risk for producing an AI solution that doesn’t work fairly for everyone. It’s significant to note that companies don’t normally set out with the intention to produce biased models. Bias can introduce itself accidentally at many stages of the model build and post-deployment process. For that reason, it’s crucial to stay vigilant in mitigating bias throughout your project.

Eight Steps on How to Reduce Bias in AI

Responsible and successful companies must know how to reduce bias in AI, and proactively turn to their training data to do it. To minimize bias, monitor for outliers by applying statistics and data exploration. At a basic level, AI bias is reduced and prevented by comparing and validating different samples of training data for representativeness. Without this bias management, any AI initiative will ultimately fall apart. Here are eight ways you can prevent AI bias from creeping into your models. AI Bias

Eight Steps on How to Reduce Bias in AI

Define and narrow the business problem you’re solving Trying to solve for too many scenarios often means you’ll need a ton of labels across an unmanageable number of classes. Narrowly defining a problem, to start, will help you make sure your model is performing well for the exact reason you’ve built it.
Structure data gathering that allows for different opinions There are often multiple valid opinions or labels for a single data point. Gathering those opinions and accounting for legitimate, often subjective, disagreements will make your model more flexible
Understand your training data Both academic and commercial datasets can have classes and labels that introduce bias into your algorithms. The more you understand and own your data, the less likely you are to be surprised by objectionable labels. Check also that your data represents the full diversity of your end users. Are all of your potential use cases covered in the data you’ve collected? If not, you may need to find additional data sources.
Gather a diverse ML team that asks diverse questions We all bring different experiences and ideas to the workplace. People from diverse backgrounds –race, gender, age, experience, culture, etc. – will inherently ask different questions and interact with your model in different ways. That can help you catch problems before your model is in production.
Think about all of your end-users Likewise, understand that your end-users won’t simply be like you or your team. Be empathetic. Acknowledge the different backgrounds, experiences, and demographics of your end users. Avoid AI bias by learning to anticipate how people who aren’t like you will interact with your technology and what problems might arise in their doing so.
Annotate with diversity The more spread out the pool of human annotators, the more diverse your viewpoints. That can really help reduce bias both at the initial launch and as you continue to retrain your models. One option is to source from a global crowd of annotators, who can not only provide a difference of perspectives, but also support a variety of languages, dialects, and geographically-specific content.
Test and deploy with feedback in mind Models are rarely static for their entire lifetime. A common, but major, mistake is deploying your model without a way for end-users to give you feedback on how the model is applying in the real world. Opening up a discussion and forum for feedback will continue to ensure your model is maintaining optimal performance levels for everyone.
Have a concrete plan to improve your model with that feedback You’ll want to continually review your model using not just customer feedback, but also independent people auditing for changes, edge cases, instances of bias you might’ve missed, and more. Make sure you get feedback from your model and give it feedback of your own to improve its performance, constantly iterating toward higher accuracy.

A Future Outlook on the AI Bias Problem

Our State of AI and Machine Learning Report found that about 85% of companies surveyed think reducing bias in their AI efforts is important to at least some degree. While the hope is that more organizations will come to understand bias mitigation as not just important, but mission-critical, the dialogue on responsible AI has certainly gained prominence among AI discussions. Technology companies are working toward hiring more women and people of color (although they still remain vastly underrepresented), which should help produce higher-performing, more inclusive AI models. To complement these efforts, organizations should develop an AI governance framework that include approaches and policies with regards to bias and responsible AI. A comprehensive framework that incorporates elements of the eight steps described above will drive a greater commitment to diversity and ultimately, result in less-biased final products.

How to Reduce Bias in AI With Appen

At Appen, we have spent the last 20+ years annotating data, leveraging our diverse crowd to help ensure you can confidently deploy your AI models. We can help you avoid AI bias that lands you on the list of biased algorithm examples by not only supplying you with a platform with over one million crowd members from 130 countries, but we can also set you up with our managed service team of experts to produce the best training data for your AI models.

How to Reduce Bias in AI

Top Eight Ways to Overcome and Prevent AI Bias

Examples of Bias in AI

Facial Recognition

Speech Recognition

Bank Loans

Eight Steps on How to Reduce Bias in AI

Eight Steps on How to Reduce Bias in AI

A Future Outlook on the AI Bias Problem

How to Reduce Bias in AI With Appen

More Articles Like This

Blog

The Impending Data Crisis in the AI Economy

Blog

Deciphering AI from Human Generated Text: The Behavioral Approach

Blog

Building AI We Can Trust

Blog

Appen and the UNGC: Defining Sustainability and Ethics in the AI Era

Blog

How the Human Element Balances AI and Contributor Efforts for Optimal Outcomes

Blog

Appen's Benchmarking Solution: Confidently Choosing the Right LLM for Your Application

Request a Consult

Request a consult