Inclusive AI – Search Results: Search Engines Rely on Data Annotators to Improve Content Relevance

As part of our Inclusive AI series, today we’re focusing on how annotators contribute to relevant search results. Companies of all types use search to assist users with accessing products or services quickly. A banking organization may have a search bar for you to ask questions like “how do I get a new credit card?”, a retailer will have a search function for you to find specific products, and even a government website may let you search for topics you’re interested in.

When you type input into a search bar, you expect to get highly relevant results in return. With more internet users than ever across the globe, companies are relying on AI to personalize search results to each user’s geography, demography, and/or any other applicable characteristics. Without accurately-labeled data from annotators that represent these characteristics, AI-powered search would struggle to output the appropriate results for each individual query.

Search, Data Annotation, and AI Explained

Let’s look at a search example to understand this further. Imagine you’re browsing the site of a major retailer and you’re looking to purchase a long-sleeved shirt that buttons down the front—one you could wear to an interview or to an office. You might type into the search bar “button-down shirt”, but someone else may type in a different term even though they’re looking for the exact same thing. In our shirt example, there are many different terms you could use:

  • dress shirt
  • formal shirt
  • long-sleeved shirt
  • shirt for work

And so on. Terms for the same item can vary widely by the searcher’s location and experiences. If I’m in India, for instance, I might search for capsicum, but if I’m in the U.S. I’d use the term bell pepper. An AI-powered search engine needs to know all of the potential terms a user may input, and deliver accurate and relevant results in response.

This is why it’s so important to have a diverse group of data annotators supporting a search engine. Each annotator will bring their own perspective to search terms, offering new insights that their counterparts around the globe may not have. Especially when it comes to a general search engine, such as Microsoft’s Bing, or Google’s, this diversity is critical to cover all of the permutations, or versions, of each item. Otherwise, your search for “dress shirt” may lead to the dreaded “no results” page and you may no longer want to use that search engine.

In our shirt example, annotators can label images of shirts with whatever term is most familiar to them. A large volume of annotators spread out in different geographic locations have a much better chance of capturing all of the possible shirt terms than a small group based in one locale. Companies, then, must rely on annotators to help them deliver relevant, inclusive search results for all of their end users.

Inclusive AI: Search Results

Search Projects Worth Highlighting

Many clients come to Appen needing assistance with improving their search engine algorithms, ultimately making them more useful to end users. These highlighted projects are just a couple examples where having a diverse, geographically spread out crowd of annotators made a huge difference in the outcome:

Microsoft’s Bing Improves Search Quality

Microsoft’s Bing search engine relies on search results that are accurate, comprehensive, and relevant to the user’s query. Microsoft continually works on improving the quality of Bing’s search results, and has relied on Appen for help in doing so. At the start of the partnership, our annotators assisted in a trial project just for the U.S. market that involved rating search results as high or low quality as well as on other key metrics. This data helped train the search engine on which results were more important to display at the top of the list.

Microsoft wanted Bing’s search results to be culturally relevant as well. To make that happen, they needed annotators from over a dozen worldwide markets. These annotators would rate the search results based on how culturally relevant they were to their location. Today, our annotators continue to process millions of pieces of data each month to continue improving the quality of Bing.

Leading Search Engine Updates Local Content

A leading multilingual search engine wanted Appen’s help in improving the quality of its local business listings, which included addresses, phone numbers, hours of operation, maps, and directions. As more businesses start using the internet, it was difficult for this search engine to keep up-to-date on the accuracy of this information using just an in-house team. It was crucial to maintain accuracy in order to retain user confidence in the search engine.

The client leveraged Appen’s in-market annotators to review and verify the local listing data. With just ten annotators at the start, the project quickly expanded to include hundreds of annotators across 31 markets. It was essential to have annotators in each market, as sometimes verifying the business information would require in-person visits. In addition, the use of our global crowd overcame the time-zone, language, and cultural differences that the in-house team was struggling with. Ultimately, our annotators updated hundreds of thousands of business listings for the search engine.

For more details, read the full case study.

The Future of Search and Data Annotators

Search engines will continue to need to appeal to more markets as internet usage grows. As an annotator, your local expertise is of vital importance to maximizing the relevance of search results. This is true across industries: many websites feature search capabilities, whether they sell products, are a social media platform, or offer any other number of services. Companies will increasingly rely on annotators around the world to represent their end users, and this is a great thing: more representation leads to AI that’s more inclusive and works better for everyone.

If you’re interested in learning more about search from a technical perspective, read this article on AI-powered search relevance.

Confidence to Deploy AI with World-Class Training Data
Website for deploying AI with world class training data
Language