What Is Image Annotation and How Is It Used To Build AI Models?

How Companies Use Image Annotation to Produce High-Quality Training Data

Image annotation is the foundation behind many Artificial Intelligence (AI) products you interact with and is one of the most essential processes in Computer Vision (CV). In image annotation, data labelers use tags, or metadata, to identify characteristics of the data you want your AI model to learn to recognize. These tagged images are then used to train the computer to identify those characteristics when presented fresh, unlabeled data. Think about when you were young. At some point, you learned what a dog was. Eventually, after seeing many dogs, you started to understand the different breeds of dogs and how a dog was different from a cat or a pig. Like us, computers need many examples to learn how to categorize things. Image annotation provides these examples in a way that’s understandable for the computer. With the increased availability of image data for companies pursuing AI, the number of projects relying on image annotation has grown exponentially. Creating a comprehensive, efficient image annotation process has become increasingly important for organizations working within this area of machine learning (ML).

Applications of Image Annotation

To compile a complete list of current applications that leverage image annotation, you’d have to read through thousands of pages. For now, we’ll highlight some of the most compelling use cases across major industries.


Using drones and satellite imagery, farmers leverage AI for countless benefits: estimating crop yield, evaluating soil, and more. An exciting example of image annotation in practice is from John Deere. The company annotates camera images to differentiate between weeds and crops at a pixel-level. They then use this data to apply pesticides only on the areas where weeds are growing rather than the entire field, saving tremendous amounts of money in pesticide use each year.


Doctors are supplementing their diagnoses with AI-powered solutions. For instance, AI can examine radiology images to identify the likelihood of certain cancers being present. In one example, teams train a model using thousands of scans labeled with cancerous and non-cancerous spots until the machine can learn to differentiate on its own. While AI isn’t intended to replace doctors, it can be used as a gut-check and added accuracy for crucial health decisions.


Manufacturers are discovering that image annotation can help them capture information on inventory in their warehouses. They’re training computers to evaluate sensory image data to determine when a product is soon to be out-of-stock and needs additional units. Certain manufacturers are also using image annotation projects to monitor infrastructure within the plant. Their teams label image data of equipment, which is then used to train computers to recognize specific faults or failures, driving faster fixes and better maintenance overall.


While the finance industry is far from fully harnessing the power of image annotation projects, there are still several companies making waves in this space. Caixabank, for example, uses face recognition technology to verify the identity of customers withdrawing money from ATMs. This is done through an image annotation process known as pose-point, which maps facial features like eyes and mouth. Facial recognition offers a faster, more precise way of determining identity, reducing the potential for fraud. Image annotation is also critical for annotating receipts for reimbursement or checks to deposit via a mobile device.


Image annotation is critical for many different AI use cases. Want to use AI to deliver the right results for a specific item – such as someone searching for jeans? Image annotation is required to build a model that can look through a product catalog and serve results that the user wants. Several retailers are also piloting robots in their stores. These robots collect images of shelves to determine if a product is low or out-of-stock, indicating it needs reordering. These robots can also scan barcode images to gather product information using a process known as image transcription, one of the methods of image annotation described below.

Types of Image Annotation

types of image annotation explained There are three popular types of image annotation, and the one to select for your use case will depend on the complexity of the project. With each type, the more high-quality image data used, the more accurate the resulting AI predictions will be.


The easiest and fastest method for image annotation, classification applies only one tag to an image. For example, you might want to look through and classify a series of images of grocery store shelves and identify which ones have soda or not. This method is perfect for capturing abstract information, such as the example above, or the time of day, if cars are in a picture, or for filtering out images that don’t meet the qualification from the start. While classification is the fastest image annotation at giving a single, high-level label, it’s also the vaguest out of the three types we highlight as it doesn’t indicate where the object is within the image. [See why Shotzr anticipates identifying over 61 million images for removal from their review queue]

Object Detection

With object detection, annotators are given specific objects that they need to label in an image. So if an image is classified as having soda in it, this takes it one step further by showing where the soda is within the image, or if you’re looking specifically for where the orange soda is. There are several methods used for object detection, including techniques such as:
  • 2D Bounding Boxes: Annotators apply rectangles and squares to define the location of the target objects. This is one of the most popular techniques in the image annotation field.
  • Cuboids, or 3D Bounding Boxes: Annotators apply cubes to the target object to define the location and the depth of the object.
  • Polygonal Segmentation: When target objects are asymmetrical and don’t easily fit into a box, annotators use complex polygons to define their location.
  • Lines and Splines: Annotators identify key boundary lines and curves in an image to separate regions. For example, annotators may label the various lanes of a highway for a self-driving car image annotation project.
Because object detection allows overlap in the usage of boxes or lines, this method is still not the most precise. What it does provide is the object’s general location while still being a relatively fast annotation process.

Semantic Segmentation

Semantic segmentation solves object detection’s overlap problem by ensuring every component of an image belongs to only one class. Usually done at the pixel level, this method requires annotators to assign categories (such as a pedestrian, car, or sign) to each pixel. This helps to teach an AI model how to recognize and classify specific objects, even if they are obstructed. For example, if you have a shopping cart obstructing part of the image, semantic segmentation can be used to identify what orange soda looks like down to the pixel level so that the model will be able to recognize that it is still, in fact, orange soda. It’s worth noting that the three image annotation methods outlined above are by no means the only methods. Other types you may have heard about include those specifically used for facial recognition, an example being landmark annotation (where the annotator plots characteristics—think eyes, nose, and mouth—using pose-point annotation). Image transcription is another standard method, used when there’s multimodal information in the data—i.e., there is text in the image and it requires extraction.

How to Make Image Annotation Easier

Broadly, image annotation is difficult for many of the same reasons that building any AI model is challenging. AI requires large amounts of high-quality data to work properly (the more examples a computer can learn from, the better it will perform), a diverse team to annotate that data, and comprehensive data pipelines for execution. For many organizations, the time, money, and effort required may not be feasible. For those that don’t have the internal resources to accomplish an end-to-end image annotation project, turning to third-party vendors for assistance is a valid option. These vendors can provide the image data, annotators, tooling, and expertise to assist in such a massive endeavor. With image annotation, specifically, the images often come with a whole host of problems. The image may have poor lighting, the target object may be occluded, or parts of the image may be unrecognizable to even a human eye. Teams must make decisions on how to represent these aspects prior to beginning an image annotation project. Teams will also need to be careful about naming their labels and differentiating classes, as these factors can confuse the annotator, and ultimately the machine. Classes that are too similar, for instance, will create unnecessary confusion. In solving these problems, expect to create an AI solution with greater accuracy and speed. When done correctly and with precision, image annotation yields high-quality training data, an essential component of any effective AI model.

Insight from Appen Image Annotation Expert, Liz Otto Hamel

At Appen, we rely on our team of experts to help with image annotation projects for our customers’ machine learning tools. Liz Otto Hamel, one of our product managers, helps ensure the Appen Data Annotation Platform exceeds industry standards in providing high-quality image annotation capabilities and tooling. Liz has a background in academic research and holds a Ph.D. from Stanford University. Her best advice for evaluating and fulfilling image annotation needs include:
  • Define the scope. Begin with a clear and narrow definition of the business goals of your project. Requirements of your labeled data including annotation geometries, metadata, ontologies, and formats will stem from the business goals of the project. Using the business value to guide your image annotation project will keep things on a clear path.
  • Plan to iterate. Define an initial set of requirements for your labeled data and then run a pilot. Label a small subset of the data yourself. In iterating, you will discover edge cases that may need to be accounted for in the project requirements. It can help to work with a data labeling partner that offers tooling and expertise that covers a wide variety of annotation use cases and can adapt to fit your needs.
  • Plan to integrate. To combat data drift—changes in the types of data your model sees in the wild—you will want to build a scalable, automated training data pipeline in order to continuously train your model with new data. It can help to work with a data labeling partner that can scale rapidly as the volume of training data you need increases. The bigger the audience interacting with your model, the faster the amount of image annotation needed to keep the model fresh will also grow. It’s critical to plan for this from the outset.

What Appen Can Do For You

At Appen, our data annotation experience spans over 20 years, during which we have acquired advanced resources and expertise on the best formula for successful annotation projects. By combining our intelligent annotation platform, a team of annotators tailored for your projects, and meticulous human supervision by our AI crowd-sourcing specialists, we give you the high-quality training data you need to deploy world-class models at scale. Our text annotation, image annotation, audio annotation, and video annotation capabilities will cover the short-term and long-term demands of your team and your organization. Whatever your data annotation needs may be, our platform, our crowd, and managed service team are standing by to assist you in deploying and maintaining your AI and ML projects. Learn more about what annotation capabilities we have available to help you with your image annotation projects, or contact us today to speak with someone directly.

Website for deploying AI with world class training data