Categories: FAANG

Announcing Rekogniton Custom Moderation: Enhance accuracy of pre-trained Rekognition moderation models with your data

ml 15641 1

Companies increasingly rely on user-generated images and videos for engagement. From ecommerce platforms encouraging customers to share product images to social media companies promoting user-generated videos and images, using user content for engagement is a powerful strategy. However, it can be challenging to ensure that this user-generated content is consistent with your policies and fosters a safe online community for your users.

Many companies currently depend on human moderators or respond reactively to user complaints to manage inappropriate user-generated content. These approaches don’t scale to effectively moderate millions of images and videos at sufficient quality or speed, which leads to a poor user experience, high costs to achieve scale, or even potential harm to brand reputation.

In this post, we discuss how to use the Custom Moderation feature in Amazon Rekognition to enhance the accuracy of your pre-trained content moderation API.

Content moderation in Amazon Rekognition

Amazon Rekognition is a managed artificial intelligence (AI) service that offers pre-trained and customizable computer vision capabilities to extract information and insights from images and videos. One such capability is Amazon Rekognition Content Moderation, which detects inappropriate or unwanted content in images and videos. Amazon Rekognition uses a hierarchical taxonomy to label inappropriate or unwanted content with 10 top-level moderation categories (such as violence, explicit, alcohol, or drugs) and 35 second-level categories. Customers across industries such as ecommerce, social media, and gaming can use content moderation in Amazon Rekognition to protect their brand reputation and foster safe user communities.

By using Amazon Rekognition for image and video moderation, human moderators have to review a much smaller set of content, typically 1–5% of the total volume, already flagged by the content moderation model. This enables companies to focus on more valuable activities and still achieve comprehensive moderation coverage at a fraction of their existing cost.

Introducing Amazon Rekognition Custom Moderation

You can now enhance the accuracy of the Rekognition moderation model for your business-specific data with the Custom Moderation feature. You can train a custom adapter with as few as 20 annotated images in less than 1 hour. These adapters extend the capabilities of the moderation model to detect images used for training with higher accuracy. For this post, we use a sample dataset containing both safe images and images with alcoholic beverages (considered unsafe) to enhance the accuracy of the alcohol moderation label.

The unique ID of the trained adapter can be provided to the existing DetectModerationLabels API operation to process images using this adapter. Each adapter can only be used by the AWS account that was used for training the adapter, ensuring that the data used for training remains safe and secure in that AWS account. With the Custom Moderation feature, you can tailor the Rekognition pre-trained moderation model for improved performance on your specific moderation use case, without any machine learning (ML) expertise. You can continue to enjoy the benefits of a fully managed moderation service with a pay-per-use pricing model for Custom Moderation.

Solution overview

Training a custom moderation adapter involves five steps that you can complete using the AWS Management Console or the API interface:

Create a project
Upload the training data
Assign ground truth labels to images
Train the adapter
Use the adapter

Let’s walk through these steps in more detail using the console.

Create a project

A project is a container to store your adapters. You can train multiple adapters within a project with different training datasets to assess which adapter performs best for your specific use case. To create your project, complete the following steps:

On the Amazon Rekognition console, choose Custom Moderation in the navigation pane.
Choose Create project.

For Project name, enter a name for your project.
For Adapter name, enter a name for your adapter.
Optionally, enter a description for your adapter.

Upload training data

You can begin with as few as 20 sample images to adapt the moderation model to detect fewer false positives (images that are appropriate for your business but are flagged by the model with a moderation label). To reduce false negatives (images that are inappropriate for your business but don’t get flagged with a moderation label), you are required to start with 50 sample images.

You can select from the following options to provide the image datasets for adapter training:

Import a manifest file with labeled images as per the Amazon Rekognition content moderation taxonomy.
Import images from an Amazon Simple Storage Service (Amazon S3) bucket and provide labels. Make sure that the AWS Identity and Access Management (IAM) user or role has the appropriate access permissions to the specified S3 bucket folder.
Upload images from your computer and provide labels.

Complete the following steps:

For this post, select Import images from S3 bucket and enter your S3 URI.

Like any ML training process, training a Custom Moderation adapter in Amazon Rekognition requires two separate datasets: one for training the adapter and another for evaluating the adapter. You can either upload a separate test dataset or choose to automatically split your training dataset for training and testing.

For this post, select Autosplit.
Select Enable auto-update to ensure that the system automatically retrains the adapter when a new version of the content moderation model is launched.
Choose Create project.

Assign ground truth labels to images

If you uploaded unannotated images, you can use the Amazon Rekognition console to provide image labels as per the moderation taxonomy. In the following example, we train an adapter to detect hidden alcohol with higher accuracy, and label all such images with the label alcohol. Images not considered inappropriate can be labeled as Safe.

Train the adapter

After you label all the images, choose Start training to initiate the training process. Amazon Rekognition will use the uploaded image datasets to train an adapter model for enhanced accuracy on the specific type of images provided for training.

After the custom moderation adapter is trained, you can view all the adapter details (adapterID, test and training manifest files) in the Adapter performance section.

The Adapter performance section displays improvements in false positives and false negatives when compared to the pre-trained moderation model. The adapter we trained to enhance the detection of the alcohol label reduces the false negative rate in test images by 73%. In other words, the adapter now accurately predicts the alcohol moderation label for 73% more images compared to the pre-trained moderation model. However, no improvement is observed in false positives, as no false positive samples were used for training.

Use the adapter

You can perform inference using the newly trained adapter to achieve enhanced accuracy. To do this, call the Amazon Rekognition DetectModerationLabel API with an additional parameter, ProjectVersion, which is the unique AdapterID of the adapter. The following is a sample command using the AWS Command Line Interface (AWS CLI):

aws rekognition detect-moderation-labels 
--image 'S3Object={Bucket="<bucket>",Name="<key>"}' 
--project-version <ARN of the Adapter> 
--region us-east-1

The following is a sample code snippet using the Python Boto3 library:

import boto3
client = boto3.client('rekognition')
response = client.detect_moderation_labels(
    Image={
        "S3Object":{
            "Bucket":"<bucket>",
            "Name":"<key>"
        }
    }, 
    ProjectVersion="<ARN of the Adapter>"
)

Best practices for training

To maximize the performance of your adapter, the following best practices are recommended for training the adapter:

The sample image data should capture the representative errors that you want to improve the moderation model accuracy for
Instead of only bringing in error images for false positives and false negatives, you can also provide true positives and true negatives for improved performance
Supply as many annotated images as possible for training

Conclusion

In this post, we presented an in-depth overview of the new Amazon Rekognition Custom Moderation feature. Furthermore, we detailed the steps for performing training using the console, including best practices for optimal results. For additional information, visit the Amazon Rekognition console and explore the Custom Moderation feature.

Amazon Rekognition Custom Moderation is now generally available in all AWS Regions where Amazon Rekognition is available.

Learn more about content moderation on AWS. Take the first step towards streamlining your content moderation operations with AWS.

About the Authors

Shipra Kanoria is a Principal Product Manager at AWS. She is passionate about helping customers solve their most complex problems with the power of machine learning and artificial intelligence. Before joining AWS, Shipra spent over 4 years at Amazon Alexa, where she launched many productivity-related features on the Alexa voice assistant.

Aakash Deep is a Software Development Engineering Manager based in Seattle. He enjoys working on computer vision, AI, and distributed systems. His mission is to enable customers to address complex problems and create value with AWS Rekognition. Outside of work, he enjoys hiking and traveling.

Lana Zhang is a Senior Solutions Architect at AWS WWSO AI Services team, specializing in AI and ML for Content Moderation, Computer Vision, Natural Language Processing and Generative AI. With her expertise, she is dedicated to promoting AWS AI/ML solutions and assisting customers in transforming their business solutions across diverse industries, including social media, gaming, e-commerce, media, advertising & marketing.

How Amazon Shopping uses Amazon Rekognition Content Moderation to review harmful images in product reviews

August 16, 2023

In "FAANG"

Generate videos in Gemini and Whisk with Veo 2

Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated clips.

April 17, 2025

In "FAANG"

Video generation models as world simulators

We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Our largest model, Sora, is capable…

February 16, 2024

In "FAANG"

AI Generated Robotic Content