Categories: FAANG

Improving Trust in AI and Online Communities with PaLM-based Moderation

To empower developers to identify sensitive content in a rapidly changing media environment, we are excited to announce Text Moderationpowered by PaLM 2, available through the Cloud Natural Language API. Built in collaboration with Jigsaw and Google Research, Text Moderation helps organizations scan for sensitive or harmful content. Here are some examples of how the Text Moderation service can be used:

  • Brand Safety: Protect against user-generated content and publisher content that are considered not “brand safe” for the advertiser  

  • User protection: Scan for potentially offensive or harmful content

  • Generative AI risk mitigation: Help safeguard against the generation of inappropriate content in outputs from generative models

Promote brand safety

Brand safety is a set of procedures that aim to protect the reputation and trustworthiness of a brand in the digital age. One of the biggest risks to brand safety is the content that ads are associated with; if an ad appears on a website that contains content that does not conform with the sponsoring brand’s values, it can reflect poorly on the brand and organization, so it’s important for companies to identify and remove content that isn’t aligned with brand guidelines or consistent with the brand. 

Text Moderation can be used by our customers to identify content that they determine is offensive or harmful, sensitive in context, or otherwise inappropriate for their brand. Once an organization has identified this content, teams can take steps to remove it from advertising campaigns or prevent it from being associated with the brand in the future, helping ensure that advertising campaigns are effective and that the brand is associated with positive and trustworthy content.

Protect users from harmful content

Digital media platforms, gaming publishers, and online marketplaces all have a vested interest in mitigating the risks of user-generated content. They want to provide a safe and welcoming environment for their users while also maintaining an open and free exchange of ideas. Text Moderation can help them achieve this goal, using artificial neural networks to detect and remove harmful content, such as harassment or abuse. These efforts can help reduce harm, improve customer experience, and increase customer retention.

Mitigate risks of generative models

Over the last year, progress in AI has enabled software to more reliably generate text, images, and video, leading to new products and services that use machine learning, including text generators, to create content. However, with any AI content generation, there is a risk of producing offensive material, even inadvertently. 

To address this risk, we have trained and evaluated the Text Moderation service on real prompts and responses from large generative models. Text Moderation is versatile and covers a broad range of content types, making it a powerful tool for protecting users from harmful content.

Getting started with Text Moderation using the Natural Language API

Text Moderation is powered by Google’s latest PaLM 2 foundation model to identify a wide range of harmful content, including hate speech, bullying, and sexual harassment. Easy to use and integrate with existing systems, the API can be accessed from almost any programming language to return confidence scores across 16 different “safety attributes.”

Visit the Natural Language AIwebsite to give it a try and refer to the “Text Moderation” page for details. You may also try out the Text Moderation codelab here.

AI Generated Robotic Content

Recent Posts

Fine-tuning SDXL with childhood pictures → audio-reactive geometries – [Experiment]

After a deeply introspective and emotional journey, I fine-tuned SDXL using old family album pictures…

11 hours ago

Beyond Accuracy: 5 Metrics That Actually Matter for AI Agents

AI agents , or autonomous systems powered by agentic AI, have reshaped the current landscape…

11 hours ago

Apple Workshop on Reasoning and Planning 2025

Reasoning and planning are the bedrock of intelligent AI systems, enabling them to plan, interact,…

11 hours ago

MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix

Avneesh Saluja, Santiago Castro, Bowei Yan, Ashish RastogiIntroductionNetflix’s core mission is to connect millions of members…

11 hours ago

Scaling data annotation using vision-language models to power physical AI systems

Critical labor shortages are constraining growth across manufacturing, logistics, construction, and agriculture. The problem is…

11 hours ago

Start Your Surround Sound Journey With $50 off This Klipsch Soundbar

This soundbar is just the beginning, with the option to add wireless bookshelf speakers or…

12 hours ago