Categories: FAANG

New-and-Improved Content Moderation Tooling

We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers.

To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content — an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm — content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a range of applications. Importantly, this reduces the chances of products “saying” the wrong thing, even when deployed to users at-scale. As a consequence, AI can unlock benefits in sensitive settings, like education, where it could not otherwise be used with confidence.

input text

Violence
Self-harm
Hate
Sexual

Moderation endpoint


Flagged


Flagged

The Moderation endpoint helps developers to benefit from our infrastructure investments. Rather than build and maintain their own classifiers—an extensive process, as we document in our paper—they can instead access accurate classifiers through a single API call.

As part of OpenAI’s commitment to making the AI ecosystem safer, we are providing this endpoint to allow free moderation of all OpenAI API-generated content. For instance, Inworld, an OpenAI API customer, uses the Moderation endpoint to help their AI-based virtual characters remain appropriate for their audiences. By leveraging OpenAI’s technology, Inworld can focus on their core product – creating memorable characters.

Additionally, we welcome the use of the endpoint to moderate content not generated with the OpenAI API. In one case, the company NGL – an anonymous messaging platform, with a focus on safety – uses the Moderation endpoint to detect hateful language and bullying in their application. NGL finds that these classifiers are capable of generalizing to the latest slang, allowing them to remain more-confident over time. Use of the Moderation endpoint to monitor non-API traffic is in private beta and will be subject to a fee. If you are interested, please reach out to us at support@openai.com.


Get started with the Moderation endpoint by checking out the documentation. More details of the training process and model performance are available in our paper. We have also released an evaluation dataset, featuring Common Crawl data labeled within these categories, which we hope will spur further research in this area.


Acknowledgments
Many people reviewed or contributed to this work, to whom we share our thanks, including: Sam Altman, Miles Brundage, Derek Chen, Karl Cobbe, Thomas Degry, Steve Dowling, Elie Georges, Jacob Hilton, Raf Jakubanis, Fraser Kelton, Matt Knight, Gretchen Krueger, Jason Kwon, Jan Leike, Mira Murati, Tinnei Pang, Girish Sastry, Pranav Shyam, Maddie Simens, Natalie Summers, Justin Wang, Peter Welinder, Dave Willner, Hannah Wong, Jeff Wu, and Summer Yue.

let isMobile = window.matchMedia(MOBILE_QUERY).matches; let lastResultFlagged = false; let lastFlaggedCategoryIndex = 0;

const animation = document.querySelector(".js-animation"); const input = document.querySelector('.js-input'); const [downArrow, upArrow, leftArrow, rightArrow] = [...document.querySelectorAll('.js-arrow')]; const [defaultOutline, badOutline, goodOutline] = [...document.querySelectorAll('.js-input-outline')]; const endPointInternals = [...document.querySelectorAll('.js-endpoint-internals')]; const [mobileResultLabel, desktopResultLabel] = [...document.querySelectorAll('.js-result')];

const handleResize = () => { isMobile = window.matchMedia(MOBILE_QUERY).matches; };

// Animation methods const prepare = () => new Promise((resolve) => { [defaultOutline, goodOutline, badOutline].forEach((el, index) => { el.getAnimations({ subtree: true }).forEach((animation) => { animation.cancel(); });

console.log(el.getAnimations({subtree: true})); el.style.strokeDashoffset = index > 0 ? STROKE_LENGTH : 0; });

resolve(); });

const animateInputIn = () => new Promise((resolve, reject) => { const keyframes = isMobile ? [ {transform: 'scale(0.85)', opacity: 0}, {opacity: 1, offset: 0.4}, {transform: 'scale(1.0)', opacity: 1}, ] : [ {transform: 'translateY(-30px) scale(0.85)', opacity: 0}, {opacity: 1, offset: 0.4}, {transform: 'translateY(0) scale(1)', opacity: 1}, ];

const animation = input.animate( keyframes, {fill: 'forwards', duration: 500, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateRequest = () => new Promise((resolve, reject) => { const element = isMobile ? downArrow : leftArrow; const keyframes = isMobile ? [ {transform: 'translateY(-10px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ] : [ {transform: 'translateX(-20px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ];

const animation = element.animate( keyframes, {fill: 'forwards', duration: 300, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateInputOut = () => new Promise((resolve) => { const keyframes = isMobile ? [ {transform: 'scale(1.0)', opacity: 1}, {transform: 'scale(0.85)', opacity: 0}, ] : [ {transform: 'translateY(0) scale(1)', opacity: 1}, {opacity: 0, offset: 0.4}, {transform: 'translateY(30px) scale(0.85)', opacity: 0} ] const animation = input.animate( keyframes, {fill: 'forwards', duration: 1000, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateInternals = (flag) => { let flaggedFactorIndex = lastFlaggedCategoryIndex;

if (flag) { while (flaggedFactorIndex === lastFlaggedCategoryIndex) { flaggedFactorIndex = Math.floor(Math.random() * endPointInternals.length); }

lastFlaggedCategoryIndex = flaggedFactorIndex; }

return Promise.all(endPointInternals.map((el, index) => new Promise((resolve, reject) => { const isFlagged = flag && index === flaggedFactorIndex; const keyframes = isFlagged ? [{opacity: 0.2}, {opacity: 1.0}, {opacity: 0.8}, {opacity: 1.0}] : [{opacity: 0.2}, {opacity: 1.0}, {opacity: 0.8}, {opacity: 1.0}, {opacity: 0.2}]; const animation = el.animate( keyframes, {duration: isFlagged ? 800 : 1000, delay: index * 150, easing: 'cubic-bezier(.645, .045, .355, 1)', fill: 'forwards'}, );

if (isFlagged) { setTimeout(() => { endPointInternals[index].classList.add('flagged'); }, 200); }

animation.onfinish = () => { resolve(); } }))); }

const animateResponse = (flag) => new Promise((resolve, reject) => { const element = isMobile ? upArrow : rightArrow; const resultLabel = isMobile ? mobileResultLabel : desktopResultLabel;

if (flag) { resultLabel.innerText = 'flagged' ; resultLabel.style.color = '#FF6E3C'; } else { resultLabel.innerText = 'not flagged' ; resultLabel.style.color = '#51DA4C'; }

const keyframes = isMobile ? [ {transform: 'translateY(10px)', opacity: 0}, {transform: 'translateY(0)', opacity: 1} ] : [ {transform: 'translateX(20px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ]

const animation = element.animate( keyframes, {fill: 'forwards', duration: 300, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateResult = (flag) => new Promise((resolve, reject) => { const animateOutDefault = defaultOutline.animate( [{strokeDashoffset: 0}, {strokeDashoffset: -1 * STROKE_LENGTH}], {fill: 'forwards', duration: 1400, easing: 'cubic-bezier(.645, .045, .355, 1)'} );

const resultEl = flag ? badOutline : goodOutline; const animateInResult = resultEl.animate( [{strokeDashoffset: STROKE_LENGTH}, {strokeDashoffset: 0}], {fill: 'forwards', duration: 1400, easing: 'cubic-bezier(.645, .045, .355, 1)'} );

animateInResult.onfinish = () => { resolve(); } });

const animateArrowsOut = () => new Promise((resolveAll) => { Promise.all([leftArrow, rightArrow, upArrow, downArrow].map((el) => new Promise((resolve) => { const animation = el.animate( [{opacity: 1}, {opacity: 0}], {duration: 500, delay: 1500, fill: 'forwards', easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } }))).then(() => { resolveAll(); }); });

const resetInternals = () => new Promise((resolve) => { endPointInternals.forEach((factor) => { const isFlagged = factor.classList.contains('flagged');

if (isFlagged) { factor.classList.remove('flagged');

factor.animate( [{opacity: 1.0}, {opacity: 0.2}], {duration: 500, easing: 'cubic-bezier(.645, .045, .355, 1)', fill: 'forwards'} ) } });

resolve(); });

const cleanUp = () => new Promise((resolve) => { mobileResultLabel.innerText = ''; desktopResultLabel.innerText = ''; resolve(); });

// Main logic const ANIMATION_STEPS = [ prepare, animateInputIn, animateRequest, animateInternals, animateResponse, animateResult, animateArrowsOut, resetInternals, animateInputOut, cleanUp, ]

const animate = async () => { const flagResult = !lastResultFlagged;

for (const animation of ANIMATION_STEPS) { await animation(flagResult); }

lastResultFlagged = flagResult;

setTimeout(() => { animate(); }, 500) }

window.addEventListener('resize', handleResize); window.addEventListener('DOMContentLoaded', () => { animate(); });

AI Generated Robotic Content

Recent Posts

GenLayer offers novel approach for AI agent transactions: getting multiple LLMs to vote on a suitable contract

GenLayer is betting that AI-driven contracts, enforced on the blockchain, will be the foundation for…

55 mins ago

OPM Watchdog Says Review of DOGE Work Is Underway

The acting inspector general says the Office of Personnel Management is investigating whether any “emerging…

55 mins ago

New technique overcomes spurious correlations problem in AI

AI models often rely on "spurious correlations," making decisions based on unimportant and potentially misleading…

55 mins ago

Surreal September: Celebrating Our Winners and Highlights

Surreal September was more than just a challenge—it was about elevating your AI art skills…

24 hours ago

The great software rewiring: AI isn’t just eating everything; it is everything

Gen AI is not just another technology layer; it has the potential to eat the…

1 day ago

14 Best Tote Bags of 2025, Tested and Reviewed by WIRED

From beach days to board meetings, these top totes are designed to protect your valuables,…

1 day ago