Categories: FAANG

New-and-Improved Content Moderation Tooling

We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers.

To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content — an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation.

When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm — content prohibited by our content policy. The endpoint has been trained to be quick, accurate, and to perform robustly across a range of applications. Importantly, this reduces the chances of products “saying” the wrong thing, even when deployed to users at-scale. As a consequence, AI can unlock benefits in sensitive settings, like education, where it could not otherwise be used with confidence.

input text

Violence
Self-harm
Hate
Sexual

Moderation endpoint


Flagged


Flagged

The Moderation endpoint helps developers to benefit from our infrastructure investments. Rather than build and maintain their own classifiers—an extensive process, as we document in our paper—they can instead access accurate classifiers through a single API call.

As part of OpenAI’s commitment to making the AI ecosystem safer, we are providing this endpoint to allow free moderation of all OpenAI API-generated content. For instance, Inworld, an OpenAI API customer, uses the Moderation endpoint to help their AI-based virtual characters remain appropriate for their audiences. By leveraging OpenAI’s technology, Inworld can focus on their core product – creating memorable characters.

Additionally, we welcome the use of the endpoint to moderate content not generated with the OpenAI API. In one case, the company NGL – an anonymous messaging platform, with a focus on safety – uses the Moderation endpoint to detect hateful language and bullying in their application. NGL finds that these classifiers are capable of generalizing to the latest slang, allowing them to remain more-confident over time. Use of the Moderation endpoint to monitor non-API traffic is in private beta and will be subject to a fee. If you are interested, please reach out to us at support@openai.com.


Get started with the Moderation endpoint by checking out the documentation. More details of the training process and model performance are available in our paper. We have also released an evaluation dataset, featuring Common Crawl data labeled within these categories, which we hope will spur further research in this area.


Acknowledgments
Many people reviewed or contributed to this work, to whom we share our thanks, including: Sam Altman, Miles Brundage, Derek Chen, Karl Cobbe, Thomas Degry, Steve Dowling, Elie Georges, Jacob Hilton, Raf Jakubanis, Fraser Kelton, Matt Knight, Gretchen Krueger, Jason Kwon, Jan Leike, Mira Murati, Tinnei Pang, Girish Sastry, Pranav Shyam, Maddie Simens, Natalie Summers, Justin Wang, Peter Welinder, Dave Willner, Hannah Wong, Jeff Wu, and Summer Yue.

let isMobile = window.matchMedia(MOBILE_QUERY).matches; let lastResultFlagged = false; let lastFlaggedCategoryIndex = 0;

const animation = document.querySelector(".js-animation"); const input = document.querySelector('.js-input'); const [downArrow, upArrow, leftArrow, rightArrow] = [...document.querySelectorAll('.js-arrow')]; const [defaultOutline, badOutline, goodOutline] = [...document.querySelectorAll('.js-input-outline')]; const endPointInternals = [...document.querySelectorAll('.js-endpoint-internals')]; const [mobileResultLabel, desktopResultLabel] = [...document.querySelectorAll('.js-result')];

const handleResize = () => { isMobile = window.matchMedia(MOBILE_QUERY).matches; };

// Animation methods const prepare = () => new Promise((resolve) => { [defaultOutline, goodOutline, badOutline].forEach((el, index) => { el.getAnimations({ subtree: true }).forEach((animation) => { animation.cancel(); });

console.log(el.getAnimations({subtree: true})); el.style.strokeDashoffset = index > 0 ? STROKE_LENGTH : 0; });

resolve(); });

const animateInputIn = () => new Promise((resolve, reject) => { const keyframes = isMobile ? [ {transform: 'scale(0.85)', opacity: 0}, {opacity: 1, offset: 0.4}, {transform: 'scale(1.0)', opacity: 1}, ] : [ {transform: 'translateY(-30px) scale(0.85)', opacity: 0}, {opacity: 1, offset: 0.4}, {transform: 'translateY(0) scale(1)', opacity: 1}, ];

const animation = input.animate( keyframes, {fill: 'forwards', duration: 500, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateRequest = () => new Promise((resolve, reject) => { const element = isMobile ? downArrow : leftArrow; const keyframes = isMobile ? [ {transform: 'translateY(-10px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ] : [ {transform: 'translateX(-20px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ];

const animation = element.animate( keyframes, {fill: 'forwards', duration: 300, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateInputOut = () => new Promise((resolve) => { const keyframes = isMobile ? [ {transform: 'scale(1.0)', opacity: 1}, {transform: 'scale(0.85)', opacity: 0}, ] : [ {transform: 'translateY(0) scale(1)', opacity: 1}, {opacity: 0, offset: 0.4}, {transform: 'translateY(30px) scale(0.85)', opacity: 0} ] const animation = input.animate( keyframes, {fill: 'forwards', duration: 1000, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateInternals = (flag) => { let flaggedFactorIndex = lastFlaggedCategoryIndex;

if (flag) { while (flaggedFactorIndex === lastFlaggedCategoryIndex) { flaggedFactorIndex = Math.floor(Math.random() * endPointInternals.length); }

lastFlaggedCategoryIndex = flaggedFactorIndex; }

return Promise.all(endPointInternals.map((el, index) => new Promise((resolve, reject) => { const isFlagged = flag && index === flaggedFactorIndex; const keyframes = isFlagged ? [{opacity: 0.2}, {opacity: 1.0}, {opacity: 0.8}, {opacity: 1.0}] : [{opacity: 0.2}, {opacity: 1.0}, {opacity: 0.8}, {opacity: 1.0}, {opacity: 0.2}]; const animation = el.animate( keyframes, {duration: isFlagged ? 800 : 1000, delay: index * 150, easing: 'cubic-bezier(.645, .045, .355, 1)', fill: 'forwards'}, );

if (isFlagged) { setTimeout(() => { endPointInternals[index].classList.add('flagged'); }, 200); }

animation.onfinish = () => { resolve(); } }))); }

const animateResponse = (flag) => new Promise((resolve, reject) => { const element = isMobile ? upArrow : rightArrow; const resultLabel = isMobile ? mobileResultLabel : desktopResultLabel;

if (flag) { resultLabel.innerText = 'flagged' ; resultLabel.style.color = '#FF6E3C'; } else { resultLabel.innerText = 'not flagged' ; resultLabel.style.color = '#51DA4C'; }

const keyframes = isMobile ? [ {transform: 'translateY(10px)', opacity: 0}, {transform: 'translateY(0)', opacity: 1} ] : [ {transform: 'translateX(20px)', opacity: 0}, {transform: 'translateX(0)', opacity: 1} ]

const animation = element.animate( keyframes, {fill: 'forwards', duration: 300, easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } });

const animateResult = (flag) => new Promise((resolve, reject) => { const animateOutDefault = defaultOutline.animate( [{strokeDashoffset: 0}, {strokeDashoffset: -1 * STROKE_LENGTH}], {fill: 'forwards', duration: 1400, easing: 'cubic-bezier(.645, .045, .355, 1)'} );

const resultEl = flag ? badOutline : goodOutline; const animateInResult = resultEl.animate( [{strokeDashoffset: STROKE_LENGTH}, {strokeDashoffset: 0}], {fill: 'forwards', duration: 1400, easing: 'cubic-bezier(.645, .045, .355, 1)'} );

animateInResult.onfinish = () => { resolve(); } });

const animateArrowsOut = () => new Promise((resolveAll) => { Promise.all([leftArrow, rightArrow, upArrow, downArrow].map((el) => new Promise((resolve) => { const animation = el.animate( [{opacity: 1}, {opacity: 0}], {duration: 500, delay: 1500, fill: 'forwards', easing: 'cubic-bezier(.215, .61, .355, 1)'} );

animation.onfinish = () => { resolve(); } }))).then(() => { resolveAll(); }); });

const resetInternals = () => new Promise((resolve) => { endPointInternals.forEach((factor) => { const isFlagged = factor.classList.contains('flagged');

if (isFlagged) { factor.classList.remove('flagged');

factor.animate( [{opacity: 1.0}, {opacity: 0.2}], {duration: 500, easing: 'cubic-bezier(.645, .045, .355, 1)', fill: 'forwards'} ) } });

resolve(); });

const cleanUp = () => new Promise((resolve) => { mobileResultLabel.innerText = ''; desktopResultLabel.innerText = ''; resolve(); });

// Main logic const ANIMATION_STEPS = [ prepare, animateInputIn, animateRequest, animateInternals, animateResponse, animateResult, animateArrowsOut, resetInternals, animateInputOut, cleanUp, ]

const animate = async () => { const flagResult = !lastResultFlagged;

for (const animation of ANIMATION_STEPS) { await animation(flagResult); }

lastResultFlagged = flagResult;

setTimeout(() => { animate(); }, 500) }

window.addEventListener('resize', handleResize); window.addEventListener('DOMContentLoaded', () => { animate(); });

AI Generated Robotic Content

Recent Posts

I’m working on a film about Batman (1989) vs Jurassic Park (1993)

submitted by /u/Many-Ad-6225 [link] [comments]

4 hours ago

10 NumPy One-Liners to Simplify Feature Engineering

When building machine learning models, most developers focus on model architectures and hyperparameter tuning.

4 hours ago

Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions

Wearable devices record physiological and behavioral signals that can improve health predictions. While foundation models…

4 hours ago

Accelerate AI development with Amazon Bedrock API keys

Today, we’re excited to announce a significant improvement to the developer experience of Amazon Bedrock:…

4 hours ago

Accelerate your AI workloads with the Google Cloud Managed Lustre

Today, we're making it even easier to achieve breakthrough performance for your AI/ML workloads: Google…

4 hours ago

MCP isn’t KYC-ready: Why regulated sectors are wary of open agent exchanges

Model Context Protocol, or MCP, is gaining momentum. But, not everyone is fully onboard yet,…

6 hours ago