ML 21237 1
Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agentic AI applications without creating guardrail resources. The new InvokeGuardrailChecks API gives you the flexibility to invoke supported safeguards at any turn in the agentic loop and take the required action in your application logic. The API operates in detect-only mode and returns numeric scores for each safeguard. You can define custom thresholds and actions in your applications to block, bypass, retry, or log results for auditing purposes based on your specific requirements.
Amazon Bedrock Guardrails provides configurable safeguards to help you build safe generative AI applications. With comprehensive safety controls across foundation models, Amazon Bedrock Guardrails helps you detect and filter undesirable content and protect sensitive information in both user inputs and model responses.
The new InvokeGuardrailChecks API extends these capabilities for agentic AI applications with multi-turn workflows. AI agents plan tasks, invoke tools, process outputs, and iterate through loops, often without direct user interaction. Each step in this loop carries a different risk profile and requires different safeguards. With the InvokeGuardrailChecks API, you can apply the checks you need, where you need them, without the operational overhead of provisioning separate guardrail resources for each stage. The API returns a numeric score that helps you define your own threshold and action for your application. In this post, we walk through how the InvokeGuardrailChecks API works and how to use it to build safe, multi-turn agentic AI applications.
Generative AI applications typically follow a familiar pattern: a user sends a prompt, the model responds, and a guardrail evaluates both. You create one guardrail resource, configure your policies, and apply it uniformly.
AI agents work differently. They operate in loops, receiving input, generating a response, and repeating multiple turns in a conversation. A single user session might involve 10, 20, or more turns. Each turn has two stages where safety checks matter: before the content goes to the model (input), and before the model response goes back to the user (output).
Consider a multi-turn customer support agent that handles varied requests across a conversation:
Each step has a distinct risk profile. Creating and applying separate guardrail resources for each step creates operational overhead that scales poorly as you deploy hundreds of agents.
The InvokeGuardrailChecks API gives you granular, per-request control over which safeguards to run at each step of the agent loop. It returns numeric scores so you can define the appropriate thresholds and actions in your application logic, such as retry, block, or bypass, based on what suits your use case.
The InvokeGuardrailChecks API uses a structured messages schema, where each content block has a required role such as system, user, or assistant. This is how agent interactions operate in loops. These roles provide the context the safeguard needs to evaluate the content precisely. This aspect is critical for multi-turn agentic workflows.
The InvokeGuardrailChecks API offers the following capabilities:
Resourceless: You don’t need to create guardrail resources upfront. There’s no CreateGuardrail step, no guardrail IDs to track, and no versions to manage. You specify which safeguards to run directly in each API request. This makes it straightforward to add, remove, or adjust checks as your workflows evolve.
Consider the following scenario. Without a resourceless API, applying a safeguard at an ephemeral step in an agentic loop requires multiple lifecycle calls. For example, suppose you want to validate a tool’s output before passing it to the next iteration. You first create a guardrail resource, invoke it, and then delete it after the invocation to avoid resource sprawl. When a single agentic user query triggers dozens of loop iterations, each with different safety requirements, this create-invoke-delete lifecycle becomes untenable. The InvokeGuardrailChecks API avoids this. You call the API with the safeguard you need.
Detect-only: The API doesn’t block, mask, or rewrite content. It returns findings with numeric scores for each safeguard, and you decide what action your application should take. With your custom threshold, you have full control to implement context-aware logic. For example, you can block high-confidence threats, route ambiguous findings to human review, or log low-confidence results for audits.
Symmetric request-response: The safeguards you configure in your request are the same keys returned in the response. If you request contentFilter and sensitiveInformation, only those two appear in results. This makes it straightforward to map findings back to the safeguards that produced them.
Independent prompt attack detection: Unlike the ApplyGuardrail API, where prompt attack detection is bundled inside content filters, the InvokeGuardrailChecks API separates prompt attack detection as its own standalone check. You can invoke prompt attack detection independently without running content filters. Additionally, you can specify individual categories such as jailbreak, prompt injection, or prompt leakage to get fine-grained control.
The InvokeGuardrailChecks API supports the following safeguards:
| Safeguard | What it detects | Score type |
| Content filters | Harmful content across categories: HATE, VIOLENCE, SEXUAL, INSULTS, MISCONDUCT | Severity score (0–1) with discrete scores |
| Prompt attack detection | Jailbreaks, prompt injection, and prompt leakage attempts | Severity score (0–1) with discrete scores |
| Sensitive information filters | PII entities including email, phone, SSN, credit card numbers (31 entity types) | Confidence score (0–1) with discrete scores |
The API returns two types of scores depending on the check:
messageIndex, contentIndex, and character offsets (beginOffset, endOffset) for precise location within the content.In this section, we walk through how to use the InvokeGuardrailChecks API in your application.
bedrock:InvokeGuardrailChecks permission.Because the InvokeGuardrailChecks API is resourceless, there’s no guardrail ARN to scope. Attach the following identity-based policy to your IAM role or user:
Why use Resource: "*"? The InvokeGuardrailChecks API is resourceless by design. There’s no guardrail ARN associated with any call. The wildcard is the only valid value for this field. This doesn’t grant access to other Amazon Bedrock resources. It applies solely to the bedrock:InvokeGuardrailChecks action.
To further restrict access, combine with condition keys such as the following:
aws:SourceIp or aws:SourceVpc to limit calls to specific networks.aws:PrincipalTag to restrict to specific teams or roles (for example, "aws:PrincipalTag/team": "agent-safety").aws:RequestedRegion to constrain to specific AWS Regions (as shown in the preceding policy).When your agent receives a user’s message, check for harmful content before sending it to a model. The following example evaluates content for violence and misconduct:
The following is the example output:
The high severity scores indicate that the content strongly matches harmful categories. Your application decides the action, such as block, log, or escalate.
AI agents often have system instructions that bad actors might try to override. You can evaluate a system-user message pair for jailbreaks and prompt leakage attempts:
The following is the example output:
When a tool returns results from a web search or database query, you can apply multiple checks in a single call. The API executes checks in parallel:
The following is the example output:
The sensitive information results include character offsets, giving you precise location data for client-side masking or redaction.
The InvokeGuardrailChecks API uses scores to drive context-aware decisions. The following pattern shows adaptive response logic:
With this pattern, you can implement thresholds that match your business context. A financial services application might block at 0.4, although a creative writing tool might only block at 0.8.
The InvokeGuardrailChecks API integrates naturally with agent frameworks that expose lifecycle hooks. The following example uses Strands Agents, which provides hooks at key stages of the agent loop:
You can use either the InvokeGuardrailChecks or ApplyGuardrail API offered by Amazon Bedrock Guardrails, depending on your use case and application. The following table provides details and pointers on when to use which API.
| InvokeGuardrailChecks | ApplyGuardrail | |
| Use case | Targeted checks at specific points or turns in workflows | Uniform enforcement across your application |
| Resource model | Resourceless. Checks specified inline per request using your own control plane | Create, version, and manage guardrails resources upfront |
| Decision logic | Detect only. Returns numeric scores so you decide the action for your application logic | Automatic block, mask, or bypass based on pre-configured thresholds |
| Targeted toward | Agentic AI workflows requiring per-step safety requirements | Traditional request-response AI applications |
The InvokeGuardrailChecks API is resourceless, so no persistent resources are created. To clean up after testing, complete the following steps:
The InvokeGuardrailChecks API complements current Amazon Bedrock Guardrails capabilities with composable safety building blocks for agentic AI. Here are some additional takeaways:
To get started, see the InvokeGuardrailChecks API reference and apply individual safety checks across your agentic AI applications.
Hi, I'm Dever and I like training LORAs, you can download this one from Huggingface…
Traditional machine learning pipelines for predictive tasks like text classification usually rely on extracting structured,…
For technology companies like Siemens, software is the nervous system of factories, energy grids, and…
Whether you’re at a festival, tennis match, or wedding, these hand fans and wearable cooling…
A research team led by Professor Taesung Kim of the School of Mechanical Engineering at…
Started from a Z-Image Turbo character LoRA and animated it with SCAIL-2 using a random…