ML 20062 image 1
Generative AI has created unprecedented opportunities for Canadian organizations to transform their operations and customer experiences. We are excited to announce that customers in Canada can now access advanced foundation models including Anthropic’s Claude Sonnet 4.5 and Claude Haiku 4.5 on Amazon Bedrock through cross-Region inference (CRIS).
This post explores how Canadian organizations can use cross-Region inference profiles from the Canada (Central) Region to access the latest foundation models to accelerate AI initiatives. We will demonstrate how to get started with these new capabilities, provide guidance for migrating from older models, and share recommended practices for quota management.
To help customers achieve the scale of their Generative AI applications, Amazon Bedrock offers Cross-Region Inference (CRIS) profiles, a powerful feature that enables organizations to seamlessly distribute inference processing across multiple AWS Regions. This capability helps you get higher throughput while building at scale, helping to ensure your generative AI applications remain responsive and reliable even under heavy load.
Amazon Bedrock provides two types of cross-Region Inference profiles:
Cross-Region Inference operates through the secure AWS network with end-to-end encryption for both data in transit and at rest. When a customer submits an inference request from the Canada (Central) Region, CRIS intelligently routes the request to one of the destination regions configured for the inference profile (US or Global profiles).
The key distinction is that while inference processing (the transient computation) may occur in another Region, all data at rest—including logs, knowledge bases, and any stored configurations—remains exclusively within the Canada (Central) Region. The inference request travels over the AWS Global Network, never traversing the public internet, and responses are returned encrypted to your application in Canada.
With CRIS, Canadian organizations gain earlier access to foundation models, including cutting-edge models like Claude Sonnet 4.5 with enhanced reasoning capabilities, providing a faster path to innovation. CRIS also delivers enhanced capacity and performance by providing access to capacity across multiple Regions. This enables higher throughput during peak periods such as tax season, Black Friday, and holiday shopping, automatic burst handling without manual intervention, and greater resiliency by serving requests from a larger pool of resources.
Canadian customers can choose between two inference profile types based on their requirements:
| CRIS profile | Source Region | Destination Regions | Description |
| US cross-Region inference | ca-central-1 | Multiple US Regions | Requests from Canada (Central) can be routed to supported US Regions with capacity. |
| Global inference | ca-central-1 | Global AWS Regions | Requests from Canada (Central) can be routed to a Region in the AWS global CRIS profile. |
To begin using cross-Region Inference from Canada, follow these steps:
First, verify your IAM role or user has the necessary permissions to invoke Amazon Bedrock models using cross-Region inference profiles.
Here’s an example of a policy for US cross-Region inference:
For global CRIS refer to the blog post, Unlock global AI inference scalability using new global cross-Region inference on Amazon Bedrock with Anthropic’s Claude Sonnet 4.5.
Configure your application to use the relevant inference profile ID. The profiles use prefixes to indicate their routing scope:
| Model | Routing scope | Inference profile ID |
| Claude Sonnet 4.5 | US Regions | us.anthropic.claude-sonnet-4-5-20250929-v1:0 |
| Claude Sonnet 4.5 | Global | global.anthropic.claude-sonnet-4-5-20250929-v1:0 |
| Claude Haiku 4.5 | US Regions | us.anthropic.claude-haiku-4-5-20251001-v1:0 |
| Claude Haiku 4.5 | Global | global.anthropic.claude-haiku-4-5-20251001-v1:0 |
Here’s how to use the Amazon Bedrock Converse API with a US CRIS inference profile from Canada:
When using CRIS from Canada, quota management is performed at the source Region level (ca-central-1). This means quota increases requested for the Canada (Central) Region apply to all inference requests originating from Canada, regardless of where they’re processed.
Important: When calculating your required quota increases, you need to take into account the burndown rate, defined as the rate at which input and output tokens are converted into token quota usage for the throttling system. The following models have a 5x burn down rate for output tokens (1 output token consumes 5 tokens from your quotas):
For other models, the burndown rate is 1:1 (1 output token consumes 1 token from your quota). For input tokens, the token to quota ratio is 1:1. The calculation for the total number of tokens per request is as follows:
Input token count + Cache write input tokens + (Output token count x Burndown rate)
To request quota increases for CRIS in Canada:
Organizations currently using older Claude models should plan their migration to Claude 4.5 to leverage the latest model capabilities.
To plan your migration strategy, incorporate the following elements:
When implementing CRIS from Canada, organizations can choose between US and Global inference profiles based on their specific requirements.
US cross-Region inference is recommended for organizations with existing US data processing agreements, high throughput and resilience requirements and development and testing environments.
Cross-Region inference for Amazon Bedrock represents an opportunity for Canadian organizations that want to use AI while maintaining data governance. By distinguishing between transient inference processing and persistent data storage, CRIS provides faster access to the latest foundation models without compromising compliance requirements.
With CRIS, Canadian organizations get access to new models within days instead of months. The system scales automatically during peak business periods while maintaining complete audit trails within Canada. This helps you meet compliance requirements and use the same advanced AI capabilities as organizations worldwide. To get started, review your data governance requirements and configure IAM permissions. Then test with the inference profile that matches your needs—US for lower latency to US Regions, or Global for maximum capacity.
President Donald Trump’s new “Genesis Mission” unveiled Monday, November 24, 2025, is billed as a…
It went about as well as you’d expect.
If you've spent any time with ChatGPT or another AI chatbot, you've probably noticed they…
Ref 《 A method to turn a video into a 360° 3D VR panorama video…
AI is transforming how mid-size banks and credit unions operate, communicate, and compete — and…
Machine learning models often behave differently across environments.