claudio
In the drive to remain competitive, businesses today are turning to AI to help them minimize cost and maximize efficiency. It’s incumbent on them to find the most suitable AI model—the one that will help them achieve more while spending less. For many businesses, the migration from OpenAI’s model family to Amazon Nova represents not only a shift in model but a strategic move toward scalability, efficiency, and broader multimodal capabilities.
In this blog, we discuss how to optimize prompting in Amazon Nova for the best price-performance.
OpenAI’s models remain powerful, but their operational costs can be prohibitive when scaled. Consider these figures from Artificial Analysis:
Model | Input Token Cost (per Million Tokens) | Output Token Cost (per Million Tokens) | Context Window | Output Speed (Tokens per Second) | Latency (Seconds per first token) |
GPT-4o | ~$2.50 | ~$10.00 | Up to 128K tokens | ~63 | ~0.49 |
GPT-4o Mini | ~$0.15 | ~$0.60 | Up to 128K tokens | ~90 | ~0.43 |
Nova Micro | ~$0.035 | ~$0.14 | Up to 128K tokens | ~195 | ~0.29 |
Nova Lite | ~$0.06 | ~$0.24 | Up to 300K tokens | ~146 | ~0.29 |
Nova Pro | ~$0.80 | ~$3.20 | Up to 300K tokens | ~90 | ~0.34 |
For high-volume applications—like global customer support or large-scale document analysis—these cost differences are disruptive. Not only does Amazon Nova Pro offer over three times the cost-efficiency, its longer context window also enables it to handle more extensive and complex inputs.
Amazon Nova isn’t a single model—it’s a suite designed for various needs:
The lower per-token costs and higher output per second of Amazon Nova give you the flexibility to simplify prompts for real-time applications so you can balance quality, speed, and cost for your use case.
To make the best decision about which model family fits your needs, it’s important to understand the differences in prompt engineering best practices in both OpenAI and Amazon Nova. Each model family has its own set of strengths, but there are some things that apply to both families. Across both model families, quality accuracy is achieved through clarity of instructions, structured prompts, and iterative refinement. Whether you’re using strong output directives or clearly defined use cases, the goal is to reduce ambiguity and improve response quality.
OpenAI uses a layered messaging system for prompt engineering, where system, developer, and user prompts work in harmony to control tone, safety, and output format. Their approach emphasizes:
Transitioning to Amazon Nova isn’t merely a change in API endpoints—it requires retooling your prompt engineering to align with the strengths of Amazon Nova. You need to reframe your use case definition. Begin by breaking down your current GPT-4o or GPT-4o Mini prompt into its core elements of task, role, response style, and instructions and success criteria. Make sure to structure these elements clearly to provide a blueprint for the model.
To understand how to migrate an existing OpenAI prompt to work optimally for Amazon Nova Pro, consider the following example using the meeting notes summarizer. Here is the GPT-4o system prompt:
The user prompt is the meeting notes that need to be summarized:
GPT produces this helpful response:
To meet or exceed the quality of the response from GPT-4o, here is what an Amazon Nova Pro prompt might look like. The prompt uses the same best practices discussed in this post, starting with the system prompt. We used a temperature of .2 and a topP of .9 here:
Here’s the user prompt, using prefilled responses:
The following example shows that the Amazon Nova response meets and exceeds the accuracy of the OpenAI example, formats the output in Markdown, and has found clear owners for each action item:
A few updates to the prompt can achieve comparable or better results from Amazon Nova Pro while enjoying a much less expensive cost of inference.
Amazon Nova Lite and Amazon Nova Pro can support up to 300,000 input tokens, which means that you can include more context in your prompt if needed. Expand your background data and detailed instructions accordingly—if your original OpenAI prompt was optimized for 128,000 tokens, adjust it to use the Amazon Nova extended window.
If your GPT prompt required strict formatting (for example, “Respond in JSON only”), make sure that your Amazon Nova prompt includes these directives. Additionally, if your task involves multimodal inputs, specify when to include images or video references.
The rise of generative AI agents has made function calling, or tool calling, one of the most important abilities of a given large language model (LLM). A model’s ability to correctly pick the right tool for the job, in a low-latency manner, is often the difference between success and failure of an agentic system.
Both OpenAI and Amazon Nova models share similarities in function calling, in particular their support for structured API calls. Both model families support tool selection through defined tool schemas, which we discuss later in this post. They also both provide a mechanism to decide when to invoke these tools or not.
OpenAI’s function calling uses flexible JSON schemas to define and structure API interactions. The models support a wide range of schema configurations, which give developers the ability to quickly implement external function calls through straightforward JSON definitions tied to their API endpoints.
Here is an example of a function:
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and country e.g. Montevideo, Uruguay"
}
},
"required": [
"location"
],
"additionalProperties": False
},
"strict": True
}
}]
completion = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the weather like in Punta del Este today?"}],
tools=tools
Similar to OpenAI’s approach, Amazon Nova can call tools when passed a configuration schema as shown in the following code example. Amazon Nova has made heavy use of Greedy Decoding when calling tools, and it’s advised to set temperature, topP, and topK to 1. This makes sure that the model has the highest accuracy in tool selection. These Greedy Decoding parameters and other great examples of tool use are covered in great detail in Tool use (function calling) with Amazon Nova.
The following is an example of function calling without using additionalModelRequestFields:
tool_config = {
"tools": [{
"toolSpec": {
"name": "get_recipe",
"description": "Structured recipe generation system",
"inputSchema": {
"json": {
"type": "object",
"properties": {
"recipe": {
"type": "object",
"properties": {
"name": {"type": "string"},
"ingredients": {
"type": "array",
"items": {
"type": "object",
"properties": {
"item": {"type": "string"},
"amount": {"type": "number"},
"unit": {"type": "string"}
}
}
},
"instructions": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["name", "ingredients", "instructions"]
}
}
}
}
}
}]
}
# Base configuration without topK=1
input_text = "I need a recipe for chocolate lava cake"
messages = [{
"role": "user",
"content": [{"text": input_text}]
}]
# Inference parameters
inf_params = {"topP": 1, "temperature": 1}
response = client.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig=inf_params
)
# Typically produces less structured or incomplete output
The following example shows how function calling accuracy can be improved by using
additionalModelRequestFields:
# Enhanced configuration with topK=1
response = client.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig=inf_params,
additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Produces more accurate and structured function call
To maximize Amazon Nova function calling potential and improve accuracy, always use additionalModelRequestFields with topk=1. This forces the model to select the single most probable token and prevents random token selection. This increases deterministic output generation and improves function call precision by about 30–40%.
The following code examples further explain how to conduct tool calling successfully. The first scenario shows recipe generation without an explicit tool. The example doesn’t use topK, which typically results in responses that are less structured:
input_text = """
I'm looking for a decadent chocolate dessert that's quick to prepare.
Something that looks fancy but isn't complicated to make.
"""
messages = [{
"role": "user",
"content": [{"text": input_text}]
}]
response = client.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
inferenceConfig={"topP": 1, "temperature": 1}
)
# Generates a conversational recipe description
# Less structured, more narrative-driven response
In this example, the scenario shows recipe generation with a structured tool. We add topK set to 1, which produces a more structured output:
response = client.converse(
modelId="us.amazon.nova-lite-v1:0",
messages=messages,
toolConfig=tool_config,
inferenceConfig={"topP": 1, "temperature": 1},
additionalModelRequestFields={"inferenceConfig": {"topK": 1}}
)
# Generates a highly structured, JSON-compliant recipe
# Includes precise ingredient measurements
# Provides step-by-step instructions
Overall, OpenAI offers more flexible, broader schema support. Amazon Nova provides more precise, controlled output generation and is the best choice when working with high-stakes, structured data scenarios, as demonstrated in Amazon Nova’s performance on the IFEval benchmark discussed in section 2.1.1 of the technical report and model card. We recommend using Amazon Nova for applications requiring predictable, structured responses because its function calling methodology provides superior control and accuracy.
The evolution from OpenAI’s models to Amazon Nova represents a significant shift in using AI. It shows a transition toward models that deliver similar or superior performance at a fraction of the cost, with expanded capabilities in multimodal processing and extended context handling.
Whether you’re using the robust, enterprise-ready Amazon Nova Pro, the agile and economical Amazon Nova Lite, or the versatile Amazon Nova Micro, the benefits are clear:
By evolving your prompt strategy—redefining use cases, exploiting the extended context, and iteratively refining instructions—you can smoothly migrate your existing workflows from OpenAI’s o4 and o4-mini models to the innovative world of Amazon Nova.
https://preview.redd.it/j6qshjdiao7f1.jpg?width=1182&format=pjpg&auto=webp&s=9f5da751e086c7c3a8cd882f5b7648211daae50c https://reddit.com/link/1leexi9/video/bs096nikao7f1/player Link to the post: https://x.com/viccpoes/status/1934983545233277428 submitted by /u/LatentSpacer [link] [comments]
Editor’s Note: This post provides a detailed rebuttal of the multitude of misguided assertions presented…
Meetings play a crucial role in decision-making, project coordination, and collaboration, and remote meetings are…
The momentum of the Gemini 2.5 era continues to build. Following our recent announcements, we're…
By offering transparent tooling and clear implementation examples, OpenAI is pushing agentic systems out of…