With Knowledge Bases for Amazon Bedrock, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for Retrieval Augmented Generation (RAG). Access to additional data helps the model generate more relevant, context-specific, and accurate responses without retraining the FMs.
In this post, we discuss two new features of Knowledge Bases for Amazon Bedrock specific to the RetrieveAndGenerate
API: configuring the maximum number of results and creating custom prompts with a knowledge base prompt template. You can now choose these as query options alongside the search type.
The maximum number of results option gives you control over the number of search results to be retrieved from the vector store and passed to the FM for generating the answer. This allows you to customize the amount of background information provided for generation, thereby giving more context for complex questions or less for simpler questions. It allows you to fetch up to 100 results. This option helps improve the likelihood of relevant context, thereby improving the accuracy and reducing the hallucination of the generated response.
The custom knowledge base prompt template allows you to replace the default prompt template with your own to customize the prompt that’s sent to the model for response generation. This allows you to customize the tone, output format, and behavior of the FM when it responds to a user’s question. With this option, you can fine-tune terminology to better match your industry or domain (such as healthcare or legal). Additionally, you can add custom instructions and examples tailored to your specific workflows.
In the following sections, we explain how you can use these features with either the AWS Management Console or SDK.
To follow along with these examples, you need to have an existing knowledge base. For instructions to create one, see Create a knowledge base.
To use the maximum number of results option using the console, complete the following steps:
For this post, we use hybrid search because it combines semantic and text search to provider greater accuracy. To learn more about hybrid search, see Knowledge Bases for Amazon Bedrock now supports hybrid search.
To demonstrate the value of the new feature, we show examples of how you can increase the accuracy of the generated response. We used Amazon 10K document for 2023 as the source data for creating the knowledge base. We use the following query for experimentation: “In what year did Amazon’s annual revenue increase from $245B to $434B?”
The correct response for this query is “Amazon’s annual revenue increased from $245B in 2019 to $434B in 2022,” based on the documents in the knowledge base. We used Claude v2 as the FM to generate the final response based on the contextual information retrieved from the knowledge base. Claude 3 Sonnet and Claude 3 Haiku are also supported as the generation FMs.
We ran another query to demonstrate the comparison of retrieval with different configurations. We used the same input query (“In what year did Amazon’s annual revenue increase from $245B to $434B?”) and set the maximum number of results to 5.
As shown in the following screenshot, the generated response was “Sorry, I am unable to assist you with this request.”
Next, we set the maximum results to 12 and ask the same question. The generated response is “Amazon’s annual revenue increase from $245B in 2019 to $434B in 2022.”
As shown in this example, we are able to retrieve the correct answer based on the number of retrieved results. If you want to learn more about the source attribution that constitutes the final output, choose Show source details to validate the generated answer based on the knowledge base.
You can also customize the default prompt with your own prompt based on the use case. To do so on the console, complete the following steps:
We use the Claude v2 model as an example in this post. The Claude 3 Sonnet and Haiku model is also available for generation.
After you choose the model, a new section called Knowledge base prompt template appears under Configurations.
For this post, we gave a few examples for creating a “Financial Advisor AI system” using Amazon financial reports with custom prompts. For best practices on prompt engineering, refer to Prompt engineering guidelines.
We now customize the default prompt template in several different ways, and observe the responses.
Let’s first try a query with the default prompt. We ask “What was the Amazon’s revenue in 2019 and 2021?” The following shows our results.
From the output, we find that it’s generating the free-form response based on the retrieved knowledge. The citations are also listed for reference.
Let’s say we want to give extra instructions on how to format the generated response, like standardizing it as JSON. We can add these instructions as a separate step after retrieving the information, as part of the prompt template:
The final response has the required structure.
By customizing the prompt, you can also change the language of the generated response. In the following example, we instruct the model to provide an answer in Spanish.
After removing $output_format_instructions$
from the default prompt, the citation from the generated response is removed.
In the following sections, we explain how you can use these features with the SDK.
To change the maximum number of results with the SDK, use the following syntax. For this example, the query is “In what year did Amazon’s annual revenue increase from $245B to $434B?” The correct response is “Amazon’s annual revenue increase from $245B in 2019 to $434B in 2022.”
The ‘numberOfResults
’ option under ‘retrievalConfiguration
’ allows you to select the number of results you want to retrieve. The output of the RetrieveAndGenerate
API includes the generated response, source attribution, and the retrieved text chunks.
The following are the results for different values of ‘numberOfResults
’ parameters. First, we set numberOfResults = 5
.
Then we set numberOfResults = 12
.
To customize the prompt using the SDK, we use the following query with different prompt templates. For this example, the query is “What was the Amazon’s revenue in 2019 and 2021?”
The following is the default prompt template:
The following is the customized prompt template:
With the default prompt template, we get the following response:
If you want to provide additional instructions around the output format of the response generation, like standardizing the response in a specific format (like JSON), you can customize the existing prompt by providing more guidance. With our custom prompt template, we get the following response.
The ‘promptTemplate
‘ option in ‘generationConfiguration
‘ allows you to customize the prompt for better control over answer generation.
In this post, we introduced two new features in Knowledge Bases for Amazon Bedrock: adjusting the maximum number of search results and customizing the default prompt template for the RetrieveAndGenerate
API. We demonstrated how to configure these features on the console and via SDK to improve performance and accuracy of the generated response. Increasing the maximum results provides more comprehensive information, whereas customizing the prompt template allows you to fine-tune instructions for the foundation model to better align with specific use cases. These enhancements offer greater flexibility and control, enabling you to deliver tailored experiences for RAG-based applications.
For additional resources to start implementing in your AWS environment, refer to the following:
TL;DR A conversation with 4o about the potential demise of companies like Anthropic. As artificial…
Whether a company begins with a proof-of-concept or live deployment, they should start small, test…
Digital tools are not always superior. Here are some WIRED-tested agendas and notebooks to keep…
Machine learning (ML) models are built upon data.
Editor’s note: This is the second post in a series that explores a range of…
David J. Berg*, David Casler^, Romain Cledat*, Qian Huang*, Rui Lin*, Nissan Pow*, Nurcan Sonmez*,…