The launch of ChatGPT and rise in popularity of generative AI have captured the imagination of customers who are curious about how they can use this technology to create new products and services on AWS, such as enterprise chatbots, which are more conversational. This post shows you how you can create a web UI, which we call Chat Studio, to start a conversation and interact with foundation models available in Amazon SageMaker JumpStart such as Llama 2, Stable Diffusion, and other models available on Amazon SageMaker. After you deploy this solution, users can get started quickly and experience the capabilities of multiple foundation models in conversational AI though a web interface.
Chat Studio can also optionally invoke the Stable Diffusion model endpoint to return a collage of relevant images and videos if the user requests for media to be displayed. This feature can help enhance the user experience with the use of media as accompanying assets to the response. This is just one example of how you can enrich Chat Studio with additional integrations to meet your goals.
The following screenshots show examples of what a user query and response look like.
Generative AI chatbots such as ChatGPT are powered by large language models (LLMs), which are based on a deep learning neural network that can be trained on large quantities of unlabeled text. The use of LLMs allows for a better conversational experience that closely resembles interactions with real humans, fostering a sense of connection and improved user satisfaction.
In 2021, the Stanford Institute for Human-Centered Artificial Intelligence termed some LLMs as foundation models. Foundation models are pre-trained on a large and broad set of general data and are meant to serve as the foundation for further optimizations in a wide range of use cases, from generating digital art to multilingual text classification. These foundation models are popular with customers because training a new model from scratch takes time and can be expensive. SageMaker JumpStart provides access to hundreds of foundation models maintained from third-party open source and proprietary providers.
This post walks through a low-code workflow for deploying pre-trained and custom LLMs through SageMaker, and creating a web UI to interface with the models deployed. We cover the following steps:
Refer to the following diagram for an overview of the solution architecture.
To walk through the solution, you must have the following prerequisites:
npm
installed in your local environment. For instructions on how to install npm
, refer to Downloading and installing Node.js and npm.To request a service quota increase, on the AWS Service Quotas console, navigate to AWS services, SageMaker, and request for a service quota raise to a value of 1 for ml.g5.48xlarge for endpoint usage and ml.p3.2xlarge for endpoint usage.
The service quota request may take a few hours to be approved, depending on the instance type availability.
SageMaker is a fully managed machine learning (ML) service for developers to quickly build and train ML models with ease. Complete the following steps to deploy the Llama 2 13b Chat and Stable Diffusion 2.1 foundation models using Amazon SageMaker Studio:
A domain sets up all the storage and allows you to add users to access SageMaker.
meta-textgeneration-llama-2-13b-f
.After the deployment succeeds, you should be able to see the In Service
status.
jumpstart-dft-stable-diffusion-v2-1-base
.After the deployment succeeds, you should be able to see the In Service
status.
This section describes how you can launch a CloudFormation stack that deploys a Lambda function that processes your user request and calls the SageMaker endpoint that you deployed, and deploys all the necessary IAM permissions. Complete the following steps:
lambda.cfn.yaml
) to your local machine.lambda.cfn.yaml
file that you downloaded, then choose Next.This section describes the steps to run the web UI (created using Cloudscape Design System) on your local machine:
functionUrl
.react-llm-chat-studio
code.src/configs/aws.json
and input the access key and secret access key you obtained.To use Chat Studio, choose a foundational model on the drop-down menu and enter your query in the text box. To get AI-generated images along with the response, add the phrase “with images” to the end of your query.
You can further extend the capability of this solution to include additional SageMaker foundation models. Because every model expects different input and output formats when invoking its SageMaker endpoint, you will need to write some transformation code in the callSageMakerEndpoints Lambda function to interface with the model.
This section describes the general steps and code changes required to implement an additional model of your choice. Note that basic knowledge of Python language is required for Steps 6–8.
These are the fields that the new model expects when invoking its SageMaker endpoint. The following screenshot shows an example.
callSageMakerEndpoints
.In the following screenshot, we transformed the input for Falcon 40B Instruct BF16 and GPT NeoXT Chat Base 20B FP16. You can insert your custom parameter logic as indicated to add the input transformation logic with reference to the payload parameters that you copied.
query_endpoint
.This function gives you an idea how to transform the output of the models to extract the final text response.
query_endpoint
, add a custom output handler for your new model.react-llm-chat-studio
code, and navigate to src/configs/models.json
.payload
using the following format: Amplify is a complete solution that allows you to quickly and efficiently deploy your application. This section describes the steps to deploy Chat Studio to an Amazon CloudFront distribution using Amplify if you wish to share your application with other users.
react-llm-chat-studio
code folder you created earlier.To avoid incurring future charges, complete the following steps:
In this post, we explained how to create a web UI for interfacing with LLMs deployed on AWS.
With this solution, you can interact with your LLM and hold a conversation in a user-friendly manner to test or ask the LLM questions, and get a collage of images and videos if required.
You can extend this solution in various ways, such as to integrate additional foundation models, integrate with Amazon Kendra to enable ML-powered intelligent search for understanding enterprise content, and more!
We invite you to experiment with different pre-trained LLMs available on AWS, or build on top of or even create your own LLMs in SageMaker. Let us know your questions and findings in the comments, and have fun!
Whether a company begins with a proof-of-concept or live deployment, they should start small, test…
Digital tools are not always superior. Here are some WIRED-tested agendas and notebooks to keep…
Machine learning (ML) models are built upon data.
Editor’s note: This is the second post in a series that explores a range of…
David J. Berg*, David Casler^, Romain Cledat*, Qian Huang*, Rui Lin*, Nissan Pow*, Nurcan Sonmez*,…
Qualcomm did not violate a license with Arm when it acquired Nuvia for $1.4 billion,…