In December 2020, AWS announced the general availability of Amazon SageMaker JumpStart, a capability of Amazon SageMaker that helps you quickly and easily get started with machine learning (ML). JumpStart provides one-click fine-tuning and deployment of a wide variety of pre-trained models across popular ML tasks, as well as a selection of end-to-end solutions that solve common business problems. These features remove the heavy lifting from each step of the ML process, making it easier to develop high-quality models and reducing time to deployment.
This post is the fifth in a series on using JumpStart for specific ML tasks. In the first post, we showed how you can run image classification use cases on JumpStart. In the second post, we demonstrated how to run text classification use cases. In the third post, we discussed image segmentation use cases. In the fourth post, we showed how you can run text generation use cases.
In this post, we provide a step-by-step walkthrough on how to deploy pre-trained stable diffusion models for generating images from text. We explore two ways of obtaining the same result: via JumpStart’s graphical interface on Amazon SageMaker Studio, and programmatically through JumpStart APIs.
If you want to jump straight into the JumpStart API code we go through in this post, you can refer to the following sample Jupyter notebook: Introduction to JumpStart – Text to Image.
JumpStart helps you get started with ML models for a variety of tasks without writing a single line of code. Currently, JumpStart enables you to do the following:
JumpStart accepts custom VPC settings and AWS Key Management Service (AWS KMS) encryption keys, so you can use the available models and solutions securely within your enterprise environment. You can pass your security settings to JumpStart within Studio or through the SageMaker Python SDK.
Text-to-image is the task of generating realistic images describing a raw input text. Stable diffusion is a popular model for this task. It’s a latent diffusion model where input text is first embedded into a latent space via a language model. Then, a series of noise addition and removal operations are performed in the latent space with U-Net architecture. Finally, denoised output is decoded into the pixel space.
For example, with the input “Garden painted in impressionist style,” we get the following output image.
Although primarily used to generate images conditioned on text, stable diffusion models can also be used for other tasks such as inpainting, outpainting, and generating image-to-image translations guided by text.
The following sections provide a step-by-step demo to perform inference, both via the Studio UI and via JumpStart APIs. We walk through the following steps:
In this section, we demonstrate how to train and deploy JumpStart models through the Studio UI.
The following video shows you how to find a pre-trained text-to-image model on JumpStart and deploy it. The model page contains valuable information about the model and how to use it. You can deploy any of the pre-trained models available in JumpStart. For inference, we pick the ml.p3.2xlarge instance type, because it provides the GPU acceleration needed for low inference latency at a low price point. After you configure the SageMaker hosting instance, choose Deploy. It may take 5–10 minutes until your persistent endpoint is up and running.
After a few minutes, your endpoint is operational and ready to respond to inference requests!
To accelerate your time to inference, JumpStart provides a sample notebook that shows you how to run inference on your freshly deployed endpoint. Choose Open Notebook under Use Endpoint from Studio.
In the preceding section, we showed how you can use the JumpStart UI to deploy a pre-trained model interactively, in a matter of a few clicks. However, you can also use JumpStart’s models programmatically by using APIs that are integrated into the SageMaker SDK.
In this section, we go over a quick example of how you can replicate the preceding process with the SageMaker SDK. We choose an appropriate pre-trained model in JumpStart, deploy this model to a SageMaker endpoint, and run inference on the deployed endpoint. All the steps in this demo are available in the accompanying notebook Introduction to JumpStart – Text to Image.
SageMaker is a platform that makes extensive use of Docker containers for build and runtime tasks. JumpStart uses the available framework-specific SageMaker Deep Learning Containers (DLCs). We first fetch any additional packages, as well as scripts to handle training and inference for the selected task. Finally, the pre-trained model artifacts are separately fetched with model_uris
, which provides flexibility to the platform. You can use any number of models pre-trained on the same task with a single inference script. See the following code:
Next, we feed the resources into a SageMaker model instance and deploy an endpoint:
After our model is deployed, we can get predictions from it in real time!
The following code snippet gives you a glimpse of what the outputs look like. To send requests to a deployed model, input text needs to be supplied in a utf-8 encoded format.
The endpoint response is a JSON object containing the generated image and the prompt:
We get the following image as our output.
In this post, we showed how to deploy a pre-trained text generation model using JumpStart. You can accomplish this without needing to write code. Try out the solution on your own and send us your comments. To learn more about JumpStart and how you can use open-source pre-trained models for a variety of other ML tasks, check out the following AWS re:Invent 2020 video.
Our next iteration of the FSF sets out stronger security protocols on the path to…
Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this…
Generative AI has revolutionized technology through generating content and solving complex problems. To fully take…
At Google Cloud, we're deeply invested in making AI helpful to organizations everywhere — not…
Advanced Micro Devices reported revenue of $7.658 billion for the fourth quarter, up 24% from…