ML 16320 Studio Local Mode image001
We are excited to announce two new capabilities in Amazon SageMaker Studio that will accelerate iterative development for machine learning (ML) practitioners: Local Mode and Docker support. ML model development often involves slow iteration cycles as developers switch between coding, training, and deployment. Each step requires waiting for remote compute resources to start up, which delays validating implementations and getting feedback on changes.
With Local Mode, developers can now train and test models, debug code, and validate end-to-end pipelines directly on their SageMaker Studio notebook instance without the need for spinning up remote compute resources. This reduces the iteration cycle from minutes down to seconds, boosting developer productivity. Docker support in SageMaker Studio notebooks enables developers to effortlessly build Docker containers and access pre-built containers, providing a consistent development environment across the team and avoiding time-consuming setup and dependency management.
Local Mode and Docker support offer a streamlined workflow for validating code changes and prototyping models using local containers running on a SageMaker Studio notebook
instance. In this post, we guide you through setting up Local Mode in SageMaker Studio, running a sample training job, and deploying the model on an Amazon SageMaker endpoint from a SageMaker Studio notebook.
SageMaker Studio introduces Local Mode, enabling you to run SageMaker training, inference, batch transform, and processing jobs directly on your JupyterLab, Code Editor, or SageMaker Studio Classic notebook instances without requiring remote compute resources. Benefits of using Local Mode include:
The following figure illustrates the workflow using Local Mode on SageMaker.
To use Local Mode, set instance_type='local'
when running SageMaker Python SDK jobs such as training and inference. This will run them on the instances used by your SageMaker Studio IDEs instead of provisioning cloud resources.
Although certain capabilities such as distributed training are only available in the cloud, Local Mode removes the need to switch contexts for quick iterations. When you’re ready to take advantage of the full power and scale of SageMaker, you can seamlessly run your workflow in the cloud.
SageMaker Studio now also enables building and running Docker containers locally on your SageMaker Studio notebook instance. This new feature allows you to build and validate Docker images in SageMaker Studio before using them for SageMaker training and inference.
The following diagram illustrates the high-level Docker orchestration architecture within SageMaker Studio.
With Docker support in SageMaker Studio, you can:
Although some advanced Docker capabilities like multi-container and custom networks are not supported as of this writing, the core build and run functionality is available to accelerate developing containers for bring your own container (BYOC) workflows.
To use Local Mode in SageMaker Studio applications, you must complete the following prerequisites:
EnableDockerAccess
parameter to true for the domain’s DockerSettings
using the AWS Command Line Interface (AWS CLI). This allows users in the domain to use Local Mode and Docker features. By default, Local Mode and Docker are disabled in SageMaker Studio. Any existing SageMaker Studio apps will need to be restarted for the Docker service update to take effect. The following is an example AWS CLI command for updating a SageMaker Studio domain:SageMaker Studio JupyterLab and Code Editor (based on Code-OSS, Visual Studio Code – Open Source), extends SageMaker Studio so you can write, test, debug, and run your analytics and ML code using the popular lightweight IDE. For more details on how to get started with SageMaker Studio IDEs, refer to Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools and New – Code Editor, based on Code-OSS VS Code Open Source now available in Amazon SageMaker Studio. Complete the following steps:
my-sm-code-editor-space
or my-sm-jupyterlab-space
, respectively.ml.m5.large
instance and set storage to 32 GB./home/sagemaker-user/
as the target folder.pip install sagemaker -Uq
in the terminal.For Code Editor only, you need to set the Python environment to run in the current terminal.
scikit_learn_script_mode_local_training_and_serving
folder and run the scikit_learn_script_mode_local_training_and_serving.py
file.You can run the script by choosing Run in Code Editor or using the CLI in a JupyterLab terminal. RMSE
).
You can also use a notebook in SageMaker Studio Classic to run a small-scale training job on CIFAR10
using Local Mode, deploy the model locally, and perform inference.
To set up the notebook, complete the following steps:
pytorch_local_mode_cifar10.ipynb notebook in blog/pytorch_cnn_cifar10
.PyTorch 2.1.0 Python 3.10 CPU Optimized
.Because you’re using Docker from SageMaker Studio Classic, remove sudo when running commands because the terminal already runs under superuser. For SageMaker Studio Classic, the installation commands depend on the SageMaker Studio app image OS. For example, DLC-based framework images are Ubuntu based, in which the following instructions would work. However, for a Debian-based image like DataScience Images, you must follow the instructions in the following GitHub repo. If chained commands fail, run the commands one at a time. You should see the Docker version displayed.
Make sure to run the cell with pip install -U sagemaker
so you’re using the latest version of the SageMaker Python SDK.
When you start running the local SageMaker training job, you will see the following log lines:
This indicates that the training was running locally using Docker.
Be patient while the pytorch-training:2.1-cpu-py310
Docker image is pulled. Due to its large size (5.2 GB), it could take a few minutes.
Docker images will be stored in the SageMaker Studio app instance’s root volume, which is not accessible to end-users. The only way to access and interact with Docker images is via the exposed Docker API operations.
From a user confidentiality standpoint, the SageMaker Studio platform never accesses or stores user-specific images.
When the training is complete, you’ll be able to see the following success log lines:
Complete the following steps:
Be patient while the pytorch-inference:2.1-cpu-py310
Docker image is pulled. Due to its large size (4.32 GB), it could take a few minutes.
You will be able to see the predicted classes: frog, ship, car, and plane:
docker ps
You’ll be able to see the running pytorch-inference:2.1-cpu-py310
container backing the SageMaker endpoint.
docker stop <CONTAINER_ID>
to stop it.If you’re using SageMaker for the first time, refer to Train machine learning models. To learn more about deploying models for inference with SageMaker, refer to Deploy models for inference.
Keep in mind the following recommendations:
You can define the Docker install process as a Lifecycle Configuration (LCC) script to simplify setup each time a new SageMaker Studio space starts. LCCs are scripts that SageMaker runs during events like space creation. Refer to the JupyterLab, Code Editor, or SageMaker Studio Classic LCC setup (using docker install cli as reference) to learn more.
In this step, you install Docker inside the JupyterLab (or Code Editor) app space and use Docker to build, test, and publish custom Docker images with SageMaker Studio spaces. Spaces are used to manage the storage and resource needs of some SageMaker Studio applications. Each space has a 1:1 relationship with an instance of an application. Every supported application that is created gets its own space. To learn more about SageMaker spaces, refer to Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools. Make sure you provision a new space with at least 30 GB of storage to allow sufficient storage for Docker images and artifacts.
To install the Docker CLI and Docker Compose plugin inside a JupyterLab space, run the commands in the following GitHub repo. SageMaker Studio only supports Docker version 20.10.X.
To confirm that Docker is installed and working inside your JupyterLab space, run the following code:
To build a custom Docker image inside a JupyterLab (or Code Editor) space, complete the following steps:
touch Dockerfile
The following code shows the contents of an example flask application file app.py
:
Additionally, you can update the reference Dockerfile commands to include packages and artifacts of your choice.
docker build --network sagemaker --tag myflaskapp:v1 --file ./Dockerfile .
Include --network
sagemaker in your docker build command, otherwise the build will fail. Containers can’t be run in Docker default bridge or custom Docker networks. Containers are run in same network as the SageMaker Studio application container. Users can only use sagemaker for the network name.
Having Docker installed inside a JupyterLab (or Code Editor) SageMaker Studio space allows you to test pre-built or custom Docker images as containers (or containerized applications). In this section, we use the docker run command to provision Docker containers inside a SageMaker Studio space to test containerized workloads like REST web services and Python scripts. Complete the following steps:
sagemaker-user@default:~$ docker pull 123456789012.dkr.ecr.us-east-2.amazonaws.com/myflaskapp:v1
aws ecr get-login-password --region region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.region.amazonaws.com
docker run --network sagemaker 123456789012.dkr.ecr.us-east-2.amazonaws.com/myflaskapp:v1
This spins up a new container instance and runs the application defined using Docker’s ENTRYPOINT:
https://<sagemaker-space-id>.studio.us-east-2.sagemaker.aws/jupyterlab/default/proxy/6006/
.You should see a JSON response similar to following screenshot.
To avoid incurring unnecessary charges, delete the resources that you created while running the examples in this post:
SageMaker Studio Local Mode and Docker support empower developers to build, test, and iterate on ML implementations faster without leaving their workspace. By providing instant access to test environments and outputs, these capabilities optimize workflows and improve productivity. Try out SageMaker Studio Local Model and Docker support using our quick onboard feature, which allows you to spin up a new domain for single users within minutes. Share your thoughts in the comments section!
Matrices are a key concept not only in linear algebra but also with regard to…
This paper delves into the challenging task of Active Speaker Detection (ASD), where the system…
Based on original post by Dr. Hemant Joshi, CTO, FloTorch.ai A recent evaluation conducted by…
As AI creates opportunities for business growth and societal benefits, we’re working to reduce their…
PlayStation characters may one day engage you in theoretically endless conversations, if a new internal…
The latest 15-inch MacBook Air is bluer and better than ever before—and it dropped in…