ML 19189 image 2
Operating a self-managed MLflow tracking server comes with administrative overhead, including server maintenance and resource scaling. As teams scale their ML experimentation, efficiently managing resources during peak usage and idle periods is a challenge. Organizations running MLflow on Amazon EC2 or on-premises can optimize costs and engineering resources by using Amazon SageMaker AI with serverless MLflow.
This post shows you how to migrate your self-managed MLflow tracking server to a MLflow App – a serverless tracking server on SageMaker AI that automatically scales resources based on demand while removing server patching and storage management tasks at no cost. Learn how to use the MLflow Export Import tool to transfer your experiments, runs, models, and other MLflow resources, including instructions to validate your migration’s success.
While this post focuses on migrating from self-managed MLflow tracking servers to SageMaker with MLflow, the MLflow Export Import tool offers broader utility. You can apply the same approach to migrate existing SageMaker managed MLflow tracking servers to the new serverless MLflow capability on SageMaker. The tool also helps with version upgrades and establishing backup routines for disaster recovery.
The following guide provides step-by-step instructions for migrating an existing MLflow tracking server to SageMaker with MLflow. The migration process consists of three main phases: exporting your MLflow artifacts to intermediate storage, configuring an MLflow App, and importing your artifacts. You can choose to execute the migration process from an EC2 instance, your personal computer, or a SageMaker notebook. Whichever environment you select must maintain connectivity to both your source tracking server and your target tracking server. MLflow Export Import supports exports from both self-managed tracking servers and Amazon SageMaker MLflow tracking servers (from MLflow v2.16 onwards) to Amazon SageMaker Serverless MLflow.
Figure 1: Migration process with MLflow Export Import tool
To follow along with this post, make sure you have the following prerequisites:
Before starting the migration, remember that not all MLflow features may be supported in the migration process. The MLflow Export Import tool supports different objects based on your MLflow version. To prepare for a successful migration:
To prepare your target environment, you first need to create a new SageMaker Serverless MLflow App.
Figure 2: MLflow App in SageMaker Studio Console
Alternatively, you can view it by executing the following AWS Command Line Interface (CLI) command:
Figure 3: MLflow user interface, landing page
To prepare your execution environment for the migration, you need to establish connectivity to your existing MLflow servers (see prerequisites) and install and configure the necessary MLflow packages and plugins.
Before you can export your MLflow resources, you need to install the MLflow Export Import tool.
export-all and import-all), which allow you to create a copy of your tracking server with its experiments and related artefacts. This approach maintains the referential integrity between objects. If you want to migrate only selected experiments or change the name of existing experiments, you can use Single tools. Please review the MLflow Export Import documentation for more information on supported objects and limitations.Now that your environment is configured, we can begin the actual migration process by exporting your MLflow resources from your source environment.
Figure 4: Experiments stored in MLflow
mlflow-export).During import, user-defined attributes are retained, but system-generated tags (e.g., creation_date) are not preserved by MLflow Export Import. To preserve original system attributes, use the --import-source-tags option as shown in the following example. This saves them as tags with the mlflow_exim prefix. For more information, see MLflow Export Import – Governance and Lineage. Be aware of additional limitations detailed here: Import Limitations.
The following procedure transfers your exported MLflow resources into your new MLflow App:Start the import by configuring the URI for your MLflow App. You can use the ARN–which you saved in Step 1–for this. The previously installed SageMaker MLflow plugin automatically translates the ARN in a valid URI and creates an authenticated request to AWS (remember to configure your AWS credentials as environmental variables so the plugin can pick them up).
To confirm your migration was successful, verify that your MLflow resources were transferred correctly:
Figure 5: MLflow user interface, landing page after migration
When planning your MLflow migration, verify your execution environment (whether EC2, local machine, or SageMaker notebooks) has sufficient storage and computing resources to handle your source tracking server’s data volume. While the migration can run in various environments, performance may vary based on network connectivity and available resources. For large-scale migrations, consider breaking down the process into smaller batches (for example, individual experiments).
A SageMaker managed MLflow tracking server will incur costs until you delete or stop it. Billing for tracking servers is based on the duration the servers have been running, the size selected, and the amount of data logged to the tracking servers. You can stop tracking servers when they’re not in use to save costs, or you can delete them using API or the SageMaker Studio UI. For more details on pricing, refer to Amazon SageMaker pricing.
In this post, we demonstrated how to migrate a self-managed MLflow tracking server to SageMaker with MLflow using the open source MLflow Export Import tool. The migration to a serverless MLflow App on Amazon SageMaker AI reduces the operational overhead associated with maintaining MLflow infrastructure while providing seamless integration with the comprehensive AI/ML serves in SageMaker AI.
To get started with your own migration, follow the preceding step-by-step guide and consult the referenced documentation for additional details. You can find code samples and examples in our AWS Samples GitHub repository. For more information about Amazon SageMaker AI capabilities and other MLOps features, visit the Amazon SageMaker AI documentation.
Embeddings — vector-based numerical representations of typically unstructured data like text — have been primarily…
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval. However, they…
This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health,…
Anthropic released Cowork on Monday, a new AI agent capability that extends the power of…
New York governor Kathy Hochul says she will propose a new law allowing limited autonomous…
Artificial intelligence (AI) is increasingly used to analyze medical images, materials data and scientific measurements,…