Artificial Intelligence (AI) and large language models (LLMs) are experiencing explosive growth, powering applications from machine translation to artistic creation. These technologies rely on intensive computations that require specialized hardware resources, like GPUs. But access to GPUs can be challenging, both in terms of availability and cost.
For Google Cloud users, the introduction of Dynamic Workload Scheduler (DWS) transformed how you can access and use GPU resources, particularly within a Google Kubernetes Engine (GKE) cluster. Dynamic Workload Scheduler optimizes AI/ML resource access and spending by simultaneously scheduling necessary accelerators like TPUs and GPUs across various Google Cloud services, improving the performance of training and fine-tuning jobs.
Further, Dynamic Workload Scheduler offers an easy and straightforward integration between GKE and Kueue, a cloud-native job scheduler, making it easier to access GPUs as quickly as possible, in a given region, for a given GKE cluster.
But what if you want to deploy your workload in any available region, as soon as possible, as soon as DWS provides you the resources your workload needs?
This is where MultiKueue, a Kueue feature, comes into play. With MultiKueue, GKE, and Dynamic Workload Scheduler, you can wait for accelerators in multiple regions. Dynamic Workload Scheduler automatically provisions resources in the best GKE clusters as soon as they are available. By submitting workloads to a global queue, MultiKueue executes them in the region with available GPU resources, helping to optimize global resource usage, lowering costs, and speeding up processing.
- aside_block
- <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3ef062fb8e80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
MultiKueue
MultiKueue enables workload distribution across multiple GKE clusters in different regions. By identifying clusters with available resources, MultiKueue simplifies the process of dispatching jobs to the optimal location.
Dynamic Workload Scheduler on GKE Autopilot, our managed Kubernetes service that automatically handles the provisioning, scaling, security, and maintenance of your container infrastructure; it’s supported on GKE Autopilot 1.30.3. Let’s take a deeper look at how to set up and manage MultiKueue with Dynamic Workload Scheduler, so you can obtain GPU resources faster.
MultiKueue cluster roles
MultiKueue provides two distinct cluster roles:
Manager cluster – Establish and maintain the connection with the worker clusters, as well as create and monitor remote objects (workloads or jobs) while keeping the local ones in sync.
Worker cluster – A simple standalone Kueue cluster that lets you execute the job submitted by the manager cluster.
Creating a MultiKueue cluster
In this example we create four GKE Autopilot clusters:
One manager cluster in europe-west4
Three worker clusters in
europe-west4
us-east4
asia-southeast1
Let’s take a look at how this works in the following step-by-step example. You can access the files for this example in this github repository.
1. Clone github repository
- code_block
- <ListValue: [StructValue([(‘code’, ‘git clone https://github.com/GoogleCloudPlatform/ai-on-gke.gitrncd ai-on-gke/tutorials-and-examples/workflow-orchestration/dws-multiclusters-example’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb8310>)])]>
2. Create GKE clusters
- code_block
- <ListValue: [StructValue([(‘code’, ‘terraform -chdir tf initrn terraform -chdir tf planrn terraform -chdir tf apply -var project_id=<YOUR PROJECT_ID>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb8f10>)])]>
This terraform script creates the required GKE clusters and adds four entries to your kubeconfig files:
manager-europe-west4
worker-us-east4
worker-europe-west4
worker-asia-southeast1
Then you can switch between contexts easily with
- code_block
- <ListValue: [StructValue([(‘code’, ‘kubectl config use-context <context name>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb8b50>)])]>
3. Install and configure MultiKueue
- code_block
- <ListValue: [StructValue([(‘code’, ‘./deploy-multikueue.sh’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb82e0>)])]>
This script:
Installs kueue in the four clusters
Enables and configures MultiKueue in the manager cluster
Creates a podMonitoring resource for each clusters that enables kueue metrics to be sent to Google Cloud Managed Service for Prometheus
Configures the connection between the manager cluster and the worker clusters
Configures Kueue in the worker clusters
GKE clusters, Kueue with MultiKueue, and DWS are now configured and ready to use. Once you submit your jobs, the Kueue manager distributes them across the three worker clusters.
In the dws-multi-worker.yaml file, you’ll find the Kueue configuration for the worker clusters, including the manager configuration.
The following script provides a basic example of how to set up the MultiKueue AdmissionCheck with three worker clusters.
- code_block
- <ListValue: [StructValue([(‘code’, ‘apiVersion: kueue.x-k8s.io/v1beta1rnkind: AdmissionCheckrnmetadata:rn name: sample-dws-multikueuernspec:rn controllerName: kueue.x-k8s.io/multikueuern parameters:rn apiGroup: kueue.x-k8s.iorn kind: MultiKueueConfigrn name: multikueue-dwsrn—rnapiVersion: kueue.x-k8s.io/v1alpha1rnkind: MultiKueueConfigrnmetadata:rn name: multikueue-dwsrnspec:rn clusters:rn – multikueue-dws-worker-asiarn – multikueue-dws-worker-usrn – multikueue-dws-worker-eurn—‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb88e0>)])]>
4. Submit jobs
Ensure you’re using the manager kubecontext when submitting jobs.
- code_block
- <ListValue: [StructValue([(‘code’, ‘kubectl config use-context manager-europe-west4rnkubectl create -f job-multi-dws-autopilot.yaml’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb8460>)])]>
To observe how the MultiKueue admission check distributes jobs among worker clusters, you can submit the job creation request multiple times.
5. Get jobs status
To check the job status and determine the scheduled region, execute the following command
- code_block
- <ListValue: [StructValue([(‘code’, ‘kubectl get workloads.kueue.x-k8s.io -o jsonpath='{range .items[*]}{.status.admissionChecks}{“\n”}{end}”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb82b0>)])]>
6. Delete resources
Finally, be sure to delete the four GKE clusters you created to try out this functionality:
- code_block
- <ListValue: [StructValue([(‘code’, ‘terraform -chdir=tf destroy -var project_id=<YOUR_PROJECT_ID>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ef062fb8cd0>)])]>
What’s next
So that’s how you can leverage MultiKueue, GKE, and DWS to streamline global job execution, optimize speed, and eliminate the need for manual node management!
This setup also addresses the needs of those with data residency requirements, allowing you to dedicate subsets of clusters for different workloads and ensure compliance.
To further enhance your setup, you can leverage advanced kueue features like team management with local kueue or workload priority classes. Additionally, you can gain valuable insights by creating a Grafana or Cloud Monitoring dashboard that utilizes Kueue metrics, which are automatically handled by Google Managed Service for Prometheus via the PodMonitoring resources.