Categories: FAANG

Priority-based scheduling between node pools

1 GKE Blog Image1.max 1000x1000 1

Google Kubernetes Engine (GKE) is a leading managed Kubernetes service in the market. GKE is used by several organizations today that are using Google Cloud. As costs rise, customers are shifting their focus to cloud cost optimization. They are seeking more efficient and cost-effective ways to run their workloads by utilizing the best-in-class optimization techniques across products and services.

From the many ways to do cost optimization on GKE, running workloads on low cost compute Spot VMs is a cost-effective way to do so. Spot VMs are idle server machines that get offered at a significant discount. Node pools can be used to deploy workloads across multiple machine types and help reduce costs.

In this blog we’ll discuss how to use four different node pools: E2 standard, N2 standard, N2d standard and N2 standard (spot VMs) to deploy workloads and help reduce costs for running a production workload.

Adding node pools

Create a standard GKE cluster with these node pools. You can also replace these node pools with GPU and other required machine shapes.

Define priority class

Pods can be assigned priority class, which indicates how important they are relative to other pods. If a pod cannot be scheduled, the scheduler will try to preempt (evict) lower priority pods to make room for it.

Priority class definition when used, will help evict lower priority pods on required node pools.

In our example we have two classes defined low-priority with value 10000 and high-priority with 1000000.

priorityclass.yaml

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: scheduling.k8s.io/v1rnkind: PriorityClassrnmetadata:rn name: low-priorityrnvalue: 10000rndescription: “Low priority workloads”rn—rnapiVersion: scheduling.k8s.io/v1rnkind: PriorityClassrnmetadata:rn name: high-priorityrnvalue: 1000000rnpreemptionPolicy: PreemptLowerPriority’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e51c0692d00>)])]>

Node affinity and priority class for deployments

Pods can be constrained from being scheduled on nodes based on node labels using node affinity. There are two types of node affinity:

requiredDuringSchedulingIgnoredDuringExecution: Unless the rule is met the scheduler can’t schedule the Pod.
preferredDuringSchedulingIgnoredDuringExecution: Scheduler tries to find a node that meets this rule. If a matching node is not available, the scheduler still schedules the Pod.

Create two separate deployments. Here deployment-1 and deployment-2 are running on separate sets of node pools.

Deployment files required for setup

deployment-app1.yaml

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: apps/v1rnkind: Deploymentrnmetadata:rn name: nginx-deployment-app1rn labels:rn app: nginxrnspec:rn replicas: 500rn selector:rn matchLabels:rn app: nginxrn template:rn metadata:rn labels:rn app: nginxrn spec:rn affinity:rn nodeAffinity:rn requiredDuringSchedulingIgnoredDuringExecution:rn nodeSelectorTerms:rn – matchExpressions:rn – key: nodetypern operator: Inrn values:rn – pool1rn – pool2rn – pool4rn preferredDuringSchedulingIgnoredDuringExecution:rn – weight: 80rn preference:rn matchExpressions:rn – key: cloud.google.com/gke-spotrn operator: DoesNotExistrnrn containers:rn – name: nginxrn image: nginx:1.14.2rn ports:rn – containerPort: 80rn priorityClassName: low-priority’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e51c06920a0>)])]>

The above deployment will deploy pods on node pools 1, 2 and 4 with a weight of 80 for spot-VMs. Note deployment 1 will get deployed with a low-priority class.

After deployment of deployment-app1.yaml

code_block: <ListValue: [StructValue([(‘code’, ‘$ for i in `kubectl get nodes|awk ‘NR!=1’|awk -F ” ” ‘{print $1}’`;do echo “HostName : ” $i;kubectl get pods -A –field-selector spec.nodeName=$i|egrep -v “kube-system|gmp-system|NAMESPACE|NAME”|grep Running|awk -F ” ” ‘{print $2}’ |awk -F “-” ‘{print $3}’|sort -n |uniq -c;donernrnHostName : gke-test-multiple-n-pool-n2d-30d0116c-6loyrnHostName : gke-test-multiple-n-pool-n2d-30d0116c-pv1trnHostName : gke-test-multiple-n-pool1-e2-4a68c136-q07arn 102 app1rnHostName : gke-test-multiple-n-pool1-e2-4a68c136-uytern 102 app1rnHostName : gke-test-multiple-no-pool-n2-301254db-k2mgrn 100 app1rnHostName : gke-test-multiple-no-pool-n2-301254db-mhd9rn 101 app1rnHostName : gke-test-multiple-nod-pool-4-54dfb5af-4k28rn 48 app1rnHostName : gke-test-multiple-nod-pool-4-54dfb5af-q0xjrn 47 app1’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e51c0692ee0>)])]>

deployment-app2.yaml

code_block: <ListValue: [StructValue([(‘code’, ‘apiVersion: apps/v1rnkind: Deploymentrnmetadata:rn name: nginx-deployment-app2rn labels:rn app: nginxrnspec:rn replicas: 600rn selector:rn matchLabels:rn app: nginxrn template:rn metadata:rn labels:rn app: nginxrn spec:rn affinity:rn nodeAffinity:rn requiredDuringSchedulingIgnoredDuringExecution:rn nodeSelectorTerms:rn – matchExpressions:rn – key: nodetypern operator: Inrn values:rn – pool1rn – pool3rn – pool4rn preferredDuringSchedulingIgnoredDuringExecution:rn – weight: 80rn preference:rn matchExpressions:rn – key: cloud.google.com/gke-spotrn operator: DoesNotExistrn containers:rn – name: nginxrn image: nginx:1.14.2rn ports:rn – containerPort: 80rn priorityClassName: high-priority’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e51c0692220>)])]>

Deployment-2 will deploy pods on node pools 1, 3, and 4 with a weight of 80 for spot-VMs. It will evict pods running on pools 1 and 4 as it’s deployed with a higher priority class.

After deployment of deployment-app2.yaml

code_block: <ListValue: [StructValue([(‘code’, ‘$ for i in `kubectl get nodes|awk ‘NR!=1’|awk -F ” ” ‘{print $1}’`;do echo “HostName : ” $i;kubectl get pods -A –field-selector spec.nodeName=$i|egrep -v “kube-system|gmp-system|NAMESPACE|NAME”|grep Running|awk -F ” ” ‘{print $2}’ |awk -F “-” ‘{print $3}’|sort -n |uniq -c;donernrnHostName : gke-test-multiple-n-pool-n2d-30d0116c-6loyrnHostName : gke-test-multiple-n-pool-n2d-30d0116c-pv1trnHostName : gke-test-multiple-n-pool1-e2-4a68c136-q07arn 102 app1rnHostName : gke-test-multiple-n-pool1-e2-4a68c136-uytern 102 app1rnHostName : gke-test-multiple-no-pool-n2-301254db-k2mgrn 100 app1rnHostName : gke-test-multiple-no-pool-n2-301254db-mhd9rn 101 app1rnHostName : gke-test-multiple-nod-pool-4-54dfb5af-4k28rn 48 app1rnHostName : gke-test-multiple-nod-pool-4-54dfb5af-q0xjrn 47 app1’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e51c06925b0>)])]>

After running the second deployment, compare the two outputs above. Observe that the pods running for the first deployment get evicted. This shows how to use features like node affinity and node pools to run complex production workloads and optimize cost with spot-VMs.

You can create node pools with the GPU machine types to run your ML workloads. The example shown here should work just fine with GPU machines. Priority based scheduling will also support active ML workloads on GKE which need higher-priority GPU machines when required.

Conclusion

GKE can help you runoptimized AI workloads with platform orchestration.

Learn more about the recently launched GKE Editions which can help organizations with configuration and policy management, fleet wide networking features, Identity management, observability and help support microservice-based architecture.

AI Generated Robotic Content