Categories: FAANG

Managing your cloud ecosystems: Maintaining workload continuity during worker node upgrades

Planning and managing your cloud ecosystem and environments is critical for reducing production downtime and maintaining a functioning workload. In the “Managing your cloud ecosystems” blog series, we cover different strategies for ensuring that your setup functions smoothly with minimal downtime.

To start things off, the first topic in this blog series is ensuring workload continuity during worker node upgrades.

What are worker node upgrades?

Worker node upgrades apply important security updates and patches and should be completed regularly. For more information on types of worker node upgrades, see Updating VPC worker nodes and Updating Classic worker nodes in the IBM Cloud Kubernetes Service documentation.

During an upgrade, some of your worker nodes may become unavailable. It’s important to make sure your cluster has enough capacity to continue running your workload throughout the upgrade process. Building a pipeline to update your worker nodes without causing application downtime will allow you to easily apply worker node upgrades regularly.

For classic worker nodes

Create a Kubernetes configmap that defines the maximum number of worker nodes that can be unavailable at a time, including during an upgrade. The maximum value is specified as a percentage. You can also use labels to apply different rules to different worker nodes. For complete instructions, see Updating Classic worker nodes in the CLI with a configmap in the Kubernetes service documentation. If you choose not to create a configmap, the default maximum amount of worker nodes that become unavailable is 20%.

If you need your total number of worker nodes to remain up and running, use the ibmcloud ks worker-pool resize command to temporarily add extra worker nodes to your cluster for the duration of the upgrade process. When the upgrade is complete, use the same command to remove the additional worker nodes and return your worker pool to its previous size.

For VPC worker nodes

VPC worker nodes are replaced by removing the old worker node and provisioning a new worker node that runs at the new version. You can upgrade one or more worker nodes at the same time, but if you upgrade multiple at once, they become unavailable at the same time. To make sure you have enough capacity to run your workload during the upgrade, you can plan to either resize your worker pools to temporarily add extra worker nodes (similar to the process described for classic worker nodes) or plan to upgrade your worker nodes one by one.

Wrap up

Whether you choose to implement a configmap, resize your worker pool or upgrade components one-by-one, creating a workload continuity plan before you upgrade your worker nodes can help you create a more streamlined, efficient setup with limited downtime.

Now that you have a plan to prevent disruptions during worker node upgrades, keep an eye out for the next blog in our series, which will discuss how, when and why to implement major, minor or patch upgrades to your clusters and worker nodes.

Learn more about IBM Cloud Kubernetes Service clusters

The post Managing your cloud ecosystems: Maintaining workload continuity during worker node upgrades appeared first on IBM Blog.

AI Generated Robotic Content

Recent Posts

The Future of Portrait Photography: Real Faces or AI Creations?

The rise of AI-generated images has created a fascinating paradox in the world of photography…

16 mins ago

15 Best Wireless Earbuds, Tested and Reviewed (2024)

Ready to cut the cord? These are our favorite buds that will never, ever get…

1 hour ago

Detecting and Overcoming Perfect Multicollinearity in Large Datasets

One of the significant challenges statisticians and data scientists face is multicollinearity, particularly its most…

1 day ago

5 Emerging AI Technologies That Will Shape the Future of Machine Learning

Artificial intelligence is not just altering the way we interact with technology; it’s reshaping the…

1 day ago

How Vidmob is using generative AI to transform its creative data landscape

This post was co-written with Mickey Alon from Vidmob. Generative artificial intelligence (AI) can be…

1 day ago

How few-shot learning with Google’s Prompt Poet can supercharge your LLMs

Prompt Poet allows you to ground LLM-generated responses to a real-world data context, opening up…

1 day ago