Categories: FAANG

Managing your cloud ecosystems: Maintaining workload continuity during worker node upgrades

Planning and managing your cloud ecosystem and environments is critical for reducing production downtime and maintaining a functioning workload. In the “Managing your cloud ecosystems” blog series, we cover different strategies for ensuring that your setup functions smoothly with minimal downtime.

To start things off, the first topic in this blog series is ensuring workload continuity during worker node upgrades.

What are worker node upgrades?

Worker node upgrades apply important security updates and patches and should be completed regularly. For more information on types of worker node upgrades, see Updating VPC worker nodes and Updating Classic worker nodes in the IBM Cloud Kubernetes Service documentation.

During an upgrade, some of your worker nodes may become unavailable. It’s important to make sure your cluster has enough capacity to continue running your workload throughout the upgrade process. Building a pipeline to update your worker nodes without causing application downtime will allow you to easily apply worker node upgrades regularly.

For classic worker nodes

Create a Kubernetes configmap that defines the maximum number of worker nodes that can be unavailable at a time, including during an upgrade. The maximum value is specified as a percentage. You can also use labels to apply different rules to different worker nodes. For complete instructions, see Updating Classic worker nodes in the CLI with a configmap in the Kubernetes service documentation. If you choose not to create a configmap, the default maximum amount of worker nodes that become unavailable is 20%.

If you need your total number of worker nodes to remain up and running, use the ibmcloud ks worker-pool resize command to temporarily add extra worker nodes to your cluster for the duration of the upgrade process. When the upgrade is complete, use the same command to remove the additional worker nodes and return your worker pool to its previous size.

For VPC worker nodes

VPC worker nodes are replaced by removing the old worker node and provisioning a new worker node that runs at the new version. You can upgrade one or more worker nodes at the same time, but if you upgrade multiple at once, they become unavailable at the same time. To make sure you have enough capacity to run your workload during the upgrade, you can plan to either resize your worker pools to temporarily add extra worker nodes (similar to the process described for classic worker nodes) or plan to upgrade your worker nodes one by one.

Wrap up

Whether you choose to implement a configmap, resize your worker pool or upgrade components one-by-one, creating a workload continuity plan before you upgrade your worker nodes can help you create a more streamlined, efficient setup with limited downtime.

Now that you have a plan to prevent disruptions during worker node upgrades, keep an eye out for the next blog in our series, which will discuss how, when and why to implement major, minor or patch upgrades to your clusters and worker nodes.

Learn more about IBM Cloud Kubernetes Service clusters

The post Managing your cloud ecosystems: Maintaining workload continuity during worker node upgrades appeared first on IBM Blog.

AI Generated Robotic Content

Recent Posts

10 Ways to Use Embeddings for Tabular ML Tasks

Embeddings — vector-based numerical representations of typically unstructured data like text — have been primarily…

12 hours ago

Over-Searching in Search-Augmented Large Language Models

Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval. However, they…

12 hours ago

How Omada Health scaled patient care by fine-tuning Llama models on Amazon SageMaker AI

This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health,…

12 hours ago

Anthropic launches Cowork, a Claude Desktop agent that works in your files — no coding required

Anthropic released Cowork on Monday, a new AI agent capability that extends the power of…

13 hours ago

New Proposed Legislation Would Let Self-Driving Cars Operate in New York State

New York governor Kathy Hochul says she will propose a new law allowing limited autonomous…

13 hours ago

From brain scans to alloys: Teaching AI to make sense of complex research data

Artificial intelligence (AI) is increasingly used to analyze medical images, materials data and scientific measurements,…

13 hours ago