Summary: This wiki page shows how I configure my AKS nodepools and migrate pods between nodepools if needed.
Date: 2 January 2026
I would like to start with explaining what nodepools are, especially in Azure Kubernetes Service (AKS). However, sometimes, the documentation is just very good:
In Azure Kubernetes Service (AKS), nodes of the same configuration are grouped together into node pools. Node pools contain the underlying VMs that run your applications. System node pools and user node pools are two different node pool modes for your AKS clusters. System node pools serve the primary purpose of hosting critical system pods such as CoreDNS and metrics-server. User node pools serve the primary purpose of hosting your application pods.
Pod scheduling in Kubernetes is managed using (among others) using taints and tolerations. Taints are applied to nodes and allow a node to repel a set of pods unless those pods have a matching toleration. Tolerations are applied to pods and allow (but do not require) the pods to schedule onto nodes with matching taints. On AKS, labels are also an important part of the scheduling process.
I usually try to keep it simple, by using these directives for nodepools:
Additionally, I use the following application (pod) scheduling directives:
Note: The number of nodepools should be kept low, because each nodepool will have a node that is not used to it's maximum capacity, adding costs (and complexity).
The setup below shows an example of the above setup:
Note that the name of a node pool can only contain lowercase alphanumeric characters and must begin with a lowercase letter. For Linux node pools, the length must be between 1-12 characters. For Windows node pools, the length must be between 1-6 characters.
With the nodepools above, te setup is that system pods (like CoreDNS) get scheduled on the system nodepool, all other pods get scheduled on the npusrdefault nodepool unless they have a toleration for either the npmobileapp or nprisk nodepools.
So, to make sure a pod get scheduled on the system pool, we do set the following nodeAffinity rule:
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/mode operator: In values: - system
This rule makes sure the pod only gets scheduled on nodes that have the label `kubernetes.azure.com/mode=system`, which is only true for the system nodepool. But we also need to set a toleration, because the system nodepool has a taint:
tolerations: - key: "CriticalAddonsOnly" operator: Exists
Combined, these setting will make sure the pod gets scheduled on the system nodepool.
The user nodepools are require the same setup, but obviously with different values. First we need the a nodeAffinity rule that makes sure the pod only gets scheduled on user nodepools. Depending on your preference you can use a 'NotIn' or an 'In' operator:
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/mode operator: NotIn values: - system
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/mode operator: In values: - user
Either of these rules will work, to make sure the pods will only be scheduled on user nodepools. However, depending on the nodepool you want the pod to be scheduled on, you also need to set a toleration:
tolerations: - key: "pool" operator: "Equal" value: "mobile" effect: "NoSchedule"
Note: Change the value toriskfor the nprisk nodepool.
The default user nodepool does not have a taint, so that any pod can always be scheduled. I prefer this because I favor uptime to control. This is however a personal preference, and different use cases might require different setups. Note that this means that even when setting a nodeAffinity to one of the additional user nodepools, the pod can still be scheduled on the default user nodepool, for example when the additional user nodepool is full or not available.
If you need to prevent the situation that pods get scheduled on the default user nodepool, additional affinity rules are required. In AKS, the agentpool label can be used for this purpose:
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/agentpool operator: In values: - npmobileapp
This however will break flexibility in case of migrations, upgrades or something like new naming conventions. This can be dealt with by adding more values like this:
# Migrating from the user nodepool to the npapp01 nodepool affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.azure.com/agentpool operator: In values: - npmobileapp1 - npmobileapp2
This wiki has been made possible by: