Skip to main content

Labels and Taints

To make scheduling more efficient and compatible with Kubernetes, Ocean supports the following Kubernetes constraint mechanisms for scheduling pods:

  • Node Selector: Constrains pods to nodes with particular labels.
  • Node Affinity: Constrains nodes for pod scheduling eligibility based on node labels. Spot supports hard and soft affinity (requiredDuringSchedulingIgnoredDuringExecution, preferredDuringSchedulingIgnoredDuringExecution). Pod Affinity and Pod Anti-Affinity: This function schedules a pod based on whether other pods run on a node.
  • Pod Port Restrictions: Validates that each pod will have required ports available on the machine.
  • Well-Known Labels.

Spot Labels

Spot labels allow you to adjust Ocean's default scaling behavior. Add them to your pods to control the node termination process or lifecycle.

spotinst.io/azure-premium-storage

The AKS scheduler does not guarantee that pods requiring premium storage will schedule on nodes that support premium storage disks. The Spot Ocean label spotinst.io/azure-premium-storageis injected into every node in a node pool that supports premium storage. We recommended using spotinst.io/azure-premium-storage on your pods in cases where the pod requires premium storage disks. This enables pods to be provisioned on the most appropriate nodes for their workloads. For more information, see Azure premium storage.

spotinst.io/restrict-scale-down

Some workloads are not as resilient to spot instance replacements as others, so you may want to lower the frequency of replacing the nodes they are running on as much as possible while still benefiting from spot instance pricing. For these workloads, use the spotinst.io/restrict-scale-down label (set to true) to block the proactive scaling down of the instance for the purposes of more efficient bin packing. This will leave the instance running as long as possible. The instance will be replaced only if it goes into an unhealthy state or if forced by a cloud provider interruption.

spotinst.io/node-lifecycle

Ocean uses the spotinst.io/node-lifecycle label key to indicate a node's lifecycle. It is applied to all Ocean-managed nodes and has a value of od (on-demand).

This label is useful for workloads that are not resilient to spot instance interruptions and must run on on-demand instances at all times.

By applying node affinity to the spotinst.io/node-lifecycle label with the value od, you can ensure that these workloads are scheduled only on on-demand instances.

note

spotinst.io/node-lifecycle:spot affinity is not supported, and unless spotinst.io/node-lifecycle:od affinity is applied, Ocean will continue to try to provide excess compute capacity (spot instances) for all workloads in the cluster.

spotinst.io/gpu-type

This label helps create direct affinity to specific types of GPU hardware, freeing the user from the need to explicitly set and manage a list of VMs that contain the required hardware. Ocean automatically matches the relevant VMs (currently with AWS and GCP) for workloads having affinity rules using this label. Valid label values are:

  • nvidia-tesla-v100
  • nvidia-tesla-p100
  • nvidia-tesla-k80
  • nvidia-tesla-p4
  • nvidia-tesla-t4
  • nvidia-tesla-a100 (Only for AWS)
  • nvidia-tesla-m60
  • amd-radeon-v520
  • nvidia-tesla-t4g
  • nvidia-tesla-a10
note

Don't add Spot labels under the virtual node group (launch specification) node labels section. Add these labels to the pod configuration only.

Instance Types Labels

Format: aws.spot.io/instance-<object>, for example, aws.spot.io/instance-category

Apply these labels to a workload's constraints (nodeSelector, node affinity, etc.) to reflect instance type properties. For example, constrain workloads to run on any M6, M7, or R7 family. This avoids manually listing all instance types per family.

The instance labels are as follows:

  • aws.spot.io/instance-category: Reflects the category of the instance (for example., c).
  • aws.spot.io/instance-family: Reflects the family of the instance (for example., c5a).
  • aws.spot.io/instance-generation: Reflects the generation of the instance (for example., 5).
  • aws.spot.io/instance-hypervisor: Reflects the hypervisor the instance uses (for example., nitro).
  • aws.spot.io/instance-cpu: Reflects the CPU the instance uses (for example., 2).
  • aws.spot.io/instance-memory: Reflects the instance's memory (for example., 4096).

These labels only launch nodes that match the required pod labels.

Examples

Using restrict scale-down label:

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spotinst.io/restrict-scale-down: "true"
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
memory: "2Gi"
cpu: "2"
limits:
memory: "4Gi"
cpu: "4"

Using od nodeSelector:

apiVersion: v1
kind: Pod
metadata:
name: with-node-selector
spec:
containers:
- name: with-node-selector
image: registry.k8s.io/pause:2.0
imagePullPolicy: IfNotPresent
nodeSelector:
spotinst.io/node-lifecycle: od

Using od nodeAffinity:

apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: spotinst.io/node-lifecycle
operator: In
values:
- od
containers:
- name: with-node-affinity
image: registry.k8s.io/pause:2.0

Startup Taints

Cloud service provider relevance: AWS Kubernetes

Startup taints are temporary taints applied to a node during its initialization phase. During this phase, the autoscaler will not scale up nodes for additional pending pods that match this node because it has already acknowledged that the start-up taint will soon be removed. Once removed, any pod without toleration matching the node can be scheduled without launching additional nodes.

When to Use Startup Taints

You may want to deploy a specific pod to a node before deploying other pods to the same node. When that pod is ready or has completed a defined procedure, such as networking, scheduling of other pods will be allowed.

Example: Cilium Cilium recommends applying a taint such as node.cilium.io/agent-not-ready=true:NoExecute to prevent other pods from starting before Cilium has finished configuring the necessary networking on the node.

The pod used for initialization will have a tolerance to this taint exclusively. Once the node is ready, the application running on the pod will remove the taint from the node.

note

If the startupTaint attribute has not been removed for a specific node by the end of the cluster's grace period, a new node will be launched for any pending pods. The grace period starts when a node is created; its default is 5 minutes, and you can configure it in the cluster under cluster.strategy.gracePeriod.

Configure Startup Taints in the Spot API

AWS Kubernetes only

Prerequisite: Ocean controller version at least v2.0.68

Configure Ocean to consider your startup taints using the startupTaints attribute at the Ocean cluster and virtual node group levels.

important

You must also set the startupTaint as a regular taint in the userData for the cluster or virtual node group. This is because Ocean does not add or remove configured startup taints.