Mastering Mutable Pod Resources in Suspended Kubernetes Jobs
Kubernetes v1.36 promotes the ability to modify container resource requests and limits in the pod template of a suspended Job to beta. This feature addresses a common pain point: the need to adjust resource specifications on Jobs without having to restart them. By allowing adjustments to CPU, memory, and GPU settings while a Job is suspended, you can fine-tune resource usage based on current cluster conditions or workload demands.
The mechanism behind this feature is straightforward. The Kubernetes API server relaxes the immutability constraint on pod template resource fields specifically for suspended Jobs. You can modify fields like spec.template.spec.containers[*].resources.requests and spec.template.spec.containers[*].resources.limits. To create a suspended Job, simply set spec.suspend: true in your Job manifest. After creating the Job, you can edit the resource requests using kubectl edit and then resume the Job with a patch command. This flexibility allows for better resource management without unnecessary downtime.
In production, you need to be aware of a key limitation: if you suspend a Job that was already running, you must wait for all active Pods to terminate before making any resource modifications. The API server will reject changes if status.active is greater than zero, preventing inconsistencies between running Pods and the updated pod template. This feature was first introduced as alpha in v1.35, and its promotion to beta in v1.36 signals its readiness for broader use in production environments.
Key takeaways
- →Leverage the Mutable Pod Resources feature to adjust resource requests and limits for suspended Jobs.
- →Use `spec.suspend: true` in your Job manifest to create a suspended Job.
- →Remember to wait for all active Pods to terminate before modifying resources on a suspended Job.
- →Utilize `kubectl edit` to easily change resource specifications on the fly.
- →Patch the Job to resume it after making resource adjustments.
Why it matters
This feature significantly enhances resource management in Kubernetes, allowing you to adapt to changing workload demands without downtime. It helps maintain optimal resource utilization across your cluster.
Code examples
1apiVersion: batch/v1
2kind: Job
3metadata:
4 name: training-job-example-abcd123
5 labels:
6 app.kubernetes.io/name: trainer
7spec:
8 suspend: true
9 template:
10 metadata:
11 annotations:
12 kubernetes.io/description: "ML training, ID abcd123"
13 spec:
14 containers:
15 - name: trainer
16 image: example-registry.example.com/training:2026-04-23T150405.678
17 resources:
18 requests:
19 cpu: "8"
20 memory: "32Gi"
21 example-hardware-vendor.com/gpu: "4"
22 limits:
23 cpu: "8"
24 memory: "32Gi"
25 example-hardware-vendor.com/gpu: "4"
26 restartPolicy: Never1apiVersion: batch/v1
2kind: Job
3metadata:
4 name: training-job-example-abcd123
5 labels:
6 app.kubernetes.io/name: trainer
7spec:
8 suspend: true
9 template:
10 metadata:
11 annotations:
12 kubernetes.io/description: "ML training, ID abcd123"
13 spec:
14 containers:
15 - name: trainer
16 image: example-registry.example.com/training:2026-04-23T150405.678
17 resources:
18 requests:
19 cpu: "4"
20 memory: "16Gi"
21 example-hardware-vendor.com/gpu: "2"
22 limits:
23 cpu: "4"
24 memory: "16Gi"
25 example-hardware-vendor.com/gpu: "2"
26 restartPolicy: Never1# Create a suspended Job
2kubectl apply -f my-job.yaml --server-side
3# Edit the resource requests
4kubectl edit job training-job-example-abcd123
5# Resume the Job
6kubectl patch job training-job-example-abcd123 -p '{"spec":{"suspend":false}}'When NOT to use this
The official docs don't call out specific anti-patterns here. Use your judgment based on your scale and requirements.
Want the complete reference?
Read official docsUnified observability — logs, uptime monitoring, and on-call in one place. Used by 50,000+ engineering teams to ship faster and sleep better.
Try Better Stack free →Mastering Workload-Aware Scheduling in Kubernetes v1.36
Kubernetes v1.36 introduces powerful workload-aware scheduling features that can transform how you deploy applications. With the new Workload and PodGroup APIs, you can prevent resource wastage and deadlocks through gang scheduling. This is a game changer for managing complex workloads effectively.
Unlocking Kubernetes v1.36: PSI Metrics for Proactive Resource Management
Kubernetes v1.36 introduces Pressure Stall Information (PSI) metrics, a game changer for monitoring resource saturation. With cumulative totals and moving averages, you can now detect issues before they escalate into outages.
Unlocking Kubernetes v1.36: Dynamic Resource Allocation and Its Game-Changing Features
Kubernetes v1.36 introduces Dynamic Resource Allocation (DRA), revolutionizing how you manage hardware accelerators. With features like prioritized lists and device taints, you can optimize resource utilization and improve system reliability.
Get the daily digest
One email. 5 articles. Every morning.
No spam. Unsubscribe anytime.