Unlocking Kubernetes v1.36: PSI Metrics for Proactive Resource Management
Kubernetes v1.36 brings PSI metrics into general availability, addressing a critical need for proactive resource management. PSI provides high-fidelity signals that help you identify resource saturation before it leads to an outage. This capability is essential for maintaining application performance and reliability in production environments.
The Kubelet now detects OS-level PSI support through cgroup configurations, ensuring that pressure metrics are only collected when supported by the node. This means cleaner data for your monitoring and alerting systems. To take advantage of these metrics, ensure your nodes are running a Linux kernel version 4.20 or later and using cgroup v2. Additionally, your kernel must be compiled with CONFIG_PSI=y, and the system must not be booted with the psi=0 parameter.
In production, you can query PSI metrics using a simple command. For example, use kubectl get --raw to access the stats summary for a specific container. Be cautious, as proxying to the kubelet is a privileged operation that requires appropriate administrative permissions. As of v1.36, you no longer need to opt in to any feature gate, making it easier to leverage these metrics in your workflows.
Key takeaways
- →Leverage PSI metrics to identify resource saturation before outages occur.
- →Ensure your kernel is compiled with CONFIG_PSI=y to collect accurate PSI data.
- →Use moving averages (10s, 60s, 300s) to differentiate between transient spikes and sustained resource tension.
- →Query PSI metrics with `kubectl get --raw` for real-time insights into container performance.
- →Be aware of security risks when proxying to the kubelet; use appropriate permissions.
Why it matters
In production, PSI metrics can significantly reduce downtime by allowing you to anticipate and address resource issues before they impact users. This proactive approach enhances overall system reliability.
Code examples
CONTAINER_NAME="example-container"
kubectl get --raw "/api/v1/nodes/$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')/proxy/stats/summary" | jq '.pods[].containers[] | select(.name=="$CONTAINER_NAME") | {name, cpu: .cpu.psi, memory: .memory.psi, io: .io.psi}'When NOT to use this
The official docs don't call out specific anti-patterns here. Use your judgment based on your scale and requirements.
Want the complete reference?
Read official docsUnified observability — logs, uptime monitoring, and on-call in one place. Used by 50,000+ engineering teams to ship faster and sleep better.
Try Better Stack free →Mastering Workload-Aware Scheduling in Kubernetes v1.36
Kubernetes v1.36 introduces powerful workload-aware scheduling features that can transform how you deploy applications. With the new Workload and PodGroup APIs, you can prevent resource wastage and deadlocks through gang scheduling. This is a game changer for managing complex workloads effectively.
Unlocking Kubernetes v1.36: Dynamic Resource Allocation and Its Game-Changing Features
Kubernetes v1.36 introduces Dynamic Resource Allocation (DRA), revolutionizing how you manage hardware accelerators. With features like prioritized lists and device taints, you can optimize resource utilization and improve system reliability.
Unlocking Performance with Kubernetes Pod-Level Resource Managers
Kubernetes v1.36 introduces Pod-Level Resource Managers, a game changer for performance-sensitive workloads. This feature allows for hybrid resource allocation models, enhancing efficiency without compromising NUMA alignment.
Get the daily digest
One email. 5 articles. Every morning.
No spam. Unsubscribe anytime.