OpsCanary
Back to daily brief
cicdreleasePractitioner

Mastering Canary Deployments: Strategies for Safe Releases

5 min read Argo Rollouts DocsApr 23, 2026
PractitionerHands-on experience recommended

Canary deployments exist to mitigate the risks associated with releasing new application versions. By directing a small percentage of production traffic to a new version, you can monitor its performance and catch issues before a full rollout. This strategy allows for a controlled and gradual exposure, reducing the chances of widespread failures.

The mechanics of a canary rollout involve defining a series of steps that the rollout controller uses to manipulate ReplicaSets. You can dictate how much traffic is sent to the canary using the setWeight parameter, which specifies the percentage of traffic directed to the new version. Additionally, you can control the canary's scale with the setCanaryScale step, allowing you to adjust the number of replicas based on traffic needs. Key parameters like maxSurge and maxUnavailable help manage the rollout process by defining limits on how many replicas can be created or unavailable during updates.

In production, understanding the nuances of canary deployments is crucial. If you don't use traffic management, the rollout may struggle to achieve the desired traffic percentages. Be cautious with the canary scale; it's only effective when using a traffic router. Also, remember that if dynamicStableScale is enabled and you abort the rollout, the canary ReplicaSet will scale down automatically as traffic shifts back to the stable version. This can be a double-edged sword if not monitored closely.

Key takeaways

  • Use setWeight to control the percentage of traffic sent to the canary.
  • Define maxSurge and maxUnavailable to manage rollout limits effectively.
  • Implement setCanaryScale to adjust the canary's replica count based on traffic.
  • Monitor the rollout closely, especially if dynamicStableScale is enabled.
  • Be aware that without traffic management, achieving the desired traffic split may be challenging.

Why it matters

In production, effective canary deployments can significantly reduce downtime and user impact during updates. By catching issues early, you can maintain a stable user experience while iterating on your application.

Code examples

YAML
apiVersion:argoproj.io/v1alpha1kind:Rolloutmetadata:name:example-rolloutspec:replicas:10selector:matchLabels:app:nginxtemplate:metadata:labels:app:nginxspec:containers:-name:nginximage:nginx:1.15.4ports:-containerPort:80minReadySeconds:30revisionHistoryLimit:3strategy:canary:#Indicates that the rollout should use the Canary strategymaxSurge:'25%'maxUnavailable:0steps:-setWeight:10-pause:duration:1h# 1 hour-setWeight:20-pause:{}# pause indefinitely
YAML
spec:strategy:canary:steps:# explicit count-setCanaryScale:replicas:3# a percentage of spec.replicas-setCanaryScale:weight:25# matchTrafficWeight returns to the default behavior of matching the canary traffic weight-setCanaryScale:matchTrafficWeight:true
Bash
# promote to the next stepkubectl argo rollouts promote <rollout>

When NOT to use this

The official docs don't call out specific anti-patterns here. Use your judgment based on your scale and requirements.

Want the complete reference?

Read official docs

Test what you just learned

Quiz questions written from this article

Take the quiz →

Get the daily digest

One email. 5 articles. Every morning.

No spam. Unsubscribe anytime.