OpsCanary
kubernetessecurityPractitioner

Building a Multi-Agent Security Platform on Kubernetes: Why Cloud Native is Key

5 min read CNCF BlogJun 17, 2026Reviewed for accuracy
Share
PractitionerHands-on experience recommended

In the evolving landscape of AI, deploying a multi-agent security platform requires a robust and scalable architecture. Cloud-native solutions, particularly Kubernetes, provide the flexibility and resilience needed to manage complex agent interactions. This architecture not only facilitates inter-agent coordination but also ensures that security and observability are baked into the system from the ground up.

The system employs a Coordinator Agent that leverages LangGraph and the A2A protocol to orchestrate four specialized agents: Detect, Analyse, Remediate, and Notify. Each agent runs as a separate Kubernetes Deployment, complete with defined resource limits, identity, and restart policies. Security is paramount; inter-agent traffic is protected using mutual TLS (mTLS), with cert-manager issuing unique identities for each agent. Observability is enhanced through the inclusion of an A2A trace_id in every task, allowing structured JSON logs to be generated. The reviewer agent utilizes Open Policy Agent (OPA) for policy decisions, while Kyverno manages admission rules. An Isolation Forest anomaly model acts as a gatekeeper for the LLM, controlling costs and latency effectively.

In production, it's crucial to recognize that while the cloud-native approach offers scalability and security, the opposite pattern—running all agents in a single process—might seem faster for demos but is unsuitable for real-world applications. The system is open-sourced and governed under the Linux Foundation, ensuring community support and ongoing development.

Key takeaways

  • Utilize the A2A protocol for effective inter-agent coordination.
  • Secure inter-agent traffic with mTLS to enhance communication security.
  • Leverage the Isolation Forest model to manage costs and latency in LLM interactions.
  • Implement OPA for policy decisions and Kyverno for managing admission rules.
  • Deploy each agent as a separate Kubernetes workload for better resource management.

Why it matters

In production, a cloud-native architecture allows for scalable, secure, and efficient management of multiple agents, which is critical for maintaining robust security protocols in an increasingly complex digital landscape.

When NOT to use this

The opposite pattern (all agents in one process) is faster to demo on a laptop and would be wrong in production.

Want the complete reference?

Read official docs

Test what you just learned

Quiz questions written from this article

Take the quiz →
Better StackSponsor

Unified observability — logs, uptime monitoring, and on-call in one place. Used by 50,000+ engineering teams to ship faster and sleep better.

Try Better Stack free →

Get the daily digest

One email. 5 articles. Every morning.

No spam. Unsubscribe anytime.