Loading video player...

How Linkerd brings simplicity to service mesh and AI security

Service meshes have a reputation for complexity, but what if operational simplicity was the core design principle?

In this Techzine TV interview from KubeCon and CloudNativeCon, William Morgan, founder and CEO of Buoyant, explains how Linkerd differentiates itself in the crowded service mesh market. As the first service mesh to achieve CNCF graduated status, Linkerd powers critical infrastructure worldwide while maintaining a focus on making systems inspectable, debuggable, and understandable.

Morgan discusses the challenges of managing service mesh complexity, Linkerd’s position versus competitors like Istio, and how Buoyant is already adapting Linkerd to parse MCP protocol traffic and secure AI agents operating in production Kubernetes environments. The conversation also reveals Buoyant’s transparent approach to open source sustainability.

The simplicity philosophy

Question: You position Linkerd as the simpler service mesh option. Can you explain that approach?

William Morgan: Our view is the beauty of Kubernetes is that it’s like a LEGO base plate and it’s designed for you to add things to it. And you build a platform by putting together 3, 4, 10, 20 projects, but you quickly get into complex situations unless every project is very, very focused on simplicity. The service mesh, unfortunately, has a reputation for being very complex. So our vision with Linkerd and our goal is how can we make this as simple for the human beings? At the end of the day, you have to operate it. Ultimately there is a person who has to wake up at three in the morning when something breaks. So how do we make sure that Linkerd is inspectable, it’s debuggable, it’s understandable, it’s small, it’s lightweight, it plays nicely with the rest of the ecosystem.

Question: What does operational simplicity really mean?

William Morgan: Simplicity is, is it composable? Is it self contained? Can I understand it and inspect it? It doesn’t mean I need to run three commands over here but I run one command over here and therefore the one command is simple. That’s not simplicity. Simplicity is operational simplicity.

Market position and competition

Question: Aren’t most Kubernetes engines using Istio instead of Linkerd nowadays?

William Morgan: No, I don’t think so. I mean, I don’t know. I don’t run the poll. Everyone I talk to is running Linkerd. So from my perspective, way more Linkerd than Istio. And we have a lot of people coming from Istio to Linkerd. But of course, it’s always fun to have a nerd fight between the two projects. When we started out, there was Kubernetes versus Mesos and everyone was like, oh, which one’s going to win? With the service mesh, I don’t think you’ve really seen that. There hasn’t really been one big winner.

Managing complexity challenges

Question: What are the main pitfalls for making a service mesh overly complex?

William Morgan: I think it’s where the service mesh sits. It’s like a pacemaker. It doesn’t sit off to the side. It’s not an Apple watch that sits off to the side and if you don’t like it, you take it off. It goes into the beating heart of your application and it sits at the intersection of the network, the Kubernetes cluster, the application. And if any of those things go wrong, then the service mesh starts behaving in a strange way. So you have to be really, really careful to make sure the behavior is predictable, it’s consistent, everything gets admitted as metrics. You have alerting and monitoring and dashboards that are easily accessible.

Question: How do you resist the temptation to add complexity through new features?

William Morgan: It’s very easy to fall into the trap. This is why software is kind of notorious. You look at Windows, right? It’s very easy, you add more features, you do stuff, not really thinking things through. And at the end of the day, you have this very complex system. The world needs software to be reliable, and the people who use Linkerd are using it for incredibly sensitive and incredibly important infrastructure. The data that passes through it is financial data, it’s medical health data, it’s PII, it’s HIPAA data, it’s the scariest of them all, GDPR data. The security and simplicity have to be there, because the more complex you make this, the more likely it is for things to fall apart and for bad actors to get access to data you don’t want them to have.

AI integration and MCP protocol

Question: How do service meshes interact with new communication standards like MCP and A2A that weren’t designed with security as a main principle?

William Morgan: Linkerd is really well-positioned for that because, at its core, it has a user-space proxy that understands the network traffic going through it. We’re able to take an HTTP stream or a TCP stream and break it down into, okay, this is a protocol, here’s what’s in the body, here’s what’s in the headers. When it comes to a protocol like MCP, which is JSON RPC going over HTTP, Linkerd can sit there, it can actually understand that traffic, and it can start doing things with it. We have an early prototype of Linkerd that parses MCP, that emits metrics around which tools and resources and prompts are being used and what their latency is, and actually ties it to the existing policy mechanism. So you can say, oh, service A is allowed to call the delete-all-users tool, but service B is not.

Question: What are the unique security challenges with AI agents?

William Morgan: All of a sudden, you’ve got this non-deterministic actor in your system. You need to treat that very carefully, much more carefully than you’ve treated human-generated software in the past. With something like MCP, the usage that we see in practice is a lot of developers spinning up MCP clients on their laptops and then just adding in whichever MCP services are out there. And pretty soon, you have a situation where it’s totally unconstrained, and you have your private IP being sent out across a wire, and you’ve got this vector for prompt injection attacks.

Security versus developer productivity

Question: How do you balance security requirements with developer productivity?

William Morgan: This is a challenge throughout history of the tension between trying to make things really fast and easy and reduce friction, and also trying to keep them secure. The surface area of this is different with AI, but the challenge is still the same. Of course, you want developers to be as productive as possible, and you want them to have full access to everything, but you also need to put in the controls where you’re not doing that in an unsafe way. How do you do that in a way that is not a huge burden to the developers? Because once it becomes a huge burden, then they’ll find ways to circumvent it. I don’t think anyone has a silver bullet for that. I think where Linkerd sits, it’s very nicely positioned when that traffic enters the production cluster.

Question: Are protocols like MCP too limited for production infrastructure needs?

William Morgan: The MCP protocol, that’s an easy one to make fun of because it was just slapped together by someone who didn’t appreciate how important that stuff is. So you have a very limited OAuth mechanism, and you have a very limited ability to do anything around identity. I think that protocol has to evolve. We as infrastructure engineers have to recognize that a lot of the code that’s going to be written and the network calls that are coming into the system are going to be unconstrained. They could come from agents, agents on behalf of humans, or the humans themselves. The system has to support that regardless.

Platform engineering perspective

Question: Does Linkerd’s purpose change when dealing with AI agents instead of human developers?

William Morgan: Even since the beginning, the point of Linkerd has been to actually be invisible to the developers. We are a tool to make the platform better. If you are building a platform on top of Kubernetes, your internal customer are the developers. But you actually don’t want them to have to care too much about the environment. You want them to be really efficient. You want them to be able to deploy code to this platform. You want the platform to always be reliable, and you need the platform to be secure. That’s still the case with or without AI. Those always have been and always will be the goals of the platform. From that perspective, everything is the same. It’s just more interesting.