When DIY Beats Managed Kubernetes

Sep 21, 2025

When I first started working with Kubernetes, I immediately gravitated toward managed offerings like EKS, GKE, and AKS. The promise was compelling: let AWS/Google/Azure handle the control plane while you focus on your applications. Fast forward a few years, and I've come to a somewhat contrarian position—for many teams, especially those with some ops capability, running K3s on virtual machines often makes more sense than using managed Kubernetes.

Let me explain why, and the important caveats to make this approach work.

The Managed Kubernetes Tax

Managed Kubernetes services aren't free—and I'm not just talking about the literal cost (though that's significant). They come with several forms of "tax":

Financial cost: You pay for control plane(s), often per cluster. For small to medium workloads, this can be disproportionately expensive.
Complexity tax: Managed K8s integrates deeply with cloud provider infrastructure—IAM, networking, storage—adding layers of abstraction and potential failure points.
Upgrade friction: Managed K8s upgrades are often more complex than they need to be, involving node group rotations and potential downtime.
Cognitive overhead: You still need to understand Kubernetes, plus the cloud provider's implementation quirks and limitations.

Take EKS, for example. What starts as "just let AWS manage the control plane" quickly spirals into wrestling with IAM roles for service accounts, custom CNIs, AWS Load Balancer Controllers, and cluster autoscaler configurations that mysteriously stop working after upgrades. I've spent entire days debugging issues that stemmed from the interaction between EKS and AWS's underlying services—time that could have been spent improving our actual applications.

Enter K3s: Kubernetes Without the Bloat

K3s is a certified Kubernetes distribution designed for resource-constrained environments. It's packaged as a single binary under 100MB and uses significantly fewer resources than standard K8s. But don't let the "lightweight" label fool you—K3s is a production-grade distribution that powers everything from IoT devices to large-scale production systems.

When deployed on standard VMs (whether AWS EC2, DigitalOcean Droplets, or your own infrastructure), K3s offers several advantages:

Simplicity: A K3s cluster can be bootstrapped with a single command. No complex cloud provider integration required.
Cost efficiency: Run your entire control plane and worker nodes on standard VMs, often at a fraction of the cost of managed offerings.
Portability: Your setup works the same way regardless of where your VMs are hosted, making multi-cloud and hybrid deployments straightforward.
Easier upgrades: K3s upgrades can be as simple as replacing a binary and restarting a service.
Full control: No mysterious behavior or limitations imposed by the cloud provider's implementation.

The Critical Caveat: You Need Automation

Here's where I need to be clear: this approach only makes sense if you invest in automation. You're essentially building your own management layer, which requires:

Infrastructure as Code: Your entire VM fleet and K3s deployment should be defined in Terraform, Pulumi, or similar.
Automated scaling: Scripts or tools that can add/remove nodes based on cluster metrics.
Upgrade playbooks: Well-tested procedures for upgrading K3s versions with minimal disruption.
Monitoring and alerting: Comprehensive visibility into both VM and Kubernetes-level metrics.
Backup and disaster recovery: Regular etcd snapshots and documented recovery procedures.

Without these elements, you're likely better off with managed Kubernetes. The goal isn't to recreate every feature of EKS/GKE/AKS, but to build a simpler, more focused system that meets your specific needs.

Real-World Example

For one of my recent projects, we replaced an EKS cluster costing roughly $250/month (control plane + required minimum nodes) with a K3s setup on three small VMs totaling $60/month. The migration took 3 days, and we've had fewer operational issues since.

Our automation includes:

Terraform for VM provisioning
Ansible for K3s installation and configuration
Custom scripts for horizontal scaling based on node resource utilization
Prometheus + Grafana for monitoring
Weekly etcd snapshots stored in S3

The entire setup is documented in a Git repository, and new team members can spin up a local replica for testing using Vagrant.

The maintenance complexity with EKS was what ultimately pushed us over the edge. Every few months, AWS would deprecate something or introduce a new "recommended" way to handle networking, storage, or access control. We'd spend days reading through documentation changes and testing upgrades in staging environments. With K3s, upgrades are predictable and focused on Kubernetes itself, not the surrounding ecosystem of AWS-specific components.

When to Stick with Managed Kubernetes

This approach isn't for everyone. You should probably stick with managed Kubernetes if:

You have large, complex clusters with hundreds of nodes
Your team has limited operations expertise
You need advanced features like managed node auto-scaling groups
You're heavily invested in cloud-provider specific features

Conclusion

The beauty of the K3s-on-VMs approach is that it strips Kubernetes down to what it does best—orchestrating containers—without the added complexity that comes from deep cloud provider integration.

By building your own lightweight management layer through automation, you get the benefits of Kubernetes with more control, often at a lower cost. The key is being honest about your team's capabilities and needs.

For startups, indie hackers, and teams that value simplicity and cost-efficiency, this approach is worth considering. You might find that a little investment in automation pays significant dividends in both cost savings and reduced operational complexity.

Of course, if you enjoy spending your weekends debugging why your EKS cluster suddenly can't talk to your RDS instances despite no apparent changes, then by all means, stick with managed Kubernetes. Some people also enjoy jigsaw puzzles with missing pieces.

Avinash Upadhyaya

How do you handle storage and load balancers when running k3s yourself on cloud VMs? Do you install the cloud provider's controllers? If yes, how is this better than the EKS model?

Do you have any automation for Day2 such as certificate updates for the control plane and kubelets, etc.?

How does this approach solve for the mysterious connection disruptions with RDS? Both EKS managed nodes and EC2s share the same networking model, right?

Not meaning to be cynical but would love to understand how you handle these with k3s.

Expand full comment

Dev with an Ops Flavour

Discussion about this post