Monthly Archives: September 2019

Kubernetes: Why I won’t use StatefulSets

StatefulSet is the workload API object used to manage stateful applications.

Manages the deployment and scaling of a set of Podsand provides guarantees about the ordering and uniqueness of these Pods.

A StatefulSet operates under the same pattern as any other Controller. You define your desired state in a StatefulSet object, and the StatefulSet controller makes any necessary updates to get there from the current state.

https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

Sounds perfect, right? Well, I’ve found that the limitations and behaviours of StatefulSets are so far away from what I want that I’ll steer clear of them completely. And in any case, they aren’t necessary.

Continue reading

How do you manage your Terraform code and state?

Terraform is a fantastic tool for managing infrastructure. However, I find that there are three inter-related challenges that you need to overcome, somehow. Firstly, how do you manage the transition between an initial state and a target state with live workloads. Secondly, how do you manage complexity. Thirdly, how do you manage working in a team. I don’t think there are any complete answers to these, but here are some of my thoughts on how I tackle them and why I do it the way I do.

Continue reading

AWS EKS: Tunneling a private kube-apiserver

AWS EKS provides two options for network accessibility of the Kubernetes API server: public or private. In both cases, it is operated by AWS. However, if your security posture is such that you cannot run the public option, the private option has some challenges. How will you access the cluster from other locations? Prometheus? Spinnaker? CLI? That all depends on your model, but here’s the way I did it.

Continue reading

Launching M3, Prometheus and Grafana in AWS: I did it my way

I am opinionated. Just ask my wife. When it comes to infrastructure, I really like to be in control. I’m a fan of Kubernetes, for example, but I’m not a fan of prometheus-operator and similar options. Why? Because I think things are complicated enough, and there’s a lot going on beneath these that aren’t obvious. Because these systems typically require the resources of a full VM (indeed often big VMs). Because these systems are significant in your daily operations. But also because these designs often introduce quite big limitations if they aren’t one-to-one with your target architecture and tooling.

So here are a few starting opinions:

  1. The essence of Prometheus is great
  2. Infrastructure as code is phenomenal
  3. Non-updating, replaceable systems are preferred to in-place updates
  4. Spot instances are preferable where appropriate
  5. High-availability of core systems is non-negotiable

So how did I and my team deploy Prometheus, and what’s M3 got to do with it?

Continue reading