← Back to Blog

Kubernetes HA Cluster Setup

Running with a single master node in production poses a serious risk. In this article, I'll explain step-by-step how to set up a High Availability (HA) Kubernetes cluster.

HA Cluster Architecture

Components we'll use:

  • 2 Master Nodes (Control Plane)
  • 4 Worker Nodes
  • HAProxy Load Balancer
  • etcd Cluster (3 nodes)

Prerequisites

Minimum requirements for each node:

  • 2 CPUs
  • 4GB RAM
  • 20GB Disk
  • Ubuntu 22.04 LTS

HAProxy Installation

First, we'll place HAProxy in front of the master nodes.

# Install HAProxy
sudo apt-get update
sudo apt-get install -y haproxy

# HAProxy configuration
sudo cat > /etc/haproxy/haproxy.cfg << 'HAPROXY'
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

defaults
    log global
    mode tcp
    option tcplog
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend kubernetes-apiserver
    bind *:6443
    mode tcp
    option tcplog
    default_backend kubernetes-master

backend kubernetes-master
    mode tcp
    balance roundrobin
    option tcp-check
    server master1 192.168.1.10:6443 check
    server master2 192.168.1.11:6443 check
HAPROXY

# Restart HAProxy
sudo systemctl restart haproxy
sudo systemctl enable haproxy

Master Node Setup

Let's set up the first master node:

# Install container runtime (containerd)
sudo apt-get install -y containerd

# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo systemctl restart containerd
sudo systemctl enable containerd

# Install Kubernetes packages
sudo apt-get install -y kubeadm kubelet kubectl
sudo apt-mark hold kubeadm kubelet kubectl

# Initialize first master node
sudo kubeadm init \
  --control-plane-endpoint="haproxy-ip:6443" \
  --upload-certs \
  --pod-network-cidr=10.244.0.0/16

Adding Second Master Node

# Join command from first master
sudo kubeadm join haproxy-ip:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash> \
  --control-plane \
  --certificate-key <cert-key>

Adding Worker Nodes

# Run on each worker node
sudo kubeadm join haproxy-ip:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

Network Plugin (Calico)

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

etcd Cluster Health

# Check etcd cluster status
kubectl exec -it etcd-master1 -n kube-system -- \
  etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  member list

# etcd health check
kubectl exec -it etcd-master1 -n kube-system -- \
  etcdctl endpoint health

Load Balancer Test

# HAProxy stats page
curl http://haproxy-ip:9000/stats

# API server access test
kubectl get nodes

# Shut down one master node and test again
kubectl get nodes

Failover Test

Simulate master node failure:

# Stop Master1
sudo systemctl stop kubelet

# API server should still be accessible
kubectl get pods -A

# Restart Master1
sudo systemctl start kubelet

Monitoring

# Install Prometheus and Grafana
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring --create-namespace

Backup and Restore

# etcd backup
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# etcd restore (disaster recovery)
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot.db \
  --data-dir=/var/lib/etcd-restore

Conclusion

With HA Kubernetes cluster:

  • ✅ Zero downtime
  • ✅ Automatic failover
  • ✅ Scalable infrastructure
  • ✅ Production-ready

"High Availability is not optional for production systems - it's a requirement."

← Back to Blog