Production-Grade Kubernetes on a Single Server: The Complete Guide

Why do this

One server. The full production stack. No excuses.

The idea is simple: build a Kubernetes cluster running the same stack you’d use on AKS or EKS — service mesh, automatic TLS, observability, layered security — but on your own hardware. This isn’t minikube. This isn’t a test lab. It’s an environment serving real workloads to real users.

Why a single node? Because for personal projects, side projects, and serious experimentation, a 15-30€/month server gives you more than enough. And the experience of operating this stack transfers directly to multi-node clusters in the cloud.

The end result: a cluster where deploying a new app means creating a namespace, labeling it for the service mesh, creating an HTTPRoute, and running kubectl apply. Automatic TLS, automatic DNS, metrics, logs, and security — it’s all already there.

The server and prerequisites

What you need:

Server with Ubuntu 22.04+ — minimum 4 vCPUs and 8GB RAM. With the full stack running, base consumption is around 4-5GB RAM. If you’re running workloads on top, 16GB is more comfortable.
cgroup v2 — Ubuntu 22.04+ has it enabled by default. Verify with stat -fc %T /sys/fs/cgroup (should return cgroup2fs).
Domains on Cloudflare — needed for DNS-01 challenge (wildcard certs) and External DNS (automatic A record creation).
Cloudflare API Token with Zone:DNS:Edit + Zone:Zone:Read permissions for all zones you’ll use.
Tailscale installed — for secure API Server access.

API Server security

The Kubernetes API Server (port 6443) must never be exposed to the internet. We use Tailscale as a VPN for external access, and nftables to filter by network interface. If someone reaches your 6443, they have the keys to the kingdom.

Base: kubeadm + containerd

Prepare the system

First, the kernel modules and network parameters that Kubernetes needs:

# Kernel modules
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# Network parameters
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Install containerd

sudo apt-get update
sudo apt-get install -y containerd

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml

# Enable SystemdCgroup — critical for cgroup v2
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

sudo systemctl restart containerd
sudo systemctl enable containerd

SystemdCgroup

If you don’t enable SystemdCgroup = true, kubelet and containerd will use different cgroup drivers. The result: pods dying randomly, kubelet restarting, and hours of debugging. This is the most common mistake in kubeadm installations.

Install kubeadm, kubelet, kubectl

sudo apt-get install -y apt-transport-https ca-certificates curl gpg

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | \
  sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | \
  sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Initialize the cluster

sudo kubeadm init \
  --pod-network-cidr=10.244.0.0/16 \
  --service-cidr=10.96.0.0/16 \
  --skip-phases=addon/kube-proxy

No kube-proxy

We skip kube-proxy with --skip-phases=addon/kube-proxy because Calico in eBPF mode replaces it entirely. eBPF handles service load balancing directly in the kernel, without the iptables chains that kube-proxy generates. Better performance, less complexity.

After init, configure access and remove the master node taint:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# Allow the master node to run workloads
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

On a single-node cluster, this last step is mandatory. Without it, nothing gets scheduled on your only node.

CNI: Calico with eBPF

Why Calico eBPF

Calico with eBPF gives you three things that iptables mode doesn’t:

Replaces kube-proxy — service load balancing in eBPF, no iptables
Better performance — especially with many services, where iptables chains become linear
Native network visibility — Calico can use eBPF data for policies and metrics

Install the Tigera Operator

helm repo add projectcalico https://docs.tigera.io/calico/charts
helm repo update

kubectl create namespace tigera-operator

helm install calico projectcalico/tigera-operator \
  --version v3.29.2 \
  --namespace tigera-operator

Here’s a real gotcha: you need to wait for the operator to register the CRDs before applying the Installation resource. If you apply the CR immediately, it fails silently or the resource gets stuck in an inconsistent state.

# Wait for CRDs to be registered
kubectl wait --for=condition=Established \
  crd/installations.operator.tigera.io \
  --timeout=120s

Apply the Installation CR with eBPF

# calico-installation.yaml
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  calicoNetwork:
    ipPools:
      - name: default-ipv4-ippool
        blockSize: 26
        cidr: 10.244.0.0/16
        encapsulation: VXLANCrossSubnet
        natOutgoing: Enabled
        nodeSelector: all()
    linuxDataplane: BPF
    hostPorts: Enabled
  cni:
    type: Calico
  controlPlaneReplicas: 1
---
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
  name: default
spec: {}

kubectl apply -f calico-installation.yaml

With linuxDataplane: BPF, Calico operates entirely in eBPF. If you had kube-proxy installed (which isn’t the case if you followed the steps above), you should patch its DaemonSet so it doesn’t get scheduled:

# Only if kube-proxy is installed
kubectl patch ds -n kube-system kube-proxy \
  -p '{"spec":{"template":{"spec":{"nodeSelector":{"non-calico": "true"}}}}}'

Service Mesh: Istio Ambient Mode

Why ambient and not sidecar

Sidecar mode injects an Envoy proxy into every pod. It works, but has overhead: each pod consumes more memory, startup is slower, and there’s a whole category of bugs related to sidecar vs. application init order.

Ambient mode changes the model: a ztunnel daemon per node handles mTLS and L4 automatically for all pods in the namespace. If you need L7 capabilities (retries, timeouts, circuit breaking, HTTP observability), you deploy a waypoint proxy shared per namespace. You only pay the L7 cost where you actually need it.

Install Istio

istioctl install --set profile=ambient -y

To enable ambient on an application namespace:

kubectl label namespace my-app istio.io/dataplane-mode=ambient

To add a shared waypoint proxy (L7):

istioctl waypoint apply --namespace my-app --enroll-namespace

Infrastructure namespaces

Don’t apply ambient to infrastructure namespaces like kube-system, calico-system, tigera-operator, istio-system, or monitoring. The service mesh is for your application workloads. Infrastructure components have their own security mechanisms, and putting them in the mesh only introduces problems.

Ingress: Gateway API + cert-manager + MetalLB

This is the layer that makes your services accessible from the internet with automatic TLS. Four components working together.

MetalLB

In the cloud, a LoadBalancer type Service gets an external IP from the provider. On bare metal, you need MetalLB for that.

helm repo add metallb https://metallb.github.io/metallb
helm repo update

helm install metallb metallb/metallb \
  --namespace metallb-system \
  --create-namespace

Wait for pods to be ready, then configure the IP pool. On a single server, the pool is the server’s public IP:

# metallb-config.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
    - 203.0.113.10/32  # Your public IP
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: default
  namespace: metallb-system
spec:
  ipAddressPools:
    - default-pool

cert-manager + ClusterIssuer

helm repo add jetstack https://charts.jetstack.io
helm repo update

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

The ClusterIssuer uses DNS-01 with Cloudflare to issue wildcard certificates:

# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - dns01:
          cloudflare:
            apiTokenSecretRef:
              name: cloudflare-api-token
              key: api-token

Create the Secret with the Cloudflare token:

kubectl create secret generic cloudflare-api-token \
  --namespace cert-manager \
  --from-literal=api-token=YOUR_CLOUDFLARE_TOKEN

And the wildcard Certificate:

# wildcard-cert.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-darkden-net
  namespace: istio-system
spec:
  secretName: wildcard-darkden-net-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - "darkden.net"
    - "*.darkden.net"

Istio Gateway

# gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: main-gateway
  namespace: istio-system
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  gatewayClassName: istio
  listeners:
    - name: http
      protocol: HTTP
      port: 80
      allowedRoutes:
        namespaces:
          from: All
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - name: wildcard-darkden-net-tls
      allowedRoutes:
        namespaces:
          from: All

HTTPRoute example

# httproute-example.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: my-app
  namespace: my-app
spec:
  parentRefs:
    - name: main-gateway
      namespace: istio-system
      sectionName: https
  hostnames:
    - "app.darkden.net"
  rules:
    - backendRefs:
        - name: my-app-svc
          port: 8080

External DNS

So DNS records get created automatically when you create an HTTPRoute:

helm repo add external-dns https://kubernetes-sigs.github.io/external-dns
helm repo update

helm install external-dns external-dns/external-dns \
  --namespace external-dns \
  --create-namespace \
  --set provider.name=cloudflare \
  --set env[0].name=CF_API_TOKEN \
  --set env[0].valueFrom.secretKeyRef.name=cloudflare-api-token \
  --set env[0].valueFrom.secretKeyRef.key=api-token \
  --set "sources={gateway-httproute}" \
  --set policy=sync \
  --set registry=txt \
  --set txtOwnerId=k8s-cluster

With this, every time you create an HTTPRoute with a hostname, External DNS creates the A record in Cloudflare automatically. When you delete it, it cleans up.

Data: Database operators

On a single server you can have multiple databases managed by Kubernetes operators. I won’t go into the configuration of each one — every operator deserves its own post — but the point is that the pattern works:

CloudNativePG — Kubernetes-native PostgreSQL. Automated backups, failover (on multi-node), WAL archiving. For new applications, it’s the default choice.
MariaDB Operator — for applications that need MySQL/MariaDB.
Redis Operator (Spotahome) — for caching, sessions, and queues.
Strimzi — Kafka for event-driven architectures. Resource-heavy, only if you really need it.

bash

$ kubectl get pods -A | grep -E ‘cnpg|mariadb|redis’

cnpg-system cnpg-controller-manager-5d7f9b4c8-x2k9l 1/1 Running databases my-app-pg-1 1/1 Running databases my-app-mariadb-0 1/1 Running databases redis-node-0 1/1 Running

Observability: Grafana + Loki + Prometheus

kube-prometheus-stack

A single Helm chart that gives you Prometheus, Grafana, Alertmanager, node-exporter, and a bunch of preconfigured dashboards:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install kube-prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set grafana.adminPassword=YOUR_SECURE_PASSWORD \
  --set prometheus.prometheusSpec.retention=30d \
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi

Grafana is exposed via HTTPRoute with authentication. Configure Google SSO (or your preferred provider) in Grafana’s values so you don’t rely on username/password.

Loki + Promtail

For centralized logs:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm install loki grafana/loki-stack \
  --namespace monitoring \
  --set promtail.enabled=true \
  --set loki.persistence.enabled=true \
  --set loki.persistence.size=20Gi

Recommended dashboards

Node Exporter Full (ID 1860) — server metrics: CPU, RAM, disk, network
Kubernetes / Pod Resources — consumption per pod and namespace
Trivy Operator Vulnerabilities (ID 17813) — detected vulnerabilities in images
Istio Mesh Dashboard — service mesh traffic, latencies, error rates

Security: Defense in depth

Security isn’t a component — it’s a stack of layers. Each one covers a different angle.

Trivy Operator

Automatic vulnerability scanning across all cluster images:

helm repo add aqua https://aquasecurity.github.io/helm-charts
helm repo update

helm install trivy-operator aqua/trivy-operator \
  --namespace trivy-system \
  --create-namespace \
  --set trivy.ignoreUnfixed=true

Trivy creates VulnerabilityReport CRDs for each workload. You can query them with kubectl and visualize them in the Grafana dashboard.

OPA Gatekeeper

Policy enforcement to prevent dangerous configurations:

helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update

helm install gatekeeper gatekeeper/gatekeeper \
  --namespace gatekeeper-system \
  --create-namespace

Constraint templates you should have from day one:

Require resource limits — no pod without CPU and memory requests/limits
Block latest tag — force immutable image tags
Require labels — app.kubernetes.io/name and app.kubernetes.io/version mandatory
Block privileged containers — nobody runs as root unless explicitly needed

Falco

Runtime threat detection. Falco monitors syscalls and generates alerts when it detects suspicious behavior (shell in container, secret reads, writes to sensitive directories):

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update

helm install falco falcosecurity/falco \
  --namespace falco \
  --create-namespace \
  --set falcosidekick.enabled=true \
  --set falcosidekick.webui.enabled=true

Falco access

The Falco Sidekick UI should never be exposed to the internet. Access it only through Tailscale or port-forward. It contains sensitive information about your cluster’s internal behavior.

Network Policies

Default-deny on each application namespace, with explicit allows:

# default-deny.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: my-app
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
---
# allow-dns.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: my-app
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to: []
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53
---
# allow-istio.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-istio
  namespace: my-app
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: istio-system
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: istio-system

nftables to protect the API Server

Calico eBPF and IP filtering

Don’t use iptables with source IP filtering if you have Calico eBPF. eBPF processing modifies source IPs via DNAT before they reach iptables rules, so your source IP filter rules will never match correctly. Filter by network interface with nftables.

# /etc/nftables.conf
table inet filter {
    chain input {
        type filter hook input priority filter; policy accept;

        # API Server: only accessible from localhost and Tailscale
        tcp dport 6443 iifname "lo" accept
        tcp dport 6443 iifname "tailscale0" accept
        tcp dport 6443 drop
    }
}

sudo systemctl enable nftables
sudo systemctl start nftables

With this, the API Server only accepts connections from the server itself (localhost) and from Tailscale VPN. Everything else is dropped.

Management: KEDA, Reloader, ESO

Three tools that simplify day-to-day operations:

KEDA (Kubernetes Event-Driven Autoscaling) — scales pods based on custom metrics, queue length, request rate, or any data source. More flexible than native HPA.

helm repo add kedacore https://kedacore.github.io/charts
helm repo update

helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace

Reloader — monitors ConfigMaps and Secrets. When they change, it automatically triggers rolling restarts of the Deployments that reference them. Without Reloader, changing a ConfigMap requires manually restarting pods.

helm repo add stakater https://stakater.github.io/stakater-charts
helm repo update

helm install reloader stakater/reloader \
  --namespace reloader \
  --create-namespace

External Secrets Operator — syncs secrets from external backends (Vault, Infisical, AWS Secrets Manager) into Kubernetes Secrets. Ready for when you want to centralize secret management outside the cluster.

helm repo add external-secrets https://charts.external-secrets.io
helm repo update

helm install external-secrets external-secrets/external-secrets \
  --namespace external-secrets \
  --create-namespace

The final result

With everything installed, here’s what you have running:

Layer	Components
Runtime	containerd, kubelet, kubeadm
Networking	Calico eBPF (CNI + kube-proxy replacement)
Service Mesh	Istio ambient (ztunnel + waypoints)
Ingress	Gateway API + MetalLB + External DNS
TLS	cert-manager + Let’s Encrypt + Cloudflare DNS-01
Data	CloudNativePG, MariaDB Operator, Redis Operator
Observability	Prometheus, Grafana, Alertmanager, Loki, Promtail
Security	Trivy, Gatekeeper, Falco, Network Policies, nftables
Management	KEDA, Reloader, External Secrets Operator

bash

$ kubectl get pods -A —field-selector status.phase=Running -o json | jq ‘.items | length’ 47

47 pods running on a single server. The infrastructure stack’s base consumption is around 4-5GB RAM and 1.5-2 vCPUs. The rest is available for your workloads.

Deploying a new app is:

# Create namespace and enable it for the service mesh
kubectl create namespace my-new-app
kubectl label namespace my-new-app istio.io/dataplane-mode=ambient

# Deploy waypoint if you need L7
istioctl waypoint apply --namespace my-new-app --enroll-namespace

# Apply manifests (Deployment, Service, HTTPRoute)
kubectl apply -f my-new-app/

DNS, TLS, metrics, logs, network policies, vulnerability scanning — it’s all already there waiting.

Conclusion

This setup isn’t a toy homelab. It’s the same architectural pattern you’d use in production cloud, adapted to a single node. The cost difference is massive: 15-30€/month for a dedicated server vs. hundreds of euros for a managed cluster with the same stack.

The gotchas are in the details that don’t appear in the official documentation:

CRDs that don’t register in time and cause silent failures in operators
eBPF breaking source IP filtering in iptables
Istio init containers interfering with database operators that need to initialize the filesystem before the network is available
containerd without SystemdCgroup causing random pod restarts

Each of these can cost you hours of debugging if you don’t know about them beforehand. Now you do.