Blog

Field notes from production. Deep dives into Kubernetes, service mesh debugging, cloud-native infrastructure, and the pursuit of fully autonomous development workflows.

review.lens(correctness | security | perf | concurrency) verify.independent(n=3).vote() >= majority severity.classify(CVSS): Critical | High | Med | Low assume_hostile(input) && hunt(silent_failures)

security code-review adversarial audit reliability

Audit Your Own Code As If You Were the Attacker

Reviewing your code looking for confirmation that it's fine finds typos, not design flaws. Adversarial review flips the goal: it assumes the code is broken and tries to prove it. Multiple lenses, independent verification, and why 'zero criticals' isn't security.

2026-06-04 12 min

browser --httpOnly cookie--> BFF proxy BFF --Bearer (server-side)--> API refresh.onError(401).singleFlight() oidc.callback: code -> cookies (no token in URL)

authentication security bff nextjs jwt web

The JWT Should Never Touch the Browser: the BFF Pattern

Stashing the access token in localStorage is handing it to the first XSS that comes along. The Backend-for-Frontend pattern keeps the JWT on the server and leaves only an httpOnly cookie in the browser that JavaScript can't read. How it works, step by step.

2026-06-04 12 min

loop.detect(identical | alternating | target-repeat) monitor.heartbeat(turn) => stale ? kill : continue gate.require(toolCalls > 0) else FAILED retry.classify(error) => retry | escalate | abort

ai-agents llm reliability production observability

The 4 Failure Modes of AI Agents in Production (and How to Mitigate Them)

Leave an agent loop running unsupervised and the runaway token bill and corrupted state show up on their own. The four failures that recur in every agentic system —loops, stuck turns, no-op turns, and misclassified errors— and the engineering patterns that contain them.

2026-06-04 14 min

task.state: ToDo -> InProgress -> InReview -> Done review.gate(QA) && review.gate(Security) transition.allow(Done) iff gates.passed && merged else escalate(lead) -> escalate(human)

ai-agents orchestration governance workflow reliability

Governing AI Agents: Why 'Done' Has to Be Earned

An autonomous agent that approves its own work isn't autonomy, it's a time bomb. How to govern teams of agents with separation of duties, independent review gates, and a 'done' invariant that nobody gets to skip.

2026-06-04 12 min

kubeadm init --pod-network-cidr=10.244.0.0/16 kubectl apply -f calico-ebpf.yaml istioctl install --set profile=ambient helm install cert-manager jetstack/cert-manager

kubernetes k8s calico istio devops homelab

Production-Grade Kubernetes on a Single Server: The Complete Guide

How to build a full Kubernetes cluster on a single server with kubeadm, Calico eBPF, Istio ambient, cert-manager, Grafana, Loki, and the full security stack. No shortcuts.

2026-03-14 18 min

pveceph install ceph osd create /dev/sda ceph osd pool create ssd-pool 64 ceph osd crush rule create-replicated ssd-rule default host ssd

proxmox ceph homelab storage backup infrastructure

Proxmox + Ceph + PBS: Complete Guide for a Production-Ready Homelab

How to set up Proxmox with Ceph using separate SSD and HDD pools, CRUSH rules by device class, and Proxmox Backup Server in a VM. From installation to a setup that actually works.

2026-03-14 20 min