Homelab

17-node Kubernetes cluster on five bare-metal Proxmox hosts, provisioned with Terraform and Ansible, managed through ArgoCD GitOps. Runs my home automation, media stack, photo backup, documents, and a few side projects.

k3s nodes ArgoCD Ansible


Architecture

graph TB
    subgraph ext[" External"]
        CF["Cloudflare CDN"]
        Admin["Remote Admin"]
    end

    subgraph vps["Edge VPS"]
        WG["WireGuard VPN Gateway"]
        TraefikVPS["Traefik"]
        Pangolin["Pangolin Tunnel Server"]
    end

    subgraph proxmox["Proxmox Cluster — 5 physical nodes"]
        subgraph cp["Control Plane x3  —  HA etcd + kube-vip"]
            S["k3s-server"]
        end
        subgraph workers["Worker Nodes x14"]
            W["k3s-agent"]
        end
        DH["docker-host — Intel QuickSync GPU"]
        NFS["NFS Server — dedicated storage node"]
    end

    subgraph k8s["Kubernetes"]
        subgraph platform["Platform"]
            direction LR
            MetalLB
            Traefik
            Longhorn
            ArgoCD
            Prometheus
            ECK["Elastic Stack"]
            Istio["Istio Ambient"]
        end
        subgraph apps["Applications"]
            direction LR
            Immich
            VW["Vaultwarden"]
            HA["Home Assistant"]
            Media["Arr Stack + Jellyfin"]
            Other["Paperless, N8n, Ntfy ..."]
        end
    end

    Admin -->|WireGuard VPN| WG
    WG -->|tunnel| k8s
    CF -->|Cloudflare tunnel| k8s
    TraefikVPS --> Pangolin
    Pangolin -->|Newt client| k8s
    cp --- workers
    workers --- Longhorn
    NFS -->|NFS mount| Media
    DH -->|Docker| Media

Hardware

Layer Host Role Resources
Physical aya01 Proxmox node + NFS server Dedicated storage — no VMs
Physical lulu Proxmox node k3s agents
Physical inko01 Proxmox node k3s server + agents + docker host
Physical naruto01 Proxmox node k3s server + agents
Physical mii01 Proxmox node k3s server + agents
VM k3s-server-{10,11,12} K3s control plane (HA etcd + kube-vip VIP) 2 vCPU · 4 GB RAM · 64 GB
VM k3s-agent-{10…23} K3s worker nodes ×14 2 vCPU · 4 GB RAM · 128 GB
VM docker-host11 Docker host w/ GPU passthrough 2 vCPU · 4 GB RAM · 192 GB · Intel QuickSync
VM docker-lb Caddy reverse proxy (LAN only) 1 vCPU · 2 GB RAM
VPS mii Edge node (Netcup) WireGuard · Traefik · Pangolin

All VMs run Debian 12 on virtio network bridges, provisioned from cloud-init templates via Terraform + Ansible.


Platform Stack

Component How deployed Purpose
ArgoCD Helm (App-of-Apps) GitOps CD — all cluster state driven from Git
ArgoCD Image Updater Helm Watches registries, commits updated image tags back to Git
kube-vip DaemonSet on control plane HA VIP for the K8s API server
Traefik k3s built-in Ingress controller, fronted by MetalLB
MetalLB Helm (ArgoCD) Bare-metal load balancer, assigns IPs from reserved pool
Cert-Manager Helm (ArgoCD) Automated TLS via Let's Encrypt DNS-01 (Cloudflare API)
Sealed Secrets Helm (ArgoCD) Encrypts secrets for safe storage in Git
Longhorn Helm (ArgoCD) Distributed block storage (RWO + RWX) across all 14 agents
CloudNativePG Operator (ArgoCD) HA PostgreSQL — used by Immich
Elastic Stack (ECK) Operator (ArgoCD) Elasticsearch + Kibana + Fleet + Elastic Agents for observability
Kube-Prometheus-Stack Helm (ArgoCD) Prometheus + Grafana monitoring
Goldilocks + VPA Helm (ArgoCD) Resource usage analysis and request/limit rightsizing
Istio (Ambient) Helm (ArgoCD) Service mesh — ztunnel DaemonSet on all nodes (L4); no Waypoint proxies yet
K3s Upgrade Controller Operator (ArgoCD) Automated rolling K3s version upgrades
mii-wireguard Manifest (ArgoCD) WireGuard pod — connects cluster to edge VPS, masquerades service CIDR
Newt Deployment (ArgoCD) Pangolin tunnel client for VPS-proxied services
Cloudflared Deployment ×2 (ArgoCD) Cloudflare tunnel — exposes selected services to the internet

Applications

Service Description Notable tech
Immich Photo & video backup (self-hosted Google Photos) CloudNativePG · Redis · ML pod
Vaultwarden Bitwarden-compatible password manager
Paperless-ngx Document management + OCR
Home Assistant Home automation hub
N8n Workflow automation
Ntfy Self-hosted push notifications
Stirling PDF PDF tools
Karakeep Bookmark manager
Gitea Self-hosted Git (source of truth for ArgoCD) Docker on docker-host11
Gitea Runner CI/CD runner
Zeroclaw Per-user instances (×3) via Kustomize overlays
Arr Stack Media automation suite Prowlarr · Sonarr · Radarr · Unpackarr
Download clients VPN-isolated download clients (×2) Gluetun sidecar
Jellyfin Media server with hardware transcoding Docker · Intel QuickSync

Design notes

Everything goes through Git. ArgoCD owns the cluster state; nothing gets kubectl apply'd directly. ArgoCD Image Updater handles the image update loop: when a new tag appears in the registry, it commits the change back to Git and ArgoCD picks it up from there.

Secrets are committed to Git too, encrypted via Sealed Secrets. Only the in-cluster controller holds the decryption key.

No ports are open on the home router. Internal load balancing goes through MetalLB + Traefik. External access uses Cloudflare tunnels or a WireGuard VPN routed through the edge VPS.

Longhorn handles block storage by replicating volumes across all 14 agent nodes. The media library lives on a dedicated NFS host instead — latency matters when Jellyfin is reading large video files, and NFS is simpler for that.

Metrics go to Prometheus + Grafana. Logs and fleet management go to Elastic Stack via the ECK operator, with Elastic Agents running as a DaemonSet so every node is covered.

All VMs are provisioned with Terraform and configured by Ansible. Rebuilding from scratch doesn't require remembering anything.


Repo layout

ansible-homelab/
├── roles/
│   ├── common/           # base OS config, SSH hardening, node-exporter
│   ├── k3s_server/       # control plane install + NoSchedule taint
│   ├── k3s_agent/        # worker node install
│   ├── kube_vip/         # kube-vip DaemonSet + TLS SAN config
│   ├── docker_host/      # Docker + Intel QuickSync GPU passthrough
│   ├── proxmox/          # Proxmox node setup
│   └── edge_vps/         # VPS: WireGuard, Traefik, Pangolin, Elastic Agent
└── playbooks/

argocd-homelab/
├── infrastructure/       # MetalLB, Longhorn, Cert-Manager, ECK, Istio, ...
├── services/             # Immich, Vaultwarden, arr-stack, Home Assistant, ...
└── cluster-apps/         # ArgoCD App-of-Apps root + ApplicationSets
Description
No description provided
Readme 47 KiB