7.7 KiB
Homelab
A production-grade homelab running on bare-metal Proxmox, with a 17-node Kubernetes cluster managed entirely through GitOps.
Architecture
graph TB
subgraph ext[" External"]
CF["Cloudflare\nCDN + DNS"]
Admin["Remote Admin"]
end
subgraph vps["Edge VPS"]
WG["WireGuard\nVPN Gateway"]
TraefikVPS["Traefik\nReverse Proxy"]
Pangolin["Pangolin\nTunnel Server"]
end
subgraph proxmox["Proxmox Cluster — 5 physical nodes"]
subgraph cp["Control Plane ×3 (HA etcd)"]
S["k3s-server"]
end
LB["nginx\nLoad Balancer"]
subgraph workers["Worker Nodes ×14"]
W["k3s-agent"]
end
DH["docker-host\nIntel QuickSync GPU"]
NFS["NFS Server\nDedicated storage node"]
end
subgraph k8s["Kubernetes"]
subgraph platform["Platform layer"]
direction LR
MetalLB["MetalLB"]
Traefik["Traefik"]
Longhorn["Longhorn"]
ArgoCD["ArgoCD"]
Prometheus["Prometheus\n+ Grafana"]
ECK["Elastic Stack\n(ECK)"]
Istio["Istio\nAmbient"]
end
subgraph apps["Applications"]
direction LR
Immich["Immich"]
VW["Vaultwarden"]
HA["Home Assistant"]
Media["Arr Stack\n+ Jellyfin"]
Other["Paperless · N8n\nNtfy · Gitea · …"]
end
end
Admin -->|WireGuard VPN| WG
WG -->|tunnel| k8s
CF -->|Cloudflare tunnel| k8s
TraefikVPS --> Pangolin
Pangolin -->|Newt client| k8s
LB --> cp
cp --- workers
workers --- Longhorn
NFS -->|NFS mount| Media
DH -->|Jellyfin\nDocker| Media
Hardware
| Layer | Host | Role | Resources |
|---|---|---|---|
| Physical | aya01 |
Proxmox node + NFS server | Dedicated storage — no VMs |
| Physical | lulu |
Proxmox node | k3s agents |
| Physical | inko01 |
Proxmox node | k3s server + agents + docker host |
| Physical | naruto01 |
Proxmox node | k3s server + agents + LB |
| Physical | mii01 |
Proxmox node | k3s server + agents |
| VM | k3s-server-{10,11,12} |
K3s control plane (HA etcd) | 2 vCPU · 4 GB RAM · 64 GB |
| VM | k3s-agent-{10…23} |
K3s worker nodes ×14 | 2 vCPU · 4 GB RAM · 128 GB |
| VM | docker-host11 |
Docker host w/ GPU passthrough | 2 vCPU · 4 GB RAM · 192 GB · Intel QuickSync |
| VM | k3s-loadbalancer |
nginx LB fronting control plane | 1 vCPU · 2 GB RAM |
| VM | docker-lb |
Caddy reverse proxy (LAN only) | 1 vCPU · 2 GB RAM |
| VPS | mii |
Edge node (Netcup) | WireGuard · Traefik · Pangolin |
All VMs run Debian 12 on virtio network bridges, provisioned from cloud-init templates via Terraform + Ansible.
Platform Stack
| Component | How deployed | Purpose |
|---|---|---|
| ArgoCD | Helm (App-of-Apps) | GitOps CD — all cluster state driven from Git |
| ArgoCD Image Updater | Helm | Watches registries, commits updated image tags back to Git |
| Traefik | k3s built-in | Ingress controller, fronted by MetalLB |
| MetalLB | Helm (ArgoCD) | Bare-metal load balancer, assigns IPs from reserved pool |
| Cert-Manager | Helm (ArgoCD) | Automated TLS via Let's Encrypt DNS-01 (Cloudflare API) |
| Sealed Secrets | Helm (ArgoCD) | Encrypts secrets for safe storage in Git |
| Longhorn | Helm (ArgoCD) | Distributed block storage (RWO + RWX) across all 14 agents |
| CloudNativePG | Operator (ArgoCD) | HA PostgreSQL — used by Immich |
| Elastic Stack (ECK) | Operator (ArgoCD) | Elasticsearch + Kibana + Fleet + Elastic Agents for observability |
| Kube-Prometheus-Stack | Helm (ArgoCD) | Prometheus + Grafana monitoring |
| Goldilocks + VPA | Helm (ArgoCD) | Resource usage analysis and request/limit rightsizing |
| Istio (Ambient) | Helm (ArgoCD) | Service mesh — ztunnel DaemonSet on all nodes (L4); no Waypoint proxies yet |
| K3s Upgrade Controller | Operator (ArgoCD) | Automated rolling K3s version upgrades |
| mii-wireguard | Manifest (ArgoCD) | WireGuard pod — connects cluster to edge VPS, masquerades service CIDR |
| Newt | Deployment (ArgoCD) | Pangolin tunnel client for VPS-proxied services |
| Cloudflared | Deployment ×2 (ArgoCD) | Cloudflare tunnel — exposes selected services to the internet |
Applications
| Service | Description | Notable tech |
|---|---|---|
| Immich | Photo & video backup (self-hosted Google Photos) | CloudNativePG · Redis · ML pod |
| Vaultwarden | Bitwarden-compatible password manager | – |
| Paperless-ngx | Document management + OCR | – |
| Home Assistant | Home automation hub | – |
| N8n | Workflow automation | – |
| Ntfy | Self-hosted push notifications | – |
| Stirling PDF | PDF tools | – |
| Karakeep | Bookmark manager | – |
| Gitea | Self-hosted Git (source of truth for ArgoCD) | Docker on docker-host11 |
| Gitea Runner | CI/CD runner | – |
| Zeroclaw | Per-user instances (×3) via Kustomize overlays | – |
| Arr Stack | Media automation suite | Prowlarr · Sonarr · Radarr · Unpackarr |
| qBittorrent | Torrent clients (×2) | Gluetun VPN sidecar · ProtonVPN |
| Jellyfin | Media server with hardware transcoding | Docker · Intel QuickSync |
Key Design Decisions
GitOps end-to-end. Every cluster resource is declared in Git and applied by ArgoCD. Nothing is kubectl apply'd by hand. ArgoCD Image Updater closes the loop by writing image tag updates back to Git automatically.
Secrets in Git, safely. Sealed Secrets lets encrypted SealedSecret manifests live in the same repo as everything else. Only the in-cluster controller can decrypt them.
No cloud dependency for ingress. MetalLB + Traefik handles all internal load balancing. External access goes through Cloudflare tunnels or a WireGuard VPN — no ports open on the home router.
Distributed storage without a SAN. Longhorn replicates volumes across all 14 agent nodes. NFS on a dedicated bare-metal host serves the media library to Jellyfin with low latency.
Observability from day one. Prometheus + Grafana for metrics, Elastic Stack (via ECK operator) for logs and fleet management. Elastic Agents run as a DaemonSet across the whole cluster.
Provisioning is reproducible. Proxmox VMs are created via Terraform (Proxmox provider), then configured by Ansible roles — from base OS hardening to k3s installation and kubeconfig management.
Repository Layout
ansible-homelab/ # Ansible roles + playbooks for all VM provisioning
├── roles/
│ ├── common/ # Base OS config, SSH hardening, node-exporter
│ ├── k3s_server/ # HA control plane install + taint config
│ ├── k3s_agent/ # Worker node install
│ ├── k3s_loadbalancer/ # nginx LB config
│ ├── kube_vip/ # VIP setup
│ ├── docker_host/ # Docker + GPU passthrough
│ ├── proxmox/ # Proxmox node config
│ └── edge_vps/ # VPS services (WireGuard, Traefik, Pangolin)
└── playbooks/ # Top-level playbooks per host group
argocd-homelab/ # All Kubernetes manifests (ArgoCD App-of-Apps)
├── infrastructure/ # Platform: MetalLB, Longhorn, Cert-Manager, ECK, …
├── services/ # Applications: Immich, Vaultwarden, arr-stack, …
└── cluster-apps/ # ArgoCD ApplicationSets + root app