add networking, storage, and observability docs
This commit is contained in:
59
docs/storage.md
Normal file
59
docs/storage.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Storage
|
||||
|
||||
## Overview
|
||||
|
||||
Three storage tiers serve different workloads:
|
||||
|
||||
| Tier | System | Access | Used by |
|
||||
|------|--------|--------|---------|
|
||||
| Distributed block | Longhorn | RWO + RWX | All stateful K8s workloads |
|
||||
| Relational | CloudNativePG | In-cluster Postgres | Immich |
|
||||
| Network file | NFS (bare-metal) | NFS mount | Jellyfin media library |
|
||||
|
||||
---
|
||||
|
||||
## Longhorn
|
||||
|
||||
Longhorn provides distributed block storage across all 14 agent nodes. Each volume is replicated (default: 3 replicas) across different nodes.
|
||||
|
||||
- **RWO** (ReadWriteOnce) — used for most services (Vaultwarden, Paperless, etc.)
|
||||
- **RWX** (ReadWriteMany) — used where multiple pods need shared access
|
||||
- Volumes are backed by the local disk on each agent node (128 GB each)
|
||||
- Longhorn manager runs as a DaemonSet; the CSI plugin integrates with the K8s storage layer
|
||||
- Snapshots and backups are supported via the Longhorn UI
|
||||
|
||||
Control plane nodes (`k3s-server-*`) are tainted `NoSchedule` — Longhorn manager tolerates this taint and runs everywhere, but user workloads are pushed to agent nodes only.
|
||||
|
||||
---
|
||||
|
||||
## CloudNativePG
|
||||
|
||||
The CNPG operator manages HA PostgreSQL clusters as first-class Kubernetes resources. Currently used by:
|
||||
|
||||
- **Immich** — primary database (photos, albums, users, ML embeddings)
|
||||
|
||||
CNPG handles streaming replication, failover, and scheduled backups. Data is stored on Longhorn PVCs.
|
||||
|
||||
---
|
||||
|
||||
## NFS
|
||||
|
||||
A dedicated physical node (`aya01`) runs a bare-metal NFS server. This serves the media library to Jellyfin.
|
||||
|
||||
- Movies, TV shows, and music live on `aya01`
|
||||
- `docker-host11` (where Jellyfin runs) mounts the NFS share
|
||||
- Separating media storage from the compute host means the Jellyfin VM can be rebuilt without touching the library
|
||||
- NFS is not used for K8s workloads — Longhorn handles all PVC-backed storage
|
||||
|
||||
---
|
||||
|
||||
## Secret Storage
|
||||
|
||||
Kubernetes secrets are managed with **Sealed Secrets** (Bitnami). The workflow:
|
||||
|
||||
1. Create a regular K8s `Secret`
|
||||
2. Encrypt it with `kubeseal` using the cluster's public key → produces a `SealedSecret`
|
||||
3. Commit the `SealedSecret` to Git — it is safe to store publicly
|
||||
4. The in-cluster Sealed Secrets controller decrypts it into a regular `Secret` at apply time
|
||||
|
||||
Ansible secrets (VM credentials, API tokens) are encrypted with **Ansible Vault** and stored in `vars/group_vars/*/secrets_*.yaml`.
|
||||
Reference in New Issue
Block a user