Tuan-Dat Tran
e10e449333
feat(proxmox): per-node CPU type based on hardware capabilities
...
Add proxmox_node_cpu map — aya01 (Celeron N5105, no AVX2) stays at
x86-64-v2-AES; inko01/lulu/mii01/naruto01 (all AVX2-capable) use x86-64-v3.
Task looks up cpu type by vm.node with x86-64-v2-AES as fallback.
2026-06-04 23:32:18 +02:00
Tuan-Dat Tran
f57ca9ac44
fix(proxmox): correct VM node assignments and upgrade CPU to x86-64-v3
...
- docker-host11, k3s-server11, k3s-agent21 moved from inko01 → aya01
- CPU type x86-64-v2-AES → x86-64-v3 to enable AVX2 (required by vLLM CPU image)
2026-06-04 23:19:08 +02:00
Tuan-Dat Tran
6325941078
docs: add raspberry-pi ansible management plan and spec
2026-06-04 01:45:16 +02:00
Tuan-Dat Tran
36f944d1c4
feat(edge_vps): add vps playbook
2026-06-04 01:45:16 +02:00
Tuan-Dat Tran
cce6aba4cd
fix(edge_vps): fix wireguard route template and update elastic/vps vars
2026-06-04 01:45:16 +02:00
Tuan-Dat Tran
f873256f65
feat(edge_vps): add traefik dynamic config template
2026-06-04 01:45:01 +02:00
Tuan-Dat Tran
a331265bde
feat(edge_vps): add pangolin/gerbil/traefik stack with versioned images
2026-06-04 01:44:55 +02:00
Tuan-Dat Tran
a905b25190
fix(raspberry_pi): switch zigbee2mqtt adapter from ezsp to ember
2026-06-03 20:06:21 +02:00
Tuan-Dat Tran
25cc5ac271
fix(inventory): remove undefined k3s_storage group
2026-06-03 19:53:43 +02:00
Tuan-Dat Tran
2b857903a7
fix(raspberry_pi): use /dev/ttyUSB0 and set ezsp adapter for SONOFF MG21
2026-06-03 19:50:30 +02:00
Tuan-Dat Tran
eb4e8445fc
fix(raspberry_pi): isolate z2m to own compose dir, fix port conflict
2026-06-03 19:43:35 +02:00
Tuan-Dat Tran
3799dc16d9
fix(raspberry_pi): install docker-compose-plugin before starting stack
2026-06-03 08:31:21 +02:00
Tuan-Dat Tran
585c01ca62
feat(raspberry_pi): wire up role tasks
2026-06-03 08:27:16 +02:00
Tuan-Dat Tran
14b93bf4f5
feat(raspberry_pi): add zigbee2mqtt deploy task
2026-06-03 08:26:04 +02:00
Tuan-Dat Tran
42e790656d
feat(raspberry_pi): add zigbee2mqtt and mosquitto templates
2026-06-03 03:12:20 +02:00
Tuan-Dat Tran
da92fb0ccc
feat(raspberry_pi): add directory setup task
2026-06-03 03:11:17 +02:00
Tuan-Dat Tran
d655cc54e2
fix(raspberry_pi): remove host condition from handler
2026-06-03 03:03:20 +02:00
Tuan-Dat Tran
9115d30c59
feat(raspberry_pi): add defaults, handlers, and secrets placeholder
2026-06-03 03:01:20 +02:00
Tuan-Dat Tran
8dcb429573
docs: add zigbee2mqtt implementation plan for naruto
2026-06-03 02:57:22 +02:00
Tuan-Dat Tran
29cc38872c
docs: add zigbee2mqtt design spec for naruto
2026-06-03 02:54:18 +02:00
Tuan-Dat Tran
f6e2ce8c1a
fix(common): replace deprecated apt_repository with deb822_repository
2026-06-03 02:31:33 +02:00
Tuan-Dat Tran
956836dc67
fix(common): replace deprecated ansible_ fact references with ansible_facts[]
2026-06-03 02:17:08 +02:00
Tuan-Dat Tran
aa8b591afd
feat(raspberry_pi): add playbook
2026-06-03 01:23:48 +02:00
Tuan-Dat Tran
935389dc6d
feat(raspberry_pi): add empty role scaffold
2026-06-03 01:23:48 +02:00
Tuan-Dat Tran
c4327a7596
fix(common): support aarch64 in extra_packages
2026-05-31 23:41:39 +02:00
Tuan-Dat Tran
b190022ff0
feat(raspberry_pi): add inventory and group vars
2026-05-31 23:29:07 +02:00
Tuan-Dat Tran
8da0ab98f8
fix(k3s_server): skip installation if k3s binary already exists
...
Primary and secondary install tasks now check k3s_status.stat.exists
so re-running the playbook is idempotent on already-provisioned nodes.
2026-04-27 21:43:42 +02:00
Tuan-Dat Tran
b4e093c9b1
fix(k3s_server): use VIP address in kubeconfig instead of k3s_server_name
...
k3s_server_name resolves to k3s.seyshiro.de which has no DNS entry.
Use k3s_vip (192.168.20.2) so the kubeconfig always works.
2026-04-27 21:41:55 +02:00
Tuan-Dat Tran
e8df950e87
chore(k3s): update vault-encrypted cluster join token
2026-04-27 21:39:37 +02:00
Tuan-Dat Tran
5b44c46e10
docs(arr-cleanup): improve runbook and fix api key paths
...
Rewrites findings.md with how-to section, cleaner summary tables,
and more detailed per-pass results. Fixes relative path for
sonarr/radarr API key files after runbook moved deeper in repo.
2026-04-27 21:39:28 +02:00
Tuan-Dat Tran
95715c7748
feat(k3s_server): persist control-plane NoSchedule taint in k3s config
...
Adds node-taint to /etc/rancher/k3s/config.yaml so the taint
survives node reboots. Taint is already applied live via kubectl.
2026-04-27 21:35:24 +02:00
Tuan-Dat Tran
5bc3024eaf
feat(k3s): replace nginx loadbalancer with kube-vip for control-plane HA
...
Deploys kube-vip as a DaemonSet on all k3s server nodes, advertising a
VIP (192.168.20.2) via ARP. Eliminates the single-point-of-failure
k3s-loadbalancer VM.
- New kube_vip role: RBAC + DaemonSet templates, TLS SAN cert rotation
- playbooks/kube-vip.yaml: migration playbook (serial=1, idempotent)
- Updated k3s install tasks (server primary/secondary, agent) to use k3s_vip
instead of the loadbalancer VM IP
- Added k3s_vip: 192.168.20.2 to group_vars (below DHCP range .11-.250)
Migration steps in playbook header comment.
2026-04-26 12:08:42 +02:00
Tuan-Dat Tran
fce6f913ff
docs(plan): add docker version update plan for jellyfin and gitea
2026-04-23 08:06:35 +02:00
Tuan-Dat Tran
8239988a70
docs(runbook): add arr-stack downloads cleanup investigation and scripts
...
~16T freed on aya01 (92% → 57% mergerfs pool). Documents root cause
(no hardlinks across mergerfs due to cross-device mounts), cleanup
passes via Sonarr/Radarr API verification, and pending decisions
(Bleach remux, 111 skipped Sonarr entries).
2026-04-23 08:06:27 +02:00
Tuan-Dat Tran
e87dcd06f3
chore(k3s): rotate cluster token secret
2026-04-23 08:06:08 +02:00
Tuan-Dat Tran
543e9a2c97
fix(docker_host): remove /media/docker from NFS mount loop
...
/media/docker is no longer a valid NFS-backed path; was causing
mount failures on docker_host nodes.
2026-04-23 08:06:03 +02:00
Tuan-Dat Tran
afbc3e3c57
docs(runbook): add Longhorn orphan auto-deletion fix and etcd defrag procedure
2026-04-22 22:03:45 +02:00
Tuan-Dat Tran
b157dd0b89
feat(k3s_server): install etcd-client on control plane nodes
2026-04-22 19:40:24 +02:00
Tuan-Dat Tran
057cd7a7f0
docs(runbook): mark vaultwarden as resolved
2026-04-22 00:52:58 +02:00
Tuan-Dat Tran
db2d5dccd4
docs(runbook): mark Longhorn orphan/etcd defrag as resolved
...
138 orphans deleted, all 3 etcd members defragged from 634MB to ~57MB.
2026-04-22 00:40:23 +02:00
Tuan-Dat Tran
db7e130515
docs: mark server11 disk issue resolved in runbook
2026-04-21 23:41:13 +02:00
Tuan-Dat Tran
c16e7cf740
fix(k3s_server): use inventory_hostname for primary detection and delegate token fetch
...
Primary server detection previously used ansible_default_ipv4.address compared against
k3s_primary_server_ip, which breaks with --limit since facts are only gathered for the
targeted hosts, causing the variable to resolve to the wrong IP.
- Replace IP comparisons with `inventory_hostname == groups['k3s_server'] | first`
in main.yaml (primary install, secondary install, kubeconfig tasks)
- Delegate the node-token slurp to the primary server unconditionally so
pull_token.yaml works correctly when run against any single node with --limit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-21 23:30:57 +02:00
Tuan-Dat Tran
c084572521
docs: add k3s-server11 reprovision implementation plan
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-21 21:58:13 +02:00
Tuan-Dat Tran
da7bd42f07
docs: add k3s-server11 reprovision spec and cluster outage runbook
...
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-21 21:55:18 +02:00
Tuan-Dat Tran
f0a45e3fda
fix: configure explicit NTP servers in timesyncd instead of relying on DHCP
...
Gateway at 192.168.20.1 was being provided via DHCP as the NTP server but
does not serve NTP, causing NodeClockNotSynchronising across all nodes.
2026-04-20 20:56:30 +02:00
Tuan-Dat Tran
b5f82e2978
fix: install kitty terminfo on all nodes via common role
2026-04-20 20:36:23 +02:00
Tuan-Dat Tran
29561c44c8
fix: enable and start systemd-timesyncd in common time role
...
systemd-timesyncd was installed via common_packages but never enabled or
started, causing NodeClockNotSynchronising alerts across all k3s nodes.
2026-04-20 20:18:19 +02:00
Tuan-Dat Tran
d33117a752
chore(docker): update jellyfin to 10.11.7 and gitea to 1.25.5-rootless
2026-04-01 21:20:02 +02:00
Tuan-Dat Tran
e9e4864456
docs: add design spec for docker service version updates (jellyfin 10.11.7, gitea 1.25.5)
2026-04-01 21:17:05 +02:00
Tuan-Dat Tran
043f97ebac
docs: add design spec and implementation plan for docker service redeployment
2026-04-01 21:00:51 +02:00