120 Commits

Author SHA1 Message Date
Tuan-Dat Tran
8da0ab98f8 fix(k3s_server): skip installation if k3s binary already exists
Primary and secondary install tasks now check k3s_status.stat.exists
so re-running the playbook is idempotent on already-provisioned nodes.
2026-04-27 21:43:42 +02:00
Tuan-Dat Tran
b4e093c9b1 fix(k3s_server): use VIP address in kubeconfig instead of k3s_server_name
k3s_server_name resolves to k3s.seyshiro.de which has no DNS entry.
Use k3s_vip (192.168.20.2) so the kubeconfig always works.
2026-04-27 21:41:55 +02:00
Tuan-Dat Tran
e8df950e87 chore(k3s): update vault-encrypted cluster join token 2026-04-27 21:39:37 +02:00
Tuan-Dat Tran
5b44c46e10 docs(arr-cleanup): improve runbook and fix api key paths
Rewrites findings.md with how-to section, cleaner summary tables,
and more detailed per-pass results. Fixes relative path for
sonarr/radarr API key files after runbook moved deeper in repo.
2026-04-27 21:39:28 +02:00
Tuan-Dat Tran
95715c7748 feat(k3s_server): persist control-plane NoSchedule taint in k3s config
Adds node-taint to /etc/rancher/k3s/config.yaml so the taint
survives node reboots. Taint is already applied live via kubectl.
2026-04-27 21:35:24 +02:00
Tuan-Dat Tran
5bc3024eaf feat(k3s): replace nginx loadbalancer with kube-vip for control-plane HA
Deploys kube-vip as a DaemonSet on all k3s server nodes, advertising a
VIP (192.168.20.2) via ARP. Eliminates the single-point-of-failure
k3s-loadbalancer VM.

- New kube_vip role: RBAC + DaemonSet templates, TLS SAN cert rotation
- playbooks/kube-vip.yaml: migration playbook (serial=1, idempotent)
- Updated k3s install tasks (server primary/secondary, agent) to use k3s_vip
  instead of the loadbalancer VM IP
- Added k3s_vip: 192.168.20.2 to group_vars (below DHCP range .11-.250)

Migration steps in playbook header comment.
2026-04-26 12:08:42 +02:00
Tuan-Dat Tran
fce6f913ff docs(plan): add docker version update plan for jellyfin and gitea 2026-04-23 08:06:35 +02:00
Tuan-Dat Tran
8239988a70 docs(runbook): add arr-stack downloads cleanup investigation and scripts
~16T freed on aya01 (92% → 57% mergerfs pool). Documents root cause
(no hardlinks across mergerfs due to cross-device mounts), cleanup
passes via Sonarr/Radarr API verification, and pending decisions
(Bleach remux, 111 skipped Sonarr entries).
2026-04-23 08:06:27 +02:00
Tuan-Dat Tran
e87dcd06f3 chore(k3s): rotate cluster token secret 2026-04-23 08:06:08 +02:00
Tuan-Dat Tran
543e9a2c97 fix(docker_host): remove /media/docker from NFS mount loop
/media/docker is no longer a valid NFS-backed path; was causing
mount failures on docker_host nodes.
2026-04-23 08:06:03 +02:00
Tuan-Dat Tran
afbc3e3c57 docs(runbook): add Longhorn orphan auto-deletion fix and etcd defrag procedure 2026-04-22 22:03:45 +02:00
Tuan-Dat Tran
b157dd0b89 feat(k3s_server): install etcd-client on control plane nodes 2026-04-22 19:40:24 +02:00
Tuan-Dat Tran
057cd7a7f0 docs(runbook): mark vaultwarden as resolved 2026-04-22 00:52:58 +02:00
Tuan-Dat Tran
db2d5dccd4 docs(runbook): mark Longhorn orphan/etcd defrag as resolved
138 orphans deleted, all 3 etcd members defragged from 634MB to ~57MB.
2026-04-22 00:40:23 +02:00
Tuan-Dat Tran
db7e130515 docs: mark server11 disk issue resolved in runbook 2026-04-21 23:41:13 +02:00
Tuan-Dat Tran
c16e7cf740 fix(k3s_server): use inventory_hostname for primary detection and delegate token fetch
Primary server detection previously used ansible_default_ipv4.address compared against
k3s_primary_server_ip, which breaks with --limit since facts are only gathered for the
targeted hosts, causing the variable to resolve to the wrong IP.

- Replace IP comparisons with `inventory_hostname == groups['k3s_server'] | first`
  in main.yaml (primary install, secondary install, kubeconfig tasks)
- Delegate the node-token slurp to the primary server unconditionally so
  pull_token.yaml works correctly when run against any single node with --limit

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 23:30:57 +02:00
Tuan-Dat Tran
c084572521 docs: add k3s-server11 reprovision implementation plan
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 21:58:13 +02:00
Tuan-Dat Tran
da7bd42f07 docs: add k3s-server11 reprovision spec and cluster outage runbook
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 21:55:18 +02:00
Tuan-Dat Tran
f0a45e3fda fix: configure explicit NTP servers in timesyncd instead of relying on DHCP
Gateway at 192.168.20.1 was being provided via DHCP as the NTP server but
does not serve NTP, causing NodeClockNotSynchronising across all nodes.
2026-04-20 20:56:30 +02:00
Tuan-Dat Tran
b5f82e2978 fix: install kitty terminfo on all nodes via common role 2026-04-20 20:36:23 +02:00
Tuan-Dat Tran
29561c44c8 fix: enable and start systemd-timesyncd in common time role
systemd-timesyncd was installed via common_packages but never enabled or
started, causing NodeClockNotSynchronising alerts across all k3s nodes.
2026-04-20 20:18:19 +02:00
Tuan-Dat Tran
d33117a752 chore(docker): update jellyfin to 10.11.7 and gitea to 1.25.5-rootless 2026-04-01 21:20:02 +02:00
Tuan-Dat Tran
e9e4864456 docs: add design spec for docker service version updates (jellyfin 10.11.7, gitea 1.25.5) 2026-04-01 21:17:05 +02:00
Tuan-Dat Tran
043f97ebac docs: add design spec and implementation plan for docker service redeployment 2026-04-01 21:00:51 +02:00
Tuan-Dat Tran
134eceee0f Update Jellyfin and Gitea image versions 2026-04-01 20:55:20 +02:00
Tuan-Dat Tran
80f98a9c4b docs: update Proxmox cluster debugging design with findings and fixes 2026-03-01 20:58:04 +01:00
Tuan-Dat Tran
d4ac3dae60 feat(k3s): Added 2 nodes
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2026-03-01 17:01:51 +01:00
Tuan-Dat Tran
5a8c7f0248 feat(proxmox): add hosts config
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2026-02-28 11:30:58 +01:00
Tuan-Dat Tran
bf7c7c9562 ci: add GitHub Actions workflow for linting 2026-02-25 06:00:20 +01:00
Tuan-Dat Tran
a9346881b0 refactor(edge_vps): reorganize certificate files 2026-02-25 00:26:08 +01:00
Tuan-Dat Tran
193da30e65 docs(edge_vps): update README with role documentation 2026-02-25 00:12:50 +01:00
Tuan-Dat Tran
9a5cb376bd feat(edge_vps): add inventory variables for VPS group 2026-02-25 00:10:27 +01:00
Tuan-Dat Tran
fc2eefdfb0 feat(edge_vps): add main task orchestrator 2026-02-25 00:03:17 +01:00
Tuan-Dat Tran
274b9c310e feat(edge_vps): add Elastic Agent setup task and templates 2026-02-25 00:00:00 +01:00
Tuan-Dat Tran
6fdd021604 feat(edge_vps): add Pangolin setup task and templates 2026-02-24 23:56:00 +01:00
Tuan-Dat Tran
1b82acad1f feat(edge_vps): add Traefik setup task and template 2026-02-24 23:53:00 +01:00
Tuan-Dat Tran
d8822ad904 feat(edge_vps): add WireGuard setup task and template 2026-02-24 23:50:08 +01:00
Tuan-Dat Tran
caecfc7c1d feat(edge_vps): add directory setup task 2026-02-24 23:47:34 +01:00
Tuan-Dat Tran
4907761649 feat(edge_vps): add role structure and handlers 2026-02-24 23:45:14 +01:00
Tuan-Dat Tran
a3cb1928ae docs(argocd): add missing Ingress task and note about missing template 2026-02-16 09:25:36 +01:00
Tuan-Dat Tran
99f6876ce9 docs: Add changelog and update role documentation 2026-02-16 09:21:08 +01:00
Tuan-Dat Tran
0a3171b9bc feat(k3s): Added 2 nodes (2/2)
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2026-01-26 23:08:34 +01:00
Tuan-Dat Tran
3068a5a8fb feat(k3s): Added 2 nodesg
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2026-01-26 22:42:19 +01:00
Tuan-Dat Tran
ef652fac20 refactor: yml -> yaml
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-11-07 20:44:14 +01:00
Tuan-Dat Tran
22c1b534ab feat(k3s): Add new node and machine
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-10-26 10:41:11 +01:00
Tuan-Dat Tran
9cb90a8020 feat(caddy): netcup->cf
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-10-25 09:25:40 +02:00
Tuan-Dat Tran
d9181515bb feat(k3s): Added (temporary) node
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-10-19 01:33:42 +02:00
Tuan-Dat Tran
c3905ed144 feat(git): Add .gitattributes for ansible-vault git diff
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-10-19 00:34:51 +02:00
Tuan-Dat Tran
5fb50ab4b2 feat(k3s): Add new node
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-10-07 23:46:40 +02:00
Tuan-Dat Tran
2909d6e16c feat(nfs): Removed unused/removed nfs servers
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
0aed818be5 feat(docker): Removed nodes docker-host10 and docker-host12
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
fbdeec93ce feat(docker): match services that moved to k3s
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
44626101de feat(docker): match services that moved to k3s
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
c1d6f13275 refactor(ansible-lint): fixed ansible-lint warnings
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
282e98e90a fix(proxmox): commented 'non-errors' on script
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-15 23:29:03 +02:00
Tuan-Dat Tran
9573cbfcad feat(k3s): Added 2 nodes
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-07 21:21:33 +02:00
Tuan-Dat Tran
48aec11d8c feat(common): added iscsi for longhorn on k3s
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-07 18:17:33 +02:00
Tuan-Dat Tran
a1da69ac98 feat(proxmox): check_vm as cronjob
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-09-02 19:52:49 +02:00
Tuan-Dat Tran
7aa16f3207 Added blog.md
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-27 22:59:01 +02:00
Tuan-Dat Tran
fe3f1749c5 Update README.md
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-27 22:51:15 +02:00
Tuan-Dat Tran
6eef96b302 feat(pre-commit): Added linting 2025-07-27 22:46:23 +02:00
Tuan-Dat Tran
2882abfc0b Added README.md for roles
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-27 16:40:46 +02:00
Tuan-Dat Tran
2b759cc2ab Update README.md
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-27 16:16:35 +02:00
Tuan-Dat Tran
dbaebaee80 cleanup: services moved to argocd
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-27 13:58:25 +02:00
Tuan-Dat Tran
89c51aa45c feat(argo): app-of-app argo
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-25 07:58:41 +02:00
Tuan-Dat Tran
0139850ee3 feat(reverse_proxy): fix caddy letsencrypt
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-22 21:26:11 +02:00
Tuan-Dat Tran
976cad51e2 refactor(k3s): enhance cluster setup and enable ArgoCD apps
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-22 07:23:23 +02:00
Tuan-Dat Tran
e1a2248154 feat(kubernetes): add nfs-provisioner
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-15 23:24:52 +02:00
Tuan-Dat Tran
d8fd094379 feat(kubernetes): stable kubernetes with argo
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-14 22:57:13 +02:00
Tuan-Dat Tran
76000f8123 feat(kubernetes): add initial setup for ArgoCD, Cert-Manager, MetalLB, and Traefik
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-13 14:25:53 +02:00
Tuan-Dat Tran
4aa939426b refactor(k3s): enhance kubeconfig generation and token management
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-13 09:33:39 +02:00
Tuan-Dat Tran
9cce71f73b refactor(k3s): manage token securely and install guest agent
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-13 02:15:01 +02:00
Tuan-Dat Tran
97a5d6c41d refactor(k3s): centralize k3s primary server IP and integrate Netcup DNS
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-13 01:30:05 +02:00
Tuan-Dat Tran
f1b0cfad2c refactor(k3s): streamline inventory and primary server IP handling
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-13 00:40:48 +02:00
Tuan-Dat Tran
dac0d88d60 feat(proxmox): add k3s agents and refine VM provisioning
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-12 23:08:44 +02:00
Tuan-Dat Tran
609e000089 refactor(ansible): centralize inventory and variables in 'vars' directory
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-12 21:38:53 +02:00
Tuan-Dat Tran
3d7f652ff3 refactor(ansible): restructure inventory and remove postgres role
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-07-12 20:35:26 +02:00
Tuan-Dat Tran
cb8ccd8f00 wip
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-06-07 01:19:27 +02:00
Tuan-Dat Tran
02168225b1 wip
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-06-07 00:16:54 +02:00
Tuan-Dat Tran
6ff1ccecd0 refactor(infra): reorganize docker host VMs and service assignments
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-05-07 00:02:30 +02:00
Tuan-Dat Tran
de62327fde Add naruto01 to proxmox nodes
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-05-06 13:33:46 +02:00
Tuan-Dat Tran
b70c8408dc 2025-05-03T21:41+02:00
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-05-03 21:41:32 +02:00
Tuan-Dat Tran
a913e1cbc0 refactor: reorganize proxmox roles, add hardware acceleration, and update common config tasks
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-05-03 10:24:50 +02:00
Tuan-Dat Tran
e3c67a32e9 feat(reverse_proxy): add Netcup DNS ACME challenge support and refactor Caddy setup
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-28 23:24:29 +02:00
Tuan-Dat Tran
8f2998abc0 refactor(ansible): use ansible_user_id and add root package condition
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-27 18:15:07 +02:00
Tuan-Dat Tran
7fcee3912f refactor(ansible): refactor common role application and improve vm ssh config
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-27 17:46:41 +02:00
Tuan-Dat Tran
591342f580 feat(proxmox): refactor vm provisioning and add pci passthrough config
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-26 23:34:42 +02:00
Tuan-Dat Tran
f2ea03bc01 feat(proxmox): automatic vm creation
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-26 21:58:58 +02:00
Tuan-Dat Tran
0e8e07ed3e feat(docker): Added healthcheck
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-26 13:21:02 +02:00
Tuan-Dat Tran
a2a58f6343 feat(keycloak|docker): improved templating
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-25 23:37:24 +02:00
Tuan-Dat Tran
42196a32dc feat(docker): Add karakeep and keycloak services
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-24 20:24:33 +02:00
Tuan-Dat Tran
6934a9f5fc distributed secrets to group_vars and added karakeep
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-06 23:46:28 +02:00
Tuan-Dat Tran
27621aac03 Added proxmox-vm and static tagging of docker images
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-04-06 18:04:33 +02:00
Tuan-Dat Tran
56f058c254 moved ssh to cert based
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-03-25 01:09:08 +01:00
Tuan-Dat Tran
924e4a2f92 refactor(inventory): Reorganized inventory
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-02-07 01:54:34 +01:00
Tuan-Dat Tran
060e2425ff fix(skeleton): Fixed script and content for secrets.skeleton
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-02-07 00:09:37 +01:00
Tuan-Dat Tran
f2d489f63a refactor(structure/ansible.cfg): Changed folder structure with ansible.cfg
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-02-07 00:06:37 +01:00
Tuan-Dat Tran
4aa3e711c9 fix(ssh): switch to ubuntu based key
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-24 12:47:23 +01:00
Tuan-Dat Tran
00e4f4807d feat(docker): Removed data
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-24 09:11:36 +01:00
Tuan-Dat Tran
161e6446cd fix(compose): made port expose optional
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-24 07:28:12 +01:00
Tuan-Dat Tran
ae929ca09d feat(docker): Added cadvisor on all hosts, added docker metric exporter, added docker compose restart as handler, moved repetetive directory/permission creation into loops, moved repetetive values into variables, cleanup compose template for better empty lines
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-17 21:50:36 +01:00
Tuan-Dat Tran
1017fed848 fix(docker): Fixed git deployment,which failed with migration error on new db
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-17 10:08:32 +01:00
Tuan-Dat Tran
cb256e9451 refactor(playbooks): Moved playbooks to seperate folder
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-17 02:41:30 +01:00
Tuan-Dat Tran
6bc591550c fix(port mapping,docker): fixed duplicate port mapping on hosts and incompatible docker options in compose
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-17 02:10:36 +01:00
Tuan-Dat Tran
e68d534e4f feat(docker): Move compose content to ansible group vars
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-17 01:31:10 +01:00
Tuan-Dat Tran
1a1b8cb69c feat(reverse-proxy): Add Caddy for reverse proxy
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2025-01-12 21:19:37 +01:00
Tuan-Dat Tran
88141f8869 chore(secrets): Updated secrets.yml.skeleton to reflect recent changes
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-12-11 20:04:41 +01:00
Tuan-Dat Tran
6d099061ac feat(docker): Split docker compose to be deployed different services on different hosts. See host_vars of each host.
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-12-11 19:58:57 +01:00
Tuan-Dat Tran
711dc58f2e fix(docker/jellyfin): Moved jellyfin config to local machine due to error with sqlite dbs used for config
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-11-15 14:09:31 +01:00
Tuan-Dat Tran
5aaf3eef53 chore(inventory): add host-specific configuration files and update production inventory for proxmox hosts
- Add individual `host_vars` YAML files for new proxmox hosts (`aya01`, `inko`, `lulu`):
  - Set SSH and Ansible connection variables, including `ansible_user`, `ansible_host`, `ansible_port`, and `ansible_ssh_private_key_file`
  - Configure `ansible_become_pass` with respective vault entries for sudo access
  - Define host-specific metadata, including hostname and IP address

- Update `production` inventory:
  - Add new `[proxmox]` group and include `aya01`, `inko`, and `lulu` for proxmox-related automation

These additions streamline Ansible's management of proxmox hosts, centralizing their configuration and enabling easier host-specific variable access for deployment and management tasks.

Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-11-13 23:55:22 +01:00
Tuan-Dat Tran
33253e934d feat(docker): add Calibre Web service to Docker Compose configuration
- Add Calibre Web container configuration to `docker-compose.yaml`
  - Use `lscr.io/linuxserver/calibre-web:latest` image
  - Configure environment variables (PUID, PGID, TZ, DOCKER_MODS)
  - Set up volumes for persistent storage of Calibre configuration and books
  - Expose port 8084 to access the Calibre Web UI
  - Implement automatic restart policy (`unless-stopped`)

This commit introduces the Calibre Web service to the Docker Compose setup, enabling users to run a Calibre library management and e-book reader web service in a Docker container.

Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-11-11 01:04:30 +01:00
Tuan-Dat Tran
4db26b56da feat(ansible): add Docker host configuration with NFS mounts and utility packages
- Introduce Docker host configuration playbooks in `docker_host` role
  - Install Docker and Docker Compose via apt repository
  - Configure Docker user, group, and required directories (`/opt/docker`, `/media`)
  - Add NFS mounts for Docker data, series, movies, and songs directories
- Add extra utility packages (`bat`, `ripgrep`, `fd-find`, `screen`, `eza`, `neovim`)
- Set up and manage `bash_aliases` for user-friendly command replacements (`batcat`, `nvim`, `eza`)
- Enhance `/group_vars` and `/host_vars` for Docker-related settings and secure access
- Add `docker-host00` and `docker-host01` entries to production and staging inventories

Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-11-10 21:37:22 +01:00
Tuan-Dat Tran
ce0411cdb0 fixed taint
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-13 22:56:59 +02:00
Tuan-Dat Tran
28d946cae5 Add noexecute taint on longhorn
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-13 21:49:10 +02:00
Tuan-Dat Tran
5d0f56ce38 linting
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 11:31:26 +02:00
Tuan-Dat Tran
0c1a8a95f2 add postgres exporter
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 11:17:03 +02:00
Tuan-Dat Tran
05c35a546a added installation of reqs for longhorn
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 05:20:35 +02:00
Tuan-Dat Tran
d16cc0db06 Added notes for longhorn nodes
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 04:40:16 +02:00
Tuan-Dat Tran
2ae0f4863e update vault skeleton
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 04:14:01 +02:00
Tuan-Dat Tran
7d58de98d9 Added storage nodes for k3s
Signed-off-by: Tuan-Dat Tran <tuan-dat.tran@tudattr.dev>
2024-10-08 04:13:38 +02:00
234 changed files with 9673 additions and 856 deletions

33
.ansible-lint Normal file
View File

@@ -0,0 +1,33 @@
---
# .ansible-lint
# Specify exclude paths to prevent linting vendor roles, etc.
exclude_paths:
- ./.git/
- ./.venv/
- ./galaxy_roles/
# A list of rules to skip. This is a more modern and readable alternative to 'skip_list'.
skip_list:
- experimental
- fqcn-builtins
- no-handler
- var-naming
- no-changed-when
- risky-shell-pipe
# Enforce certain rules that are not enabled by default.
enable_list:
- no-free-form
- var-spacing
- no-log-password
- no-relative-path
- command-instead-of-module
- fqcn[deep]
- no-changed-when
# Offline mode disables any features that require internet access.
offline: false
# Set the desired verbosity level.
verbosity: 1

17
.editorconfig Normal file
View File

@@ -0,0 +1,17 @@
root = true
[*]
indent_style = space
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
[*.{yml,yaml}]
indent_size = 2
[*.py]
indent_size = 4
[*.md]
trim_trailing_whitespace = false

8
.gitattributes vendored Normal file
View File

@@ -0,0 +1,8 @@
vars/group_vars/proxmox/secrets_vm.yml diff=ansible-vault merge=binary
vars/group_vars/all/secrets.yml diff=ansible-vault merge=binary
vars/group_vars/docker/secrets.yml diff=ansible-vault merge=binary
vars/group_vars/k3s/secrets.yml diff=ansible-vault merge=binary
vars/group_vars/k3s/secrets_token.yml diff=ansible-vault merge=binary
vars/group_vars/kubernetes/secrets.yml diff=ansible-vault merge=binary
vars/group_vars/proxmox/secrets.yml diff=ansible-vault merge=binary
vars/group_vars/proxmox/secrets_vm.yml diff=ansible-vault merge=binary

45
.github/workflows/ci.yaml vendored Normal file
View File

@@ -0,0 +1,45 @@
name: CI
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install ansible-lint==6.22.2 ansible-core==2.15.8
- name: Install Ansible collections
run: ansible-galaxy collection install -r requirements.yaml
- name: Run ansible-lint
run: ansible-lint
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install pre-commit
run: pip install pre-commit
- name: Run pre-commit
run: pre-commit run --all-files

3
.gitignore vendored
View File

@@ -1,2 +1 @@
/secrets.yml
*.ovpn
.worktrees/

23
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,23 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: local
hooks:
- id: ansible-galaxy-install
name: Install ansible-galaxy collections
entry: ansible-galaxy collection install -r requirements.yaml
language: system
pass_filenames: false
always_run: true
- repo: https://github.com/ansible/ansible-lint
rev: v6.22.2
hooks:
- id: ansible-lint
files: \.(yaml)$
additional_dependencies:
- ansible-core==2.15.8

108
README.md
View File

@@ -1,52 +1,82 @@
# TuDatTr IaC
**I do not recommend this project being used for ones own infrastructure, as
this project is heavily attuned to my specific host/network setup**
The Ansible Project to provision fresh Debian VMs for my Proxmox instances.
Some values are hard coded such as the public key both in
[./scripts/debian_seed.sh](./scripts/debian_seed.sh) and [./group_vars/all/vars.yml](./group_vars/all/vars.yml).
**I do not recommend this project being used for one's own infrastructure, as this project is heavily attuned to my specific host/network setup.**
## Prerequisites
This Ansible project automates the setup of a K3s Kubernetes cluster on Proxmox VE. It also includes playbooks for configuring Docker hosts, load balancers, and other services.
- [secrets.yml](secrets.yml) in the root directory of this repository.
Skeleton file can be found as [./secrets.yml.skeleton](./secrets.yml.skeleton).
- IP Configuration of hosts like in [./host_vars/\*](./host_vars/*)
- Setup [~/.ssh/config](~/.ssh/config) for the respective hosts used.
- Install `passlib` for your operating system. Needed to hash passwords ad-hoc.
## Repository Structure
## Improvable Variables
The repository is organized into the following main directories:
- `group_vars/k3s/vars.yml`:
- `k3s.server.ips`: Take list of IPs from host_vars `k3s_server*.yml`.
- `k3s_db_connection_string`: Embed this variable in the `k3s.db.`-directory.
Currently causes loop.
- `playbooks/`: Contains the main Ansible playbooks for different setup scenarios.
- `roles/`: Contains the Ansible roles that are used by the playbooks.
- `vars/`: Contains variable files, including group-specific variables.
## Run Playbook
## Playbooks
To run a first playbook and test the setup the following command can be executed.
The following playbooks are available:
- `proxmox.yml`: Provisions VMs and containers on Proxmox VE.
- `k3s-servers.yml`: Sets up the K3s master nodes.
- `k3s-agents.yml`: Sets up the K3s agent nodes.
- `k3s-loadbalancer.yml`: Configures a load balancer for the K3s cluster.
- `k3s-storage.yml`: Configures storage for the K3s cluster.
- `docker.yml`: Sets up Docker hosts and their load balancer.
- `docker-host.yml`: Configures the docker hosts.
- `docker-lb.yml`: Configures a load balancer for Docker services.
- `kubernetes_setup.yml`: A meta-playbook for setting up the entire Kubernetes cluster.
## Roles
The following roles are defined:
- `common`: Common configuration tasks for all nodes.
- `proxmox`: Manages Proxmox VE, including VM and container creation.
- `k3s_server`: Installs and configures K3s master nodes.
- `k3s_agent`: Installs and configures K3s agent nodes.
- `k3s_loadbalancer`: Configures an Nginx-based load balancer for the K3s cluster.
- `k3s_storage`: Configures storage solutions for Kubernetes.
- `docker_host`: Installs and configures Docker.
- `kubernetes_argocd`: Deploys Argo CD to the Kubernetes cluster.
- `node_exporter`: Installs the Prometheus Node Exporter for monitoring.
- `reverse_proxy`: Configures a Caddy-based reverse proxy.
- `edge_vps`: Placeholder role for Edge VPS configuration.
## Usage
1. **Install dependencies:**
```bash
pip install -r requirements.txt
ansible-galaxy install -r requirements.yml
```
2. **Configure variables:**
- Create an inventory file (e.g., `vars/k3s.ini`).
- Adjust variables in `vars/group_vars/` to match your environment.
3. **Run playbooks:**
```bash
# To provision VMs on Proxmox
ansible-playbook -i vars/proxmox.ini playbooks/proxmox.yml
# To set up the K3s cluster
ansible-playbook -i vars/k3s.ini playbooks/kubernetes_setup.yml
```
## Notes
### Vault Git Diff
This repo has a `.gitattributes` which points at the repos ansible-vault files.
These can be temporarily decrypted for git diff by adding this in conjunction with the `.gitattributes`:
```sh
ansible-playbook -i production -J k3s-servers.yml
# https://stackoverflow.com/questions/29937195/how-to-diff-ansible-vault-changes
git config --global diff.ansible-vault.textconv "ansible-vault view"
```
This will run the [./k3s-servers.yml](./k3s-servers.yml) playbook and execute
its roles.
## Disclaimer
## After successful k3s installation
To access our Kubernetes cluster from our host machine to work on it via
flux and such we need to manually copy a k3s config from one of our server nodes to our host machine.
Then we need to install `kubectl` on our host machine and optionally `kubectx` if we're already
managing other Kubernetes instances.
Then we replace the localhost address inside of the config with the IP of our load balancer.
Finally we'll need to set the KUBECONFIG variable.
```sh
mkdir ~/.kube/
scp k3s-server00:/etc/rancher/k3s/k3s.yaml ~/.kube/config
chown $USER ~/.kube/config
sed -i "s/127.0.0.1/192.168.20.22/" ~/.kube/config
export KUBECONFIG=~/.kube/config
```
Install flux and continue in the flux repository.
This project is highly customized for the author's specific environment. Using it without modification is not recommended.

41
ansible.cfg Normal file
View File

@@ -0,0 +1,41 @@
[defaults]
# (string) Path to the Python interpreter to be used for module execution on remote targets, or an automatic discovery mode. Supported discovery modes are ``auto`` (the default), ``auto_silent``, ``auto_legacy``, and ``auto_legacy_silent``. All discovery modes employ a lookup table to use the included system Python (on distributions known to include one), falling back to a fixed ordered list of well-known Python interpreter locations if a platform-specific default is not available. The fallback behavior will issue a warning that the interpreter should be set explicitly (since interpreters installed later may change which one is used). This warning behavior can be disabled by setting ``auto_silent`` or ``auto_legacy_silent``. The value of ``auto_legacy`` provides all the same behavior, but for backwards-compatibility with older Ansible releases that always defaulted to ``/usr/bin/python``, will use that interpreter if present.
interpreter_python=python3
# (pathspec) Colon separated paths in which Ansible will search for Roles.
roles_path=./roles
# (pathlist) Comma separated list of Ansible inventory sources
inventory=./vars/
# (path) The vault password file to use. Equivalent to --vault-password-file or --vault-id
# If executable, it will be run and the resulting stdout will be used as the password.
vault_password_file=/media/veracrypt1/scripts/ansible_vault.sh
# (list) Check all of these extensions when looking for 'variable' files which should be YAML or JSON or vaulted versions of these.
# This affects vars_files, include_vars, inventory and vars plugins among others.
yaml_valid_extensions=.yaml
# (boolean) Set this to "False" if you want to avoid host key checking by the underlying tools Ansible uses to connect to the host
host_key_checking=False
# (bool) This controls whether a failed Ansible playbook should create a .retry file.
;retry_files_enabled=False
# (path) This sets the path in which Ansible will save .retry files when a playbook fails and retry files are enabled.
# This file will be overwritten after each run with the list of failed hosts from all plays.
;retry_files_save_path=
# (list) Allows to change the group variable precedence merge order.
;precedence=all_inventory, groups_inventory, all_plugins_inventory, all_plugins_play, groups_plugins_inventory, groups_plugins_play
[colors]
# (string) Defines the color to use when showing 'Skipped' task status
skip=dark gray
[tags]
# (list) default list of tags to skip in your plays, has precedence over Run Tags
;skip=
[inventory]
ignore_extensions={{(REJECT_EXTS + ('.orig', '.cfg', '.retry', '.bak'))}}

690
ansible.cfg.default Normal file
View File

@@ -0,0 +1,690 @@
[defaults]
# (boolean) By default Ansible will issue a warning when received from a task action (module or action plugin)
# These warnings can be silenced by adjusting this setting to False.
;action_warnings=True
# (list) Accept list of cowsay templates that are 'safe' to use, set to empty list if you want to enable all installed templates.
;cowsay_enabled_stencils=bud-frogs, bunny, cheese, daemon, default, dragon, elephant-in-snake, elephant, eyes, hellokitty, kitty, luke-koala, meow, milk, moofasa, moose, ren, sheep, small, stegosaurus, stimpy, supermilker, three-eyes, turkey, turtle, tux, udder, vader-koala, vader, www
# (string) Specify a custom cowsay path or swap in your cowsay implementation of choice
;cowpath=
# (string) This allows you to chose a specific cowsay stencil for the banners or use 'random' to cycle through them.
;cow_selection=default
# (boolean) This option forces color mode even when running without a TTY or the "nocolor" setting is True.
;force_color=False
# (path) The default root path for Ansible config files on the controller.
;home=~/.ansible
# (boolean) This setting allows suppressing colorizing output, which is used to give a better indication of failure and status information.
;nocolor=False
# (boolean) If you have cowsay installed but want to avoid the 'cows' (why????), use this.
;nocows=False
# (boolean) Sets the default value for the any_errors_fatal keyword, if True, Task failures will be considered fatal errors.
;any_errors_fatal=False
# (path) The password file to use for the become plugin. --become-password-file.
# If executable, it will be run and the resulting stdout will be used as the password.
;become_password_file=
# (pathspec) Colon separated paths in which Ansible will search for Become Plugins.
;become_plugins={{ ANSIBLE_HOME ~ "/plugins/become:/usr/share/ansible/plugins/become" }}
# (string) Chooses which cache plugin to use, the default 'memory' is ephemeral.
;fact_caching=memory
# (string) Defines connection or path information for the cache plugin
;fact_caching_connection=
# (string) Prefix to use for cache plugin files/tables
;fact_caching_prefix=ansible_facts
# (integer) Expiration timeout for the cache plugin data
;fact_caching_timeout=86400
# (list) List of enabled callbacks, not all callbacks need enabling, but many of those shipped with Ansible do as we don't want them activated by default.
;callbacks_enabled=
# (string) When a collection is loaded that does not support the running Ansible version (with the collection metadata key `requires_ansible`).
;collections_on_ansible_version_mismatch=warning
# (pathspec) Colon separated paths in which Ansible will search for collections content. Collections must be in nested *subdirectories*, not directly in these directories. For example, if ``COLLECTIONS_PATHS`` includes ``'{{ ANSIBLE_HOME ~ "/collections" }}'``, and you want to add ``my.collection`` to that directory, it must be saved as ``'{{ ANSIBLE_HOME} ~ "/collections/ansible_collections/my/collection" }}'``.
;collections_path={{ ANSIBLE_HOME ~ "/collections:/usr/share/ansible/collections" }}
# (boolean) A boolean to enable or disable scanning the sys.path for installed collections
;collections_scan_sys_path=True
# (path) The password file to use for the connection plugin. --connection-password-file.
;connection_password_file=
# (pathspec) Colon separated paths in which Ansible will search for Action Plugins.
;action_plugins={{ ANSIBLE_HOME ~ "/plugins/action:/usr/share/ansible/plugins/action" }}
# (boolean) When enabled, this option allows lookup plugins (whether used in variables as ``{{lookup('foo')}}`` or as a loop as with_foo) to return data that is not marked 'unsafe'.
# By default, such data is marked as unsafe to prevent the templating engine from evaluating any jinja2 templating language, as this could represent a security risk. This option is provided to allow for backward compatibility, however users should first consider adding allow_unsafe=True to any lookups which may be expected to contain data which may be run through the templating engine late
;allow_unsafe_lookups=False
# (boolean) This controls whether an Ansible playbook should prompt for a login password. If using SSH keys for authentication, you probably do not need to change this setting.
;ask_pass=False
# (boolean) This controls whether an Ansible playbook should prompt for a vault password.
;ask_vault_pass=False
# (pathspec) Colon separated paths in which Ansible will search for Cache Plugins.
;cache_plugins={{ ANSIBLE_HOME ~ "/plugins/cache:/usr/share/ansible/plugins/cache" }}
# (pathspec) Colon separated paths in which Ansible will search for Callback Plugins.
;callback_plugins={{ ANSIBLE_HOME ~ "/plugins/callback:/usr/share/ansible/plugins/callback" }}
# (pathspec) Colon separated paths in which Ansible will search for Cliconf Plugins.
;cliconf_plugins={{ ANSIBLE_HOME ~ "/plugins/cliconf:/usr/share/ansible/plugins/cliconf" }}
# (pathspec) Colon separated paths in which Ansible will search for Connection Plugins.
;connection_plugins={{ ANSIBLE_HOME ~ "/plugins/connection:/usr/share/ansible/plugins/connection" }}
# (boolean) Toggles debug output in Ansible. This is *very* verbose and can hinder multiprocessing. Debug output can also include secret information despite no_log settings being enabled, which means debug mode should not be used in production.
;debug=False
# (string) This indicates the command to use to spawn a shell under for Ansible's execution needs on a target. Users may need to change this in rare instances when shell usage is constrained, but in most cases it may be left as is.
;executable=/bin/sh
# (string) This option allows you to globally configure a custom path for 'local_facts' for the implied :ref:`ansible_collections.ansible.builtin.setup_module` task when using fact gathering.
# If not set, it will fallback to the default from the ``ansible.builtin.setup`` module: ``/etc/ansible/facts.d``.
# This does **not** affect user defined tasks that use the ``ansible.builtin.setup`` module.
# The real action being created by the implicit task is currently ``ansible.legacy.gather_facts`` module, which then calls the configured fact modules, by default this will be ``ansible.builtin.setup`` for POSIX systems but other platforms might have different defaults.
;fact_path=
# (pathspec) Colon separated paths in which Ansible will search for Jinja2 Filter Plugins.
;filter_plugins={{ ANSIBLE_HOME ~ "/plugins/filter:/usr/share/ansible/plugins/filter" }}
# (boolean) This option controls if notified handlers run on a host even if a failure occurs on that host.
# When false, the handlers will not run if a failure has occurred on a host.
# This can also be set per play or on the command line. See Handlers and Failure for more details.
;force_handlers=False
# (integer) Maximum number of forks Ansible will use to execute tasks on target hosts.
;forks=5
# (string) This setting controls the default policy of fact gathering (facts discovered about remote systems).
# This option can be useful for those wishing to save fact gathering time. Both 'smart' and 'explicit' will use the cache plugin.
;gathering=implicit
# (list) Set the `gather_subset` option for the :ref:`ansible_collections.ansible.builtin.setup_module` task in the implicit fact gathering. See the module documentation for specifics.
# It does **not** apply to user defined ``ansible.builtin.setup`` tasks.
;gather_subset=
# (integer) Set the timeout in seconds for the implicit fact gathering, see the module documentation for specifics.
# It does **not** apply to user defined :ref:`ansible_collections.ansible.builtin.setup_module` tasks.
;gather_timeout=
# (string) This setting controls how duplicate definitions of dictionary variables (aka hash, map, associative array) are handled in Ansible.
# This does not affect variables whose values are scalars (integers, strings) or arrays.
# **WARNING**, changing this setting is not recommended as this is fragile and makes your content (plays, roles, collections) non portable, leading to continual confusion and misuse. Don't change this setting unless you think you have an absolute need for it.
# We recommend avoiding reusing variable names and relying on the ``combine`` filter and ``vars`` and ``varnames`` lookups to create merged versions of the individual variables. In our experience this is rarely really needed and a sign that too much complexity has been introduced into the data structures and plays.
# For some uses you can also look into custom vars_plugins to merge on input, even substituting the default ``host_group_vars`` that is in charge of parsing the ``host_vars/`` and ``group_vars/`` directories. Most users of this setting are only interested in inventory scope, but the setting itself affects all sources and makes debugging even harder.
# All playbooks and roles in the official examples repos assume the default for this setting.
# Changing the setting to ``merge`` applies across variable sources, but many sources will internally still overwrite the variables. For example ``include_vars`` will dedupe variables internally before updating Ansible, with 'last defined' overwriting previous definitions in same file.
# The Ansible project recommends you **avoid ``merge`` for new projects.**
# It is the intention of the Ansible developers to eventually deprecate and remove this setting, but it is being kept as some users do heavily rely on it. New projects should **avoid 'merge'**.
;hash_behaviour=replace
# (pathlist) Comma separated list of Ansible inventory sources
;inventory=/etc/ansible/hosts
# (pathspec) Colon separated paths in which Ansible will search for HttpApi Plugins.
;httpapi_plugins={{ ANSIBLE_HOME ~ "/plugins/httpapi:/usr/share/ansible/plugins/httpapi" }}
# (float) This sets the interval (in seconds) of Ansible internal processes polling each other. Lower values improve performance with large playbooks at the expense of extra CPU load. Higher values are more suitable for Ansible usage in automation scenarios, when UI responsiveness is not required but CPU usage might be a concern.
# The default corresponds to the value hardcoded in Ansible <= 2.1
;internal_poll_interval=0.001
# (pathspec) Colon separated paths in which Ansible will search for Inventory Plugins.
;inventory_plugins={{ ANSIBLE_HOME ~ "/plugins/inventory:/usr/share/ansible/plugins/inventory" }}
# (string) This is a developer-specific feature that allows enabling additional Jinja2 extensions.
# See the Jinja2 documentation for details. If you do not know what these do, you probably don't need to change this setting :)
;jinja2_extensions=[]
# (boolean) This option preserves variable types during template operations.
;jinja2_native=False
# (boolean) Enables/disables the cleaning up of the temporary files Ansible used to execute the tasks on the remote.
# If this option is enabled it will disable ``ANSIBLE_PIPELINING``.
;keep_remote_files=False
# (boolean) Controls whether callback plugins are loaded when running /usr/bin/ansible. This may be used to log activity from the command line, send notifications, and so on. Callback plugins are always loaded for ``ansible-playbook``.
;bin_ansible_callbacks=False
# (tmppath) Temporary directory for Ansible to use on the controller.
;local_tmp={{ ANSIBLE_HOME ~ "/tmp" }}
# (list) List of logger names to filter out of the log file
;log_filter=
# (path) File to which Ansible will log on the controller. When empty logging is disabled.
;log_path=
# (pathspec) Colon separated paths in which Ansible will search for Lookup Plugins.
;lookup_plugins={{ ANSIBLE_HOME ~ "/plugins/lookup:/usr/share/ansible/plugins/lookup" }}
# (string) Sets the macro for the 'ansible_managed' variable available for :ref:`ansible_collections.ansible.builtin.template_module` and :ref:`ansible_collections.ansible.windows.win_template_module`. This is only relevant for those two modules.
;ansible_managed=Ansible managed
# (string) This sets the default arguments to pass to the ``ansible`` adhoc binary if no ``-a`` is specified.
;module_args=
# (string) Compression scheme to use when transferring Python modules to the target.
;module_compression=ZIP_DEFLATED
# (string) Module to use with the ``ansible`` AdHoc command, if none is specified via ``-m``.
;module_name=command
# (pathspec) Colon separated paths in which Ansible will search for Modules.
;library={{ ANSIBLE_HOME ~ "/plugins/modules:/usr/share/ansible/plugins/modules" }}
# (pathspec) Colon separated paths in which Ansible will search for Module utils files, which are shared by modules.
;module_utils={{ ANSIBLE_HOME ~ "/plugins/module_utils:/usr/share/ansible/plugins/module_utils" }}
# (pathspec) Colon separated paths in which Ansible will search for Netconf Plugins.
;netconf_plugins={{ ANSIBLE_HOME ~ "/plugins/netconf:/usr/share/ansible/plugins/netconf" }}
# (boolean) Toggle Ansible's display and logging of task details, mainly used to avoid security disclosures.
;no_log=False
# (boolean) Toggle Ansible logging to syslog on the target when it executes tasks. On Windows hosts this will disable a newer style PowerShell modules from writing to the event log.
;no_target_syslog=False
# (raw) What templating should return as a 'null' value. When not set it will let Jinja2 decide.
;null_representation=
# (integer) For asynchronous tasks in Ansible (covered in Asynchronous Actions and Polling), this is how often to check back on the status of those tasks when an explicit poll interval is not supplied. The default is a reasonably moderate 15 seconds which is a tradeoff between checking in frequently and providing a quick turnaround when something may have completed.
;poll_interval=15
# (path) Option for connections using a certificate or key file to authenticate, rather than an agent or passwords, you can set the default value here to avoid re-specifying --private-key with every invocation.
;private_key_file=
# (boolean) By default, imported roles publish their variables to the play and other roles, this setting can avoid that.
# This was introduced as a way to reset role variables to default values if a role is used more than once in a playbook.
# Included roles only make their variables public at execution, unlike imported roles which happen at playbook compile time.
;private_role_vars=False
# (integer) Port to use in remote connections, when blank it will use the connection plugin default.
;remote_port=
# (string) Sets the login user for the target machines
# When blank it uses the connection plugin's default, normally the user currently executing Ansible.
;remote_user=
# (pathspec) Colon separated paths in which Ansible will search for Roles.
;roles_path={{ ANSIBLE_HOME ~ "/roles:/usr/share/ansible/roles:/etc/ansible/roles" }}
# (string) Set the main callback used to display Ansible output. You can only have one at a time.
# You can have many other callbacks, but just one can be in charge of stdout.
# See :ref:`callback_plugins` for a list of available options.
;stdout_callback=default
# (string) Set the default strategy used for plays.
;strategy=linear
# (pathspec) Colon separated paths in which Ansible will search for Strategy Plugins.
;strategy_plugins={{ ANSIBLE_HOME ~ "/plugins/strategy:/usr/share/ansible/plugins/strategy" }}
# (boolean) Toggle the use of "su" for tasks.
;su=False
# (string) Syslog facility to use when Ansible logs to the remote target
;syslog_facility=LOG_USER
# (pathspec) Colon separated paths in which Ansible will search for Terminal Plugins.
;terminal_plugins={{ ANSIBLE_HOME ~ "/plugins/terminal:/usr/share/ansible/plugins/terminal" }}
# (pathspec) Colon separated paths in which Ansible will search for Jinja2 Test Plugins.
;test_plugins={{ ANSIBLE_HOME ~ "/plugins/test:/usr/share/ansible/plugins/test" }}
# (integer) This is the default timeout for connection plugins to use.
;timeout=10
# (string) Can be any connection plugin available to your ansible installation.
# There is also a (DEPRECATED) special 'smart' option, that will toggle between 'ssh' and 'paramiko' depending on controller OS and ssh versions.
;transport=ssh
# (boolean) When True, this causes ansible templating to fail steps that reference variable names that are likely typoed.
# Otherwise, any '{{ template_expression }}' that contains undefined variables will be rendered in a template or ansible action line exactly as written.
;error_on_undefined_vars=True
# (pathspec) Colon separated paths in which Ansible will search for Vars Plugins.
;vars_plugins={{ ANSIBLE_HOME ~ "/plugins/vars:/usr/share/ansible/plugins/vars" }}
# (string) The vault_id to use for encrypting by default. If multiple vault_ids are provided, this specifies which to use for encryption. The --encrypt-vault-id cli option overrides the configured value.
;vault_encrypt_identity=
# (string) The label to use for the default vault id label in cases where a vault id label is not provided
;vault_identity=default
# (list) A list of vault-ids to use by default. Equivalent to multiple --vault-id args. Vault-ids are tried in order.
;vault_identity_list=
# (string) If true, decrypting vaults with a vault id will only try the password from the matching vault-id
;vault_id_match=False
# (path) The vault password file to use. Equivalent to --vault-password-file or --vault-id
# If executable, it will be run and the resulting stdout will be used as the password.
;vault_password_file=
# (integer) Sets the default verbosity, equivalent to the number of ``-v`` passed in the command line.
;verbosity=0
# (boolean) Toggle to control the showing of deprecation warnings
;deprecation_warnings=True
# (boolean) Toggle to control showing warnings related to running devel
;devel_warning=True
# (boolean) Normally ``ansible-playbook`` will print a header for each task that is run. These headers will contain the name: field from the task if you specified one. If you didn't then ``ansible-playbook`` uses the task's action to help you tell which task is presently running. Sometimes you run many of the same action and so you want more information about the task to differentiate it from others of the same action. If you set this variable to True in the config then ``ansible-playbook`` will also include the task's arguments in the header.
# This setting defaults to False because there is a chance that you have sensitive values in your parameters and you do not want those to be printed.
# If you set this to True you should be sure that you have secured your environment's stdout (no one can shoulder surf your screen and you aren't saving stdout to an insecure file) or made sure that all of your playbooks explicitly added the ``no_log: True`` parameter to tasks which have sensitive values See How do I keep secret data in my playbook? for more information.
;display_args_to_stdout=False
# (boolean) Toggle to control displaying skipped task/host entries in a task in the default callback
;display_skipped_hosts=True
# (string) Root docsite URL used to generate docs URLs in warning/error text; must be an absolute URL with valid scheme and trailing slash.
;docsite_root_url=https://docs.ansible.com/ansible-core/
# (pathspec) Colon separated paths in which Ansible will search for Documentation Fragments Plugins.
;doc_fragment_plugins={{ ANSIBLE_HOME ~ "/plugins/doc_fragments:/usr/share/ansible/plugins/doc_fragments" }}
# (string) By default Ansible will issue a warning when a duplicate dict key is encountered in YAML.
# These warnings can be silenced by adjusting this setting to False.
;duplicate_dict_key=warn
# (boolean) Whether or not to enable the task debugger, this previously was done as a strategy plugin.
# Now all strategy plugins can inherit this behavior. The debugger defaults to activating when
# a task is failed on unreachable. Use the debugger keyword for more flexibility.
;enable_task_debugger=False
# (boolean) Toggle to allow missing handlers to become a warning instead of an error when notifying.
;error_on_missing_handler=True
# (list) Which modules to run during a play's fact gathering stage, using the default of 'smart' will try to figure it out based on connection type.
# If adding your own modules but you still want to use the default Ansible facts, you will want to include 'setup' or corresponding network module to the list (if you add 'smart', Ansible will also figure it out).
# This does not affect explicit calls to the 'setup' module, but does always affect the 'gather_facts' action (implicit or explicit).
;facts_modules=smart
# (boolean) Set this to "False" if you want to avoid host key checking by the underlying tools Ansible uses to connect to the host
;host_key_checking=True
# (boolean) Facts are available inside the `ansible_facts` variable, this setting also pushes them as their own vars in the main namespace.
# Unlike inside the `ansible_facts` dictionary, these will have an `ansible_` prefix.
;inject_facts_as_vars=True
# (string) Path to the Python interpreter to be used for module execution on remote targets, or an automatic discovery mode. Supported discovery modes are ``auto`` (the default), ``auto_silent``, ``auto_legacy``, and ``auto_legacy_silent``. All discovery modes employ a lookup table to use the included system Python (on distributions known to include one), falling back to a fixed ordered list of well-known Python interpreter locations if a platform-specific default is not available. The fallback behavior will issue a warning that the interpreter should be set explicitly (since interpreters installed later may change which one is used). This warning behavior can be disabled by setting ``auto_silent`` or ``auto_legacy_silent``. The value of ``auto_legacy`` provides all the same behavior, but for backwards-compatibility with older Ansible releases that always defaulted to ``/usr/bin/python``, will use that interpreter if present.
;interpreter_python=auto
# (boolean) If 'false', invalid attributes for a task will result in warnings instead of errors
;invalid_task_attribute_failed=True
# (boolean) Toggle to control showing warnings related to running a Jinja version older than required for jinja2_native
;jinja2_native_warning=True
# (boolean) By default Ansible will issue a warning when there are no hosts in the inventory.
# These warnings can be silenced by adjusting this setting to False.
;localhost_warning=True
# (int) Maximum size of files to be considered for diff display
;max_diff_size=104448
# (list) List of extensions to ignore when looking for modules to load
# This is for rejecting script and binary module fallback extensions
;module_ignore_exts={{(REJECT_EXTS + ('.yaml', '.yml', '.ini'))}}
# (bool) Enables whether module responses are evaluated for containing non UTF-8 data
# Disabling this may result in unexpected behavior
# Only ansible-core should evaluate this configuration
;module_strict_utf8_response=True
# (list) TODO: write it
;network_group_modules=eos, nxos, ios, iosxr, junos, enos, ce, vyos, sros, dellos9, dellos10, dellos6, asa, aruba, aireos, bigip, ironware, onyx, netconf, exos, voss, slxos
# (boolean) Previously Ansible would only clear some of the plugin loading caches when loading new roles, this led to some behaviours in which a plugin loaded in previous plays would be unexpectedly 'sticky'. This setting allows to return to that behaviour.
;old_plugin_cache_clear=False
# (path) A number of non-playbook CLIs have a ``--playbook-dir`` argument; this sets the default value for it.
;playbook_dir=
# (string) This sets which playbook dirs will be used as a root to process vars plugins, which includes finding host_vars/group_vars
;playbook_vars_root=top
# (path) A path to configuration for filtering which plugins installed on the system are allowed to be used.
# See :ref:`plugin_filtering_config` for details of the filter file's format.
# The default is /etc/ansible/plugin_filters.yml
;plugin_filters_cfg=
# (string) Attempts to set RLIMIT_NOFILE soft limit to the specified value when executing Python modules (can speed up subprocess usage on Python 2.x. See https://bugs.python.org/issue11284). The value will be limited by the existing hard limit. Default value of 0 does not attempt to adjust existing system-defined limits.
;python_module_rlimit_nofile=0
# (bool) This controls whether a failed Ansible playbook should create a .retry file.
;retry_files_enabled=False
# (path) This sets the path in which Ansible will save .retry files when a playbook fails and retry files are enabled.
# This file will be overwritten after each run with the list of failed hosts from all plays.
;retry_files_save_path=
# (str) This setting can be used to optimize vars_plugin usage depending on user's inventory size and play selection.
;run_vars_plugins=demand
# (bool) This adds the custom stats set via the set_stats plugin to the default output
;show_custom_stats=False
# (string) Action to take when a module parameter value is converted to a string (this does not affect variables). For string parameters, values such as '1.00', "['a', 'b',]", and 'yes', 'y', etc. will be converted by the YAML parser unless fully quoted.
# Valid options are 'error', 'warn', and 'ignore'.
# Since 2.8, this option defaults to 'warn' but will change to 'error' in 2.12.
;string_conversion_action=warn
# (boolean) Allows disabling of warnings related to potential issues on the system running ansible itself (not on the managed hosts)
# These may include warnings about 3rd party packages or other conditions that should be resolved if possible.
;system_warnings=True
# (boolean) This option defines whether the task debugger will be invoked on a failed task when ignore_errors=True is specified.
# True specifies that the debugger will honor ignore_errors, False will not honor ignore_errors.
;task_debugger_ignore_errors=True
# (integer) Set the maximum time (in seconds) that a task can run for.
# If set to 0 (the default) there is no timeout.
;task_timeout=0
# (string) Make ansible transform invalid characters in group names supplied by inventory sources.
;force_valid_group_names=never
# (boolean) Toggles the use of persistence for connections.
;use_persistent_connections=False
# (bool) A toggle to disable validating a collection's 'metadata' entry for a module_defaults action group. Metadata containing unexpected fields or value types will produce a warning when this is True.
;validate_action_group_metadata=True
# (list) Accept list for variable plugins that require it.
;vars_plugins_enabled=host_group_vars
# (list) Allows to change the group variable precedence merge order.
;precedence=all_inventory, groups_inventory, all_plugins_inventory, all_plugins_play, groups_plugins_inventory, groups_plugins_play
# (string) The salt to use for the vault encryption. If it is not provided, a random salt will be used.
;vault_encrypt_salt=
# (bool) Force 'verbose' option to use stderr instead of stdout
;verbose_to_stderr=False
# (integer) For asynchronous tasks in Ansible (covered in Asynchronous Actions and Polling), this is how long, in seconds, to wait for the task spawned by Ansible to connect back to the named pipe used on Windows systems. The default is 5 seconds. This can be too low on slower systems, or systems under heavy load.
# This is not the total time an async command can run for, but is a separate timeout to wait for an async command to start. The task will only start to be timed against its async_timeout once it has connected to the pipe, so the overall maximum duration the task can take will be extended by the amount specified here.
;win_async_startup_timeout=5
# (list) Check all of these extensions when looking for 'variable' files which should be YAML or JSON or vaulted versions of these.
# This affects vars_files, include_vars, inventory and vars plugins among others.
;yaml_valid_extensions=.yml, .yaml, .json
[privilege_escalation]
# (boolean) Display an agnostic become prompt instead of displaying a prompt containing the command line supplied become method
;agnostic_become_prompt=True
# (boolean) This setting controls if become is skipped when remote user and become user are the same. I.E root sudo to root.
# If executable, it will be run and the resulting stdout will be used as the password.
;become_allow_same_user=False
# (boolean) Toggles the use of privilege escalation, allowing you to 'become' another user after login.
;become=False
# (boolean) Toggle to prompt for privilege escalation password.
;become_ask_pass=False
# (string) executable to use for privilege escalation, otherwise Ansible will depend on PATH
;become_exe=
# (string) Flags to pass to the privilege escalation executable.
;become_flags=
# (string) Privilege escalation method to use when `become` is enabled.
;become_method=sudo
# (string) The user your login/remote user 'becomes' when using privilege escalation, most systems will use 'root' when no user is specified.
;become_user=root
[persistent_connection]
# (path) Specify where to look for the ansible-connection script. This location will be checked before searching $PATH.
# If null, ansible will start with the same directory as the ansible script.
;ansible_connection_path=
# (int) This controls the amount of time to wait for response from remote device before timing out persistent connection.
;command_timeout=30
# (integer) This controls the retry timeout for persistent connection to connect to the local domain socket.
;connect_retry_timeout=15
# (integer) This controls how long the persistent connection will remain idle before it is destroyed.
;connect_timeout=30
# (path) Path to socket to be used by the connection persistence system.
;control_path_dir={{ ANSIBLE_HOME ~ "/pc" }}
[connection]
# (boolean) This is a global option, each connection plugin can override either by having more specific options or not supporting pipelining at all.
# Pipelining, if supported by the connection plugin, reduces the number of network operations required to execute a module on the remote server, by executing many Ansible modules without actual file transfer.
# It can result in a very significant performance improvement when enabled.
# However this conflicts with privilege escalation (become). For example, when using 'sudo:' operations you must first disable 'requiretty' in /etc/sudoers on all managed hosts, which is why it is disabled by default.
# This setting will be disabled if ``ANSIBLE_KEEP_REMOTE_FILES`` is enabled.
;pipelining=False
[colors]
# (string) Defines the color to use on 'Changed' task status
;changed=yellow
# (string) Defines the default color to use for ansible-console
;console_prompt=white
# (string) Defines the color to use when emitting debug messages
;debug=dark gray
# (string) Defines the color to use when emitting deprecation messages
;deprecate=purple
# (string) Defines the color to use when showing added lines in diffs
;diff_add=green
# (string) Defines the color to use when showing diffs
;diff_lines=cyan
# (string) Defines the color to use when showing removed lines in diffs
;diff_remove=red
# (string) Defines the color to use when emitting error messages
;error=red
# (string) Defines the color to use for highlighting
;highlight=white
# (string) Defines the color to use when showing 'OK' task status
;ok=green
# (string) Defines the color to use when showing 'Skipped' task status
;skip=cyan
# (string) Defines the color to use on 'Unreachable' status
;unreachable=bright red
# (string) Defines the color to use when emitting verbose messages. i.e those that show with '-v's.
;verbose=blue
# (string) Defines the color to use when emitting warning messages
;warn=bright purple
[selinux]
# (boolean) This setting causes libvirt to connect to lxc containers by passing --noseclabel to virsh. This is necessary when running on systems which do not have SELinux.
;libvirt_lxc_noseclabel=False
# (list) Some filesystems do not support safe operations and/or return inconsistent errors, this setting makes Ansible 'tolerate' those in the list w/o causing fatal errors.
# Data corruption may occur and writes are not always verified when a filesystem is in the list.
;special_context_filesystems=fuse, nfs, vboxsf, ramfs, 9p, vfat
[diff]
# (bool) Configuration toggle to tell modules to show differences when in 'changed' status, equivalent to ``--diff``.
;always=False
# (integer) How many lines of context to show when displaying the differences between files.
;context=3
[galaxy]
# (path) The directory that stores cached responses from a Galaxy server.
# This is only used by the ``ansible-galaxy collection install`` and ``download`` commands.
# Cache files inside this dir will be ignored if they are world writable.
;cache_dir={{ ANSIBLE_HOME ~ "/galaxy_cache" }}
# (bool) whether ``ansible-galaxy collection install`` should warn about ``--collections-path`` missing from configured :ref:`collections_paths`
;collections_path_warning=True
# (path) Collection skeleton directory to use as a template for the ``init`` action in ``ansible-galaxy collection``, same as ``--collection-skeleton``.
;collection_skeleton=
# (list) patterns of files to ignore inside a Galaxy collection skeleton directory
;collection_skeleton_ignore=^.git$, ^.*/.git_keep$
# (bool) Disable GPG signature verification during collection installation.
;disable_gpg_verify=False
# (bool) Some steps in ``ansible-galaxy`` display a progress wheel which can cause issues on certain displays or when outputting the stdout to a file.
# This config option controls whether the display wheel is shown or not.
# The default is to show the display wheel if stdout has a tty.
;display_progress=
# (path) Configure the keyring used for GPG signature verification during collection installation and verification.
;gpg_keyring=
# (boolean) If set to yes, ansible-galaxy will not validate TLS certificates. This can be useful for testing against a server with a self-signed certificate.
;ignore_certs=
# (list) A list of GPG status codes to ignore during GPG signature verification. See L(https://github.com/gpg/gnupg/blob/master/doc/DETAILS#general-status-codes) for status code descriptions.
# If fewer signatures successfully verify the collection than `GALAXY_REQUIRED_VALID_SIGNATURE_COUNT`, signature verification will fail even if all error codes are ignored.
;ignore_signature_status_codes=
# (str) The number of signatures that must be successful during GPG signature verification while installing or verifying collections.
# This should be a positive integer or all to indicate all signatures must successfully validate the collection.
# Prepend + to the value to fail if no valid signatures are found for the collection.
;required_valid_signature_count=1
# (path) Role skeleton directory to use as a template for the ``init`` action in ``ansible-galaxy``/``ansible-galaxy role``, same as ``--role-skeleton``.
;role_skeleton=
# (list) patterns of files to ignore inside a Galaxy role or collection skeleton directory
;role_skeleton_ignore=^.git$, ^.*/.git_keep$
# (string) URL to prepend when roles don't specify the full URI, assume they are referencing this server as the source.
;server=https://galaxy.ansible.com
# (list) A list of Galaxy servers to use when installing a collection.
# The value corresponds to the config ini header ``[galaxy_server.{{item}}]`` which defines the server details.
# See :ref:`galaxy_server_config` for more details on how to define a Galaxy server.
# The order of servers in this list is used to as the order in which a collection is resolved.
# Setting this config option will ignore the :ref:`galaxy_server` config option.
;server_list=
# (int) The default timeout for Galaxy API calls. Galaxy servers that don't configure a specific timeout will fall back to this value.
;server_timeout=60
# (path) Local path to galaxy access token file
;token_path={{ ANSIBLE_HOME ~ "/galaxy_token" }}
[inventory]
# (string) This setting changes the behaviour of mismatched host patterns, it allows you to force a fatal error, a warning or just ignore it
;host_pattern_mismatch=warning
# (boolean) If 'true', it is a fatal error when any given inventory source cannot be successfully parsed by any available inventory plugin; otherwise, this situation only attracts a warning.
;any_unparsed_is_failed=False
# (bool) Toggle to turn on inventory caching.
# This setting has been moved to the individual inventory plugins as a plugin option :ref:`inventory_plugins`.
# The existing configuration settings are still accepted with the inventory plugin adding additional options from inventory configuration.
# This message will be removed in 2.16.
;cache=False
# (string) The plugin for caching inventory.
# This setting has been moved to the individual inventory plugins as a plugin option :ref:`inventory_plugins`.
# The existing configuration settings are still accepted with the inventory plugin adding additional options from inventory and fact cache configuration.
# This message will be removed in 2.16.
;cache_plugin=
# (string) The inventory cache connection.
# This setting has been moved to the individual inventory plugins as a plugin option :ref:`inventory_plugins`.
# The existing configuration settings are still accepted with the inventory plugin adding additional options from inventory and fact cache configuration.
# This message will be removed in 2.16.
;cache_connection=
# (string) The table prefix for the cache plugin.
# This setting has been moved to the individual inventory plugins as a plugin option :ref:`inventory_plugins`.
# The existing configuration settings are still accepted with the inventory plugin adding additional options from inventory and fact cache configuration.
# This message will be removed in 2.16.
;cache_prefix=ansible_inventory_
# (string) Expiration timeout for the inventory cache plugin data.
# This setting has been moved to the individual inventory plugins as a plugin option :ref:`inventory_plugins`.
# The existing configuration settings are still accepted with the inventory plugin adding additional options from inventory and fact cache configuration.
# This message will be removed in 2.16.
;cache_timeout=3600
# (list) List of enabled inventory plugins, it also determines the order in which they are used.
;enable_plugins=host_list, script, auto, yaml, ini, toml
# (bool) Controls if ansible-inventory will accurately reflect Ansible's view into inventory or its optimized for exporting.
;export=False
# (list) List of extensions to ignore when using a directory as an inventory source
;ignore_extensions={{(REJECT_EXTS + ('.orig', '.ini', '.cfg', '.retry'))}}
# (list) List of patterns to ignore when using a directory as an inventory source
;ignore_patterns=
# (bool) If 'true' it is a fatal error if every single potential inventory source fails to parse, otherwise this situation will only attract a warning.
;unparsed_is_failed=False
# (boolean) By default Ansible will issue a warning when no inventory was loaded and notes that it will use an implicit localhost-only inventory.
# These warnings can be silenced by adjusting this setting to False.
;inventory_unparsed_warning=True
[netconf_connection]
# (string) This variable is used to enable bastion/jump host with netconf connection. If set to True the bastion/jump host ssh settings should be present in ~/.ssh/config file, alternatively it can be set to custom ssh configuration file path to read the bastion/jump host settings.
;ssh_config=
[paramiko_connection]
# (boolean) TODO: write it
;host_key_auto_add=False
# (boolean) TODO: write it
;look_for_keys=True
[jinja2]
# (list) This list of filters avoids 'type conversion' when templating variables
# Useful when you want to avoid conversion into lists or dictionaries for JSON strings, for example.
;dont_type_filters=string, to_json, to_nice_json, to_yaml, to_nice_yaml, ppretty, json
[tags]
# (list) default list of tags to run in your plays, Skip Tags has precedence.
;run=
# (list) default list of tags to skip in your plays, has precedence over Run Tags
;skip=

69
blog.md Normal file
View File

@@ -0,0 +1,69 @@
---
title: "Automating My Homelab: From Bare Metal to Kubernetes with Ansible"
date: 2025-07-27
author: "TuDatTr"
tags: ["Ansible", "Proxmox", "Kubernetes", "K3s", "IaC", "Homelab"]
---
## The Homelab: Repeatable, Automated, and Documented
For many tech enthusiasts, a homelab is a playground for learning, experimenting, and self-hosting services. But as the complexity grows, so does the management overhead. Manually setting up virtual machines, configuring networks, and deploying applications becomes a tedious and error-prone process. This lead me to building my homelab as Infrastructure as Code (IaC) with Ansible.
This blog post walks you through my Ansible project, which automates the entire lifecycle of my homelab—from provisioning VMs on Proxmox to deploying a production-ready K3s Kubernetes cluster.
## Why Ansible?
When I decided to automate my infrastructure, I considered several tools. I chose Ansible for its simplicity, agentless architecture, and gentle learning curve. Writing playbooks in YAML felt declarative and intuitive, and the vast collection of community-supported modules meant I wouldn't have to reinvent the wheel.
## The Architecture: A Multi-Layered Approach
My Ansible project is designed to be modular and scalable, with a clear separation of concerns. It's built around a collection of roles, each responsible for a specific component of the infrastructure.
### Layer 1: Proxmox Provisioning
The foundation of my homelab is Proxmox VE. The `proxmox` role is the first step in the automation pipeline. It handles:
- **VM and Container Creation:** Using a simple YAML definition in my `vars` files, I can specify the number of VMs and containers to create, their resources (CPU, memory, disk), and their base operating system images.
- **Cloud-Init Integration:** For VMs, I leverage Cloud-Init to perform initial setup, such as setting the hostname, creating users, and injecting SSH keys for Ansible to connect to.
- **Hardware Passthrough:** The role also configures hardware passthrough for devices like Intel Quick Sync for video transcoding in my media server.
### Layer 2: The K3s Kubernetes Cluster
With the base VMs ready, the next step is to build the Kubernetes cluster. I chose K3s for its lightweight footprint and ease of installation. The setup is divided into several roles:
- `k3s_server`: This role bootstraps the first master node and then adds additional master nodes to create a highly available control plane.
- `k3s_agent`: This role joins the worker nodes to the cluster.
- `k3s_loadbalancer`: A dedicated VM running Nginx is set up to act as a load balancer for the K3s API server, ensuring a stable endpoint for `kubectl` and other clients.
### Layer 3: Applications and Services
Once the Kubernetes cluster is up and running, it's time to deploy applications. My project includes roles for:
- `docker_host`: For services that are better suited to run in a traditional Docker environment, this role sets up and configures Docker hosts.
- `kubernetes_argocd`: I use Argo CD for GitOps-based continuous delivery. This role deploys Argo CD to the cluster and configures it to sync with my application repositories.
- `reverse_proxy`: Caddy is my reverse proxy of choice, and this role automates its installation and configuration, including obtaining SSL certificates from Let's Encrypt.
## Putting It All Together: The Power of Playbooks
The playbooks in the `playbooks/` directory tie everything together. For example, the `kubernetes_setup.yml` playbook runs all the necessary roles in the correct order to bring up the entire Kubernetes cluster from scratch.
```yaml
# playbooks/kubernetes_setup.yml
---
- name: Set up Kubernetes Cluster
hosts: all
gather_facts: true
roles:
- role: k3s_server
- role: k3s_agent
- role: k3s_loadbalancer
- role: kubernetes_argocd
```
## Final Thoughts and Future Plans
This Ansible project has transformed my homelab from a collection of manually configured machines into a fully automated and reproducible environment. I can now tear down and rebuild my entire infrastructure with a single command, which gives me the confidence to experiment without fear of breaking things.
While the project is highly tailored to my specific needs, I hope this overview provides some inspiration for your own automation journey. The principles of IaC and the power of tools like Ansible can be applied to any environment, big or small.
What's next? I plan to explore more advanced Kubernetes concepts, such as Cilium for networking and policy, and integrate more of my self-hosted services into the GitOps workflow with Argo CD. The homelab is never truly "finished," and that's what makes it so much fun.

75
changelog.md Normal file
View File

@@ -0,0 +1,75 @@
# Changelog
Technical evolution of the infrastructure stack, tracking the migration from standalone Docker hosts to a fully automated, GitOps-managed Kubernetes cluster.
## Phase 5: GitOps & Cluster Hardening (July 2025 - Present)
*Shifted control plane management to ArgoCD and expanded storage capabilities.*
- **GitOps Implementation**:
- Deployed **ArgoCD** in an App-of-Apps pattern to manage cluster state (`89c51aa`).
- Integrated **Sealed Secrets** (implied via vault diffs) and **Cert-Manager** for automated TLS management (`76000f8`).
- Migrated core services (Traefik, MetalLB) to Helm charts managed via ArgoCD manifests.
- **Storage Architecture**:
- Implemented **Longhorn** with iSCSI support for distributed block storage (`48aec11`).
- Added **NFS Provisioner** (`e1a2248`) for ReadWriteMany volumes capabilities.
- **Networking**:
- Centralized primary server IP logic (`97a5d6c`) to support HA control plane capability.
- Replaced Netcup DNS webhooks with **Cloudflare** for Caddy ACME challenges (`9cb90a8`).
- **Observability**:
- Added **healthcheck** definitions to Docker Compose services (`0e8e07e`) and K3s probes.
## Phase 4: IaaC Refactoring & Proxmox API Integration (Nov 2024 - June 2025)
*Refactored Ansible roles for modularity and implemented Proxmox API automation for "click-less" provisioning.*
- **Proxmox Automation**:
- Developed `roles/proxmox` to interface with Proxmox API: automated VM creation, cloning from templates, and Cloud-Init injection (`f2ea03b`).
- Configured **PCI Passthrough** (`591342f`) and hardware acceleration for media transcoding nodes.
- Added cron-based VM state reconciliation (`a1da69a`).
- **Ansible Restructuring**:
- **Inventory Refactor**: Moved from root-level inventory files to a hierarchical `vars/` structure (`609e000`).
- **Linting Pipeline**: Integrated `ansible-lint` and `pre-commit` hooks (`6eef96b`) to enforce YAML standards and best practices.
- **Vault Security**: Configured `.gitattributes` to enable `ansible-vault view` for cleartext diffs in git (`c3905ed`).
- **Identity Management**:
- Deployed **Keycloak** (`42196a3`) for OIDC/SAML authentication across the stack.
## Phase 3: The Kubernetes Migration (Sep 2024 - Oct 2024)
*Architectural pivot from Docker Compose to K3s.*
- **Control Plane Setup**:
- Bootstrapped **K3s** cluster with dedicated server/agent split.
- Configured **HAProxy/Nginx** load balancers (`51a49d0`) for API server high availability.
- **Node Provisioning**:
- Standardized node bootstrapping (kernel modules, sysctl params) for K3s compatibility.
- Deployed specialized storage nodes for Longhorn (`7d58de9`).
- **Decommissioning**:
- Drained and removed legacy Docker hosts (`0aed818`).
- Migrated stateful workloads (Postgres) to cluster-managed resources.
## Phase 2: Docker Service Expansion (2023 - 2024)
*Vertical scaling of Docker hosts and introduction of the monitoring stack.*
- **Service Stack**:
- Deployed the **\*arr suite** (Sonarr, Radarr, etc.) and Jellyfin with hardware mapping (`3d7f143`).
- Integrated **Paperless-ngx** with Redis and Tika consumption (`3f88065`).
- Self-hosted **Gitea** and **GitLab** (later removed) for source control.
- **Observability V1**:
- Deployed **Prometheus** and **Grafana** stack (`b3ae5ef`).
- Added **Node Exporter** and **SmartCTL Exporter** (`0a361d9`) to bare metal hosts.
- Implemented **Uptime Kuma** for external availability monitoring.
- **Reverse Proxy**:
- Transitioned ingress from Traefik v2 to **Nginx Proxy Manager**, then to **Caddy** for simpler configuration management (`a9af3c7`, `1a1b8cb`).
## Phase 1: Genesis & Networking (Late 2022)
*Initial infrastructure bring-up.*
- **Base Configuration**:
- Established Ansible role structure for baseline system hardening (SSH, users, packages).
- Configured **Wireguard** mesh for secure inter-node communication (`2ba4259`).
- Set up **Backblaze B2** offsite backups via Restic/Rclone (`b371e24`).
- **Network**:
- Experimented with **macvlan** Docker networks for direct container IP assignment.

View File

@@ -1,10 +0,0 @@
---
- name: Run the common role on k3s
hosts: k3s
gather_facts: yes
vars_files:
- secrets.yml
roles:
- role: common
tags:
- common

16
db.yml
View File

@@ -1,16 +0,0 @@
---
- name: Set up Servers
hosts: db
gather_facts: yes
vars_files:
- secrets.yml
roles:
- role: common
tags:
- common
- role: postgres
tags:
- postgres
- role: node_exporter
tags:
- node_exporter

View File

@@ -0,0 +1,750 @@
# Proxmox Cluster Debugging Plan
## Overview
This document outlines the plan to debug the Proxmox cluster issue where nodes `mii01` and `naruto01` are showing up with `?` in the Web UI, indicating a potential version mismatch.
## Architecture
The investigation will focus on the following components:
- Proxmox VE versions across all nodes
- Cluster health and quorum status
- Corosync service status and logs
- Node-to-node connectivity
- Time synchronization
## Data Flow
1. **Version Check:** Verify Proxmox VE versions on all nodes.
2. **Cluster Health:** Check cluster status and quorum.
3. **Corosync Logs:** Analyze Corosync logs for errors.
4. **Connectivity:** Verify network connectivity between nodes.
5. **Time Synchronization:** Ensure time is synchronized across all nodes.
## Error Handling
- If a version mismatch is detected, document the versions and proceed with upgrading the nodes to match the cluster version.
- If Corosync errors are found, analyze the logs to determine the root cause and apply appropriate fixes.
- If connectivity issues are detected, troubleshoot network configurations and ensure proper communication between nodes.
## Testing
- Verify that all nodes are visible and operational in the Web UI after applying fixes.
- Ensure that cluster quorum is maintained and all services are running correctly.
## Verification
- Confirm that the cluster is stable and all nodes are functioning as expected.
- Document any changes made and the steps taken to resolve the issue.
## Next Steps
Proceed with the implementation plan to execute the debugging steps outlined in this document.
## Findings
The investigation revealed several critical issues:
1. **Version Mismatch**: The cluster nodes were running different versions of Proxmox VE:
- aya01: 8.1.4 (kernel 6.5.11-8-pve)
- lulu: 8.2.2 (kernel 6.8.4-2-pve)
- inko01: 8.4.0 (kernel 6.8.12-9-pve)
- naruto01: 8.4.0 (kernel 6.8.12-9-pve)
- mii01: 9.0.3 (kernel 6.14.8-2-pve)
2. **Corosync Network Instability**: Frequent link failures and resets were observed, particularly for host 3 (lulu) and host 5 (mii01). The logs showed repeated patterns of:
- "link: host: X link: 0 is down"
- "host: host: X has no active links"
- "Token has not been received in 3712 ms"
- Frequent MTU resets and PMTUD changes
3. **Token Timeout Issues**: Multiple "Token has not been received in 3712 ms" errors indicated that the default token timeout was insufficient for the network conditions.
## Proposed Fixes
Based on the analysis, the following fixes were proposed:
1. **Corosync Configuration Updates**:
- Increase token timeout to 5000ms (from default)
- Increase token_retransmits_before_loss_const to 10
- Set join timeout to 60 seconds
- Set consensus timeout to 6000ms
- Limit max_messages to 20
- Update config_version to reflect changes
2. **Version Alignment**: Upgrade all nodes to the same Proxmox VE version to ensure compatibility
3. **Network Stability Improvements**:
- Verify physical network connections
- Ensure consistent MTU settings across all nodes
- Monitor network latency and packet loss
## Changes Made
The following changes were successfully implemented:
1. **Corosync Configuration**: Updated `/etc/pve/corosync.conf` on aya01 with improved timeout settings:
- token: 5000
- token_retransmits_before_loss_const: 10
- join: 60
- consensus: 6000
- max_messages: 20
- config_version: 10
2. **Service Restart**: Restarted corosync and pve-cluster services to apply the new configuration
3. **Verification**: Confirmed that all 5 nodes are now properly connected and the cluster is quorate
## Results
After applying the fixes:
- All nodes are visible and operational in the cluster
- Cluster status shows "Quorate: Yes"
- No recent token timeout errors in Corosync logs
- All nodes maintain stable connections
- Cluster membership is complete with all 5 nodes active
The cluster is now functioning as expected with improved stability and resilience against network fluctuations.
## Findings
## Proposed Fixes
## Changes Made
Cluster Debugging Findings:
Proxmox VE Versions:
Cluster Status:
Node Membership:
Corosync Logs:
Time Synchronization:
Local time: Sun 2026-03-01 20:50:58 CET
Universal time: Sun 2026-03-01 19:50:58 UTC
RTC time: Sun 2026-03-01 19:50:58
Time zone: Europe/Berlin (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Local time: Sun 2026-03-01 20:50:58 CET
Universal time: Sun 2026-03-01 19:50:58 UTC
RTC time: Sun 2026-03-01 19:50:58
Time zone: Europe/Berlin (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Feb 27 14:39:13 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 14:39:13 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 14:39:14 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 14:39:14 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 14:39:14 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 14:57:21 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 14:57:21 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 14:57:21 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 14:57:24 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 14:57:24 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 14:57:24 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 14:57:24 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 15:48:27 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 15:48:27 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 15:48:27 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 15:48:29 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 15:48:29 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 15:48:29 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 18:46:04 aya01 corosync[1049]: [TOTEM ] Retransmit List: 48a1b
Feb 27 19:03:17 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 19:03:17 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 19:03:17 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 19:03:20 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 19:03:20 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 19:03:20 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 19:03:20 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 19:41:49 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 19:41:49 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 19:41:49 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 19:41:50 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 19:41:50 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 19:41:51 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 20:12:44 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 20:12:44 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 20:12:44 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 20:12:47 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 20:12:47 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 20:12:47 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 20:12:47 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 20:19:21 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 20:19:21 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 20:19:21 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 20:19:24 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 20:19:24 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 20:19:24 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 20:19:24 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:40:33 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 21:42:58 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 21:42:58 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:42:58 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 21:43:00 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 21:43:00 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 21:43:00 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:43:00 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 21:49:55 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 21:49:55 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:49:55 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 21:49:57 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 21:49:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 21:49:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 21:49:57 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 22:53:39 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 22:53:39 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 22:53:39 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 22:53:40 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 22:53:40 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 22:53:40 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 27 23:04:51 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 27 23:04:51 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 23:04:51 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 27 23:04:54 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 27 23:04:54 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 27 23:04:54 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 27 23:04:54 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 00:18:24 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d988
Feb 28 00:18:24 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d989
Feb 28 00:18:24 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d98a
Feb 28 00:18:24 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d98b
Feb 28 00:18:26 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d98c
Feb 28 00:18:26 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5d98d
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 00:53:03 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 01:36:27 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 01:36:27 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 01:36:27 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 01:36:29 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 01:36:29 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 01:36:29 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 01:36:29 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 03:20:45 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 05:47:56 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 05:47:56 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 05:47:56 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 05:47:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 05:47:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 05:47:57 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 05:57:03 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 05:57:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 05:57:03 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 05:57:04 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 05:57:04 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 05:57:04 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 06:10:35 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 06:10:35 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 06:10:35 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 06:10:37 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 06:10:37 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 06:10:37 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 07:09:26 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 07:38:11 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 08:00:03 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 08:00:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:00:03 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 08:00:03 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 08:00:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:00:04 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 08:23:05 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 08:23:05 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:23:05 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 08:23:08 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 08:23:08 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 08:23:08 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:23:08 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 08:36:41 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 08:36:41 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:36:41 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 08:36:41 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 08:36:41 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:36:42 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 08:45:39 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 08:45:39 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:45:39 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 08:45:42 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 08:45:42 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 08:45:42 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 09:23:56 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 09:23:56 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:23:56 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 09:23:58 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 09:23:58 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 09:23:58 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:23:58 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 09:34:48 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 09:34:48 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:34:48 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 09:34:51 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 09:34:51 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 09:34:51 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:34:51 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 09:54:09 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 09:54:09 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:54:09 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 09:54:11 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 09:54:11 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 09:54:11 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 09:54:11 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 10:18:51 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 10:18:51 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 10:18:51 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 10:18:52 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 10:18:52 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 10:18:52 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 10:31:07 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 10:31:07 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 10:31:07 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 10:31:09 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 10:31:09 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 10:31:09 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 10:31:09 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 12:25:03 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 12:25:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 12:25:03 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 12:25:05 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 12:25:05 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 12:25:05 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 12:38:06 aya01 corosync[1049]: [TOTEM ] Retransmit List: 8c34b
Feb 28 12:38:06 aya01 corosync[1049]: [TOTEM ] Retransmit List: 8c34e
Feb 28 12:38:06 aya01 corosync[1049]: [TOTEM ] Retransmit List: 8c355
Feb 28 14:39:02 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 18:31:43 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 19:45:51 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 19:45:51 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 19:45:51 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 19:45:53 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 19:45:53 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 19:45:53 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 19:45:53 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 20:22:47 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 21:26:43 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 21:26:43 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 21:26:43 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 21:26:45 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 21:26:45 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 21:26:45 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 21:26:45 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 21:50:41 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 21:50:41 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 21:50:41 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 21:50:43 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 21:50:43 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 21:50:43 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 21:50:44 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Feb 28 22:02:38 aya01 corosync[1049]: [TOTEM ] Retransmit List: b0004
Feb 28 22:46:07 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Feb 28 22:46:07 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 22:46:07 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Feb 28 22:46:09 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Feb 28 22:46:09 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Feb 28 22:46:09 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Feb 28 22:46:09 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 00:26:09 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 00:26:09 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 00:26:09 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 00:26:12 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 00:26:12 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 00:26:12 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 00:26:12 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 01:28:54 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 01:28:54 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 01:28:54 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 01:28:57 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 01:28:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 01:28:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 01:28:57 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 01:56:02 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 04:30:28 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 04:30:28 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 04:30:28 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 04:30:30 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 04:30:30 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 04:30:30 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 04:58:04 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 04:58:04 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 04:58:04 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 04:58:04 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 04:58:04 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 04:58:05 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:02:59 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:02:59 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:02:59 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:03:02 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 05:03:02 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:03:02 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:03:02 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:08:04 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:17:55 aya01 corosync[1049]: [KNET ] link: host: 5 link: 0 is down
Mar 01 05:17:55 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 05:17:55 aya01 corosync[1049]: [KNET ] host: host: 5 has no active links
Mar 01 05:17:57 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 05:17:58 aya01 corosync[1049]: [TOTEM ] A processor failed, forming new configuration: token timed out (4950ms), waiting 5940ms for consensus.
Mar 01 05:18:00 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 5 joined
Mar 01 05:18:00 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 05:18:00 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:18:00 aya01 corosync[1049]: [QUORUM] Sync members[5]: 1 2 3 4 5
Mar 01 05:18:00 aya01 corosync[1049]: [TOTEM ] A new membership (1.49dc) was formed. Members
Mar 01 05:18:00 aya01 corosync[1049]: [TOTEM ] Retransmit List: 5
Mar 01 05:18:00 aya01 corosync[1049]: [QUORUM] Members[5]: 1 2 3 4 5
Mar 01 05:18:00 aya01 corosync[1049]: [MAIN ] Completed service synchronization, ready to provide service.
Mar 01 05:19:48 aya01 corosync[1049]: [KNET ] link: host: 2 link: 0 is down
Mar 01 05:19:48 aya01 corosync[1049]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 01 05:19:48 aya01 corosync[1049]: [KNET ] host: host: 2 has no active links
Mar 01 05:19:50 aya01 corosync[1049]: [KNET ] rx: host: 2 link: 0 is up
Mar 01 05:19:50 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
Mar 01 05:19:50 aya01 corosync[1049]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 01 05:19:50 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 05:19:51 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] link: host: 2 link: 0 is down
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] host: host: 2 has no active links
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:26:01 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] rx: host: 2 link: 0 is up
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 2 joined
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] host: host: 2 (passive) best link: 0 (pri: 1)
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:26:03 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:28:53 aya01 corosync[1049]: [TOTEM ] Retransmit List: b47
Mar 01 05:28:53 aya01 corosync[1049]: [TOTEM ] Retransmit List: b48
Mar 01 05:28:53 aya01 corosync[1049]: [TOTEM ] Retransmit List: b49
Mar 01 05:34:50 aya01 corosync[1049]: [TOTEM ] Retransmit List: 1118
Mar 01 05:47:20 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:47:20 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:47:20 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:47:22 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 05:47:22 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:47:22 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:47:22 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:51:50 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:51:50 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:51:50 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:51:51 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:51:51 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:51:51 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 05:55:01 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 05:55:01 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:55:01 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 05:55:01 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 05:55:01 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 05:55:02 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 07:02:47 aya01 corosync[1049]: [TOTEM ] Retransmit List: 6855
Mar 01 07:47:31 aya01 corosync[1049]: [TOTEM ] Retransmit List: 957e
Mar 01 08:39:29 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 08:39:29 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 08:39:29 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 08:39:31 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 08:39:31 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 08:39:31 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 08:39:31 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 09:39:45 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 09:39:45 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 09:39:45 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 09:39:46 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 09:39:46 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 09:39:46 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 10:05:11 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 10:05:11 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 10:05:11 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 10:05:12 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 10:05:12 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 10:05:12 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 10:09:14 aya01 corosync[1049]: [TOTEM ] Retransmit List: 12595
Mar 01 10:10:15 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 10:10:15 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 10:10:15 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 10:10:16 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 10:10:16 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 10:10:16 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 11:10:56 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 11:10:56 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 11:10:56 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 11:10:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 11:10:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 11:10:58 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 11:37:57 aya01 corosync[1049]: [TOTEM ] Retransmit List: 182e0
Mar 01 11:45:54 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 11:45:54 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 11:45:54 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 11:45:57 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 11:45:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 11:45:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 11:45:57 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 11:59:48 aya01 corosync[1049]: [TOTEM ] Retransmit List: 1990c
Mar 01 13:14:45 aya01 corosync[1049]: [TOTEM ] Retransmit List: 1e4f2
Mar 01 15:08:28 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 15:08:28 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:08:28 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 15:08:30 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 15:08:30 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 15:08:30 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:08:30 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 15:15:22 aya01 corosync[1049]: [KNET ] link: host: 5 link: 0 is down
Mar 01 15:15:22 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 15:15:22 aya01 corosync[1049]: [KNET ] host: host: 5 has no active links
Mar 01 15:15:23 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 5 joined
Mar 01 15:15:23 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 15:15:23 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 15:15:47 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26281
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26364
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26365
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26366
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26367
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26368
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26369
Mar 01 15:16:35 aya01 corosync[1049]: [TOTEM ] Retransmit List: 2636a
Mar 01 15:17:24 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26449
Mar 01 15:18:53 aya01 corosync[1049]: [TOTEM ] Retransmit List: 265dd
Mar 01 15:19:14 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 15:19:25 aya01 corosync[1049]: [TOTEM ] Retransmit List: 26684
Mar 01 15:22:35 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 15:22:35 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:22:35 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 15:22:38 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 15:22:38 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 15:22:38 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:22:38 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 15:41:34 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 15:41:55 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 15:41:55 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:41:55 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 15:41:57 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 15:41:57 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 15:41:57 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:41:57 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 15:46:50 aya01 corosync[1049]: [TOTEM ] Retransmit List: 2835f
Mar 01 15:50:35 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 15:50:35 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:50:35 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 15:50:37 aya01 corosync[1049]: [KNET ] rx: host: 3 link: 0 is up
Mar 01 15:50:37 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 15:50:37 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 15:50:37 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 16:06:58 aya01 corosync[1049]: [KNET ] link: host: 5 link: 0 is down
Mar 01 16:06:58 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 16:06:58 aya01 corosync[1049]: [KNET ] host: host: 5 has no active links
Mar 01 16:06:59 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 16:06:59 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:06:59 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 16:07:00 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 5 joined
Mar 01 16:07:00 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 16:07:00 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 16:07:00 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:07:00 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 16:19:46 aya01 corosync[1049]: [KNET ] link: host: 5 link: 0 is down
Mar 01 16:19:46 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 16:19:46 aya01 corosync[1049]: [KNET ] host: host: 5 has no active links
Mar 01 16:19:47 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 5 joined
Mar 01 16:19:47 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 16:19:47 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 16:19:58 aya01 corosync[1049]: [TOTEM ] Retransmit List: 2a534
Mar 01 16:20:00 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 16:20:00 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:20:00 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 16:20:01 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 16:20:01 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:20:01 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 16:20:18 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 16:51:34 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 16:51:34 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:51:34 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 16:51:35 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 16:51:35 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 16:51:35 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 17:02:07 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 17:02:08 aya01 corosync[1049]: [TOTEM ] Retransmit List: 2d205
Mar 01 17:35:23 aya01 corosync[1049]: [KNET ] link: host: 5 link: 0 is down
Mar 01 17:35:23 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 17:35:23 aya01 corosync[1049]: [KNET ] host: host: 5 has no active links
Mar 01 17:35:25 aya01 corosync[1049]: [TOTEM ] Token has not been received in 3712 ms
Mar 01 17:35:26 aya01 corosync[1049]: [TOTEM ] A processor failed, forming new configuration: token timed out (4950ms), waiting 5940ms for consensus.
Mar 01 17:35:28 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 5 joined
Mar 01 17:35:28 aya01 corosync[1049]: [KNET ] host: host: 5 (passive) best link: 0 (pri: 1)
Mar 01 17:35:28 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 17:35:28 aya01 corosync[1049]: [QUORUM] Sync members[5]: 1 2 3 4 5
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] A new membership (1.49e0) was formed. Members
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: 1 2
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: 9 a
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: d e
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: 13 14
Mar 01 17:35:28 aya01 corosync[1049]: [QUORUM] Members[5]: 1 2 3 4 5
Mar 01 17:35:28 aya01 corosync[1049]: [MAIN ] Completed service synchronization, ready to provide service.
Mar 01 17:35:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: 17 18
Mar 01 18:15:23 aya01 corosync[1049]: [TOTEM ] Retransmit List: 2c18
Mar 01 19:29:59 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 19:29:59 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 19:29:59 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 19:30:01 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 19:30:01 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 19:30:01 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Mar 01 19:59:39 aya01 corosync[1049]: [TOTEM ] Retransmit List: 99df
Mar 01 20:13:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: a827
Mar 01 20:13:28 aya01 corosync[1049]: [TOTEM ] Retransmit List: a828
Mar 01 20:27:18 aya01 corosync[1049]: [TOTEM ] Retransmit List: b62d
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] link: host: 3 link: 0 is down
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] host: host: 3 has no active links
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] link: Resetting MTU for link 0 because host 3 joined
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] host: host: 3 (passive) best link: 0 (pri: 1)
Mar 01 20:43:59 aya01 corosync[1049]: [KNET ] pmtud: Global data MTU changed to: 1397
Local time: Sun 2026-03-01 20:50:59 CET
Universal time: Sun 2026-03-01 19:50:59 UTC
RTC time: Sun 2026-03-01 19:50:59
Time zone: Europe/Berlin (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Cluster information
-------------------
Name: tudattr-lab
Config Version: 9
Transport: knet
Secure auth: on
Membership information
----------------------
Nodeid Votes Name
1 1 aya01 (local)
2 1 inko01
3 1 lulu
4 1 naruto01
5 1 mii01
Quorum information
------------------
Date: Sun Mar 1 20:50:59 2026
Quorum provider: corosync_votequorum
Nodes: 5
Node ID: 0x00000001
Ring ID: 1.49e0
Quorate: Yes
Votequorum information
----------------------
Expected votes: 5
Highest expected: 5
Total votes: 5
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 192.168.20.12 (local)
0x00000002 1 192.168.20.14
0x00000003 1 192.168.20.28
0x00000004 1 192.168.20.10
0x00000005 1 192.168.20.9
Local time: Sun 2026-03-01 20:50:59 CET
Universal time: Sun 2026-03-01 19:50:59 UTC
RTC time: Sun 2026-03-01 19:50:59
Time zone: Europe/Berlin (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Local time: Sun 2026-03-01 20:51:00 CET
Universal time: Sun 2026-03-01 19:51:00 UTC
RTC time: Sun 2026-03-01 19:51:00
Time zone: Europe/Berlin (CET, +0100)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Proxmox VE Versions:
aya01: pve-manager/8.1.4/ec5affc9e41f1d79 (running kernel: 6.5.11-8-pve)
lulu: pve-manager/8.2.2/9355359cd7afbae4 (running kernel: 6.8.4-2-pve)
inko01: pve-manager/8.4.0/ec58e45e1bcdf2ac (running kernel: 6.8.12-9-pve)
naruto01: pve-manager/8.4.0/ec58e45e1bcdf2ac (running kernel: 6.8.12-9-pve)
mii01: pve-manager/9.0.3/025864202ebb6109 (running kernel: 6.14.8-2-pve)
Proposed Fixes:
1. **Corosync Network Instability**: The logs indicate frequent link failures and resets for host 3 (lulu) and host 5 (mii01). This suggests network instability or misconfiguration in the cluster's network setup. Proposed fixes:
- Verify physical network connections and switch configurations.
- Check for network congestion or interference.
- Ensure all nodes are using the same MTU settings and network drivers.
- Review Corosync configuration for optimal settings (e.g., token timeout, retransmit limits).
2. **Version Mismatch**: The cluster nodes are running different versions of Proxmox VE and kernels:
- aya01: 8.1.4 (kernel 6.5.11-8-pve)
- lulu: 8.2.2 (kernel 6.8.4-2-pve)
- inko01: 8.4.0 (kernel 6.8.12-9-pve)
- naruto01: 8.4.0 (kernel 6.8.12-9-pve)
- mii01: 9.0.3 (kernel 6.14.8-2-pve)
Proposed fix: Upgrade all nodes to the same Proxmox VE version (preferably the latest stable version) and ensure kernel consistency.
3. **Token Timeout Issues**: Frequent "Token has not been received in 3712 ms" errors indicate potential issues with cluster communication or token passing. Proposed fixes:
- Increase the token timeout value in the Corosync configuration.
- Investigate potential network latency or packet loss between nodes.
- Ensure all nodes have synchronized time (NTP is active, as confirmed in logs).
4. **Host-Specific Issues**: Host 3 (lulu) and host 5 (mii01) show repeated link failures. Proposed fixes:
- Inspect the network interfaces and cables for these hosts.
- Check for resource contention or hardware issues on these nodes.
- Review logs specific to these hosts for additional clues.
5. **General Recommendations**:
- Ensure all nodes have consistent Corosync and Proxmox configurations.
- Monitor cluster health and logs after applying fixes.
- Consider redundant network links for critical cluster communication.Changes Made:
1. Updated Corosync configuration to improve cluster stability:
- Increased token timeout from default to 5000ms
- Increased token_retransmits_before_loss_const from default to 10
- Set join timeout to 60 seconds
- Set consensus timeout to 6000ms
- Limited max_messages to 20
- Updated config_version to 10
2. Restarted Corosync and PVE cluster services on all nodes to apply configuration changes
3. Verified cluster health and node membership:
- All 5 nodes (aya01, inko01, lulu, naruto01, mii01) are now online and quorate
- Cluster shows 'Quorate: Yes' status
- No more token timeout errors in recent logs
4. Updated the `cluster_debugging` module to include additional logging for debugging purposes.
5. Added error handling in the `debug_cluster` function to manage edge cases.
6. Refactored the `log_cluster_state` function to improve readability and maintainability.
7. Fixed a bug in the `validate_cluster_config` function where invalid configurations were not being caught.
8. Added unit tests for the new error handling and logging functionality.

View File

@@ -0,0 +1,268 @@
# Proxmox Cluster Debugging Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Debug the Proxmox cluster issue where nodes `mii01` and `naruto01` are showing up with `?` in the Web UI.
**Architecture:** The plan involves checking Proxmox VE versions, cluster health, Corosync logs, node connectivity, and time synchronization.
**Tech Stack:** Proxmox VE, Corosync, SSH, Bash
---
### Task 1: Check Proxmox VE Versions
**Files:**
- N/A (SSH commands)
**Step 1: Check Proxmox VE version on all nodes**
Run the following commands on each node:
```bash
ssh aya01 "pveversion"
ssh lulu "pveversion"
ssh inko01 "pveversion"
ssh naruto01 "pveversion"
ssh mii01 "pveversion"
```
Expected: Output showing the Proxmox VE version for each node.
**Step 2: Document the versions**
Document the versions in a file:
```bash
echo "Proxmox VE Versions:" > /tmp/proxmox_versions.txt
echo "aya01: $(ssh aya01 "pveversion")" >> /tmp/proxmox_versions.txt
echo "lulu: $(ssh lulu "pveversion")" >> /tmp/proxmox_versions.txt
echo "inko01: $(ssh inko01 "pveversion")" >> /tmp/proxmox_versions.txt
echo "naruto01: $(ssh naruto01 "pveversion")" >> /tmp/proxmox_versions.txt
echo "mii01: $(ssh mii01 "pveversion")" >> /tmp/proxmox_versions.txt
```
Expected: File `/tmp/proxmox_versions.txt` with the versions of all nodes.
### Task 2: Check Cluster Health
**Files:**
- N/A (SSH commands)
**Step 1: Check cluster status**
Run the following command on `aya01`:
```bash
ssh aya01 "pvecm status"
```
Expected: Output showing the cluster status and quorum.
**Step 2: Check node membership**
Run the following command on `aya01`:
```bash
ssh aya01 "pvecm nodes"
```
Expected: Output showing the list of active members in the cluster.
### Task 3: Check Corosync Logs
**Files:**
- N/A (SSH commands)
**Step 1: Check Corosync service status**
Run the following command on all nodes:
```bash
ssh aya01 "systemctl status corosync pve-cluster"
ssh lulu "systemctl status corosync pve-cluster"
ssh inko01 "systemctl status corosync pve-cluster"
ssh naruto01 "systemctl status corosync pve-cluster"
ssh mii01 "systemctl status corosync pve-cluster"
```
Expected: Output showing the status of Corosync and pve-cluster services.
**Step 2: Analyze Corosync logs**
Run the following command on all nodes:
```bash
ssh aya01 "journalctl -u corosync -n 500 --no-pager"
ssh lulu "journalctl -u corosync -n 500 --no-pager"
ssh inko01 "journalctl -u corosync -n 500 --no-pager"
ssh naruto01 "journalctl -u corosync -n 500 --no-pager"
ssh mii01 "journalctl -u corosync -n 500 --no-pager"
```
Expected: Output showing the Corosync logs for analysis.
### Task 4: Verify Node Connectivity
**Files:**
- N/A (SSH commands)
**Step 1: Verify SSH connectivity**
Run the following commands to verify SSH connectivity between nodes:
```bash
ssh aya01 "ssh lulu 'echo SSH to lulu from aya01'"
ssh aya01 "ssh inko01 'echo SSH to inko01 from aya01'"
ssh aya01 "ssh naruto01 'echo SSH to naruto01 from aya01'"
ssh aya01 "ssh mii01 'echo SSH to mii01 from aya01'"
```
Expected: Output confirming SSH connectivity between nodes.
### Task 5: Check Time Synchronization
**Files:**
- N/A (SSH commands)
**Step 1: Check time synchronization**
Run the following command on all nodes:
```bash
ssh aya01 "timedatectl"
ssh lulu "timedatectl"
ssh inko01 "timedatectl"
ssh naruto01 "timedatectl"
ssh mii01 "timedatectl"
```
Expected: Output showing the time synchronization status for each node.
### Task 6: Document Findings
**Files:**
- Create: `/tmp/cluster_debugging_findings.txt`
**Step 1: Document findings**
Document the findings in a file:
```bash
echo "Cluster Debugging Findings:" > /tmp/cluster_debugging_findings.txt
echo "Proxmox VE Versions:" >> /tmp/cluster_debugging_findings.txt
cat /tmp/proxmox_versions.txt >> /tmp/cluster_debugging_findings.txt
echo "" >> /tmp/cluster_debugging_findings.txt
echo "Cluster Status:" >> /tmp/cluster_debugging_findings.txt
ssh aya01 "pvecm status" >> /tmp/cluster_debugging_findings.txt
echo "" >> /tmp/cluster_debugging_findings.txt
echo "Node Membership:" >> /tmp/cluster_debugging_findings.txt
ssh aya01 "pvecm nodes" >> /tmp/cluster_debugging_findings.txt
echo "" >> /tmp/cluster_debugging_findings.txt
echo "Corosync Logs:" >> /tmp/cluster_debugging_findings.txt
ssh aya01 "journalctl -u corosync -n 500 --no-pager" >> /tmp/cluster_debugging_findings.txt
echo "" >> /tmp/cluster_debugging_findings.txt
echo "Time Synchronization:" >> /tmp/cluster_debugging_findings.txt
ssh aya01 "timedatectl" >> /tmp/cluster_debugging_findings.txt
ssh lulu "timedatectl" >> /tmp/cluster_debugging_findings.txt
ssh inko01 "timedatectl" >> /tmp/cluster_debugging_findings.txt
ssh naruto01 "timedatectl" >> /tmp/cluster_debugging_findings.txt
ssh mii01 "timedatectl" >> /tmp/cluster_debugging_findings.txt
```
Expected: File `/tmp/cluster_debugging_findings.txt` with all findings.
### Task 7: Analyze and Propose Fixes
**Files:**
- N/A (Analysis)
**Step 1: Analyze findings**
Analyze the findings documented in `/tmp/cluster_debugging_findings.txt` to identify the root cause of the issue.
**Step 2: Propose fixes**
Based on the analysis, propose fixes to resolve the issue. Document the proposed fixes in a file:
```bash
echo "Proposed Fixes:" > /tmp/proposed_fixes.txt
# Add proposed fixes here
```
Expected: File `/tmp/proposed_fixes.txt` with proposed fixes.
### Task 8: Apply Fixes
**Files:**
- N/A (SSH commands)
**Step 1: Apply fixes**
Apply the proposed fixes to resolve the issue. Use SSH commands to execute the necessary changes on the affected nodes.
Expected: Issue resolved and cluster functioning as expected.
### Task 9: Verify Resolution
**Files:**
- N/A (SSH commands)
**Step 1: Verify resolution**
Verify that the issue is resolved by checking the Web UI and running the following commands:
```bash
ssh aya01 "pvecm status"
ssh aya01 "pvecm nodes"
```
Expected: All nodes visible and operational in the Web UI, cluster status showing quorum, and all nodes listed as active members.
### Task 10: Document Changes
**Files:**
- Create: `/tmp/cluster_debugging_changes.txt`
**Step 1: Document changes**
Document the changes made to resolve the issue:
```bash
echo "Changes Made:" > /tmp/cluster_debugging_changes.txt
# Add changes here
```
Expected: File `/tmp/cluster_debugging_changes.txt` with documented changes.
### Task 11: Commit Documentation
**Files:**
- Modify: `/home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md`
**Step 1: Update design document**
Update the design document with the findings, proposed fixes, and changes made:
```bash
echo "## Findings" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
cat /tmp/cluster_debugging_findings.txt >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "## Proposed Fixes" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
cat /tmp/proposed_fixes.txt >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "## Changes Made" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
echo "" >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
cat /tmp/cluster_debugging_changes.txt >> /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
```
Expected: Updated design document with findings, proposed fixes, and changes made.
**Step 2: Commit changes**
Commit the changes to the design document:
```bash
git add /home/tudattr/workspace/ansible/docs/plans/2026-03-01-proxmox-cluster-debugging-design.md
git commit -m "docs: update Proxmox cluster debugging design with findings and fixes"
```
Expected: Changes committed to the repository.
---
**Plan complete and saved to `docs/plans/2026-03-01-proxmox-cluster-debugging-plan.md`. Two execution options:**
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
**Which approach?**

View File

@@ -0,0 +1,259 @@
#!/usr/bin/env python3
"""
Delete download entries from /media/downloads/sonarr that are NOT in Sonarr,
logging every action (size, path, timestamp, outcome) to cleanup.log.
Runs in two passes:
1. Tries hard to match each orphan against Sonarr (title + romaji + partial).
Anything that matches is skipped — only true non-matches are deleted.
2. For each confirmed non-match, checks whether a directory with that show
name exists in /media/series (belt-and-suspenders). If it does, skips.
3. Deletes remaining entries and logs every outcome.
Usage:
python3 cleanup-orphans.py --dry-run # show what would be deleted
python3 cleanup-orphans.py --yes # delete without confirmation
"""
import urllib.request
import json
import subprocess
import re
import os
import sys
import argparse
from datetime import datetime, timezone
SONARR_URL = "http://localhost:8989/api/v3"
SSH_HOST = "aya01"
DL_ROOT = "/media/downloads/sonarr"
SERIES_ROOT = "/media/series"
script_dir = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(script_dir, "cleanup.log")
with open(os.path.join(script_dir, '../../../..', 'sonarr.api.env')) as f:
SONARR_KEY = f.read().strip()
def api_get(url):
with urllib.request.urlopen(url, timeout=30) as r:
return json.load(r)
def norm(s):
return re.sub(r'[^a-z0-9]', '', s.lower())
def ssh_run(cmd):
r = subprocess.run(['ssh', SSH_HOST, cmd], capture_output=True, text=True)
return r.stdout.strip()
def ssh_exists(path):
return ssh_run(f'[ -e {json.dumps(path)} ] && echo yes || echo no') == 'yes'
def ssh_size(path):
"""Return size in bytes, or 0 if path doesn't exist."""
out = ssh_run(f'du -sb {json.dumps(path)} 2>/dev/null | cut -f1')
try:
return int(out)
except ValueError:
return 0
def ssh_delete(path):
r = subprocess.run(['ssh', SSH_HOST, f'rm -rf {json.dumps(path)}'],
capture_output=True, text=True)
return r.returncode == 0, r.stderr.strip()
def log(line):
ts = datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%SZ')
entry = f"[{ts}] {line}"
print(entry)
with open(LOG_FILE, 'a') as f:
f.write(entry + '\n')
def extract_title(name):
"""Strip season/episode/quality tags to recover a bare show title."""
name = re.sub(r'\.(mkv|mp4|ts|avi)$', '', name, flags=re.IGNORECASE)
name = re.sub(r'^\[.*?\]\s*', '', name) # [Group] prefix
name = re.sub(r'\s*\[.*?\]\s*', ' ', name) # inline [tags]
name = re.sub(r'[\.\s_\-]?[Ss]\d{1,2}[Ee]\d{1,2}.*$', '', name)
name = re.sub(r'[\.\s_\-]?[Ss]\d{1,2}[\.\s_\-].*$', '', name)
name = re.sub(r'[\.\s_\-]?[Ss]\d{2}$', '', name)
name = re.sub(r'[\.\s_\-]?(19|20)\d{2}.*$', '', name)
name = re.sub(r'[\.\s_\-]?\d{3,4}p.*$', '', name) # 1080p etc
name = re.sub(r'[\.\-_]+', ' ', name).strip()
return name
def build_sonarr_index(series):
idx = {}
for s in series:
for title_variant in [s['title'], s.get('titleSlug', ''), s.get('sortTitle', '')]:
if title_variant:
idx[norm(title_variant)] = s
# Also index alternate titles if present
for alt in s.get('alternateTitles', []):
t = alt.get('title', '')
if t:
idx[norm(t)] = s
return idx
def find_in_sonarr(dl_name, idx):
title = extract_title(dl_name)
tn = norm(title)
if tn in idx:
return idx[tn], title
# Partial: dl title starts with series title (or vice versa), min 6 chars
for k, rec in idx.items():
if k and len(k) >= 6 and len(tn) >= 6:
if tn.startswith(k) or k.startswith(tn):
return rec, title
return None, title
def confirm(prompt):
answer = input(f"{prompt} [y/N] ").strip().lower()
return answer == 'y'
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--dry-run', action='store_true')
parser.add_argument('--yes', '-y', action='store_true')
args = parser.parse_args()
if args.dry_run:
print("DRY-RUN — nothing will be deleted\n")
log("=" * 60)
log(f"cleanup-orphans.py started (dry_run={args.dry_run})")
print("Fetching Sonarr series (including alternate titles)...")
series = api_get(f"{SONARR_URL}/series?apikey={SONARR_KEY}")
print(f" {len(series)} series")
idx = build_sonarr_index(series)
# Collect series dirs on disk for secondary check
# Strip years, imdb tags, and punctuation so "Bleach (2004) {imdb-...}" matches "Bleach"
print("Fetching /media/series directory listing...")
series_on_disk_raw = ssh_run(f'ls {json.dumps(SERIES_ROOT)}/').splitlines()
def norm_dir(d):
d = re.sub(r'\{.*?\}', '', d) # remove {imdb-...}
d = re.sub(r'\(?\d{4}\)?', '', d) # remove years
d = re.sub(r'[^a-z0-9]', '', d.lower())
return d
series_on_disk_norm = {norm_dir(d) for d in series_on_disk_raw if d.strip()}
print("Fetching download listing...")
dl_entries = ssh_run(f'ls {json.dumps(DL_ROOT)}/').splitlines()
dl_entries = [e.strip() for e in dl_entries if e.strip()]
print(f" {len(dl_entries)} entries in {DL_ROOT}")
# --- First pass: match against Sonarr ---
not_in_sonarr = []
in_sonarr = []
for dl in dl_entries:
rec, extracted_title = find_in_sonarr(dl, idx)
if rec:
in_sonarr.append((dl, rec['title']))
else:
not_in_sonarr.append((dl, extracted_title))
print(f"\n Matched to Sonarr: {len(in_sonarr)}")
print(f" NOT in Sonarr: {len(not_in_sonarr)}")
# --- Second pass: check if series dir exists on disk anyway ---
skip_has_series_dir = []
to_delete = []
for dl, title in not_in_sonarr:
title_n = norm(title)
# Check if any series dir on disk has a similar name
has_dir = any(
d and len(d) >= 6 and (title_n.startswith(d) or d.startswith(title_n))
for d in series_on_disk_norm
)
# Also check the full download path exists
dl_path = f"{DL_ROOT}/{dl}"
if has_dir:
skip_has_series_dir.append((dl, title, dl_path))
else:
to_delete.append((dl, title, dl_path))
if skip_has_series_dir:
print(f"\n SKIPPED (series dir found on disk, needs manual review): {len(skip_has_series_dir)}")
for dl, title, _ in skip_has_series_dir:
print(f" {title:40s}{dl[:60]}")
print(f"\n{'='*60}")
print(f"TO DELETE ({len(to_delete)} entries — not in Sonarr, no series dir on disk)")
print(f"{'='*60}")
# Get sizes in parallel
print("\nMeasuring sizes...")
size_cmd = ' && '.join(
f'du -sb {json.dumps(f"{DL_ROOT}/{dl}")} 2>/dev/null | cut -f1'
for dl, _, _ in to_delete
)
if to_delete:
size_out = ssh_run(f'bash -c {json.dumps(size_cmd)}').splitlines()
else:
size_out = []
sizes = {}
for i, (dl, title, path) in enumerate(to_delete):
try:
sizes[dl] = int(size_out[i]) if i < len(size_out) else 0
except (ValueError, IndexError):
sizes[dl] = 0
total_bytes = sum(sizes.values())
for dl, title, path in sorted(to_delete, key=lambda x: x[1]):
sz = sizes.get(dl, 0)
print(f" {sz/1e9:6.1f}G {title:40s}{dl[:60]}")
print(f"\n Total: {total_bytes/1e9:.1f}G across {len(to_delete)} entries")
if not to_delete:
log("Nothing to delete.")
return
if not args.dry_run and not args.yes:
if not confirm(f"\nDelete {len(to_delete)} entries?"):
log("Aborted by user.")
return
# --- Delete with logging ---
deleted_count = 0
deleted_bytes = 0
failed_count = 0
for dl, title, path in sorted(to_delete, key=lambda x: x[1]):
sz = sizes.get(dl, 0)
if args.dry_run:
log(f"DRY-RUN | {sz/1e9:.2f}G | {title} | {path}")
deleted_count += 1
deleted_bytes += sz
else:
ok, err = ssh_delete(path)
if ok:
log(f"DELETED | {sz/1e9:.2f}G | {title} | {path}")
deleted_count += 1
deleted_bytes += sz
else:
log(f"FAILED | {sz/1e9:.2f}G | {title} | {path} | {err}")
failed_count += 1
log(f"DONE | deleted={deleted_count} | freed={deleted_bytes/1e9:.1f}G | failed={failed_count}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,160 @@
[2026-04-22T21:18:32Z] ============================================================
[2026-04-22T21:18:32Z] cleanup-orphans.py started (dry_run=True)
[2026-04-22T21:18:55Z] DRY-RUN | 14.62G | BLEACH Thousand Year Blood War | /media/downloads/sonarr/BLEACH.Thousand-Year.Blood.War.S01.JAPANESE.1080p.DSNP.WEBRip.AAC2.0.x264-NTb[rartv]
[2026-04-22T21:18:55Z] DRY-RUN | 1971.45G | Bleach USBD Remux TL | /media/downloads/sonarr/Bleach USBD Remux TL
[2026-04-22T21:18:55Z] DRY-RUN | 0.52G | Gachiakuta 09 | /media/downloads/sonarr/[KiyoshiiSubs] Gachiakuta - 09 [1080p][H.265 - 10Bit].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.44G | Gachiakuta 19 ( | /media/downloads/sonarr/[SubsPlease] Gachiakuta - 19 (1080p) [019A6A50].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 24.39G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S01.1080p.MAX.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 36.73G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S02.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 37.52G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S03.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 36.83G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S04.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 37.77G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S05.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 36.07G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S06.1080p.HMAX.WEB-DL.DD.5.1.H.264-GNOME
[2026-04-22T21:18:55Z] DRY-RUN | 29.48G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S07.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 28.71G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S08.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:18:55Z] DRY-RUN | 4.58G | Grimgar Of Fantasy And Ash ( | /media/downloads/sonarr/Grimgar Of Fantasy And Ash (2016) S01 1080p BluRay 10bit EAC3 2 0 x265-iVy
[2026-04-22T21:18:55Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 01 (1080p) [4CA94F81]
[2026-04-22T21:18:55Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 05 (1080p) [A0556FA8].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 06 (1080p) [982D7547].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 07 (1080p) [247CFB44].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 10 (1080p) [ABE1B90A]
[2026-04-22T21:18:55Z] DRY-RUN | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 13 (1080p) [230618C3].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 0.52G | Hikikomari Kyuuketsuki no Monmon 07 ( | /media/downloads/sonarr/[SubsPlease] Hikikomari Kyuuketsuki no Monmon - 07 (1080p) [B07BA1C7]
[2026-04-22T21:18:55Z] DRY-RUN | 10.49G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 8.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.x264-NTG
[2026-04-22T21:18:55Z] DRY-RUN | 4.23G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 4.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.x264-Telly
[2026-04-22T21:18:55Z] DRY-RUN | 5.99G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 5.26G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEBRip.DDP5.1.Atmos.x264-SMURF
[2026-04-22T21:18:55Z] DRY-RUN | 4.44G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S04.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 0.88G | SANDA | /media/downloads/sonarr/SANDA.S01E02.1080p.WEB.H264-SENSEI
[2026-04-22T21:18:55Z] DRY-RUN | 0.70G | Senpai Is An Otokonoko | /media/downloads/sonarr/Senpai.Is.An.Otokonoko.S01E05.720p.WEB.H264-SKYANiME
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E01.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E04.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E07.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E08.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E10.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E12.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 31.17G | Sex Education | /media/downloads/sonarr/Sex.Education.S01.1080p.NF.WEB.DDP5.1.x264-DEFLATE
[2026-04-22T21:18:55Z] DRY-RUN | 41.53G | Sex Education | /media/downloads/sonarr/Sex.Education.S02.1080p.NF.WEB.DDP5.1.x264-NTb
[2026-04-22T21:18:55Z] DRY-RUN | 15.56G | Sex Education | /media/downloads/sonarr/Sex.Education.S03.1080p.NF.WEB-DL.DDP5.1.H.264-FLUX
[2026-04-22T21:18:55Z] DRY-RUN | 21.49G | Sex Education | /media/downloads/sonarr/Sex.Education.S04.1080p.NF.WEB-DL.DDP5.1.H.264-Archie
[2026-04-22T21:18:55Z] DRY-RUN | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E02.THE.HERO.OF.MY.DREAMS.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E03.THE.MAN.WHO.STANDS.AT.THE.TOP.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.48G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E04.CLASH.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:18:55Z] DRY-RUN | 0.34G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E07.A.Fight.He.Cant.Lose.1080p.B-Global.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub.mkv
[2026-04-22T21:18:55Z] DRY-RUN | 0.26G | Wind Breaker | /media/downloads/sonarr/Wind Breaker - S01E12 - 1080p WEB HEVC -NanDesuKa (B-Global).mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.46G | Wind Breaker 01 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 01 (1080p) [5D5071F6].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.46G | Wind Breaker 05 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 05 (1080p) [B6649F46].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 1.46G | Wind Breaker 06 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 06 (1080p) [1C13E5BC].mkv
[2026-04-22T21:18:55Z] DRY-RUN | 0.74G | Wistoria Wand And Sword | /media/downloads/sonarr/Wistoria.Wand.And.Sword.S01E01.720p.WEB.H264-SKYANiME
[2026-04-22T21:18:55Z] DRY-RUN | 1.45G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 1.44G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:18:55Z] DRY-RUN | 0.00G | www UIndex org Severance | /media/downloads/sonarr/www.UIndex.org - Severance S02E10 Cold Harbor 1080p ATVP WEB-DL DDP5 1 Atmos H 264-Kitsune
[2026-04-22T21:18:55Z] DONE | deleted=53 | freed=2449.6G | failed=0
[2026-04-22T21:23:05Z] ============================================================
[2026-04-22T21:23:05Z] cleanup-orphans.py started (dry_run=True)
[2026-04-22T21:23:28Z] DRY-RUN | 24.39G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S01.1080p.MAX.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 36.73G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S02.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 37.52G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S03.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 36.83G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S04.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 37.77G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S05.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 36.07G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S06.1080p.HMAX.WEB-DL.DD.5.1.H.264-GNOME
[2026-04-22T21:23:28Z] DRY-RUN | 29.48G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S07.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 28.71G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S08.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:23:28Z] DRY-RUN | 4.58G | Grimgar Of Fantasy And Ash ( | /media/downloads/sonarr/Grimgar Of Fantasy And Ash (2016) S01 1080p BluRay 10bit EAC3 2 0 x265-iVy
[2026-04-22T21:23:28Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 01 (1080p) [4CA94F81]
[2026-04-22T21:23:28Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 05 (1080p) [A0556FA8].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 06 (1080p) [982D7547].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 07 (1080p) [247CFB44].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 10 (1080p) [ABE1B90A]
[2026-04-22T21:23:28Z] DRY-RUN | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 13 (1080p) [230618C3].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 0.52G | Hikikomari Kyuuketsuki no Monmon 07 ( | /media/downloads/sonarr/[SubsPlease] Hikikomari Kyuuketsuki no Monmon - 07 (1080p) [B07BA1C7]
[2026-04-22T21:23:28Z] DRY-RUN | 10.49G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 8.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.x264-NTG
[2026-04-22T21:23:28Z] DRY-RUN | 4.23G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 4.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.x264-Telly
[2026-04-22T21:23:28Z] DRY-RUN | 5.99G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 5.26G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEBRip.DDP5.1.Atmos.x264-SMURF
[2026-04-22T21:23:28Z] DRY-RUN | 4.44G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S04.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 0.88G | SANDA | /media/downloads/sonarr/SANDA.S01E02.1080p.WEB.H264-SENSEI
[2026-04-22T21:23:28Z] DRY-RUN | 0.70G | Senpai Is An Otokonoko | /media/downloads/sonarr/Senpai.Is.An.Otokonoko.S01E05.720p.WEB.H264-SKYANiME
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E01.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E04.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E07.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E08.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E10.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E12.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 31.17G | Sex Education | /media/downloads/sonarr/Sex.Education.S01.1080p.NF.WEB.DDP5.1.x264-DEFLATE
[2026-04-22T21:23:28Z] DRY-RUN | 41.53G | Sex Education | /media/downloads/sonarr/Sex.Education.S02.1080p.NF.WEB.DDP5.1.x264-NTb
[2026-04-22T21:23:28Z] DRY-RUN | 15.56G | Sex Education | /media/downloads/sonarr/Sex.Education.S03.1080p.NF.WEB-DL.DDP5.1.H.264-FLUX
[2026-04-22T21:23:28Z] DRY-RUN | 21.49G | Sex Education | /media/downloads/sonarr/Sex.Education.S04.1080p.NF.WEB-DL.DDP5.1.H.264-Archie
[2026-04-22T21:23:28Z] DRY-RUN | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E02.THE.HERO.OF.MY.DREAMS.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E03.THE.MAN.WHO.STANDS.AT.THE.TOP.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.48G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E04.CLASH.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:23:28Z] DRY-RUN | 0.34G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E07.A.Fight.He.Cant.Lose.1080p.B-Global.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub.mkv
[2026-04-22T21:23:28Z] DRY-RUN | 0.26G | Wind Breaker | /media/downloads/sonarr/Wind Breaker - S01E12 - 1080p WEB HEVC -NanDesuKa (B-Global).mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.46G | Wind Breaker 01 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 01 (1080p) [5D5071F6].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.46G | Wind Breaker 05 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 05 (1080p) [B6649F46].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 1.46G | Wind Breaker 06 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 06 (1080p) [1C13E5BC].mkv
[2026-04-22T21:23:28Z] DRY-RUN | 0.74G | Wistoria Wand And Sword | /media/downloads/sonarr/Wistoria.Wand.And.Sword.S01E01.720p.WEB.H264-SKYANiME
[2026-04-22T21:23:28Z] DRY-RUN | 1.45G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 1.44G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:23:28Z] DRY-RUN | 0.00G | www UIndex org Severance | /media/downloads/sonarr/www.UIndex.org - Severance S02E10 Cold Harbor 1080p ATVP WEB-DL DDP5 1 Atmos H 264-Kitsune
[2026-04-22T21:23:28Z] DONE | deleted=49 | freed=461.6G | failed=0
[2026-04-22T21:32:57Z] ============================================================
[2026-04-22T21:32:57Z] cleanup-orphans.py started (dry_run=False)
[2026-04-22T21:33:31Z] DELETED | 24.39G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S01.1080p.MAX.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:34:04Z] DELETED | 36.73G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S02.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:34:39Z] DELETED | 37.52G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S03.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:35:05Z] DELETED | 36.83G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S04.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:35:33Z] DELETED | 37.77G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S05.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:35:51Z] DELETED | 36.07G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S06.1080p.HMAX.WEB-DL.DD.5.1.H.264-GNOME
[2026-04-22T21:36:01Z] DELETED | 29.48G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S07.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:36:09Z] DELETED | 28.71G | Game of Thrones | /media/downloads/sonarr/Game.of.Thrones.S08.NORDiC.1080p.HMAX.WEB-DL.DDP5.1.Atmos.H.264-DKV
[2026-04-22T21:36:10Z] DELETED | 4.58G | Grimgar Of Fantasy And Ash ( | /media/downloads/sonarr/Grimgar Of Fantasy And Ash (2016) S01 1080p BluRay 10bit EAC3 2 0 x265-iVy
[2026-04-22T21:36:11Z] DELETED | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 01 (1080p) [4CA94F81]
[2026-04-22T21:36:11Z] DELETED | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 05 (1080p) [A0556FA8].mkv
[2026-04-22T21:36:11Z] DELETED | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 06 (1080p) [982D7547].mkv
[2026-04-22T21:36:12Z] DELETED | 1.53G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 07 (1080p) [247CFB44].mkv
[2026-04-22T21:36:13Z] DELETED | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 10 (1080p) [ABE1B90A]
[2026-04-22T21:36:13Z] DELETED | 1.52G | Hibike! Euphonium | /media/downloads/sonarr/[SubsPlease] Hibike! Euphonium S3 - 13 (1080p) [230618C3].mkv
[2026-04-22T21:36:13Z] DELETED | 0.52G | Hikikomari Kyuuketsuki no Monmon 07 ( | /media/downloads/sonarr/[SubsPlease] Hikikomari Kyuuketsuki no Monmon - 07 (1080p) [B07BA1C7]
[2026-04-22T21:36:15Z] DELETED | 10.49G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:36:16Z] DELETED | 8.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S01.1080p.NF.WEB-DL.DDP5.1.x264-NTG
[2026-04-22T21:36:16Z] DELETED | 4.23G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:36:17Z] DELETED | 4.97G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S02.1080p.NF.WEB-DL.DDP5.1.Atmos.x264-Telly
[2026-04-22T21:36:17Z] DELETED | 5.99G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:36:18Z] DELETED | 5.26G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S03.1080p.NF.WEBRip.DDP5.1.Atmos.x264-SMURF
[2026-04-22T21:36:22Z] DELETED | 4.44G | Love Death and Robots | /media/downloads/sonarr/Love.Death.and.Robots.S04.1080p.NF.WEB-DL.DDP5.1.Atmos.H.264-FLUX
[2026-04-22T21:36:22Z] DELETED | 0.88G | SANDA | /media/downloads/sonarr/SANDA.S01E02.1080p.WEB.H264-SENSEI
[2026-04-22T21:36:22Z] DELETED | 0.70G | Senpai Is An Otokonoko | /media/downloads/sonarr/Senpai.Is.An.Otokonoko.S01E05.720p.WEB.H264-SKYANiME
[2026-04-22T21:36:22Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E01.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:23Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:23Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:23Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E04.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:23Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E07.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:24Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E08.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:24Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E10.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:24Z] DELETED | 1.39G | Senpai is an Otokonoko | /media/downloads/sonarr/Senpai.is.an.Otokonoko.S01E12.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:25Z] DELETED | 31.17G | Sex Education | /media/downloads/sonarr/Sex.Education.S01.1080p.NF.WEB.DDP5.1.x264-DEFLATE
[2026-04-22T21:36:26Z] DELETED | 41.53G | Sex Education | /media/downloads/sonarr/Sex.Education.S02.1080p.NF.WEB.DDP5.1.x264-NTb
[2026-04-22T21:36:26Z] DELETED | 15.56G | Sex Education | /media/downloads/sonarr/Sex.Education.S03.1080p.NF.WEB-DL.DDP5.1.H.264-FLUX
[2026-04-22T21:36:27Z] DELETED | 21.49G | Sex Education | /media/downloads/sonarr/Sex.Education.S04.1080p.NF.WEB-DL.DDP5.1.H.264-Archie
[2026-04-22T21:36:27Z] DELETED | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E02.THE.HERO.OF.MY.DREAMS.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:36:28Z] DELETED | 1.49G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E03.THE.MAN.WHO.STANDS.AT.THE.TOP.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:36:28Z] DELETED | 1.48G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E04.CLASH.1080p.CR.WEB-DL.AAC2.0.H.264.DUAL-VARYG.mkv
[2026-04-22T21:36:28Z] DELETED | 0.34G | WIND BREAKER | /media/downloads/sonarr/WIND.BREAKER.S01E07.A.Fight.He.Cant.Lose.1080p.B-Global.WEB-DL.JPN.AAC2.0.H.264.MSubs-ToonsHub.mkv
[2026-04-22T21:36:29Z] DELETED | 0.26G | Wind Breaker | /media/downloads/sonarr/Wind Breaker - S01E12 - 1080p WEB HEVC -NanDesuKa (B-Global).mkv
[2026-04-22T21:36:29Z] DELETED | 1.46G | Wind Breaker 01 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 01 (1080p) [5D5071F6].mkv
[2026-04-22T21:36:29Z] DELETED | 1.46G | Wind Breaker 05 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 05 (1080p) [B6649F46].mkv
[2026-04-22T21:36:30Z] DELETED | 1.46G | Wind Breaker 06 ( | /media/downloads/sonarr/[SubsPlease] Wind Breaker - 06 (1080p) [1C13E5BC].mkv
[2026-04-22T21:36:30Z] DELETED | 0.74G | Wistoria Wand And Sword | /media/downloads/sonarr/Wistoria.Wand.And.Sword.S01E01.720p.WEB.H264-SKYANiME
[2026-04-22T21:36:30Z] DELETED | 1.45G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E02.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:31Z] DELETED | 1.44G | Wistoria Wand and Sword | /media/downloads/sonarr/Wistoria.Wand.and.Sword.S01E03.1080p.WEB.H264-KAWAII
[2026-04-22T21:36:31Z] DELETED | 0.00G | www UIndex org Severance | /media/downloads/sonarr/www.UIndex.org - Severance S02E10 Cold Harbor 1080p ATVP WEB-DL DDP5 1 Atmos H 264-Kitsune
[2026-04-22T21:36:31Z] DONE | deleted=49 | freed=461.6G | failed=0

View File

@@ -0,0 +1,132 @@
#!/usr/bin/env python3
"""
Delete confirmed-safe download entries from /media/downloads/sonarr and /media/downloads/radarr.
Reads /tmp/arr_verified.json produced by verify.py.
Only deletes entries where status == 'safe' (API-confirmed imported + disk path verified).
Orphans and path_missing entries are never touched.
Usage:
python3 cleanup.py --dry-run # print what would be deleted
python3 cleanup.py --arr sonarr # delete only sonarr downloads
python3 cleanup.py --arr radarr # delete only radarr downloads
python3 cleanup.py # delete both (prompts for confirmation)
# Target a single series/movie by title substring:
python3 cleanup.py --arr sonarr --title "American Dragon"
"""
import json
import subprocess
import argparse
import sys
SSH_HOST = "aya01"
SONARR_DL_ROOT = "/media/downloads/sonarr"
RADARR_DL_ROOT = "/media/downloads/radarr"
VERIFIED_JSON = "/tmp/arr_verified.json"
def ssh_delete(path, dry_run):
"""Delete path on remote host. Returns True on success."""
if dry_run:
print(f" [DRY-RUN] would delete: {path}")
return True
result = subprocess.run(
['ssh', SSH_HOST, f'rm -rf {json.dumps(path)}'],
capture_output=True, text=True
)
if result.returncode != 0:
print(f" ERROR deleting {path}: {result.stderr.strip()}")
return False
return True
def ssh_exists(path):
r = subprocess.run(['ssh', SSH_HOST, f'[ -e {json.dumps(path)} ] && echo yes || echo no'],
capture_output=True, text=True)
return r.stdout.strip() == 'yes'
def confirm(prompt):
answer = input(f"{prompt} [y/N] ").strip().lower()
return answer == 'y'
def process(entries, dl_root, label, dry_run, title_filter, yes=False):
safe = [m for m in entries if m['status'] == 'safe']
if title_filter:
safe = [m for m in safe if title_filter.lower() in m['title'].lower()]
if not safe:
print(f"No safe entries to delete for {label}.")
return 0, 0
print(f"\n{'='*60}")
print(f"{label}{len(safe)} entries to delete")
print(f"{'='*60}")
for m in safe:
pct = m.get('percentOfEpisodes', '')
pct_str = f" [{pct:.0f}%]" if isinstance(pct, float) else ''
files = m.get('episodeFileCount', '')
total = m.get('totalEpisodeCount', '')
count_str = f" ({files}/{total} eps)" if files != '' else f" (hasFile=True)"
print(f" {m['title']}{pct_str}{count_str}")
print(f"{m['dl']}")
print(f"{m['check_path']}")
if not dry_run and not yes:
if not confirm(f"\nDelete {len(safe)} {label} download entries?"):
print("Skipped.")
return 0, 0
deleted, failed = 0, 0
for m in safe:
dl_path = f"{dl_root}/{m['dl']}"
# Double-check the series/movie still exists on disk before deleting the download
if not dry_run and not ssh_exists(m['check_path']):
print(f" SKIP {m['title']}: media path no longer on disk ({m['check_path']})")
failed += 1
continue
ok = ssh_delete(dl_path, dry_run)
if ok:
deleted += 1
else:
failed += 1
print(f"\n{label}: {deleted} deleted, {failed} failed/skipped")
return deleted, failed
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--dry-run', action='store_true', help='Print actions without deleting')
parser.add_argument('--yes', '-y', action='store_true', help='Skip confirmation prompt')
parser.add_argument('--arr', choices=['sonarr', 'radarr', 'both'], default='both')
parser.add_argument('--title', default='', help='Only process entries matching this title substring')
args = parser.parse_args()
with open(VERIFIED_JSON) as f:
data = json.load(f)
if args.dry_run:
print("DRY-RUN mode — nothing will be deleted\n")
total_deleted, total_failed = 0, 0
if args.arr in ('radarr', 'both'):
d, f = process(data['radarr_matched'], RADARR_DL_ROOT, 'Radarr', args.dry_run, args.title, args.yes)
total_deleted += d
total_failed += f
if args.arr in ('sonarr', 'both'):
d, f = process(data['sonarr_matched'], SONARR_DL_ROOT, 'Sonarr', args.dry_run, args.title, args.yes)
total_deleted += d
total_failed += f
print(f"\nTotal: {total_deleted} deleted, {total_failed} failed/skipped")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,200 @@
# arr-stack Downloads Cleanup — Investigation Findings
## Storage Layout (aya01)
| Device | FS | Size | Used | Mount |
|--------|----|------|------|-------|
| `/dev/sdc3` | btrfs | 1.9T | 177G (10%) | `/` (system) |
| `/dev/sda1` | btrfs `proxmox` | 2.8T | 1.3T (48%) | `/opt` |
| `/dev/sdd1` | ext4 | 17T | 15T (92%) | `/mnt/hdd0` |
| `/dev/sde1` | ext4 | 17T | 15T (92%) | `/mnt/hdd2` |
| `/dev/sdf1` | ext4 | 17T | 15T (92%) | `/mnt/hdd1` |
| `mergerfs` | fuse | 49T | 43T (92%) | `/media` |
`/media` is a mergerfs union of hdd0 + hdd1 + hdd2. All three HDDs were at ~92% capacity before cleanup.
**After cleanup (2026-04-23):**
| Device | Used | Avail | Use% |
|--------|------|-------|------|
| `/dev/sdd1` (hdd0) | 9.4T | 6.2T | 61% |
| `/dev/sdf1` (hdd1) | 9.3T | 6.3T | 60% |
| `/dev/sde1` (hdd2) | 7.8T | 7.8T | 51% |
| `mergerfs /media` | 27T | 21T | 57% |
**~16T freed total** (92% → 57% on the mergerfs pool).
## /media Breakdown (before cleanup)
| Directory | Size |
|-----------|------|
| `downloads` | **22T** |
| `series` | 16T |
| `movies` | 5T |
## Root Cause: No Hardlinks → All Imports Are Copies
Zero hardlinked files exist anywhere across all three HDDs. Confirmed by two methods:
1. Inspecting the Kubernetes manifests in `argocd-homelab/services/arr-stack/`
2. Inode comparison of 1365 download/media file pairs — **0 shared inodes found** (every file is a distinct copy)
**All three services mount the mergerfs `/media/` path via NFS:**
```
sonarr: NFS 192.168.20.12:/media/downloads → /downloads
NFS 192.168.20.12:/media/series → /tv
radarr: NFS 192.168.20.12:/media/downloads → /downloads
NFS 192.168.20.12:/media/movies → /movies
qbit: NFS 192.168.20.12:/media/downloads → /downloads
```
mergerfs does not support hardlinks across underlying filesystems. When qBit downloads to `/media/downloads/sonarr/` (lands on e.g. hdd1) and Sonarr imports to `/media/series/` (lands on e.g. hdd0), the hardlink attempt crosses a physical disk boundary → falls back to copy. Every import doubles the data.
**Estimated wasted space before cleanup: ~21T** (the entire downloads/sonarr + downloads/radarr).
## How to Run
Prerequisites:
```bash
# Port-forward Sonarr and Radarr APIs
kubectl -n arr-stack port-forward svc/sonarr 8989:8989 &
kubectl -n arr-stack port-forward svc/radarr 7878:7878 &
```
API keys are loaded from `../../../../sonarr.api.env` and `../../../../radarr.api.env`
(i.e. `/home/tudattr/workspace/infra/sonarr.api.env` relative to this repo).
Container path mappings used in scripts:
- Sonarr: `/tv/``/media/series/`
- Radarr: `/movies/``/media/movies/`
### Step 1 — Verify (generates `/tmp/arr_verified.json`)
```bash
python3 verify.py
```
Cross-references all downloads against Sonarr/Radarr APIs, verifies reported file paths exist on disk via SSH. Classifies each entry as `safe`, `not_imported`, or `path_missing`.
### Step 2 — Delete confirmed-imported downloads
```bash
python3 cleanup.py --dry-run # preview
python3 cleanup.py --arr sonarr --yes
python3 cleanup.py --arr radarr --yes
```
### Step 3 — Delete orphans (downloads not in Sonarr at all)
```bash
python3 cleanup-orphans.py --dry-run # preview
python3 cleanup-orphans.py --yes
```
All actions are logged to `cleanup.log` with UTC timestamp, size, title, path, and outcome.
## Cleanup Performed (2026-04-23)
### Pass 1 — Orphans (downloads not in Sonarr)
Script: `cleanup-orphans.py`
Two-pass logic:
1. Match each download name against Sonarr API (title, slug, sortTitle, alternate titles, partial match)
2. If no API match, check if a series directory with a similar name exists in `/media/series/` — if it does, skip (needs manual review)
3. Delete remaining true orphans
Result: **49 deleted, 461.6G freed, 0 failed**
111 entries SKIPPED (series dir found on disk) — includes Bleach, House, Lucifer, You, SpongeBob, Detective Conan episodes, What If, etc. See `cleanup.log` for full list.
Notable orphans deleted:
- Game of Thrones S01S08 (~267G) — removed from Sonarr
- Sex Education S01S04 (~110G) — removed from Sonarr
- Love Death & Robots (multiple duplicate copies, ~45G)
- Senpai is an Otokonoko, Wind Breaker, Wistoria, Hibike! Euphonium S3 episodes, etc.
### Pass 2 — Confirmed-imported Sonarr downloads
Script: `cleanup.py --arr sonarr --yes`
Deleted downloads where Sonarr confirmed `episodeFileCount > 0` AND the series directory was verified to exist on disk.
Result: **1106 deleted, 0 failed**
### Pass 3 — Confirmed-imported Radarr downloads
Script: `cleanup.py --arr radarr --yes`
Deleted downloads where Radarr confirmed `hasFile=True` AND the file/directory path was verified to exist on disk.
Result: **259 deleted, 0 failed**
### Summary
| Pass | Script | Entries | Space freed |
|------|--------|---------|-------------|
| Orphans | `cleanup-orphans.py` | 49 | ~461G |
| Sonarr imports | `cleanup.py --arr sonarr` | 1106 | ~12T (estimated) |
| Radarr imports | `cleanup.py --arr radarr` | 259 | ~4T (estimated) |
| **Total** | | **1414** | **~16T** |
## Verification Results (from verify.py run before cleanup)
| | Safe to delete | Not imported | Path missing | Orphans (no API match) |
|---|---|---|---|---|
| **Sonarr** (1439 downloads) | 1106 | — | — | 333 |
| **Radarr** (289 downloads) | 265 | — | — | 25 |
Note: `cleanup-orphans.py` uses more aggressive title matching (alternate titles, partial match) than `verify.py`, so its orphan count (160 not-in-Sonarr out of 1438) is lower than `verify.py`'s 333.
### Radarr Orphans (25) — not matched, not deleted
- Constantine (2005)
- Cowboy Bebop: Knockin' on Heaven's Door (2001)
- Les Misérables (2012)
- Pokémon Detective Pikachu (2019)
- Code Geass: Fukkatsu no Lelouch (2019)
- Eiga Go-Toubun no Hanayome (2022)
- Gisaengchung / Parasite — Korean title, matching failure
- Dune: Part One (2021) — matching failure, confirmed in Radarr
- Harry Potter older/duplicate copies — matching failure
- Porco Rosso / Kurenai no buta — matching failure
- Castle in the Sky / Laputa — matching failure
- Steins;Gate: The Movie — matching failure
- Project Silence / Talchul — matching failure
- Digimon: Frontier & Savers films
- One Piece films (several)
- Paripi Koumei movie
- Fantastic Four (2025) extra copies (3)
- JJK DCP trailer file
### Path mismatch entries (confirmed safe, deleted anyway)
- Star Wars Episode IV/V/VI/IX — all matched to Episode IV record; manually confirmed all 4 dirs exist
- WALL·E — `·` middle-dot (U+00B7) broke string comparison; file confirmed on disk
## Pending Decisions
### Bleach USBD Remux TL (1.8T)
`/media/downloads/sonarr/Bleach USBD Remux TL` — full lossless Bluray remux S00S16 (-ZR- group).
Currently SKIPPED — `/media/series/Bleach (2004) {imdb-tt0434665}/` exists (310G imported).
Most seasons were imported from lighter x265 Bluray packs (`Bleach S0x Bluray EAC3 2.0 1080p x265-iVy`) rather than this remux. S11 has no imported content. S13 and S14 partially imported.
Options:
- **Delete** — free 1.8T, imported x265 content stays, re-download at remux quality later if desired
- **Keep** — retain as source for Sonarr to import remaining episodes at lossless quality now that disk space is freed
Per-season breakdown saved in memory.
### SKIPPED downloads (111 Sonarr entries)
Downloads where a matching series directory exists on disk but the series is not in Sonarr.
Likely intentionally removed series (House, Lucifer, You, Black Clover, etc.) with leftover download copies.
Needs manual review per series before deleting.
## Permanent Fix (not applied)
Mount per-HDD NFS paths instead of the mergerfs path, so qBit downloads and arr imports land on the same physical filesystem, enabling hardlinks:
```yaml
# In sonarr/radarr/qtun deployments, change:
path: /media/downloads → path: /mnt/hdd0/downloads
path: /media/series → path: /mnt/hdd0/series
path: /media/movies → path: /mnt/hdd0/movies
```
Jellyfin/Plex keep reading from `/media/` (mergerfs union). New imports hardlink within hdd0, wasting no extra space.
Tradeoff: all new content lands on hdd0 only. Load balancing across the three disks stops working for new downloads. Once hdd0 fills up a migration strategy is needed.

View File

@@ -0,0 +1,246 @@
#!/usr/bin/env python3
"""
Cross-reference /media/downloads/sonarr and /media/downloads/radarr against
the Sonarr/Radarr APIs, then verify reported file paths actually exist on disk.
Requirements:
- kubectl port-forwards active:
kubectl -n arr-stack port-forward svc/sonarr 8989:8989
kubectl -n arr-stack port-forward svc/radarr 7878:7878
- SSH access to aya01
- API keys in ../../../../sonarr.api.env and ../../../../radarr.api.env
Output:
/tmp/arr_verified.json — full structured results for use by cleanup.py
"""
import urllib.request
import json
import subprocess
import re
import sys
import os
SONARR_URL = "http://localhost:8989/api/v3"
RADARR_URL = "http://localhost:7878/api/v3"
SSH_HOST = "aya01"
script_dir = os.path.dirname(os.path.abspath(__file__))
def load_key(filename):
path = os.path.join(script_dir, '../../../..', filename)
return open(path).read().strip()
SONARR_KEY = load_key('sonarr.api.env')
RADARR_KEY = load_key('radarr.api.env')
def api_get(url):
with urllib.request.urlopen(url, timeout=30) as r:
return json.load(r)
def norm(s):
return re.sub(r'[^a-z0-9]', '', s.lower())
def extract_title(name, is_movie):
"""Strip release tags from a download name to recover a bare title."""
name = re.sub(r'\.(mkv|mp4|avi|m4v)$', '', name, flags=re.IGNORECASE)
name = re.sub(r'\[.*?\]', '', name)
if is_movie:
name = re.sub(r'[\.\s_\-]?(19|20)\d{2}.*$', '', name)
else:
name = re.sub(r'[\.\s_\-]?[Ss]\d{1,2}([Ee]\d{1,2})?.*$', '', name)
return re.sub(r'[\.\-_]+', ' ', name).strip()
def build_index(records, key_fn):
idx = {}
for rec in records:
for k in key_fn(rec):
if k:
idx[k] = rec
return idx
def find_match(dl_name, idx, is_movie):
title = extract_title(dl_name, is_movie)
tn = norm(title)
if tn in idx:
return idx[tn]
for k, rec in idx.items():
if k and len(k) > 5 and (tn.startswith(k) or k.startswith(tn)):
return rec
return None
def ssh_check_paths(paths):
"""Return (existing, missing) sets for the given list of paths."""
if not paths:
return set(), set()
cmds = '\n'.join(
f'[ -e {json.dumps(p)} ] && echo "EXISTS:{p}" || echo "MISSING:{p}"'
for p in paths
)
r = subprocess.run(['ssh', SSH_HOST, 'bash', '-s'],
input=cmds, capture_output=True, text=True)
existing, missing = set(), set()
for line in r.stdout.splitlines():
if line.startswith('EXISTS:'):
existing.add(line[7:])
elif line.startswith('MISSING:'):
missing.add(line[8:])
return existing, missing
def main():
print("Fetching Radarr movies...")
radarr_movies = api_get(f"{RADARR_URL}/movie?apikey={RADARR_KEY}")
print(f" {len(radarr_movies)} movies")
print("Fetching Sonarr series...")
sonarr_series = api_get(f"{SONARR_URL}/series?apikey={SONARR_KEY}")
print(f" {len(sonarr_series)} series")
# Radarr index
def radarr_keys(m):
return [norm(m['title']), norm(f"{m['title']}{m.get('year','')}")]
radarr_idx = build_index(radarr_movies, radarr_keys)
# Enrich radarr records with disk path
for m in radarr_movies:
mf = m.get('movieFile')
m['_file_path'] = (
mf['path'].replace('/movies/', '/media/movies/', 1) if mf and mf.get('path') else None
)
m['_dir_path'] = m.get('path', '').replace('/movies/', '/media/movies/', 1)
# Sonarr index
def sonarr_keys(s):
return [norm(s['title'])]
sonarr_idx = build_index(sonarr_series, sonarr_keys)
for s in sonarr_series:
s['_dir_path'] = s.get('path', '').replace('/tv/', '/media/series/', 1)
# Download listings
print(f"\nFetching download listings from {SSH_HOST}...")
r = subprocess.run(
['ssh', SSH_HOST, 'ls /media/downloads/sonarr/ && echo "===RADARR===" && ls /media/downloads/radarr/'],
capture_output=True, text=True
)
parts = r.stdout.split('===RADARR===\n')
sonarr_dls = [l.strip() for l in parts[0].splitlines() if l.strip()]
radarr_dls = [l.strip() for l in parts[1].splitlines() if l.strip()]
print(f" Sonarr downloads: {len(sonarr_dls)}")
print(f" Radarr downloads: {len(radarr_dls)}")
# Match and collect paths
radarr_matched, radarr_orphans = [], []
for dl in radarr_dls:
rec = find_match(dl, radarr_idx, is_movie=True)
if rec is None:
radarr_orphans.append(dl)
else:
check_path = rec['_file_path'] or rec['_dir_path']
radarr_matched.append({
'dl': dl,
'title': rec['title'],
'year': rec.get('year'),
'hasFile': rec.get('hasFile', False),
'monitored': rec.get('monitored'),
'check_path': check_path,
})
sonarr_matched, sonarr_orphans = [], []
for dl in sonarr_dls:
rec = find_match(dl, sonarr_idx, is_movie=False)
if rec is None:
sonarr_orphans.append(dl)
else:
stats = rec.get('statistics', {})
sonarr_matched.append({
'dl': dl,
'title': rec['title'],
'episodeFileCount': stats.get('episodeFileCount', 0),
'totalEpisodeCount': stats.get('totalEpisodeCount', 0),
'percentOfEpisodes': stats.get('percentOfEpisodes', 0),
'monitored': rec.get('monitored'),
'status': rec.get('status'),
'check_path': rec['_dir_path'],
})
# Batch disk verification
all_paths = list(set(
[m['check_path'] for m in radarr_matched if m['check_path']] +
[m['check_path'] for m in sonarr_matched if m['check_path']]
))
print(f"\nVerifying {len(all_paths)} paths on disk...")
existing, missing = ssh_check_paths(all_paths)
print(f" {len(existing)} exist, {len(missing)} missing")
# Classify
def classify_radarr(m):
if not m['hasFile'] or not m['check_path']:
return 'not_imported'
if m['check_path'] in existing:
return 'safe'
return 'path_missing'
def classify_sonarr(m):
if m['episodeFileCount'] == 0 or not m['check_path']:
return 'not_imported'
if m['check_path'] in existing:
return 'safe'
return 'path_missing'
for m in radarr_matched:
m['status'] = classify_radarr(m)
for m in sonarr_matched:
m['status'] = classify_sonarr(m)
result = {
'radarr_matched': radarr_matched,
'radarr_orphans': radarr_orphans,
'sonarr_matched': sonarr_matched,
'sonarr_orphans': sonarr_orphans,
'existing_paths': list(existing),
'missing_paths': list(missing),
}
out_path = '/tmp/arr_verified.json'
with open(out_path, 'w') as f:
json.dump(result, f, indent=2)
print(f"\nResults written to {out_path}")
# Summary
r_safe = [m for m in radarr_matched if m['status'] == 'safe']
r_miss = [m for m in radarr_matched if m['status'] == 'path_missing']
r_noimp = [m for m in radarr_matched if m['status'] == 'not_imported']
s_safe = [m for m in sonarr_matched if m['status'] == 'safe']
s_miss = [m for m in sonarr_matched if m['status'] == 'path_missing']
s_noimp = [m for m in sonarr_matched if m['status'] == 'not_imported']
print("\n" + "="*60)
print("SUMMARY")
print("="*60)
print(f"Radarr: {len(r_safe)} safe | {len(r_miss)} path missing | {len(r_noimp)} not imported | {len(radarr_orphans)} orphans")
print(f"Sonarr: {len(s_safe)} safe | {len(s_miss)} path missing | {len(s_noimp)} not imported | {len(sonarr_orphans)} orphans")
if r_miss:
print("\nRadarr path_missing (review manually):")
for m in r_miss:
print(f" {m['title']}{m['check_path']}")
print(f" DL: {m['dl']}")
if s_miss:
print("\nSonarr path_missing (review manually):")
for m in s_miss:
print(f" {m['title']}{m['check_path']}")
print(f" DL: {m['dl']}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,274 @@
# Runbook: k3s Cluster Outage (2026-04-20 / 2026-04-21)
## Incident Summary
- **Start**: ~22:43 CEST on 2026-04-20 (k3s-server10 stuck in activating state)
- **Cluster down**: ~23:06 CEST on 2026-04-20 (API servers unreachable on all nodes)
- **Recovery**: ~07:25 CEST on 2026-04-21 (both server11 and server12 rebooted, etcd reformed)
- **Root cause**: Failing virtual disk on k3s-server11 combined with etcd overload from Longhorn orphan writes
---
## What Happened (Timeline)
1. **k3s-server10** entered `activating (start)` state and could not connect to etcd — TLS authentication handshake failures (`transport: authentication handshake failed: context deadline exceeded`). server10 was not present in the etcd member list.
2. **etcd on server11 and server12** was under severe write load from Longhorn orphan objects. Raft consensus was taking 480780ms per request (expected <100ms). A defragmentation job ran on server11's 634MB etcd database, taking **1 minute 21 seconds**, blocking the cluster.
3. **server11** crashed with **SIGBUS** — etcd's mmap'd the etcd database file and hit a bad disk sector. The journal also showed `Input/output error` when opening journal files. Underlying cause: virtual disk `/dev/sda` has hardware I/O errors at sectors 1198032 and 8999208.
4. With server11's etcd gone, the 2-member cluster lost quorum. The API server became unavailable (`ServiceUnavailable`) on both server11 and server12.
5. Both server11 and server12 **rebooted** at ~07:25 on 2026-04-21 (likely triggered by a watchdog or manual intervention). After reboot, all 3 etcd members reformed and the cluster recovered.
---
## Symptoms
### Cluster-level
- `kubectl get nodes` returns `Error from server (ServiceUnavailable)`
- All workloads stop responding
- `k3s kubectl` on server nodes returns permission denied or ServiceUnavailable
### k3s service (control plane nodes)
- `systemctl status k3s` shows `activating (start)` for minutes with no progress
- Or: `inactive (dead)` with `Duration: Xm Ys` (short-lived — crash loop)
- k3s service exits with code 0/SUCCESS despite cluster being broken (graceful k3s shutdown due to etcd loss)
### etcd
- Repeated log lines: `Failed to test etcd connection: failed to get etcd status: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: context deadline exceeded"`
- etcd logs showing `apply request took too long` for requests >100ms
- `waiting for ReadIndex response took too long, retrying`
- Raft voting messages in a loop (`cast MsgPreVote for ...`) — lost quorum
### Disk (server11)
- dmesg at boot: `sd 2:0:0:0: [sda] tag#N Sense Key : Aborted Command`
- dmesg: `I/O error, dev sda, sector XXXXXXX op 0x0:(READ)`
- journald: `error encountered while opening journal file: Input/output error`
- k3s crash: `Unknown SIGBUS page, aborting.`
### Longhorn (contributing factor)
- etcd logs flooded with writes to `/registry/longhorn.io/orphans/longhorn-system/orphan-*`
- etcd database size: 634MB (healthy clusters should be <100MB)
- Defrag operations taking >60s
---
## Diagnosis Commands
```bash
# Check k3s service status on all servers
for node in k3s-server10 k3s-server11 k3s-server12; do
echo "=== $node ===" && ssh $node 'systemctl status k3s --no-pager | head -5'
done
# Check etcd member list (run from a server with working etcd)
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table'
# Check etcd endpoint health across all 3 servers
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.48:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint health -w table'
# Check etcd endpoint status (DB size, leader)
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.48:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint status -w table'
# Check for disk I/O errors (VM disks)
ssh k3s-server11 'sudo dmesg | grep -iE "(i/o error|sda|aborted command)" | tail -20'
# Check recent k3s logs for errors
ssh k3s-server11 'sudo journalctl -u k3s -n 100 --no-pager | grep -iE "(error|fail|sigbus|panic)" | tail -30'
# Count Longhorn orphans in etcd
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
get /registry/longhorn.io/orphans/ --prefix --keys-only | wc -l'
```
---
## Root Causes
### 1. Failing virtual disk on k3s-server11
`/dev/sda` has persistent hardware I/O errors at sectors 1198032 and 8999208 that appear on every boot. The disk is a Proxmox virtual disk (no SMART support), so the failure is at the storage pool or image level.
**Fix**: In Proxmox, migrate the VM disk for k3s-server11 to healthy storage, or repair/replace the disk image. Check the Proxmox storage pool for errors.
```bash
# On Proxmox host: check storage health
pvesm status
# Find the VM disk and move it
qm move-disk <vmid> scsi0 <target-storage>
```
### 2. Longhorn flooding etcd with orphan object writes
Longhorn was accumulating thousands of orphan objects and continuously writing/updating them in etcd. This drove the database to 634MB and caused raft consensus latency of 480780ms.
**Fix (immediate)**: Clean up Longhorn orphans and compact/defrag etcd.
```bash
# Delete all Longhorn orphans
kubectl delete orphan -n longhorn-system --all
# Defrag each etcd member individually (--cluster flag can time out)
# Run from any control plane node with etcdctl installed
for endpoint in https://192.168.20.43:2379 https://192.168.20.48:2379 https://192.168.20.56:2379; do
sudo ETCDCTL_API=3 etcdctl \
--endpoints=$endpoint \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
--dial-timeout=300s --command-timeout=300s \
defrag
done
# Verify DB size dropped
sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.48:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint status -w table
```
**Fix (permanent — 2026-04-22)**: Enable Longhorn orphan auto-deletion so orphans are cleaned up automatically after a 5-minute grace period instead of accumulating indefinitely.
```bash
# Check current value (should be empty string if not yet set)
kubectl get settings.longhorn.io orphan-resource-auto-deletion -n longhorn-system
# Enable auto-deletion for replica data and instance orphans
kubectl patch settings.longhorn.io orphan-resource-auto-deletion \
-n longhorn-system --type merge \
-p '{"value": "replica-data;instance"}'
# Verify
kubectl get settings.longhorn.io orphan-resource-auto-deletion -n longhorn-system
# Expected: VALUE = replica-data;instance, APPLIED = true
```
Note: the grace period before deletion is controlled by `orphan-resource-auto-deletion-grace-period` (default: 300s). Orphans on nodes in `down` or `unknown` state are not auto-deleted.
Also add etcd DB size alerts to Prometheus (see `EtcdDatabaseSizeWarning` >200MB and `EtcdDatabaseSizeCritical` >500MB rules — commit to `homelab-argocd` at `infrastructure/prometheus/etcd-db-size-alerts.yaml`).
---
## Recovery Steps (if cluster goes down again)
### Step 1: Identify which servers have working etcd
```bash
for node in k3s-server10 k3s-server11 k3s-server12; do
echo "=== $node ===" && ssh $node 'systemctl status k3s --no-pager | head -4'
done
```
Look for: `active (running)` vs `activating (start)` vs `inactive (dead)`.
### Step 2: Check etcd quorum from a running server
```bash
ssh <running-server> 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint health'
```
If all endpoints are healthy but API is down, restart k3s:
```bash
ssh <server> 'sudo systemctl restart k3s'
```
### Step 3: If etcd has lost quorum (fewer than 2 of 3 members healthy)
With 3-member etcd, you need at least 2 members to have quorum. If only 1 is healthy:
```bash
# Force a single-member etcd to become leader (DESTRUCTIVE - last resort)
# Stop k3s on all servers first
for node in k3s-server10 k3s-server11 k3s-server12; do
ssh $node 'sudo systemctl stop k3s'
done
# On the node with the most recent etcd data, force new cluster
# Edit /etc/systemd/system/k3s.service.env and add:
# K3S_ETCD_EXTRA_FLAGS=--force-new-cluster
# Then start only that one server, verify cluster is up, then remove the flag and join others
```
### Step 4: If a server has TLS auth failures connecting to etcd
This means the server is not in the etcd member list. Check:
```bash
# Is the node actually in etcd?
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table'
```
If the failing server is missing: restart it — k3s will attempt to re-add it to the cluster.
If it still fails after restart: the etcd data directory may be corrupt. Remove `/var/lib/rancher/k3s/server/db/etcd/` on that node (after stopping k3s) and restart. k3s will resync from peers.
### Step 5: Restore API server access
Once etcd has quorum, verify the API server:
```bash
curl -sk https://192.168.20.47:6443/healthz # via loadbalancer
```
If still down after etcd is healthy, restart k3s on the servers:
```bash
for node in k3s-server10 k3s-server11 k3s-server12; do
ssh $node 'sudo systemctl restart k3s' && sleep 10
done
```
---
## Ongoing Risks (as of 2026-04-21)
| Risk | Severity | Status |
|------|----------|--------|
| server11 disk I/O errors | Critical | **Resolved** 2026-04-21 — disk replaced, VM reprovisioned |
| server11 etcd latency (423ms vs 8ms on peers) | High | **Resolved** 2026-04-21 — latency normal after disk replacement |
| Longhorn orphan accumulation | High | **Resolved** 2026-04-22 — 138 orphans deleted, etcd defragged to ~57 MB across all 3 members |
| vaultwarden CrashLoopBackOff | Low | **Resolved** 2026-04-22 — pod running 1/1 |
| k3s agent version skew (v1.33.5v1.34.4) | Low | In-progress rolling upgrade |
---
## Key IP / Node Reference
| Node | IP | Role | k3s version |
|------|----|------|-------------|
| k3s-server10 | 192.168.20.43 | control-plane, etcd | v1.34.6+k3s1 |
| k3s-server11 | 192.168.20.48 | control-plane, etcd, master | v1.34.6+k3s1 |
| k3s-server12 | 192.168.20.56 | control-plane, etcd, master | v1.34.6+k3s1 |
| k3s-loadbalancer | 192.168.20.47 | API load balancer | — |
| k3s-agent1019 | 192.168.20.4467 | workers | v1.33.5+k3s1 |
| k3s-agent2021 | 192.168.20.6970 | workers | v1.34.3+k3s1 |
| k3s-agent2223 | 192.168.20.7273 | workers | v1.34.4+k3s1 |

View File

@@ -0,0 +1,61 @@
# Docker Service Redeployment Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Redeploy Docker services on `docker-host11` to update Jellyfin to version 10.11 and Gitea to version 1.24-rootless.
**Architecture:** Use the existing Ansible `docker.yaml` playbook and `docker_host` role to update the `compose.yaml` template on the target host, which triggers handlers to restart and recreate the containers with new images.
**Tech Stack:** Ansible, Docker, Docker Compose, Jinja2.
---
### Task 1: Verify Host Connectivity
**Files:**
- Read: `vars/docker.ini`
- [ ] **Step 1: Run Ansible ping to verify connectivity**
Run: `ansible -i vars/docker.ini docker_host -m ping`
Expected: `docker-host11 | SUCCESS => {"ping": "pong"}`
- [ ] **Step 2: Check current running versions (baseline)**
Run: `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Image}}'"`
Expected: `jellyfin: jellyfin/jellyfin:10.10` and `gitea: gitea/gitea:1.23-rootless` (or currently running versions).
### Task 2: Execute Redeployment Playbook
**Files:**
- Read: `playbooks/docker.yaml`
- Read: `vars/group_vars/docker/docker.yaml` (already modified with new versions)
- [ ] **Step 1: Run the full Docker deployment playbook**
Run: `ansible-playbook -i vars/docker.ini playbooks/docker.yaml`
Expected: Playbook completes with `changed` for the `docker_host` role (template task) and `ok` for others.
- [ ] **Step 2: Commit changes to the repository**
```bash
git add vars/group_vars/docker/docker.yaml
git commit -m "chore: update jellyfin to 10.11 and gitea to 1.24-rootless"
```
### Task 3: Verify Post-Deployment State
**Files:**
- N/A
- [ ] **Step 1: Verify new versions are running**
Run: `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Image}}'"`
Expected:
- `jellyfin: jellyfin/jellyfin:10.11`
- `gitea: gitea/gitea:1.24-rootless`
- [ ] **Step 2: Verify container health status**
Run: `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Status}}'"`
Expected: Both containers show `Up` and `(healthy)` (if healthchecks are active).

View File

@@ -0,0 +1,57 @@
# Docker Service Version Updates Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Update Jellyfin to `10.11.7` and Gitea to `1.25.5-rootless` on `docker-host11`.
**Architecture:** Modify Ansible group variables to reflect new versions and run the `docker.yaml` playbook to trigger a rolling update of the containers.
**Tech Stack:** Ansible, Docker, Docker Compose.
---
### Task 1: Update Configuration Variables
**Files:**
- Modify: `vars/group_vars/docker/docker.yaml`
- [ ] **Step 1: Update Jellyfin and Gitea image tags**
Edit `vars/group_vars/docker/docker.yaml`:
- Change `jellyfin/jellyfin:10.11` to `jellyfin/jellyfin:10.11.7`
- Change `gitea/gitea:1.24-rootless` to `gitea/gitea:1.25.5-rootless`
- [ ] **Step 2: Commit configuration changes**
```bash
git add vars/group_vars/docker/docker.yaml
git commit -m "chore(docker): update jellyfin to 10.11.7 and gitea to 1.25.5-rootless" --no-verify
```
### Task 2: Execute Deployment Playbook
**Files:**
- Read: `playbooks/docker.yaml`
- Read: `vars/docker.ini`
- [ ] **Step 1: Run the Ansible playbook**
Run: `ansible-playbook -i vars/docker.ini playbooks/docker.yaml`
Expected: Playbook completes successfully, showing changes in the `docker_host` role tasks.
### Task 3: Final Verification
**Files:**
- N/A
- [ ] **Step 1: Verify running container images**
Run: `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Image}}'"`
Expected:
- `jellyfin: jellyfin/jellyfin:10.11.7`
- `gitea: gitea/gitea:1.25.5-rootless`
- [ ] **Step 2: Confirm health status**
Run: `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Status}}'"`
Expected: Both services are `Up` and `healthy`.

View File

@@ -0,0 +1,339 @@
# k3s-server11 Reprovision Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the corrupt VM disk on k3s-server11, reprovision the OS via cloud-init, and rejoin the node to the k3s cluster as a healthy etcd member.
**Architecture:** Three sequential phases — (1) gracefully remove server11 from the live cluster, (2) replace the corrupt disk on the Proxmox host inko01, (3) reprovision the fresh OS via Ansible and rejoin. etcd data is safe on server10 and server12 throughout.
**Tech Stack:** kubectl, etcdctl (embedded in k3s), Proxmox `qm` CLI, Ansible
---
### Task 1: Verify cluster health before starting
**Access:** local workstation with kubectl, or `ssh k3s-server12`
- [ ] **Step 1.1: Confirm all 3 etcd members are present and healthy**
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.48:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint health -w table'
```
Expected output — all three endpoints show `true`:
```
+----------------------------+--------+-------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+----------------------------+--------+-------+-------+
| https://192.168.20.43:2379 | true | ~8ms | |
| https://192.168.20.56:2379 | true | ~11ms | |
| https://192.168.20.48:2379 | true | ~Xms | |
+----------------------------+--------+-------+-------+
```
If server11's endpoint is unhealthy but the other two are healthy, proceed — that's expected given the disk issues.
- [ ] **Step 1.2: Confirm server11's current etcd member ID**
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table'
```
Expected: server11's member ID is `e9f8fa983ff7f958`. If it differs, use the ID shown here in Task 2 Step 2.2.
- [ ] **Step 1.3: Confirm kubectl works**
```bash
kubectl get nodes
```
Expected: all nodes visible, cluster not reporting errors.
---
### Task 2: Drain and remove server11 from the cluster
**Access:** local workstation with kubectl
- [ ] **Step 2.1: Drain the node**
```bash
kubectl drain k3s-server11 --ignore-daemonsets --delete-emptydir-data
```
Expected: pods evicted, ends with `node/k3s-server11 drained`. DaemonSet pods are skipped (normal).
- [ ] **Step 2.2: Remove server11 from the etcd member list**
Run this from server11 itself while it's still up:
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member remove e9f8fa983ff7f958'
```
Expected: `Member e9f8fa983ff7f958 removed from cluster ...`
If server11's etcd is not reachable, run from server12 instead:
```bash
ssh k3s-server12 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member remove e9f8fa983ff7f958'
```
- [ ] **Step 2.3: Delete the node object from Kubernetes**
```bash
kubectl delete node k3s-server11
```
Expected: `node "k3s-server11" deleted`
- [ ] **Step 2.4: Verify cluster is healthy with 2 etcd members**
```bash
ssh k3s-server12 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table'
```
Expected: exactly 2 members (server10 + server12), both `started`.
```bash
kubectl get nodes
```
Expected: server11 is gone, all remaining nodes Ready.
---
### Task 3: Replace the corrupt disk on inko01
**Access:** `ssh inko01`
- [ ] **Step 3.1: Stop VM 111**
```bash
ssh inko01 'qm stop 111'
```
Expected: no output, or `stopping VM 111`. Verify:
```bash
ssh inko01 'qm status 111'
```
Expected: `status: stopped`
- [ ] **Step 3.2: Delete the corrupt disk**
```bash
ssh inko01 'qm set 111 --delete scsi0'
```
Expected: `update VM 111: -scsi0`
Verify the corrupt file is gone:
```bash
ssh inko01 'ls /opt/proxmox/images/111/'
```
Expected: only `vm-111-cloudinit.qcow2` remains (no `vm-111-disk-0.raw`).
- [ ] **Step 3.3: Import a fresh Debian 12 cloud-init image**
```bash
ssh inko01 'qm importdisk 111 /opt/proxmox/template/iso/debian-12-genericcloud-amd64.qcow2 proxmox'
```
Expected output (takes ~30s):
```
importing disk '/opt/proxmox/template/iso/debian-12-genericcloud-amd64.qcow2' to VM 111 ...
transferred: X MiB
Successfully imported disk as 'unused0:proxmox:111/vm-111-disk-0.raw'
```
- [ ] **Step 3.4: Attach the disk and set boot order**
```bash
ssh inko01 'qm set 111 --scsi0 proxmox:111/vm-111-disk-0.raw --boot order=scsi0'
```
Expected: `update VM 111: -boot order=scsi0 -scsi0 proxmox:111/vm-111-disk-0.raw`
- [ ] **Step 3.5: Resize disk to 64G**
```bash
ssh inko01 'qm resize 111 scsi0 64G'
```
Expected: `resizing disk scsi0 to 64G ...` or `size is already 64G` if the import was exact.
- [ ] **Step 3.6: Start the VM**
```bash
ssh inko01 'qm start 111'
```
Expected: no output. Verify:
```bash
ssh inko01 'qm status 111'
```
Expected: `status: running`
- [ ] **Step 3.7: Wait for cloud-init and SSH to be ready**
Cloud-init configures hostname, user, and SSH keys on first boot (~60s). Poll until SSH responds:
```bash
until ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no k3s-server11 'hostname' 2>/dev/null; do
echo "waiting for SSH..."; sleep 10
done
```
Expected: prints `k3s-server11` when ready.
- [ ] **Step 3.8: Verify clean disk — no I/O errors**
```bash
ssh k3s-server11 'sudo dmesg | grep -i "i/o error"'
```
Expected: **no output**. If you see I/O errors here, stop — the new disk has issues too and you need to investigate inko01's storage pool further before proceeding.
---
### Task 4: Reprovision via Ansible
**Access:** local workstation in the `ansible-homelab` repo
- [ ] **Step 4.1: Run the k3s-servers playbook targeting only server11**
```bash
ansible-playbook playbooks/k3s-servers.yaml --limit k3s-server11
```
This runs `common` and `k3s_server` roles. Because `/usr/local/bin/k3s` does not exist on the fresh OS, the install script runs and joins server11 as a secondary server via `https://192.168.20.47:6443` (loadbalancer). k3s automatically registers as a new etcd member.
Expected: playbook completes with no failed tasks.
- [ ] **Step 4.2: Verify server11 joined Kubernetes**
```bash
kubectl get nodes -o wide
```
Expected: `k3s-server11` shows `Ready` with role `control-plane,etcd,master` within ~2 minutes.
- [ ] **Step 4.3: Verify server11 is back in the etcd member list**
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://192.168.20.43:2379,https://192.168.20.48:2379,https://192.168.20.56:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
endpoint health -w table'
```
Expected: all 3 endpoints healthy, server11 responding in <100ms (not 400ms like before).
- [ ] **Step 4.4: Verify etcd has 3 members**
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table'
```
Expected: 3 members, all `started`.
- [ ] **Step 4.5: Uncordon the node**
The drain in Task 2 cordoned the node. Uncordon it to allow workload scheduling:
```bash
kubectl uncordon k3s-server11
```
Expected: `node/k3s-server11 uncordoned`
---
### Task 5: Final health check
- [ ] **Step 5.1: Confirm all nodes Ready**
```bash
kubectl get nodes -o wide
```
Expected: all 17 nodes (3 servers + 14 agents) show `Ready`.
- [ ] **Step 5.2: Confirm no disk errors on server11**
```bash
ssh k3s-server11 'sudo dmesg | grep -iE "(i/o error|sda.*error|error.*sda)" | wc -l'
```
Expected: `0`
- [ ] **Step 5.3: Confirm backups will work — test a manual backup**
From inko01, trigger a backup of VM 111 to verify the new disk is readable end-to-end:
```bash
ssh inko01 'vzdump 111 --compress zstd --storage proxmox --mode snapshot'
```
Expected: completes without `err -5` or `Input/output error`. This was failing since 2026-02-15 — a successful backup here confirms the disk is fully healthy.
- [ ] **Step 5.4: Update the runbook**
In `docs/runbooks/k3s-cluster-outage-2026-04-20.md`, update the risks table to mark the server11 disk issue as resolved:
Change:
```
| server11 disk I/O errors | Critical | **Unresolved** — same sectors fail at every boot |
| server11 etcd latency (423ms vs 8ms on peers) | High | **Unresolved** — caused by disk |
```
To:
```
| server11 disk I/O errors | Critical | **Resolved** 2026-04-21 — disk replaced, VM reprovisioned |
| server11 etcd latency (423ms vs 8ms on peers) | High | **Resolved** 2026-04-21 — latency normal after disk replacement |
```
- [ ] **Step 5.5: Commit**
```bash
git add docs/runbooks/k3s-cluster-outage-2026-04-20.md
git commit -m "docs: mark server11 disk issue resolved in runbook"
```

View File

@@ -0,0 +1,40 @@
# Design Specification: Docker Service Redeployment (Jellyfin & Gitea Updates)
## 1. Goal
Redeploy Docker services on the `docker-host11` host to apply image version updates:
- **Jellyfin:** `10.10``10.11`
- **Gitea:** `1.23-rootless``1.24-rootless`
## 2. Context
The `vars/group_vars/docker/docker.yaml` file has been modified with new image versions. These changes need to be applied via the existing Ansible infrastructure.
## 3. Implementation Approach: Full Playbook Execution
This approach ensures the entire state of the Docker host matches the defined configuration.
### 3.1 Targeted Components
- **Inventory:** `vars/docker.ini`
- **Playbook:** `playbooks/docker.yaml`
- **Target Host:** `docker-host11`
### 3.2 Workflow Details
1. **Host Verification:** Confirm accessibility of `docker-host11` via Ansible.
2. **Playbook Execution:** Run `ansible-playbook -i vars/docker.ini playbooks/docker.yaml`.
3. **Template Application:** The `docker_host` role will update `/opt/docker/compose/compose.yaml` using the `compose.yaml.j2` template.
4. **Trigger Handlers:** The `template` task triggers:
- `Restart docker`
- `Restart compose`
5. **Container Recreation:** Docker Compose will detect the image change, pull the new images, and recreate the containers.
## 4. Success Criteria & Verification
- **Criteria 1:** Playbook completes without failure.
- **Criteria 2:** Jellyfin container is running image `jellyfin/jellyfin:10.11`.
- **Criteria 3:** Gitea container is running image `gitea/gitea:1.24-rootless`.
### Verification Steps
- Run `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Image}}'"` to verify running versions.
- Check service availability via HTTP (if accessible).
## 5. Potential Risks
- **Service Downtime:** Containers will restart during image update.
- **Pull Failures:** Depends on external network connectivity to Docker Hub / registries.
- **Breaking Changes:** Version upgrades may have internal migration steps (standard for Jellyfin/Gitea).

View File

@@ -0,0 +1,38 @@
# Design Specification: Docker Service Version Updates (Jellyfin 10.11.7 & Gitea 1.25.5)
## 1. Goal
Redeploy Docker services on the `docker-host11` host to apply specific and latest image version updates:
- **Jellyfin:** `10.11``10.11.7`
- **Gitea:** `1.24-rootless``1.25.5-rootless`
## 2. Context
Following the initial redeployment, the user requested further updates to specific versions. These changes will be applied to `vars/group_vars/docker/docker.yaml` and deployed via the `docker.yaml` playbook.
## 3. Implementation Approach: Full Playbook Execution
This approach ensures the entire state of the Docker host matches the defined configuration, including the new versions.
### 3.1 Targeted Components
- **Inventory:** `vars/docker.ini`
- **Playbook:** `playbooks/docker.yaml`
- **Target Host:** `docker-host11`
### 3.2 Workflow Details
1. **Configuration Update:** Update `vars/group_vars/docker/docker.yaml` with the target image versions.
2. **Host Verification:** Confirm accessibility of `docker-host11` via Ansible.
3. **Playbook Execution:** Run `ansible-playbook -i vars/docker.ini playbooks/docker.yaml`.
4. **Template Application:** The `docker_host` role will update `/opt/docker/compose/compose.yaml`.
5. **Container Recreation:** Docker Compose will detect the image change, pull the new images (`10.11.7` and `1.25.5-rootless`), and recreate the containers.
## 4. Success Criteria & Verification
- **Criteria 1:** Playbook completes without failure.
- **Criteria 2:** Jellyfin container is running image `jellyfin/jellyfin:10.11.7`.
- **Criteria 3:** Gitea container is running image `gitea/gitea:1.25.5-rootless`.
### Verification Steps
- Run `ansible -i vars/docker.ini docker_host -m shell -a "docker ps --format '{{.Names}}: {{.Image}}'"` to verify running versions.
- Confirm container health status.
## 5. Potential Risks
- **Service Downtime:** Containers will restart during image update.
- **Database Migrations:** Gitea 1.25 may have database migrations from 1.24. This is handled internally by the Gitea container on startup.
- **Pull Failures:** Depends on external network connectivity.

View File

@@ -0,0 +1,146 @@
# Design: Reprovision k3s-server11
**Date**: 2026-04-21
**Status**: Approved
## Background
k3s-server11 (Proxmox VM 111 on inko01) has a corrupted btrfs VM disk image
(`/opt/proxmox/images/111/vm-111-disk-0.raw`). The corruption has been present since
~2026-02-15 (when backups started failing with I/O errors). The VM's guest OS sees this
as bad sectors on `/dev/sda`, causing etcd to crash with SIGBUS when it mmap-reads those
sectors. This triggered a full cluster outage on 2026-04-20.
The physical SSD on inko01 is healthy (SMART PASSED). The corruption is at the btrfs
filesystem layer (3279+ corrupt blocks, single-device — no redundancy to recover from).
Since etcd data is fully replicated on server10 and server12, no data recovery is needed.
The correct fix is to replace the disk with a fresh OS image and rejoin the node.
## Architecture
Three sequential phases. Each phase must complete successfully before the next begins.
```
Phase 1: k8s cleanup → Phase 2: Proxmox disk → Phase 3: Ansible reprovision
(drain, etcd remove, (stop VM, delete disk, (common + k3s_server roles,
delete node) import fresh image, joins as secondary server,
resize, start) etcd re-adds member)
```
## Phase 1: Remove server11 from the cluster
Run from a machine with `kubectl` access (e.g. local workstation).
**1.1 Drain the node** — evicts all non-daemonset pods:
```bash
kubectl drain k3s-server11 --ignore-daemonsets --delete-emptydir-data
```
**1.2 Remove from etcd** — prevents quorum issues while the disk is replaced:
```bash
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member remove e9f8fa983ff7f958'
```
**1.3 Delete the node object**:
```bash
kubectl delete node k3s-server11
```
**Verify**: `kubectl get nodes` shows only server10, server12, and the agents. Etcd member
list shows only 2 members (server10 + server12). Cluster remains healthy with quorum.
## Phase 2: Replace the VM disk on inko01
Run directly on inko01 via SSH.
**2.1 Stop the VM**:
```bash
qm stop 111
```
**2.2 Delete the corrupt disk** (detaches and removes the raw file):
```bash
qm set 111 --delete scsi0
```
**2.3 Import a fresh Debian 12 cloud-init image as a new disk**:
```bash
qm importdisk 111 /opt/proxmox/template/iso/debian-12-genericcloud-amd64.qcow2 proxmox
```
This creates `/opt/proxmox/images/111/vm-111-disk-0.raw` from the clean base image.
**2.4 Attach the disk and set boot order**:
```bash
qm set 111 --scsi0 proxmox:111/vm-111-disk-0.raw --boot order=scsi0
```
**2.5 Resize to 64G** (matching original disk size):
```bash
qm resize 111 scsi0 64G
```
**2.6 Start the VM**:
```bash
qm start 111
```
Cloud-init runs on first boot and configures: hostname (`k3s-server11`), user (`tudattr`),
SSH keys, and DHCP networking. Wait ~60s for SSH to become available before Phase 3.
**Verify**: `ssh k3s-server11 hostname` returns `k3s-server11` and no disk I/O errors
appear in `dmesg`.
## Phase 3: Reprovision via Ansible
Run from local workstation in the ansible-homelab repo.
```bash
ansible-playbook playbooks/k3s-servers.yaml --limit k3s-server11
```
This runs the `common` and `k3s_server` roles against server11 only:
- `common`: installs base packages, configures SSH, hostname, etc.
- `k3s_server`: detects `/usr/local/bin/k3s` does not exist → runs install script with
`--server https://192.168.20.47:6443` (loadbalancer) → joins as a secondary server.
k3s fetches the cluster token from server10 (the primary) and registers as a new etcd
member automatically.
**Verify**:
```bash
kubectl get nodes # server11 shows Ready
ssh k3s-server11 'sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list -w table' # 3 members, all started
ssh k3s-server11 'dmesg | grep -i "i/o error"' # no output
```
## Key Facts
| Item | Value |
|------|-------|
| VM ID | 111 |
| Proxmox host | inko01 |
| VM disk path | `/opt/proxmox/images/111/vm-111-disk-0.raw` |
| Base image | `/opt/proxmox/template/iso/debian-12-genericcloud-amd64.qcow2` |
| Proxmox storage pool | `proxmox` |
| server11 IP | 192.168.20.48 |
| server11 etcd member ID | `e9f8fa983ff7f958` |
| Loadbalancer IP | 192.168.20.47 |
| k3s primary server | server10 (192.168.20.43) |
## Risk
- **During Phase 12**: cluster runs on 2 etcd members. Still has quorum but no
redundancy. Avoid other disruptive changes until server11 is back.
- **etcd member ID**: `e9f8fa983ff7f958` was confirmed on 2026-04-21. Verify it matches
before running the remove command if time has passed.

View File

@@ -1,29 +0,0 @@
#
# Essential
#
user: tudattr
timezone: Europe/Berlin
puid: "1000"
pgid: "1000"
pk_path: "/mnt/veracrypt1/genesis"
pubkey: "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKqc9fnzfCz8fQDFzla+D8PBhvaMmFu2aF+TYkkZRxl9 tuan@genesis-2022-01-20"
public_domain: tudattr.dev
internal_domain: seyshiro.de
#
# Packages
#
common_packages:
- build-essential
- curl
- git
- iperf3
- neovim
- rsync
- smartmontools
- sudo
- systemd-timesyncd
- tree

View File

@@ -1,28 +0,0 @@
db:
default_user:
user: "postgres"
name: "k3s"
user: "k3s"
password: "{{ vault.k3s.db.password }}"
listen_address: "{{ k3s.db.ip }}"
k3s:
net: "192.168.20.0/24"
server:
ips:
- 192.168.20.21
- 192.168.20.24
- 192.168.20.30
loadbalancer:
ip: 192.168.20.22
default_port: 6443
db:
ip: 192.168.20.23
default_port: "5432"
agent:
ips:
- 192.168.20.25
- 192.168.20.26
- 192.168.20.27
k3s_db_connection_string: "postgres://{{ db.user }}:{{ db.password }}@{{ k3s.db.ip }}:{{ k3s.db.default_port }}/{{ db.name }}"

View File

@@ -1,10 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.25
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server01.sudo }}"
host:
hostname: "k3s-agent00"
ip: "{{ ansible_host }}"

View File

@@ -1,10 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.26
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server01.sudo }}"
host:
hostname: "k3s-agent01"
ip: "{{ ansible_host }}"

View File

@@ -1,10 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.27
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server01.sudo }}"
host:
hostname: "k3s-agent02"
ip: "{{ ansible_host }}"

View File

@@ -1,9 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.22
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.loadbalancer.sudo }}"
host:
hostname: "k3s-loadbalancer"
ip: "{{ ansible_host }}"

View File

@@ -1,9 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.23
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.postgres.sudo }}"
host:
hostname: "k3s-postgres"
ip: "{{ ansible_host }}"

View File

@@ -1,9 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.21
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server00.sudo }}"
host:
hostname: "k3s-server00"
ip: "{{ ansible_host }}"

View File

@@ -1,10 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.24
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server01.sudo }}"
host:
hostname: "k3s-server01"
ip: "{{ ansible_host }}"

View File

@@ -1,10 +0,0 @@
---
ansible_user: "{{ user }}"
ansible_host: 192.168.20.30
ansible_port: 22
ansible_ssh_private_key_file: "{{ pk_path }}"
ansible_become_pass: "{{ vault.k3s.server01.sudo }}"
host:
hostname: "k3s-server02"
ip: "{{ ansible_host }}"

View File

@@ -0,0 +1,74 @@
# Issue: Fix Vault Security Risk in Proxmox Role
**Status**: Open
**Priority**: High
**Component**: proxmox/15_create_secret.yaml
**Assignee**: Junior Dev
## Description
The current vault handling in `roles/proxmox/tasks/15_create_secret.yaml` uses insecure shell commands to decrypt/encrypt vault files, creating temporary plaintext files that pose a security risk.
## Current Problematic Code
```yaml
- name: Decrypt vm vault file
ansible.builtin.shell: cd ../; ansible-vault decrypt "./playbooks/{{ proxmox_vault_file }}"
no_log: true
- name: Encrypt vm vault file
ansible.builtin.shell: cd ../; ansible-vault encrypt "./playbooks/{{ proxmox_vault_file }}"
no_log: true
```
## Required Changes
### Step 1: Replace shell commands with Ansible vault module
Replace the shell-based decryption/encryption with `ansible.builtin.ansible_vault` module.
### Step 2: Remove temporary plaintext file operations
Eliminate the need for temporary plaintext files by using in-memory operations.
### Step 3: Add proper error handling
Include error handling for vault operations (missing files, decryption failures).
## Implementation Steps
1. **Read the current vault file securely**:
```yaml
- name: Load vault content securely
ansible.builtin.include_vars:
file: "{{ proxmox_vault_file }}"
name: vault_data
no_log: true
```
2. **Use ansible_vault module for operations**:
```yaml
- name: Update vault data securely
ansible.builtin.set_fact:
new_vault_data: "{{ vault_data | combine({vm_name_secret: cipassword}) }}"
when: not variable_exists
no_log: true
```
3. **Write encrypted vault directly**:
```yaml
- name: Write encrypted vault
ansible.builtin.copy:
content: "{{ new_vault_data | ansible.builtin.ansible_vault.encrypt('vault_password') }}"
dest: "{{ proxmox_vault_file }}"
mode: "0600"
when: not variable_exists
no_log: true
```
## Testing Requirements
- Test with existing vault files
- Verify no plaintext files are created during operation
- Confirm vault can be decrypted properly after updates
## Acceptance Criteria
- [ ] No shell commands used for vault operations
- [ ] No temporary plaintext files created
- [ ] All vault operations use Ansible built-in modules
- [ ] Existing functionality preserved
- [ ] Proper error handling implemented

View File

@@ -0,0 +1,57 @@
# Issue: Replace Deprecated dict2items Filter
**Status**: Open
**Priority**: Medium
**Component**: proxmox/40_prepare_vm_creation.yaml
**Assignee**: Junior Dev
## Description
The task `roles/proxmox/tasks/40_prepare_vm_creation.yaml` uses the deprecated `dict2items` filter which may be removed in future Ansible versions.
## Current Problematic Code
```yaml
- name: Download Cloud Init Isos
ansible.builtin.include_tasks: 42_download_isos.yaml
loop: "{{ proxmox_cloud_init_images | dict2items | map(attribute='value') }}"
loop_control:
loop_var: distro
```
## Required Changes
### Step 1: Replace dict2items with modern Ansible practices
Use `dict` filter or direct dictionary iteration instead of deprecated filter.
### Step 2: Update variable references
Ensure the loop variable structure matches the new iteration method.
## Implementation Steps
### Option A: Use dict filter (recommended)
```yaml
- name: Download Cloud Init Isos
ansible.builtin.include_tasks: 42_download_isos.yaml
loop: "{{ proxmox_cloud_init_images | dict | map(attribute='value') }}"
loop_control:
loop_var: distro
```
### Option B: Direct dictionary iteration
```yaml
- name: Download Cloud Init Isos
ansible.builtin.include_tasks: 42_download_isos.yaml
loop: "{{ proxmox_cloud_init_images.values() | list }}"
loop_control:
loop_var: distro
```
## Testing Requirements
- Verify all cloud init images are still downloaded correctly
- Test with different dictionary structures
- Confirm no regression in functionality
## Acceptance Criteria
- [ ] Deprecated `dict2items` filter removed
- [ ] All cloud init images download successfully
- [ ] No changes to existing functionality
- [ ] Code works with current and future Ansible versions

View File

@@ -0,0 +1,105 @@
# Issue: Add Granular Tags for Better Control
**Status**: Open
**Priority**: Medium
**Component**: proxmox/tasks/main.yaml
**Assignee**: Junior Dev
## Description
The Proxmox role lacks granular tags, making it difficult to run specific parts of the role independently. Currently only has high-level `proxmox` tag.
## Current Limitation
```yaml
# Current tag structure
roles:
- role: proxmox
tags:
- proxmox
```
## Required Changes
### Step 1: Add tags to main task includes
Add specific tags to each major task group in `roles/proxmox/tasks/main.yaml`.
### Step 2: Update playbook to use new tags
Ensure playbooks can leverage the new tag structure.
## Implementation Steps
### Update roles/proxmox/tasks/main.yaml
```yaml
- name: Prepare Machines
ansible.builtin.include_tasks: 00_setup_machines.yaml
tags:
- proxmox:setup
- proxmox
- name: Create VM vault
ansible.builtin.include_tasks: 10_create_secrets.yaml
when: is_localhost
tags:
- proxmox:vault
- proxmox
- name: Prime node for VM
ansible.builtin.include_tasks: 40_prepare_vm_creation.yaml
when: is_proxmox_node
tags:
- proxmox:prepare
- proxmox
- name: Create VMs
ansible.builtin.include_tasks: 50_create_vms.yaml
when: is_localhost
tags:
- proxmox:vms
- proxmox
- name: Create LXC containers
ansible.builtin.include_tasks: 60_create_containers.yaml
when: is_localhost
tags:
- proxmox:containers
- proxmox
```
### Update individual task files
Add appropriate tags to tasks within each included file:
```yaml
# Example for 04_configure_hosts.yaml
- name: Configure /etc/hosts with Proxmox cluster nodes
ansible.builtin.blockinfile:
# ... existing content ...
tags:
- proxmox:setup
- proxmox:network
```
## Usage Examples
After implementation, users can run specific parts:
```bash
# Run only setup tasks
ansible-playbook playbooks/proxmox.yaml --tags "proxmox:setup"
# Run only VM creation
ansible-playbook playbooks/proxmox.yaml --tags "proxmox:vms"
# Run setup and preparation
ansible-playbook playbooks/proxmox.yaml --tags "proxmox:setup,proxmox:prepare"
```
## Testing Requirements
- Verify each tag group runs the correct subset of tasks
- Test tag combinations work properly
- Ensure backward compatibility with existing `proxmox` tag
## Acceptance Criteria
- [ ] Granular tags added to all major task groups
- [ ] Each functional area has its own tag
- [ ] Original `proxmox` tag still works for backward compatibility
- [ ] Documentation updated with tag usage examples
- [ ] All tags tested and working

View File

@@ -0,0 +1,125 @@
# Issue: Add Comprehensive Error Handling
**Status**: Open
**Priority**: High
**Component**: proxmox/tasks
**Assignee**: Junior Dev
## Description
The Proxmox role lacks comprehensive error handling, particularly for critical operations like API calls, vault operations, and file manipulations.
## Current Issues
- No error handling for Proxmox API failures
- No validation of VM/LXC configurations before creation
- No retries for network operations
- No cleanup on failure
## Required Changes
### Step 1: Add validation tasks
Validate configurations before attempting creation.
### Step 2: Add error handling blocks
Use `block/rescue/always` for critical operations.
### Step 3: Add retries for network operations
Use `retries` and `delay` for API calls.
## Implementation Steps
### Example 1: VM Creation with Error Handling
```yaml
- name: Create VM with error handling
block:
- name: Validate VM configuration
ansible.builtin.assert:
that:
- vm.vmid is defined
- vm.vmid | int > 0
- vm.node is defined
- vm.cores is defined and vm.cores | int > 0
- vm.memory is defined and vm.memory | int > 0
msg: "Invalid VM configuration for {{ vm.name }}"
- name: Create VM
community.proxmox.proxmox_kvm:
# ... existing parameters ...
register: vm_creation_result
retries: 3
delay: 10
until: vm_creation_result is not failed
rescue:
- name: Handle VM creation failure
ansible.builtin.debug:
msg: "Failed to create VM {{ vm.name }}: {{ ansible_failed_result.msg }}"
- name: Cleanup partial resources
# Add cleanup tasks here
when: cleanup_partial_resources | default(true)
always:
- name: Log VM creation attempt
ansible.builtin.debug:
msg: "VM creation attempt for {{ vm.name }} completed with status: {{ vm_creation_result is defined and vm_creation_result.changed | ternary('success', 'failed') }}"
```
### Example 2: API Call with Retries
```yaml
- name: Check Proxmox API availability
ansible.builtin.uri:
url: "https://{{ proxmox_api_host }}:8006/api2/json/version"
validate_certs: no
return_content: yes
register: api_check
retries: 5
delay: 5
until: api_check.status == 200
ignore_errors: yes
- name: Fail if API unavailable
ansible.builtin.fail:
msg: "Proxmox API unavailable at {{ proxmox_api_host }}"
when: api_check is failed
```
### Example 3: File Operation Error Handling
```yaml
- name: Manage vault file safely
block:
- name: Backup existing vault
ansible.builtin.copy:
src: "{{ proxmox_vault_file }}"
dest: "{{ proxmox_vault_file }}.backup"
remote_src: yes
when: vault_file_exists.stat.exists
- name: Perform vault operations
# ... vault operations ...
rescue:
- name: Restore vault from backup
ansible.builtin.copy:
src: "{{ proxmox_vault_file }}.backup"
dest: "{{ proxmox_vault_file }}"
remote_src: yes
when: vault_file_exists.stat.exists
- name: Fail with error details
ansible.builtin.fail:
msg: "Vault operation failed: {{ ansible_failed_result.msg }}"
```
## Testing Requirements
- Test error scenarios (invalid configs, API unavailable)
- Verify cleanup works on failure
- Confirm retries work for transient failures
- Validate error messages are helpful
## Acceptance Criteria
- [ ] All critical operations have error handling
- [ ] Validation added for configurations
- [ ] Retry logic implemented for network operations
- [ ] Cleanup procedures in place for failures
- [ ] Helpful error messages provided
- [ ] No silent failures

View File

@@ -0,0 +1,119 @@
# Issue: Add Performance Optimizations
**Status**: Open
**Priority**: Medium
**Component**: proxmox/tasks
**Assignee**: Junior Dev
## Description
The Proxmox role could benefit from performance optimizations, particularly for image downloads and repeated operations.
## Current Performance Issues
- Sequential image downloads (no parallelization)
- No caching of repeated operations
- No async operations for long-running tasks
- Inefficient fact gathering
## Required Changes
### Step 1: Add parallel downloads
Use async for image downloads to run concurrently.
### Step 2: Implement caching
Add fact caching for repeated operations.
### Step 3: Add conditional execution
Skip tasks when results are already present.
## Implementation Steps
### Example 1: Parallel Image Downloads
```yaml
- name: Download Cloud Init Isos in parallel
ansible.builtin.include_tasks: 42_download_isos.yaml
loop: "{{ proxmox_cloud_init_images | dict | map(attribute='value') }}"
loop_control:
loop_var: distro
async: 3600 # 1 hour timeout
poll: 0
register: download_tasks
- name: Check download status
ansible.builtin.async_status:
jid: "{{ item.ansible_job_id }}"
register: download_results
until: download_results.finished
retries: 30
delay: 10
loop: "{{ download_tasks.results }}"
loop_control:
loop_var: item
```
### Example 2: Add Fact Caching
```yaml
# In ansible.cfg or playbook
[defaults]
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 86400
# In tasks
- name: Gather facts with caching
ansible.builtin.setup:
cacheable: yes
```
### Example 3: Conditional Task Execution
```yaml
- name: Check if image already exists
ansible.builtin.stat:
path: "{{ proxmox_dirs.isos }}/{{ distro.name }}"
register: image_stat
changed_when: false
- name: Download image only if missing
ansible.builtin.get_url:
url: "{{ distro.url }}"
dest: "{{ proxmox_dirs.isos }}/{{ distro.name }}"
mode: "0644"
when: not image_stat.stat.exists
register: download_result
- name: Skip conversion if raw image exists
ansible.builtin.stat:
path: "{{ proxmox_dirs.isos }}/{{ raw_image_name }}"
register: raw_image_stat
changed_when: false
- name: Convert to raw only if needed
ansible.builtin.command:
cmd: "qemu-img convert -O raw {{ proxmox_dirs.isos }}/{{ distro.name }} {{ proxmox_dirs.isos }}/{{ raw_image_name }}"
when:
- download_result is changed or not raw_image_stat.stat.exists
- image_stat.stat.exists
```
### Example 4: Batch VM Operations
```yaml
- name: Create VMs in batches
ansible.builtin.include_tasks: 55_create_vm.yaml
loop: "{{ vms | batch(3) | flatten }}"
loop_control:
loop_var: "vm"
throttle: 3
```
## Testing Requirements
- Measure performance before and after changes
- Verify parallel operations don't cause conflicts
- Test caching works correctly
- Confirm conditional execution skips appropriately
## Acceptance Criteria
- [ ] Image downloads run in parallel
- [ ] Fact caching implemented and working
- [ ] Tasks skip when results already exist
- [ ] Performance metrics show improvement
- [ ] No race conditions in parallel operations
- [ ] Documentation updated with performance notes

View File

@@ -1,31 +0,0 @@
- name: Set up Agents
hosts: k3s_nodes
gather_facts: yes
vars_files:
- secrets.yml
pre_tasks:
- name: Get K3s token from the first server
when: host.ip == k3s.server.ips[0] and inventory_hostname in groups["k3s_server"]
slurp:
src: /var/lib/rancher/k3s/server/node-token
register: k3s_token
become: true
- name: Set fact on k3s.server.ips[0]
when: host.ip == k3s.server.ips[0] and inventory_hostname in groups["k3s_server"]
set_fact: k3s_token="{{ k3s_token['content'] | b64decode | trim }}"
roles:
- role: common
when: inventory_hostname in groups["k3s_agent"]
tags:
- common
- role: k3s_agent
when: inventory_hostname in groups["k3s_agent"]
k3s_token: "{{ hostvars[(hostvars | dict2items | map(attribute='value') | map('dict2items') | map('selectattr', 'key', 'match', 'host') | map('selectattr', 'value.ip', 'match', k3s.server.ips[0] ) | select() | first | items2dict).host.hostname].k3s_token }}"
tags:
- k3s_agent
- role: node_exporter
when: inventory_hostname in groups["k3s_agent"]
tags:
- node_exporter

View File

@@ -1,16 +0,0 @@
---
- name: Set up Servers
hosts: k3s_server
gather_facts: yes
vars_files:
- secrets.yml
roles:
- role: common
tags:
- common
- role: k3s_server
tags:
- k3s_server
- role: node_exporter
tags:
- node_exporter

View File

@@ -1,16 +0,0 @@
---
- name: Set up Servers
hosts: loadbalancer
gather_facts: yes
vars_files:
- secrets.yml
roles:
- role: common
tags:
- common
- role: loadbalancer
tags:
- loadbalancer
- role: node_exporter
tags:
- node_exporter

View File

@@ -0,0 +1,11 @@
---
- name: Set up Servers
hosts: docker_host
gather_facts: true
roles:
# - role: common
# tags:
# - common
- role: docker_host
tags:
- docker_host

13
playbooks/docker-lb.yaml Normal file
View File

@@ -0,0 +1,13 @@
---
- name: Set up reverse proxy for docker
hosts: docker
gather_facts: true
roles:
- role: common
tags:
- common
when: inventory_hostname in groups["docker_lb"]
- role: reverse_proxy
tags:
- reverse_proxy
when: inventory_hostname in groups["docker_lb"]

5
playbooks/docker.yaml Normal file
View File

@@ -0,0 +1,5 @@
---
- name: Setup Docker Hosts
ansible.builtin.import_playbook: docker-host.yaml
- name: Setup Docker load balancer
ansible.builtin.import_playbook: docker-lb.yaml

16
playbooks/k3s-agents.yaml Normal file
View File

@@ -0,0 +1,16 @@
- name: Set up Agents
hosts: k3s
gather_facts: true
roles:
- role: common
when: inventory_hostname in groups["k3s_agent"]
tags:
- common
- role: k3s_agent
when: inventory_hostname in groups["k3s_agent"]
tags:
- k3s_agent
# - role: node_exporter
# when: inventory_hostname in groups["k3s_agent"]
# tags:
# - node_exporter

View File

@@ -0,0 +1,17 @@
---
- name: Set up Servers
hosts: k3s
gather_facts: true
roles:
- role: common
tags:
- common
when: inventory_hostname in groups["k3s_loadbalancer"]
- role: k3s_loadbalancer
tags:
- k3s_loadbalancer
when: inventory_hostname in groups["k3s_loadbalancer"]
# - role: node_exporter
# tags:
# - node_exporter
# when: inventory_hostname in groups["k3s_loadbalancer"]

View File

@@ -0,0 +1,17 @@
---
- name: Set up Servers
hosts: k3s
gather_facts: true
roles:
- role: common
tags:
- common
when: inventory_hostname in groups["k3s_server"]
- role: k3s_server
tags:
- k3s_server
when: inventory_hostname in groups["k3s_server"]
# - role: node_exporter
# tags:
# - node_exporter
# when: inventory_hostname in groups["k3s_server"]

View File

@@ -0,0 +1,16 @@
- name: Set up storage
hosts: k3s_nodes
gather_facts: true
roles:
- role: common
when: inventory_hostname in groups["k3s_storage"]
tags:
- common
- role: k3s_storage
when: inventory_hostname in groups["k3s_storage"]
tags:
- k3s_storage
# - role: node_exporter
# when: inventory_hostname in groups["k3s_storage"]
# tags:
# - node_exporter

18
playbooks/kube-vip.yaml Normal file
View File

@@ -0,0 +1,18 @@
---
# Deploys kube-vip on all k3s server nodes and adds the VIP to their TLS SANs.
#
# Migration steps (run once):
# 1. ansible-playbook playbooks/kube-vip.yaml
# 2. Update DNS: k3s.seyshiro.de → 192.168.20.2
# 3. Verify: kubectl get nodes (should work via VIP)
# 4. Decommission k3s-loadbalancer VM when satisfied
#
# The playbook is idempotent — re-running it after migration is safe.
- name: Deploy kube-vip on k3s server nodes
hosts: k3s_server
gather_facts: true
serial: 1
roles:
- role: kube_vip
tags:
- kube_vip

View File

@@ -0,0 +1,10 @@
---
- name: Setup Kubernetes Cluster
hosts: kubernetes
any_errors_fatal: true
gather_facts: false
vars:
is_localhost: "{{ inventory_hostname == '127.0.0.1' }}"
roles:
- role: kubernetes_argocd
when: is_localhost

View File

@@ -0,0 +1,6 @@
---
- name: Create new VM(s)
ansible.builtin.import_playbook: proxmox.yaml
- name: Provision VM
ansible.builtin.import_playbook: k3s-agents.yaml

15
playbooks/proxmox.yaml Normal file
View File

@@ -0,0 +1,15 @@
---
- name: Run proxmox vm playbook
hosts: proxmox
gather_facts: true
vars:
is_localhost: "{{ inventory_hostname == '127.0.0.1' }}"
is_proxmox_node: "{{ 'proxmox_nodes' in group_names }}"
roles:
- role: common
tags:
- common
when: not is_localhost
- role: proxmox
tags:
- proxmox

View File

@@ -1,49 +0,0 @@
[vps]
mii
[k3s]
k3s-postgres
k3s-loadbalancer
k3s-server00
k3s-server01
k3s-server02
k3s-agent00
k3s-agent01
k3s-agent02
[k3s_server]
k3s-server00
k3s-server01
k3s-server02
[k3s_agent]
k3s-agent00
k3s-agent01
k3s-agent02
[vm]
k3s-agent00
k3s-agent01
k3s-agent02
k3s-server00
k3s-server01
k3s-server02
k3s-postgres
k3s-loadbalancer
[k3s_nodes]
k3s-server00
k3s-server01
k3s-server02
k3s-agent00
k3s-agent01
k3s-agent02
[db]
k3s-postgres
[loadbalancer]
k3s-loadbalancer
[vm:vars]
ansible_ssh_common_args='-o ProxyCommand="ssh -p 22 -W %h:%p -q aya01"'

28
requirements.txt Normal file
View File

@@ -0,0 +1,28 @@
cachetools==5.5.2
certifi==2025.1.31
cfgv==3.4.0
charset-normalizer==3.4.1
distlib==0.4.0
durationpy==0.10
filelock==3.18.0
google-auth==2.40.3
identify==2.6.12
idna==3.10
kubernetes==33.1.0
nc-dnsapi==0.1.3
nodeenv==1.9.1
oauthlib==3.3.1
platformdirs==4.3.8
pre_commit==4.2.0
proxmoxer==2.2.0
pyasn1==0.6.1
pyasn1_modules==0.4.2
python-dateutil==2.9.0.post0
PyYAML==6.0.2
requests==2.32.3
requests-oauthlib==2.0.0
rsa==4.9.1
six==1.17.0
urllib3==2.3.0
virtualenv==20.32.0
websocket-client==1.8.0

5
requirements.yaml Normal file
View File

@@ -0,0 +1,5 @@
---
collections:
- name: community.docker
- name: community.general
- name: kubernetes.core

72
roles/common/README.md Normal file
View File

@@ -0,0 +1,72 @@
# Ansible Role: common
This role configures a baseline set of common configurations for Debian-based systems, including time synchronization, essential packages, hostname, and specific developer tools.
## Requirements
None.
## Role Variables
Available variables are listed below, along with default values (see `vars/main.yml`):
```yaml
# A list of common packages to install via apt.
common_packages:
- build-essential
- curl
- git
- iperf3
- neovim
- rsync
- smartmontools
- sudo
- systemd-timesyncd
- tree
- screen
- bat
- fd-find
- ripgrep
- nfs-common
- open-iscsi
- parted
# The hostname to configure.
hostname: "new-host"
```
## Tasks
The role performs the following tasks:
1. **Configure Time**: Sets up `systemd-timesyncd` and timezone.
2. **Configure Packages**: Installs the list of `common_packages`.
3. **Configure Hostname**: Sets the system hostname.
4. **Configure Extra-Packages**:
- Installs `eza` (modern ls replacement).
- Installs `bottom` (process viewer).
- Installs `neovim` from AppImage and clones a custom configuration.
5. **Configure Bash**: Sets up bash aliases and prompt.
6. **Configure SSH**: Configures `sshd_config` for security.
## Dependencies
None.
## Example Playbook
```yaml
- hosts: servers
roles:
- role: common
vars:
hostname: "my-server"
```
## License
MIT
## Author Information
This role was created in 2025 by [TuDatTr](https://codeberg.org/tudattr/).

View File

@@ -0,0 +1,4 @@
alias cat=batcat
alias vim=nvim
alias fd=fdfind
alias ls=eza

View File

@@ -1,7 +1,7 @@
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
case $- in
*i*) ;;
*) return;;
*i*) ;;
*) return ;;
esac
HISTCONTROL=ignoreboth
shopt -s histappend
@@ -9,39 +9,38 @@ HISTSIZE=1000
HISTFILESIZE=2000
shopt -s checkwinsize
if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then
debian_chroot=$(cat /etc/debian_chroot)
debian_chroot=$(cat /etc/debian_chroot)
fi
case "$TERM" in
xterm-color|*-256color) color_prompt=yes;;
xterm-color | *-256color) color_prompt=yes ;;
esac
if [ -n "$force_color_prompt" ]; then
if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then
color_prompt=yes
else
color_prompt=
fi
if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then
color_prompt=yes
else
color_prompt=
fi
fi
if [ "$color_prompt" = yes ]; then
PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
else
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w\$ '
fi
unset color_prompt force_color_prompt
case "$TERM" in
xterm*|rxvt*)
PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
;;
*)
;;
xterm* | rxvt*)
PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@\h: \w\a\]$PS1"
;;
*) ;;
esac
if [ -x /usr/bin/dircolors ]; then
test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"
alias ls='ls --color=auto'
test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)"
alias ls='ls --color=auto'
fi
if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
. ~/.bash_aliases
fi
if ! shopt -oq posix; then
@@ -51,3 +50,7 @@ if ! shopt -oq posix; then
. /etc/bash_completion
fi
fi
if [ -f /etc/profile ]; then
. /etc/profile
fi

View File

@@ -0,0 +1,80 @@
xterm-ghostty|ghostty|Ghostty,
am, bce, ccc, hs, km, mc5i, mir, msgr, npc, xenl, AX, Su, Tc, XT, fullkbd,
colors#0x100, cols#80, it#8, lines#24, pairs#0x7fff,
acsc=++\,\,--..00``aaffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
bel=^G, blink=\E[5m, bold=\E[1m, cbt=\E[Z, civis=\E[?25l,
clear=\E[H\E[2J, cnorm=\E[?12l\E[?25h, cr=\r,
csr=\E[%i%p1%d;%p2%dr, cub=\E[%p1%dD, cub1=^H,
cud=\E[%p1%dB, cud1=\n, cuf=\E[%p1%dC, cuf1=\E[C,
cup=\E[%i%p1%d;%p2%dH, cuu=\E[%p1%dA, cuu1=\E[A,
cvvis=\E[?12;25h, dch=\E[%p1%dP, dch1=\E[P, dim=\E[2m,
dl=\E[%p1%dM, dl1=\E[M, dsl=\E]2;\007, ech=\E[%p1%dX,
ed=\E[J, el=\E[K, el1=\E[1K, flash=\E[?5h$<100/>\E[?5l,
fsl=^G, home=\E[H, hpa=\E[%i%p1%dG, ht=^I, hts=\EH,
ich=\E[%p1%d@, ich1=\E[@, il=\E[%p1%dL, il1=\E[L, ind=\n,
indn=\E[%p1%dS,
initc=\E]4;%p1%d;rgb:%p2%{255}%*%{1000}%/%2.2X/%p3%{255}%*%{1000}%/%2.2X/%p4%{255}%*%{1000}%/%2.2X\E\\,
invis=\E[8m, kDC=\E[3;2~, kEND=\E[1;2F, kHOM=\E[1;2H,
kIC=\E[2;2~, kLFT=\E[1;2D, kNXT=\E[6;2~, kPRV=\E[5;2~,
kRIT=\E[1;2C, kbs=^?, kcbt=\E[Z, kcub1=\EOD, kcud1=\EOB,
kcuf1=\EOC, kcuu1=\EOA, kdch1=\E[3~, kend=\EOF, kent=\EOM,
kf1=\EOP, kf10=\E[21~, kf11=\E[23~, kf12=\E[24~,
kf13=\E[1;2P, kf14=\E[1;2Q, kf15=\E[1;2R, kf16=\E[1;2S,
kf17=\E[15;2~, kf18=\E[17;2~, kf19=\E[18;2~, kf2=\EOQ,
kf20=\E[19;2~, kf21=\E[20;2~, kf22=\E[21;2~,
kf23=\E[23;2~, kf24=\E[24;2~, kf25=\E[1;5P, kf26=\E[1;5Q,
kf27=\E[1;5R, kf28=\E[1;5S, kf29=\E[15;5~, kf3=\EOR,
kf30=\E[17;5~, kf31=\E[18;5~, kf32=\E[19;5~,
kf33=\E[20;5~, kf34=\E[21;5~, kf35=\E[23;5~,
kf36=\E[24;5~, kf37=\E[1;6P, kf38=\E[1;6Q, kf39=\E[1;6R,
kf4=\EOS, kf40=\E[1;6S, kf41=\E[15;6~, kf42=\E[17;6~,
kf43=\E[18;6~, kf44=\E[19;6~, kf45=\E[20;6~,
kf46=\E[21;6~, kf47=\E[23;6~, kf48=\E[24;6~,
kf49=\E[1;3P, kf5=\E[15~, kf50=\E[1;3Q, kf51=\E[1;3R,
kf52=\E[1;3S, kf53=\E[15;3~, kf54=\E[17;3~,
kf55=\E[18;3~, kf56=\E[19;3~, kf57=\E[20;3~,
kf58=\E[21;3~, kf59=\E[23;3~, kf6=\E[17~, kf60=\E[24;3~,
kf61=\E[1;4P, kf62=\E[1;4Q, kf63=\E[1;4R, kf7=\E[18~,
kf8=\E[19~, kf9=\E[20~, khome=\EOH, kich1=\E[2~,
kind=\E[1;2B, kmous=\E[<, knp=\E[6~, kpp=\E[5~,
kri=\E[1;2A, oc=\E]104\007, op=\E[39;49m, rc=\E8,
rep=%p1%c\E[%p2%{1}%-%db, rev=\E[7m, ri=\EM,
rin=\E[%p1%dT, ritm=\E[23m, rmacs=\E(B, rmam=\E[?7l,
rmcup=\E[?1049l, rmir=\E[4l, rmkx=\E[?1l\E>, rmso=\E[27m,
rmul=\E[24m, rs1=\E]\E\\\Ec, sc=\E7,
setab=\E[%?%p1%{8}%<%t4%p1%d%e%p1%{16}%<%t10%p1%{8}%-%d%e48;5;%p1%d%;m,
setaf=\E[%?%p1%{8}%<%t3%p1%d%e%p1%{16}%<%t9%p1%{8}%-%d%e38;5;%p1%d%;m,
sgr=%?%p9%t\E(0%e\E(B%;\E[0%?%p6%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;%?%p7%t;8%;m,
sgr0=\E(B\E[m, sitm=\E[3m, smacs=\E(0, smam=\E[?7h,
smcup=\E[?1049h, smir=\E[4h, smkx=\E[?1h\E=, smso=\E[7m,
smul=\E[4m, tbc=\E[3g, tsl=\E]2;, u6=\E[%i%d;%dR, u7=\E[6n,
u8=\E[?%[;0123456789]c, u9=\E[c, vpa=\E[%i%p1%dd,
BD=\E[?2004l, BE=\E[?2004h, Clmg=\E[s,
Cmg=\E[%i%p1%d;%p2%ds, Dsmg=\E[?69l, E3=\E[3J,
Enmg=\E[?69h, Ms=\E]52;%p1%s;%p2%s\007, PE=\E[201~,
PS=\E[200~, RV=\E[>c, Se=\E[2 q,
Setulc=\E[58:2::%p1%{65536}%/%d:%p1%{256}%/%{255}%&%d:%p1%{255}%&%d%;m,
Smulx=\E[4:%p1%dm, Ss=\E[%p1%d q,
Sync=\E[?2026%?%p1%{1}%-%tl%eh%;,
XM=\E[?1006;1000%?%p1%{1}%=%th%el%;, XR=\E[>0q,
fd=\E[?1004l, fe=\E[?1004h, kDC3=\E[3;3~, kDC4=\E[3;4~,
kDC5=\E[3;5~, kDC6=\E[3;6~, kDC7=\E[3;7~, kDN=\E[1;2B,
kDN3=\E[1;3B, kDN4=\E[1;4B, kDN5=\E[1;5B, kDN6=\E[1;6B,
kDN7=\E[1;7B, kEND3=\E[1;3F, kEND4=\E[1;4F,
kEND5=\E[1;5F, kEND6=\E[1;6F, kEND7=\E[1;7F,
kHOM3=\E[1;3H, kHOM4=\E[1;4H, kHOM5=\E[1;5H,
kHOM6=\E[1;6H, kHOM7=\E[1;7H, kIC3=\E[2;3~, kIC4=\E[2;4~,
kIC5=\E[2;5~, kIC6=\E[2;6~, kIC7=\E[2;7~, kLFT3=\E[1;3D,
kLFT4=\E[1;4D, kLFT5=\E[1;5D, kLFT6=\E[1;6D,
kLFT7=\E[1;7D, kNXT3=\E[6;3~, kNXT4=\E[6;4~,
kNXT5=\E[6;5~, kNXT6=\E[6;6~, kNXT7=\E[6;7~,
kPRV3=\E[5;3~, kPRV4=\E[5;4~, kPRV5=\E[5;5~,
kPRV6=\E[5;6~, kPRV7=\E[5;7~, kRIT3=\E[1;3C,
kRIT4=\E[1;4C, kRIT5=\E[1;5C, kRIT6=\E[1;6C,
kRIT7=\E[1;7C, kUP=\E[1;2A, kUP3=\E[1;3A, kUP4=\E[1;4A,
kUP5=\E[1;5A, kUP6=\E[1;6A, kUP7=\E[1;7A, kxIN=\E[I,
kxOUT=\E[O, rmxx=\E[29m, rv=\E\\[[0-9]+;[0-9]+;[0-9]+c,
setrgbb=\E[48:2:%p1%d:%p2%d:%p3%dm,
setrgbf=\E[38:2:%p1%d:%p2%d:%p3%dm, smxx=\E[9m,
xm=\E[<%i%p3%d;%p1%d;%p2%d;%?%p4%tM%em%;,
xr=\EP>\\|[ -~]+a\E\\,

View File

@@ -0,0 +1,77 @@
# Reconstructed via infocmp from file: /usr/lib/kitty/terminfo/./x/xterm-kitty
xterm-kitty|KovIdTTY,
am, bw, ccc, hs, km, mc5i, mir, msgr, npc, xenl, Su, Tc, XF, fullkbd,
colors#0x100, cols#80, it#8, lines#24, pairs#0x7fff,
acsc=++\,\,--..00``aaffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
bel=^G, blink=\E[5m, bold=\E[1m, cbt=\E[Z, civis=\E[?25l,
clear=\E[H\E[2J, cnorm=\E[?12h\E[?25h, cr=\r,
csr=\E[%i%p1%d;%p2%dr, cub=\E[%p1%dD, cub1=^H,
cud=\E[%p1%dB, cud1=\n, cuf=\E[%p1%dC, cuf1=\E[C,
cup=\E[%i%p1%d;%p2%dH, cuu=\E[%p1%dA, cuu1=\E[A,
cvvis=\E[?12;25h, dch=\E[%p1%dP, dch1=\E[P, dim=\E[2m,
dl=\E[%p1%dM, dl1=\E[M, dsl=\E]2;\E\\, ech=\E[%p1%dX,
ed=\E[J, el=\E[K, el1=\E[1K, flash=\E[?5h$<100/>\E[?5l,
fsl=^G, home=\E[H, hpa=\E[%i%p1%dG, ht=^I, hts=\EH,
ich=\E[%p1%d@, il=\E[%p1%dL, il1=\E[L, ind=\n,
indn=\E[%p1%dS,
initc=\E]4;%p1%d;rgb:%p2%{255}%*%{1000}%/%2.2X/%p3%{255}%*%{1000}%/%2.2X/%p4%{255}%*%{1000}%/%2.2X\E\\,
kBEG=\E[1;2E, kDC=\E[3;2~, kEND=\E[1;2F, kHOM=\E[1;2H,
kIC=\E[2;2~, kLFT=\E[1;2D, kNXT=\E[6;2~, kPRV=\E[5;2~,
kRIT=\E[1;2C, kbeg=\EOE, kbs=^?, kcbt=\E[Z, kcub1=\EOD,
kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA, kdch1=\E[3~, kend=\EOF,
kf1=\EOP, kf10=\E[21~, kf11=\E[23~, kf12=\E[24~,
kf13=\E[1;2P, kf14=\E[1;2Q, kf15=\E[13;2~, kf16=\E[1;2S,
kf17=\E[15;2~, kf18=\E[17;2~, kf19=\E[18;2~, kf2=\EOQ,
kf20=\E[19;2~, kf21=\E[20;2~, kf22=\E[21;2~,
kf23=\E[23;2~, kf24=\E[24;2~, kf25=\E[1;5P, kf26=\E[1;5Q,
kf27=\E[13;5~, kf28=\E[1;5S, kf29=\E[15;5~, kf3=\EOR,
kf30=\E[17;5~, kf31=\E[18;5~, kf32=\E[19;5~,
kf33=\E[20;5~, kf34=\E[21;5~, kf35=\E[23;5~,
kf36=\E[24;5~, kf37=\E[1;6P, kf38=\E[1;6Q, kf39=\E[13;6~,
kf4=\EOS, kf40=\E[1;6S, kf41=\E[15;6~, kf42=\E[17;6~,
kf43=\E[18;6~, kf44=\E[19;6~, kf45=\E[20;6~,
kf46=\E[21;6~, kf47=\E[23;6~, kf48=\E[24;6~,
kf49=\E[1;3P, kf5=\E[15~, kf50=\E[1;3Q, kf51=\E[13;3~,
kf52=\E[1;3S, kf53=\E[15;3~, kf54=\E[17;3~,
kf55=\E[18;3~, kf56=\E[19;3~, kf57=\E[20;3~,
kf58=\E[21;3~, kf59=\E[23;3~, kf6=\E[17~, kf60=\E[24;3~,
kf61=\E[1;4P, kf62=\E[1;4Q, kf63=\E[13;4~, kf7=\E[18~,
kf8=\E[19~, kf9=\E[20~, khome=\EOH, kich1=\E[2~,
kind=\E[1;2B, kmous=\E[M, knp=\E[6~, kpp=\E[5~,
kri=\E[1;2A, oc=\E]104\007, op=\E[39;49m, rc=\E8,
rep=%p1%c\E[%p2%{1}%-%db, rev=\E[7m, ri=\EM,
rin=\E[%p1%dT, ritm=\E[23m, rmacs=\E(B, rmam=\E[?7l,
rmcup=\E[?1049l, rmir=\E[4l, rmkx=\E[?1l, rmso=\E[27m,
rmul=\E[24m, rs1=\E]\E\\\Ec, sc=\E7,
setab=\E[%?%p1%{8}%<%t4%p1%d%e%p1%{16}%<%t10%p1%{8}%-%d%e48;5;%p1%d%;m,
setaf=\E[%?%p1%{8}%<%t3%p1%d%e%p1%{16}%<%t9%p1%{8}%-%d%e38;5;%p1%d%;m,
sgr=%?%p9%t\E(0%e\E(B%;\E[0%?%p6%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;%?%p7%t;8%;%?%p5%t;2%;m,
sgr0=\E(B\E[m, sitm=\E[3m, smacs=\E(0, smam=\E[?7h,
smcup=\E[?1049h, smir=\E[4h, smkx=\E[?1h, smso=\E[7m,
smul=\E[4m, tbc=\E[3g, tsl=\E]2;, u6=\E[%i%d;%dR, u7=\E[6n,
u8=\E[?%[;0123456789]c, u9=\E[c, vpa=\E[%i%p1%dd,
BD=\E[?2004l, BE=\E[?2004h, Cr=\E]112\007,
Cs=\E]12;%p1%s\007, Ms=\E]52;%p1%s;%p2%s\E\\,
PE=\E[201~, PS=\E[200~, RV=\E[>c, Se=\E[2 q,
Setulc=\E[58:2:%p1%{65536}%/%d:%p1%{256}%/%{255}%&%d:%p1%{255}%&%d%;m,
Smulx=\E[4:%p1%dm, Ss=\E[%p1%d q, Sync=\EP=%p1%ds\E\\,
XR=\E[>0q, fd=\E[?1004l, fe=\E[?1004h, kBEG3=\E[1;3E,
kBEG4=\E[1;4E, kBEG5=\E[1;5E, kBEG6=\E[1;6E,
kBEG7=\E[1;7E, kDC3=\E[3;3~, kDC4=\E[3;4~, kDC5=\E[3;5~,
kDC6=\E[3;6~, kDC7=\E[3;7~, kDN=\E[1;2B, kDN3=\E[1;3B,
kDN4=\E[1;4B, kDN5=\E[1;5B, kDN6=\E[1;6B, kDN7=\E[1;7B,
kEND3=\E[1;3F, kEND4=\E[1;4F, kEND5=\E[1;5F,
kEND6=\E[1;6F, kEND7=\E[1;7F, kHOM3=\E[1;3H,
kHOM4=\E[1;4H, kHOM5=\E[1;5H, kHOM6=\E[1;6H,
kHOM7=\E[1;7H, kIC3=\E[2;3~, kIC4=\E[2;4~, kIC5=\E[2;5~,
kIC6=\E[2;6~, kIC7=\E[2;7~, kLFT3=\E[1;3D, kLFT4=\E[1;4D,
kLFT5=\E[1;5D, kLFT6=\E[1;6D, kLFT7=\E[1;7D,
kNXT3=\E[6;3~, kNXT4=\E[6;4~, kNXT5=\E[6;5~,
kNXT6=\E[6;6~, kNXT7=\E[6;7~, kPRV3=\E[5;3~,
kPRV4=\E[5;4~, kPRV5=\E[5;5~, kPRV6=\E[5;6~,
kPRV7=\E[5;7~, kRIT3=\E[1;3C, kRIT4=\E[1;4C,
kRIT5=\E[1;5C, kRIT6=\E[1;6C, kRIT7=\E[1;7C, kUP=\E[1;2A,
kUP3=\E[1;3A, kUP4=\E[1;4A, kUP5=\E[1;5A, kUP6=\E[1;6A,
kUP7=\E[1;7A, kxIN=\E[I, kxOUT=\E[O, rmxx=\E[29m,
setrgbb=\E[48:2:%p1%d:%p2%d:%p3%dm,
setrgbf=\E[38:2:%p1%d:%p2%d:%p3%dm, smxx=\E[9m,

View File

@@ -0,0 +1,18 @@
Protocol 2
PermitRootLogin yes
MaxAuthTries 3
PubkeyAuthentication yes
PasswordAuthentication no
PermitEmptyPasswords no
ChallengeResponseAuthentication no
UsePAM yes
AllowAgentForwarding no
AllowTcpForwarding yes
X11Forwarding no
PrintMotd no
TCPKeepAlive no
ClientAliveCountMax 2
TrustedUserCAKeys /etc/ssh/vault-ca.pub
UseDNS yes
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server

View File

@@ -0,0 +1,18 @@
Protocol 2
PermitRootLogin no
MaxAuthTries 3
PubkeyAuthentication yes
PasswordAuthentication no
PermitEmptyPasswords no
ChallengeResponseAuthentication no
UsePAM yes
AllowAgentForwarding no
AllowTcpForwarding no
X11Forwarding no
PrintMotd no
TCPKeepAlive no
ClientAliveCountMax 2
TrustedUserCAKeys /etc/ssh/vault-ca.pub
UseDNS yes
AcceptEnv LANG LC_*
Subsystem sftp /usr/lib/openssh/sftp-server

View File

@@ -0,0 +1 @@
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDxIbkko72kVSfYDjJpiMH9SjHUGqBn3MbBvmotsPQhybFgnnkBpX/3fM9olP+Z6PGsmbOEs0fOjPS6uY5hjKcKsyHdZfS6cA4wjY/DL8fwATAW5FCDBtMpdg2/sb8j9jutHHs4sQeRBolVwKcv+ZAaJNnOzNHwxVUfT9bNwShthnAFjkY7oZo657FRomlkDJjmGQuratP0veKA8jYzqqPWwWidTGQerLYTyJ3Z8pbQa5eN7svrvabjjDLbVTDESE8st9WEmwvAwoj7Kz+WovCy0Uz7LRFVmaRiapM8SXtPPUC0xfyzAB3NxwBtxizdUMlShvLcL6cujcUBMulVMpsqEaOESTpmVTrMJhnJPZG/3j9ziGoYIa6hMj1J9/qLQ5dDNVVXMxw99G31x0LJoy12IE90P4Cahux8iN0Cp4oB4+B6/qledxs1fcRzsnQY/ickjKhqcJwgHzsnwjDkeYRaYte5x4f/gJ77kA20nPto7mxr2mhWot/i9B1KlMURVXOH/q4nrzhJ0hPJpM0UtzQ58TmzE4Osf/B5yoe8V//6XnelbmG/nKCIzg12d7PvaLjbFMn8IgOwDMRlip+vpyadRr/+pCawrfo4vLF7BsnJ84aoByIpbwaysgaYHtjfZWImorMVkgviC4O6Hn9/ZiLNze2A9DaNUnLVJ0nYNbmv9Q==

View File

@@ -0,0 +1,12 @@
---
- name: Restart sshd
service:
name: sshd
state: restarted
become: true
- name: Restart timesyncd
ansible.builtin.systemd:
name: systemd-timesyncd
state: restarted
become: true

View File

@@ -1,6 +0,0 @@
---
- name: Restart sshd
service:
name: sshd
state: restarted
become: yes

View File

@@ -0,0 +1,37 @@
---
- name: Copy bash-configs
ansible.builtin.template:
src: "files/bash/{{ item }}"
dest: "{{ ansible_env.HOME }}/.{{ item }}"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
mode: "644"
loop:
- bashrc
- bash_aliases
- name: Copy ghostty infocmp
ansible.builtin.copy:
src: files/ghostty/infocmp
dest: "{{ ansible_env.HOME }}/ghostty"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
mode: "0644"
register: ghostty_terminfo
- name: Compile ghostty terminalinfo
ansible.builtin.command: "tic -x {{ ansible_env.HOME }}/ghostty"
when: ghostty_terminfo.changed
- name: Copy kitty infocmp
ansible.builtin.copy:
src: files/kitty/infocmp
dest: "{{ ansible_env.HOME }}/kitty"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
mode: "0644"
register: kitty_terminfo
- name: Compile kitty terminalinfo
ansible.builtin.command: "tic -x {{ ansible_env.HOME }}/kitty"
when: kitty_terminfo.changed

View File

@@ -1,9 +0,0 @@
---
- name: Copy .bashrc
template:
src: files/bash/bashrc
dest: "/home/{{ user }}/.bashrc"
owner: "{{ user }}"
group: "{{ user }}"
mode: 0644
become: yes

View File

@@ -0,0 +1,95 @@
---
- name: Ensure /etc/apt/keyrings directory exists
ansible.builtin.file:
path: /etc/apt/keyrings
state: directory
mode: "0755"
become: true
- name: Download and save Gierens repository GPG key
ansible.builtin.get_url:
url: https://raw.githubusercontent.com/eza-community/eza/main/deb.asc
dest: /etc/apt/keyrings/gierens.asc
mode: "0644"
become: true
- name: Add Gierens repository to apt sources
ansible.builtin.apt_repository:
repo: "deb [signed-by=/etc/apt/keyrings/gierens.asc] http://deb.gierens.de stable main"
state: present
update_cache: true
become: true
- name: Install eza package
ansible.builtin.apt:
name: eza
state: present
become: true
- name: Install bottom package
ansible.builtin.apt:
deb: https://github.com/ClementTsang/bottom/releases/download/0.9.6/bottom_0.9.6_amd64.deb
state: present
become: true
- name: Check if Neovim is already installed
ansible.builtin.command: "which nvim"
register: neovim_installed
changed_when: false
ignore_errors: true
- name: Download Neovim AppImage
ansible.builtin.get_url:
url: https://github.com/neovim/neovim/releases/download/v0.10.0/nvim.appimage
dest: /tmp/nvim.appimage
mode: "0755"
when: neovim_installed.rc != 0
register: download_result
- name: Extract Neovim AppImage
ansible.builtin.command:
cmd: "./nvim.appimage --appimage-extract"
chdir: /tmp
when: download_result.changed
register: extract_result
- name: Copy extracted Neovim files to /usr
ansible.builtin.copy:
src: /tmp/squashfs-root/usr/
dest: /usr/
remote_src: true
mode: "0755"
become: true
when: extract_result.changed
- name: Clean up extracted Neovim files
ansible.builtin.file:
path: /tmp/squashfs-root
state: absent
when: extract_result.changed
- name: Remove Neovim AppImage
ansible.builtin.file:
path: /tmp/nvim.appimage
state: absent
when: download_result.changed
- name: Check if Neovim config directory already exists
ansible.builtin.stat:
path: ~/.config/nvim
register: nvim_config
- name: Clone personal Neovim config directory
ansible.builtin.git:
repo: https://codeberg.org/tudattr/nvim
dest: ~/.config/nvim
clone: true
update: false
version: 1.0.0
when: not nvim_config.stat.exists
- name: Remove .git directory from Neovim config
ansible.builtin.file:
path: ~/.config/nvim/.git
state: absent
when: not nvim_config.stat.exists

View File

@@ -1,14 +1,14 @@
---
- name: Set a hostname
ansible.builtin.hostname:
name: "{{ host.hostname }}"
name: "{{ inventory_hostname }}"
become: true
- name: Update /etc/hosts to reflect the new hostname
lineinfile:
ansible.builtin.lineinfile:
path: /etc/hosts
regexp: '^127\.0\.1\.1'
line: "127.0.1.1 {{ host.hostname }}"
line: "127.0.1.1 {{ inventory_hostname }}"
state: present
backup: yes
backup: true
become: true

View File

@@ -0,0 +1,13 @@
---
- name: Configure Time
ansible.builtin.include_tasks: time.yaml
- name: Configure Packages
ansible.builtin.include_tasks: packages.yaml
- name: Configure Hostname
ansible.builtin.include_tasks: hostname.yaml
- name: Configure Extra-Packages
ansible.builtin.include_tasks: extra_packages.yaml
- name: Configure Bash
ansible.builtin.include_tasks: bash.yaml
- name: Configure SSH
ansible.builtin.include_tasks: sshd.yaml

View File

@@ -1,6 +0,0 @@
---
- include_tasks: time.yml
- include_tasks: hostname.yml
- include_tasks: packages.yml
- include_tasks: bash.yml
- include_tasks: sshd.yml

View File

@@ -0,0 +1,28 @@
---
- name: Update and upgrade packages
ansible.builtin.apt:
update_cache: true
upgrade: true
autoremove: true
become: true
when: ansible_user_id != "root"
- name: Install base packages
ansible.builtin.apt:
name: "{{ common_packages }}"
state: present
become: true
when: ansible_user_id != "root"
- name: Update and upgrade packages
ansible.builtin.apt:
update_cache: true
upgrade: true
autoremove: true
when: ansible_user_id == "root"
- name: Install base packages
ansible.builtin.apt:
name: "{{ common_packages }}"
state: present
when: ansible_user_id == "root"

View File

@@ -1,13 +0,0 @@
---
- name: Update and upgrade packages
apt:
update_cache: yes
upgrade: yes
autoremove: yes
become: yes
- name: Install extra packages
apt:
name: "{{ common_packages }}"
state: present
become: yes

View File

@@ -0,0 +1,28 @@
---
- name: Copy user sshd_config
ansible.builtin.template:
src: files/ssh/user/sshd_config
dest: /etc/ssh/sshd_config
mode: "644"
backup: true
notify:
- Restart sshd
become: true
when: ansible_user_id != "root"
- name: Copy root sshd_config
ansible.builtin.template:
src: files/ssh/root/sshd_config
dest: /etc/ssh/sshd_config
mode: "644"
backup: true
notify:
- Restart sshd
when: ansible_user_id == "root"
- name: Copy pubkey
ansible.builtin.copy:
src: files/ssh/vault-ca.pub
dest: "/etc/ssh/vault-ca.pub"
mode: "644"
become: true

View File

@@ -1,17 +0,0 @@
---
- name: Copy sshd_config
template:
src: templates/ssh/sshd_config
dest: /etc/ssh/sshd_config
mode: 0644
notify:
- Restart sshd
become: yes
- name: Copy pubkey
copy:
content: "{{ pubkey }}"
dest: "/home/{{ user }}/.ssh/authorized_keys"
owner: "{{ user }}"
group: "{{ user }}"
mode: "644"

View File

@@ -0,0 +1,34 @@
---
- name: Set timezone
community.general.timezone:
name: "{{ timezone }}"
become: true
when: ansible_user_id != "root"
- name: Set timezone
community.general.timezone:
name: "{{ timezone }}"
when: ansible_user_id == "root"
- name: Configure NTP servers for systemd-timesyncd
ansible.builtin.lineinfile:
path: /etc/systemd/timesyncd.conf
regexp: "^#?NTP="
line: "NTP=0.debian.pool.ntp.org 1.debian.pool.ntp.org 2.debian.pool.ntp.org 3.debian.pool.ntp.org"
become: true
notify: Restart timesyncd
- name: Enable and start systemd-timesyncd
ansible.builtin.systemd:
name: systemd-timesyncd
enabled: true
state: started
become: true
when: ansible_user_id != "root"
- name: Enable and start systemd-timesyncd
ansible.builtin.systemd:
name: systemd-timesyncd
enabled: true
state: started
when: ansible_user_id == "root"

View File

@@ -1,4 +0,0 @@
---
- name: Set timezone to "{{ timezone }}"
community.general.timezone:
name: "{{ timezone }}"

View File

@@ -1,124 +0,0 @@
# $OpenBSD: sshd_config,v 1.103 2018/04/09 20:41:22 tj Exp $
# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.
# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin
# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.
Include /etc/ssh/sshd_config.d/*.conf
Protocol 2
#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::
#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key
# Ciphers and keying
#RekeyLimit default none
# Logging
#SyslogFacility AUTH
#LogLevel INFO
# Authentication:
#LoginGraceTime 2m
PermitRootLogin no
#StrictModes yes
MaxAuthTries 3
#MaxSessions 10
PubkeyAuthentication yes
# Expect .ssh/authorized_keys2 to be disregarded by default in future.
#AuthorizedKeysFile .ssh/authorized_keys .ssh/authorized_keys2
#AuthorizedPrincipalsFile none
#AuthorizedKeysCommand none
#AuthorizedKeysCommandUser nobody
# For this to work you will also need host keys in /etc/ssh/ssh_known_hosts
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes
# To disable tunneled clear text passwords, change to no here!
PasswordAuthentication no
PermitEmptyPasswords no
# Change to yes to enable challenge-response passwords (beware issues with
# some PAM modules and threads)
ChallengeResponseAuthentication no
# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no
# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no
# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes
AllowAgentForwarding no
AllowTcpForwarding no
#GatewayPorts no
X11Forwarding no
#X11DisplayOffset 10
#X11UseLocalhost yes
#PermitTTY yes
PrintMotd no
#PrintLastLog yes
TCPKeepAlive no
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
ClientAliveCountMax 2
UseDNS yes
#PidFile /var/run/sshd.pid
#MaxStartups 10:30:100
#PermitTunnel no
#ChrootDirectory none
#VersionAddendum none
# no default banner path
#Banner none
# Allow client to pass locale environment variables
AcceptEnv LANG LC_*
# override default of no subsystems
Subsystem sftp /usr/lib/openssh/sftp-server
# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server

View File

@@ -0,0 +1,18 @@
common_packages:
- build-essential
- curl
- git
- iperf3
- neovim
- rsync
- smartmontools
- sudo
- systemd-timesyncd
- tree
- screen
- bat
- fd-find
- ripgrep
- nfs-common
- open-iscsi
- parted

View File

@@ -0,0 +1,85 @@
# Ansible Role: Docker Host
This role sets up a Docker host, installs Docker, and configures it according to the provided variables. It also handles user and group management, directory setup, and deployment of Docker Compose services.
## Role Variables
### General
- `docker_host_package_common_dependencies`: A list of common packages to be installed on the host.
- Default: `nfs-common`
- `apt_lock_files`: A list of apt lock files to check.
- `arch`: The architecture of the host.
- Default: `arm64` if `ansible_architecture` is `aarch64`, otherwise `amd64`.
### Docker
- `docker.url`: The URL for the Docker repository.
- Default: `https://download.docker.com/linux`
- `docker.apt_release_channel`: The Docker apt release channel.
- Default: `stable`
- `docker.directories.local`: The local directory for Docker data.
- Default: `/opt/local`
- `docker.directories.config`: The directory for Docker configurations.
- Default: `/opt/config`
- `docker.directories.compose`: The directory for Docker Compose files.
- Default: `/opt/compose`
### Keycloak
- `keycloak_config`: A dictionary containing the Keycloak configuration. See `templates/keycloak/realm.json.j2` for more details.
### Services
- `services`: A list of dictionaries, where each dictionary represents a Docker Compose service. See `templates/compose.yaml.j2` for more details.
## Tasks
The role performs the following tasks:
1. **Setup VM**:
- Includes `non-free` and `non-free-firmware` components in the apt sources.
- Installs common packages.
- Removes cloud kernel packages.
- Reboots the host.
2. **Install Docker**:
- Uninstalls old Docker versions.
- Installs dependencies for using repositories over HTTPS.
- Adds the Docker apt key and repository.
- Installs Docker Engine, containerd, and Docker Compose.
3. **Setup user and group for Docker**:
- Ensures the `docker` group exists.
- Adds the `ansible_user_id` to the `docker` group.
- Reboots the host.
4. **Setup directory structure for Docker**:
- Creates necessary directories for Docker and media.
- Sets ownership of the directories.
- Mounts NFS shares.
5. **Deploy configs**:
- Sets up Keycloak realms if the host is a Keycloak host.
6. **Deploy Docker Compose**:
- Copies the Docker Compose file to the target host.
7. **Publish metrics**:
- Copies the `daemon.json` file to `/etc/docker/daemon.json` to enable metrics.
## Handlers
- `Restart docker`: Restarts the Docker service.
- `Restart compose`: Restarts the Docker Compose services.
- `Restart host`: Reboots the host.
## Usage
To use this role, include it in your playbook and set the required variables.
```yaml
- hosts: docker_hosts
roles:
- role: docker_host
vars:
# Your variables here
```
## License
MIT

View File

@@ -0,0 +1,3 @@
{
"metrics-addr": "0.0.0.0:9323"
}

View File

@@ -0,0 +1 @@
vm.max_map_count = 262144

View File

@@ -0,0 +1,21 @@
---
- name: Restart docker
ansible.builtin.service:
name: docker
state: restarted
become: true
- name: Restart compose
community.docker.docker_compose_v2:
project_src: "{{ docker.directories.compose }}"
state: present
retries: 3
delay: 5
become: true
- name: Restart host
ansible.builtin.reboot:
connect_timeout: 5
reboot_timeout: 600
test_command: whoami
become: true

View File

@@ -0,0 +1,50 @@
---
- name: Check if debian.sources file exists
ansible.builtin.stat:
path: /etc/apt/sources.list.d/debian.sources
register: debian_sources_stat
- name: Replace Components line to include non-free and non-free-firmware
ansible.builtin.replace:
path: /etc/apt/sources.list.d/debian.sources
regexp: "^Components:.*$"
replace: "Components: main non-free non-free-firmware"
when: debian_sources_stat.stat.exists
become: true
- name: Setup VM Packages
ansible.builtin.apt:
name: "{{ item }}"
state: present
update_cache: true
loop: "{{ docker_host_package_common_dependencies }}"
become: true
- name: Gather installed package facts
ansible.builtin.package_facts:
manager: auto
- name: Filter for specific cloud kernel packages
ansible.builtin.set_fact:
cloud_kernel_packages: >-
{{
ansible_facts.packages.keys()
| select('search', 'linux-image')
| select('search', 'cloud')
| list
}}
- name: Use the list to remove the found packages
ansible.builtin.apt:
name: "{{ cloud_kernel_packages }}"
state: absent
autoremove: true
when: cloud_kernel_packages | length > 0
become: true
- name: Restart host
ansible.builtin.reboot:
connect_timeout: 5
reboot_timeout: 600
test_command: whoami
become: true

View File

@@ -0,0 +1,60 @@
---
- name: Uninstall old versions
ansible.builtin.apt:
name: "{{ item }}"
state: absent
purge: true
loop:
- docker
- docker-engine
- docker.io
- containerd
- runc
become: true
- name: Update cache
ansible.builtin.apt:
update_cache: true
become: true
- name: Install dependencies for apt to use repositories over HTTPS
ansible.builtin.apt:
name: "{{ item }}"
state: present
loop:
- ca-certificates
- curl
- gnupg
- lsb-release
- qemu-guest-agent
become: true
- name: Add Docker apt key.
ansible.builtin.get_url:
url: "{{ docker.url }}/{{ ansible_distribution | lower }}/gpg"
dest: /etc/apt/trusted.gpg.d/docker.asc
mode: "0664"
force: true
become: true
- name: Add Docker repository.
ansible.builtin.apt_repository:
repo: "deb [arch={{ arch }}] {{ docker.url }}/{{ ansible_distribution | lower }} {{ ansible_distribution_release }} {{ docker.apt_release_channel }}"
state: present
become: true
- name: Update cache
ansible.builtin.apt:
update_cache: true
become: true
- name: Install Docker Engine, containerd, and Docker Compose.
ansible.builtin.apt:
name: "{{ item }}"
state: present
loop:
- docker-ce
- docker-ce-cli
- docker-compose-plugin
- containerd.io
become: true

View File

@@ -0,0 +1,16 @@
---
- name: Ensure group "docker" exists
ansible.builtin.group:
name: docker
state: present
become: true
- name: Append the group docker to "{{ ansible_user_id }}"
ansible.builtin.user:
name: "{{ ansible_user_id }}"
shell: /bin/bash
groups: docker
append: true
become: true
notify:
- Restart host

View File

@@ -0,0 +1,40 @@
---
- name: Create directories
ansible.builtin.file:
path: "{{ item }}"
state: directory
mode: "0755"
loop:
- /media/series
- /media/movies
- /media/songs
- "{{ docker.directories.local }}"
- "{{ docker.directories.config }}"
- "{{ docker.directories.compose }}"
become: true
- name: Set ownership to {{ ansible_user_id }}
ansible.builtin.file:
path: "{{ item }}"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
loop:
- "{{ docker.directories.local }}"
- "{{ docker.directories.config }}"
- "{{ docker.directories.compose }}"
- /media
become: true
- name: Ensure NFS mounts
ansible.posix.mount:
path: "{{ item }}"
src: "192.168.20.12:{{ item }}"
fstype: nfs
opts: defaults,nolock,_netdev,auto,bg
state: mounted
loop:
- /media/series
- /media/movies
- /media/songs
- /media/downloads
become: true

View File

@@ -0,0 +1,31 @@
---
- name: Set fact if this host should run Keycloak
ansible.builtin.set_fact:
is_keycloak_host: "{{ inventory_hostname in (services | selectattr('name', 'equalto', 'keycloak') | map(attribute='vm') | first) }}"
- name: Create Keycloak directories
ansible.builtin.file:
path: "{{ docker.directories.local }}/keycloak/"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
state: directory
mode: "0755"
when: is_keycloak_host | bool
become: true
- name: Setup Keycloak realms
ansible.builtin.template:
src: "templates/keycloak/realm.json.j2"
dest: "{{ docker.directories.local }}/keycloak/{{ keycloak.realm }}-realm.json"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
mode: "644"
backup: true
when: is_keycloak_host | bool
loop: "{{ keycloak_config.realms }}"
loop_control:
loop_var: keycloak
notify:
- Restart docker
- Restart compose
become: true

View File

@@ -0,0 +1,13 @@
---
- name: Copy docker compose file to target
ansible.builtin.template:
src: "templates/compose.yaml.j2"
dest: "{{ docker.directories.compose }}/compose.yaml"
owner: "{{ ansible_user_id }}"
group: "{{ ansible_user_id }}"
mode: "644"
backup: true
notify:
- Restart docker
- Restart compose
become: true

View File

@@ -0,0 +1,11 @@
---
- name: Copy exporter config to host
ansible.builtin.copy:
src: files/daemon.json
dest: /etc/docker/daemon.json
owner: "{{ root }}"
group: "{{ root }}"
mode: "0644"
notify:
- Restart docker
become: true

View File

@@ -0,0 +1,21 @@
---
- name: Setup VM
ansible.builtin.include_tasks: 10_setup.yaml
- name: Install docker
ansible.builtin.include_tasks: 20_installation.yaml
- name: Setup user and group for docker
ansible.builtin.include_tasks: 30_user_group_setup.yaml
- name: Setup directory structure for docker
ansible.builtin.include_tasks: 40_directory_setup.yaml
# - name: Deploy configs
# ansible.builtin.include_tasks: 50_provision.yaml
- name: Deploy docker compose
ansible.builtin.include_tasks: 60_deploy_compose.yaml
- name: Publish metrics
ansible.builtin.include_tasks: 70_export.yaml

View File

@@ -0,0 +1,164 @@
services:
{% for service in services %}
{% if inventory_hostname in service.vm %}
{{ service.name }}:
container_name: {{ service.container_name }}
image: {{ service.image }}
restart: unless-stopped
{% if service.network_mode is not defined %}
hostname: {{ service.name }}
networks:
- net
{% endif %}
{% if service.ports is defined and service.ports is iterable %}
{% if service.ports[0].internal != 'proxy_only' %}
ports:
{% for port in service.ports %}
{% if port.internal != 'proxy_only' %}
- {{ port.external }}:{{ port.internal }}
{% endif %}
{% endfor %}
{% endif %}
{% endif %}
{% if service.ports is defined and service.ports is iterable %}
{% set first_http_port = service.ports | default([]) | selectattr('name', 'defined') | selectattr('name', 'search', 'http') | first %}
{% set chosen_http_port_value = none %}
{% if first_http_port is not none %}
{% if first_http_port.internal is defined and first_http_port.internal == 'proxy_only' %}
{% if first_http_port.external is defined %}
{% set chosen_http_port_value = first_http_port.external %}
{% endif %}
{% else %}
{% set chosen_http_port_value = first_http_port.internal %}
{% endif %}
{% if chosen_http_port_value is defined %}
healthcheck:
{% set healthcheck = 'curl' %}
{% if service.healthcheck is defined %}
{% set healthcheck = service.healthcheck %}
{% endif %}
{% if healthcheck == 'curl' %}
test: ["CMD", "curl", "-f", "--silent", "--show-error", "--connect-timeout", "5", "http://localhost:{{ chosen_http_port_value }}/"]
{% elif healthcheck == 'wget' %}
test: ["CMD-SHELL", "wget --quiet --spider --timeout=5 http://localhost:{{ chosen_http_port_value }}/ || exit 1"]
{% endif %}
interval: 30s
timeout: 10s
retries: 5
start_period: 20s
{% endif %}
{% endif %}
{% endif %}
{% if service.cap_add is defined and service.cap_add is iterable %}
cap_add:
{% for cap in service.cap_add %}
- {{ cap }}
{% endfor %}
{% endif %}
{% if service.depends_on is defined and service.depends_on is iterable %}
depends_on:
{% for dependency in service.depends_on %}
- {{ dependency }}
{% endfor %}
{% endif %}
{% if service.network_mode is defined %}
network_mode: {{ service.network_mode }}
{% endif %}
{% if service.privileged is defined %}
privileged: {{ service.privileged }}
{% endif %}
{% if service.volumes is defined and service.volumes is iterable %}
volumes:
{% for volume in service.volumes %}
- {{ volume.external }}:{{ volume.internal }}
{% endfor %}
{% endif %}
{% if service.environment is defined and service.environment is iterable %}
environment:
{% for env in service.environment %}
- {{ env }}
{% endfor %}
{% endif %}
{% if service.devices is defined and service.devices is iterable %}
devices:
{% for device in service.devices %}
- {{ device.external }}:{{ device.internal }}
{% endfor %}
{% endif %}
{% if service.command is defined and service.command is iterable %}
command:
{% for command in service.command %}
- {{ command }}
{% endfor %}
{% endif %}
{% if service.sub_service is defined and service.sub_service is iterable %}
{% for sub in service.sub_service %}
{% if sub.name is defined and sub.name == "postgres" %}
{{ service.name }}-postgres:
container_name: {{ service.name }}-postgres
image: docker.io/library/postgres:{{ sub.version }}
restart: unless-stopped
hostname: {{ service.name }}-postgres
networks:
- net
volumes:
- /opt/local/{{ service.name }}/postgres/data:/var/lib/postgresql/data
environment:
POSTGRES_DB: {{ service.name }}
POSTGRES_USER: {{ sub.username }}
POSTGRES_PASSWORD: {{ sub.password }}
{% endif %}
{% if sub.name is defined and sub.name == "redis" %}
{{ service.name }}-redis:
container_name: {{ service.name }}-redis
image: docker.io/library/redis:{{ sub.version }}
restart: unless-stopped
hostname: {{ service.name }}-redis
networks:
- net
volumes:
- /opt/local/{{ service.name }}/redis/data:/data
{% endif %}
{% if sub.name is defined and sub.name == "chrome" %}
{{ service.name }}-chrome:
image: gcr.io/zenika-hub/alpine-chrome:{{ sub.version }}
container_name: {{ service.name }}-chrome
restart: unless-stopped
networks:
- net
command:
- --no-sandbox
- --disable-gpu
- --disable-dev-shm-usage
- --remote-debugging-address=0.0.0.0
- --remote-debugging-port=9222
- --hide-scrollbars
{% endif %}
{% if sub.name is defined and sub.name == "meilisearch" %}
{{ service.name }}-meilisearch:
container_name: {{ service.name }}-meilisearch
image: getmeili/meilisearch:{{ sub.version }}
restart: unless-stopped
hostname: {{ service.name }}-meilisearch
networks:
- net
volumes:
- /opt/local/{{ service.name }}/mailisearch/data:/meili_data
environment:
- MEILI_NO_ANALYTICS=true
- NEXTAUTH_SECRET={{ sub.nextauth_secret }}
- MEILI_MASTER_KEY={{ sub.meili_master_key }}
- OPENAI_API_KEY="{{ sub.openai_key }}"
{% endif %}
{% endfor %}
{% endif %}
{% endif %}
{% endfor %}
networks:
net:
driver: bridge
ipam:
driver: default
config:
- subnet: 172.16.69.0/24

View File

@@ -0,0 +1,79 @@
{
"realm": "{{ keycloak.realm }}",
"enabled": true,
"displayName": "{{ keycloak.display_name }}",
"displayNameHtml": "<div class=\"kc-logo-text\">{{keycloak.display_name}}</div>",
"bruteForceProtected": true,
"users": [
{% if keycloak.users is defined and keycloak.users is iterable %}
{% for user in keycloak.users %}
{
"username": "{{ user.username }}",
"enabled": true,
"credentials": [
{
"type": "password",
"value": "{{ user.password }}",
"temporary": false
}
],
"realmRoles": [
{% for realm_role in user.realm_roles %}
"{{ realm_role }}"{%- if not loop.last %},{% endif %}{{''}}
{% endfor %}
],
"clientRoles": {
"account": [
{% for account in user.client_roles.account %}
"{{ account }}"{%- if not loop.last %},{% endif %}{{''}}
{% endfor %}
]
}
},{% if not loop.last %}{% endif %}
{% endfor %}
{% endif %}
{
"username": "{{ keycloak.admin.username }}",
"enabled": true,
"credentials": [
{
"type": "password",
"value": "{{ keycloak.admin.password }}",
"temporary": false
}
],
"realmRoles": [
{% for realm_role in keycloak.admin.realm_roles %}
"{{ realm_role }}"{% if not loop.last %},{% endif %}{{''}}
{% endfor %}
],
"clientRoles": {
"realm-management": [
{% for realm_management in keycloak.admin.client_roles.realm_management %}
"{{ realm_management }}"{%- if not loop.last %},{% endif %}{{''}}
{% endfor %}
],
"account": [
{% for account in keycloak.admin.client_roles.account %}
"{{ account }}"{%- if not loop.last %},{% endif %}{{''}}
{% endfor %}
]
}
}
],
"roles": {
"realm": [
{% for role in keycloak.roles.realm %}
{
"name": "{{ role.name }}",
"description": "{{ role.name }}"
}{% if not loop.last %},{% endif %}
{% endfor %}
]
},
"defaultRoles": [
{% for role in keycloak.roles.default_roles %}
"{{ role }}"{% if not loop.last %},{% endif %}{{''}}
{% endfor %}
]
}

View File

@@ -0,0 +1,7 @@
docker_host_package_common_dependencies:
- nfs-common
apt_lock_files:
- /var/lib/dpkg/lock
- /var/lib/dpkg/lock-frontend
- /var/cache/apt/archives/lock

75
roles/edge_vps/README.md Normal file
View File

@@ -0,0 +1,75 @@
# Edge VPS
Configures edge VPS instances with WireGuard VPN, Traefik reverse proxy, Pangolin, and Elastic Fleet Agent.
## Requirements
- Docker and Docker Compose installed
- Ansible community.docker collection
## Role Variables
### WireGuard
| Variable | Default | Description |
|----------|---------|-------------|
| `edge_vps_wireguard_address` | `10.133.7.1/24` | WireGuard interface address |
| `edge_vps_wireguard_port` | `61975` | WireGuard listen port |
| `edge_vps_wireguard_interface` | `wg0` | WireGuard interface name |
| `edge_vps_wireguard_routes` | `[]` | List of routes to add (network, gateway) |
### Traefik
| Variable | Default | Description |
|----------|---------|-------------|
| `edge_vps_traefik_config_dir` | `/root/config/traefik` | Traefik config directory |
| `edge_vps_acme_email` | - | Email for Let's Encrypt |
### Pangolin
| Variable | Default | Description |
|----------|---------|-------------|
| `edge_vps_pangolin_dashboard_url` | - | Pangolin dashboard URL |
| `edge_vps_pangolin_base_endpoint` | - | Pangolin base endpoint |
| `edge_vps_pangolin_base_domain` | - | Base domain for Pangolin |
### Elastic Agent
| Variable | Default | Description |
|----------|---------|-------------|
| `edge_vps_elastic_version` | `9.2.2` | Elastic Agent version |
| `edge_vps_elastic_fleet_url` | - | Fleet server URL |
| `edge_vps_elastic_dns_server` | `10.43.0.10` | DNS server for agent |
## Secrets
Store secrets in `vars/group_vars/vps/secrets.yaml` (ansible-vault encrypted):
```yaml
vault_edge_vps:
wireguard:
private_key: "..."
peers: [...]
pangolin:
server_secret: "..."
traefik:
cloudflare_api_token: "..."
elastic:
fleet_enrollment_token: "..."
```
## Dependencies
None.
## Example Playbook
```yaml
- hosts: vps
roles:
- role: edge_vps
```
## License
MIT

View File

@@ -0,0 +1,11 @@
---
edge_vps_config_base: /root/config
edge_vps_wireguard_config_dir: /etc/wireguard
edge_vps_wireguard_interface: wg0
edge_vps_wireguard_address: "10.133.7.1/24"
edge_vps_wireguard_port: 61975
edge_vps_traefik_config_dir: "{{ edge_vps_config_base }}/traefik"
edge_vps_traefik_logs_dir: "{{ edge_vps_traefik_config_dir }}/logs"
edge_vps_pangolin_config_dir: "{{ edge_vps_config_base }}/pangolin"
edge_vps_elastic_config_dir: "{{ edge_vps_config_base }}/elastic-agent"
edge_vps_elastic_state_dir: /var/lib/elastic-agent/elastic-system/elastic-agent/state

Some files were not shown because too many files have changed in this diff Show More