Files
ansible/docs/runbooks/arr-cleanup/findings.md
Tuan-Dat Tran 5b44c46e10 docs(arr-cleanup): improve runbook and fix api key paths
Rewrites findings.md with how-to section, cleaner summary tables,
and more detailed per-pass results. Fixes relative path for
sonarr/radarr API key files after runbook moved deeper in repo.
2026-04-27 21:39:28 +02:00

8.1 KiB
Raw Permalink Blame History

arr-stack Downloads Cleanup — Investigation Findings

Storage Layout (aya01)

Device FS Size Used Mount
/dev/sdc3 btrfs 1.9T 177G (10%) / (system)
/dev/sda1 btrfs proxmox 2.8T 1.3T (48%) /opt
/dev/sdd1 ext4 17T 15T (92%) /mnt/hdd0
/dev/sde1 ext4 17T 15T (92%) /mnt/hdd2
/dev/sdf1 ext4 17T 15T (92%) /mnt/hdd1
mergerfs fuse 49T 43T (92%) /media

/media is a mergerfs union of hdd0 + hdd1 + hdd2. All three HDDs were at ~92% capacity before cleanup.

After cleanup (2026-04-23):

Device Used Avail Use%
/dev/sdd1 (hdd0) 9.4T 6.2T 61%
/dev/sdf1 (hdd1) 9.3T 6.3T 60%
/dev/sde1 (hdd2) 7.8T 7.8T 51%
mergerfs /media 27T 21T 57%

~16T freed total (92% → 57% on the mergerfs pool).

/media Breakdown (before cleanup)

Directory Size
downloads 22T
series 16T
movies 5T

Zero hardlinked files exist anywhere across all three HDDs. Confirmed by two methods:

  1. Inspecting the Kubernetes manifests in argocd-homelab/services/arr-stack/
  2. Inode comparison of 1365 download/media file pairs — 0 shared inodes found (every file is a distinct copy)

All three services mount the mergerfs /media/ path via NFS:

sonarr:      NFS 192.168.20.12:/media/downloads  → /downloads
             NFS 192.168.20.12:/media/series     → /tv
radarr:      NFS 192.168.20.12:/media/downloads  → /downloads
             NFS 192.168.20.12:/media/movies     → /movies
qbit:        NFS 192.168.20.12:/media/downloads  → /downloads

mergerfs does not support hardlinks across underlying filesystems. When qBit downloads to /media/downloads/sonarr/ (lands on e.g. hdd1) and Sonarr imports to /media/series/ (lands on e.g. hdd0), the hardlink attempt crosses a physical disk boundary → falls back to copy. Every import doubles the data.

Estimated wasted space before cleanup: ~21T (the entire downloads/sonarr + downloads/radarr).

How to Run

Prerequisites:

# Port-forward Sonarr and Radarr APIs
kubectl -n arr-stack port-forward svc/sonarr 8989:8989 &
kubectl -n arr-stack port-forward svc/radarr 7878:7878 &

API keys are loaded from ../../../../sonarr.api.env and ../../../../radarr.api.env (i.e. /home/tudattr/workspace/infra/sonarr.api.env relative to this repo).

Container path mappings used in scripts:

  • Sonarr: /tv//media/series/
  • Radarr: /movies//media/movies/

Step 1 — Verify (generates /tmp/arr_verified.json)

python3 verify.py

Cross-references all downloads against Sonarr/Radarr APIs, verifies reported file paths exist on disk via SSH. Classifies each entry as safe, not_imported, or path_missing.

Step 2 — Delete confirmed-imported downloads

python3 cleanup.py --dry-run          # preview
python3 cleanup.py --arr sonarr --yes
python3 cleanup.py --arr radarr --yes

Step 3 — Delete orphans (downloads not in Sonarr at all)

python3 cleanup-orphans.py --dry-run  # preview
python3 cleanup-orphans.py --yes

All actions are logged to cleanup.log with UTC timestamp, size, title, path, and outcome.

Cleanup Performed (2026-04-23)

Pass 1 — Orphans (downloads not in Sonarr)

Script: cleanup-orphans.py

Two-pass logic:

  1. Match each download name against Sonarr API (title, slug, sortTitle, alternate titles, partial match)
  2. If no API match, check if a series directory with a similar name exists in /media/series/ — if it does, skip (needs manual review)
  3. Delete remaining true orphans

Result: 49 deleted, 461.6G freed, 0 failed

111 entries SKIPPED (series dir found on disk) — includes Bleach, House, Lucifer, You, SpongeBob, Detective Conan episodes, What If, etc. See cleanup.log for full list.

Notable orphans deleted:

  • Game of Thrones S01S08 (~267G) — removed from Sonarr
  • Sex Education S01S04 (~110G) — removed from Sonarr
  • Love Death & Robots (multiple duplicate copies, ~45G)
  • Senpai is an Otokonoko, Wind Breaker, Wistoria, Hibike! Euphonium S3 episodes, etc.

Pass 2 — Confirmed-imported Sonarr downloads

Script: cleanup.py --arr sonarr --yes

Deleted downloads where Sonarr confirmed episodeFileCount > 0 AND the series directory was verified to exist on disk.

Result: 1106 deleted, 0 failed

Pass 3 — Confirmed-imported Radarr downloads

Script: cleanup.py --arr radarr --yes

Deleted downloads where Radarr confirmed hasFile=True AND the file/directory path was verified to exist on disk.

Result: 259 deleted, 0 failed

Summary

Pass Script Entries Space freed
Orphans cleanup-orphans.py 49 ~461G
Sonarr imports cleanup.py --arr sonarr 1106 ~12T (estimated)
Radarr imports cleanup.py --arr radarr 259 ~4T (estimated)
Total 1414 ~16T

Verification Results (from verify.py run before cleanup)

Safe to delete Not imported Path missing Orphans (no API match)
Sonarr (1439 downloads) 1106 333
Radarr (289 downloads) 265 25

Note: cleanup-orphans.py uses more aggressive title matching (alternate titles, partial match) than verify.py, so its orphan count (160 not-in-Sonarr out of 1438) is lower than verify.py's 333.

Radarr Orphans (25) — not matched, not deleted

  • Constantine (2005)
  • Cowboy Bebop: Knockin' on Heaven's Door (2001)
  • Les Misérables (2012)
  • Pokémon Detective Pikachu (2019)
  • Code Geass: Fukkatsu no Lelouch (2019)
  • Eiga Go-Toubun no Hanayome (2022)
  • Gisaengchung / Parasite — Korean title, matching failure
  • Dune: Part One (2021) — matching failure, confirmed in Radarr
  • Harry Potter older/duplicate copies — matching failure
  • Porco Rosso / Kurenai no buta — matching failure
  • Castle in the Sky / Laputa — matching failure
  • Steins;Gate: The Movie — matching failure
  • Project Silence / Talchul — matching failure
  • Digimon: Frontier & Savers films
  • One Piece films (several)
  • Paripi Koumei movie
  • Fantastic Four (2025) extra copies (3)
  • JJK DCP trailer file

Path mismatch entries (confirmed safe, deleted anyway)

  • Star Wars Episode IV/V/VI/IX — all matched to Episode IV record; manually confirmed all 4 dirs exist
  • WALL·E — · middle-dot (U+00B7) broke string comparison; file confirmed on disk

Pending Decisions

Bleach USBD Remux TL (1.8T)

/media/downloads/sonarr/Bleach USBD Remux TL — full lossless Bluray remux S00S16 (-ZR- group).

Currently SKIPPED — /media/series/Bleach (2004) {imdb-tt0434665}/ exists (310G imported).

Most seasons were imported from lighter x265 Bluray packs (Bleach S0x Bluray EAC3 2.0 1080p x265-iVy) rather than this remux. S11 has no imported content. S13 and S14 partially imported.

Options:

  • Delete — free 1.8T, imported x265 content stays, re-download at remux quality later if desired
  • Keep — retain as source for Sonarr to import remaining episodes at lossless quality now that disk space is freed

Per-season breakdown saved in memory.

SKIPPED downloads (111 Sonarr entries)

Downloads where a matching series directory exists on disk but the series is not in Sonarr. Likely intentionally removed series (House, Lucifer, You, Black Clover, etc.) with leftover download copies. Needs manual review per series before deleting.

Permanent Fix (not applied)

Mount per-HDD NFS paths instead of the mergerfs path, so qBit downloads and arr imports land on the same physical filesystem, enabling hardlinks:

# In sonarr/radarr/qtun deployments, change:
path: /media/downloads  →  path: /mnt/hdd0/downloads
path: /media/series     →  path: /mnt/hdd0/series
path: /media/movies     →  path: /mnt/hdd0/movies

Jellyfin/Plex keep reading from /media/ (mergerfs union). New imports hardlink within hdd0, wasting no extra space.

Tradeoff: all new content lands on hdd0 only. Load balancing across the three disks stops working for new downloads. Once hdd0 fills up a migration strategy is needed.