docs(arr-cleanup): improve runbook and fix api key paths
Rewrites findings.md with how-to section, cleaner summary tables, and more detailed per-pass results. Fixes relative path for sonarr/radarr API key files after runbook moved deeper in repo.
This commit is contained in:
@@ -32,7 +32,7 @@ SERIES_ROOT = "/media/series"
|
||||
script_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
LOG_FILE = os.path.join(script_dir, "cleanup.log")
|
||||
|
||||
with open(os.path.join(script_dir, '..', 'sonarr.api.env')) as f:
|
||||
with open(os.path.join(script_dir, '../../../..', 'sonarr.api.env')) as f:
|
||||
SONARR_KEY = f.read().strip()
|
||||
|
||||
|
||||
|
||||
@@ -34,7 +34,9 @@
|
||||
|
||||
## Root Cause: No Hardlinks → All Imports Are Copies
|
||||
|
||||
Zero hardlinked files exist anywhere across all three HDDs. Confirmed by inspecting the Kubernetes manifests in `argocd-homelab/services/arr-stack/` and by inode comparison of 1365 download/media file pairs (0 shared inodes found).
|
||||
Zero hardlinked files exist anywhere across all three HDDs. Confirmed by two methods:
|
||||
1. Inspecting the Kubernetes manifests in `argocd-homelab/services/arr-stack/`
|
||||
2. Inode comparison of 1365 download/media file pairs — **0 shared inodes found** (every file is a distinct copy)
|
||||
|
||||
**All three services mount the mergerfs `/media/` path via NFS:**
|
||||
|
||||
@@ -48,63 +50,106 @@ qbit: NFS 192.168.20.12:/media/downloads → /downloads
|
||||
|
||||
mergerfs does not support hardlinks across underlying filesystems. When qBit downloads to `/media/downloads/sonarr/` (lands on e.g. hdd1) and Sonarr imports to `/media/series/` (lands on e.g. hdd0), the hardlink attempt crosses a physical disk boundary → falls back to copy. Every import doubles the data.
|
||||
|
||||
**Estimated wasted space before cleanup: ~21T** (the entire downloads/sonarr + downloads/radarr).
|
||||
|
||||
## How to Run
|
||||
|
||||
Prerequisites:
|
||||
```bash
|
||||
# Port-forward Sonarr and Radarr APIs
|
||||
kubectl -n arr-stack port-forward svc/sonarr 8989:8989 &
|
||||
kubectl -n arr-stack port-forward svc/radarr 7878:7878 &
|
||||
```
|
||||
|
||||
API keys are loaded from `../../../../sonarr.api.env` and `../../../../radarr.api.env`
|
||||
(i.e. `/home/tudattr/workspace/infra/sonarr.api.env` relative to this repo).
|
||||
|
||||
Container path mappings used in scripts:
|
||||
- Sonarr: `/tv/` → `/media/series/`
|
||||
- Radarr: `/movies/` → `/media/movies/`
|
||||
|
||||
### Step 1 — Verify (generates `/tmp/arr_verified.json`)
|
||||
```bash
|
||||
python3 verify.py
|
||||
```
|
||||
Cross-references all downloads against Sonarr/Radarr APIs, verifies reported file paths exist on disk via SSH. Classifies each entry as `safe`, `not_imported`, or `path_missing`.
|
||||
|
||||
### Step 2 — Delete confirmed-imported downloads
|
||||
```bash
|
||||
python3 cleanup.py --dry-run # preview
|
||||
python3 cleanup.py --arr sonarr --yes
|
||||
python3 cleanup.py --arr radarr --yes
|
||||
```
|
||||
|
||||
### Step 3 — Delete orphans (downloads not in Sonarr at all)
|
||||
```bash
|
||||
python3 cleanup-orphans.py --dry-run # preview
|
||||
python3 cleanup-orphans.py --yes
|
||||
```
|
||||
|
||||
All actions are logged to `cleanup.log` with UTC timestamp, size, title, path, and outcome.
|
||||
|
||||
## Cleanup Performed (2026-04-23)
|
||||
|
||||
Three passes using the scripts in this directory:
|
||||
|
||||
### Pass 1 — Orphans (not in Sonarr at all)
|
||||
### Pass 1 — Orphans (downloads not in Sonarr)
|
||||
Script: `cleanup-orphans.py`
|
||||
|
||||
Deleted 49 entries totalling **461.6G** — downloads with no matching Sonarr series and no series directory on disk. Includes Game of Thrones (all 8 seasons), Sex Education (all 4 seasons), Love Death & Robots (multiple duplicate copies), and various anime episode files.
|
||||
Two-pass logic:
|
||||
1. Match each download name against Sonarr API (title, slug, sortTitle, alternate titles, partial match)
|
||||
2. If no API match, check if a series directory with a similar name exists in `/media/series/` — if it does, skip (needs manual review)
|
||||
3. Delete remaining true orphans
|
||||
|
||||
111 entries were SKIPPED (series dir found on disk, needs manual review) — includes Bleach, House, Lucifer, You, Detective Conan episodes, What If, etc. See cleanup.log for full list.
|
||||
Result: **49 deleted, 461.6G freed, 0 failed**
|
||||
|
||||
111 entries SKIPPED (series dir found on disk) — includes Bleach, House, Lucifer, You, SpongeBob, Detective Conan episodes, What If, etc. See `cleanup.log` for full list.
|
||||
|
||||
Notable orphans deleted:
|
||||
- Game of Thrones S01–S08 (~267G) — removed from Sonarr
|
||||
- Sex Education S01–S04 (~110G) — removed from Sonarr
|
||||
- Love Death & Robots (multiple duplicate copies, ~45G)
|
||||
- Senpai is an Otokonoko, Wind Breaker, Wistoria, Hibike! Euphonium S3 episodes, etc.
|
||||
|
||||
### Pass 2 — Confirmed-imported Sonarr downloads
|
||||
Script: `cleanup.py --arr sonarr`
|
||||
Script: `cleanup.py --arr sonarr --yes`
|
||||
|
||||
Deleted **1106 entries**, 0 failed. These were downloads where Sonarr confirmed `episodeFileCount > 0` AND the series directory was verified to exist on disk at the time of `verify.py` run.
|
||||
Deleted downloads where Sonarr confirmed `episodeFileCount > 0` AND the series directory was verified to exist on disk.
|
||||
|
||||
Result: **1106 deleted, 0 failed**
|
||||
|
||||
### Pass 3 — Confirmed-imported Radarr downloads
|
||||
Script: `cleanup.py --arr radarr`
|
||||
Script: `cleanup.py --arr radarr --yes`
|
||||
|
||||
Deleted **259 entries**, 0 failed. These were downloads where Radarr confirmed `hasFile=True` AND the file/directory path was verified to exist on disk.
|
||||
Deleted downloads where Radarr confirmed `hasFile=True` AND the file/directory path was verified to exist on disk.
|
||||
|
||||
### Totals
|
||||
| Pass | Entries | Space |
|
||||
|------|---------|-------|
|
||||
| Orphans (cleanup-orphans.py) | 49 | ~461G |
|
||||
| Sonarr imports (cleanup.py) | 1106 | ~12T (estimated) |
|
||||
| Radarr imports (cleanup.py) | 259 | ~4T (estimated) |
|
||||
| **Total** | **1414** | **~16T freed** |
|
||||
Result: **259 deleted, 0 failed**
|
||||
|
||||
All deletions logged to `cleanup.log` with UTC timestamp, size, title, path, outcome.
|
||||
### Summary
|
||||
| Pass | Script | Entries | Space freed |
|
||||
|------|--------|---------|-------------|
|
||||
| Orphans | `cleanup-orphans.py` | 49 | ~461G |
|
||||
| Sonarr imports | `cleanup.py --arr sonarr` | 1106 | ~12T (estimated) |
|
||||
| Radarr imports | `cleanup.py --arr radarr` | 259 | ~4T (estimated) |
|
||||
| **Total** | | **1414** | **~16T** |
|
||||
|
||||
## Verification Results (via API + disk path check)
|
||||
## Verification Results (from verify.py run before cleanup)
|
||||
|
||||
API keys stored in `../sonarr.api.env` and `../radarr.api.env`.
|
||||
Access via `kubectl -n arr-stack port-forward svc/sonarr 8989:8989` and `svc/radarr 7878:7878`.
|
||||
| | Safe to delete | Not imported | Path missing | Orphans (no API match) |
|
||||
|---|---|---|---|---|
|
||||
| **Sonarr** (1439 downloads) | 1106 | — | — | 333 |
|
||||
| **Radarr** (289 downloads) | 265 | — | — | 25 |
|
||||
|
||||
Container path mappings:
|
||||
- Sonarr `/tv/` → `/media/series/`
|
||||
- Radarr `/movies/` → `/media/movies/`
|
||||
Note: `cleanup-orphans.py` uses more aggressive title matching (alternate titles, partial match) than `verify.py`, so its orphan count (160 not-in-Sonarr out of 1438) is lower than `verify.py`'s 333.
|
||||
|
||||
| | Safe to delete | Orphans (not in arr) | Keep |
|
||||
|---|---|---|---|
|
||||
| **Radarr** (289 items, ~5.2T) | **265** | 25 | 0 |
|
||||
| **Sonarr** (1439 items, ~17T) | **1106** | 333 | 0 |
|
||||
|
||||
"Safe to delete" = API confirms `hasFile=True` (Radarr) or `episodeFileCount > 0` (Sonarr), AND the reported file/directory path was verified to exist on disk via SSH.
|
||||
|
||||
### Radarr Orphans (25) — not matched in Radarr, not deleted
|
||||
### Radarr Orphans (25) — not matched, not deleted
|
||||
- Constantine (2005)
|
||||
- Cowboy Bebop: Knockin' on Heaven's Door (2001)
|
||||
- Les Misérables (2012)
|
||||
- Pokémon Detective Pikachu (2019)
|
||||
- Code Geass: Fukkatsu no Lelouch (2019)
|
||||
- Eiga Go-Toubun no Hanayome (2022)
|
||||
- Gisaengchung / Parasite (Korean title — matching failure)
|
||||
- Dune: Part One (2021) — matching failure, is in Radarr
|
||||
- Harry Potter (older/duplicate copies — matching failure)
|
||||
- Gisaengchung / Parasite — Korean title, matching failure
|
||||
- Dune: Part One (2021) — matching failure, confirmed in Radarr
|
||||
- Harry Potter older/duplicate copies — matching failure
|
||||
- Porco Rosso / Kurenai no buta — matching failure
|
||||
- Castle in the Sky / Laputa — matching failure
|
||||
- Steins;Gate: The Movie — matching failure
|
||||
@@ -115,32 +160,41 @@ Container path mappings:
|
||||
- Fantastic Four (2025) extra copies (3)
|
||||
- JJK DCP trailer file
|
||||
|
||||
### 6 Radarr "path mismatch" entries (all confirmed safe, deleted)
|
||||
Flagged due to path comparison artifacts, manually verified on disk:
|
||||
- Star Wars Episode IV/V/VI/IX — each is a separate Radarr entry; all directories exist
|
||||
- WALL·E — `·` middle-dot character caused comparison failure; file exists
|
||||
### Path mismatch entries (confirmed safe, deleted anyway)
|
||||
- Star Wars Episode IV/V/VI/IX — all matched to Episode IV record; manually confirmed all 4 dirs exist
|
||||
- WALL·E — `·` middle-dot (U+00B7) broke string comparison; file confirmed on disk
|
||||
|
||||
## Pending Decisions
|
||||
|
||||
### Bleach USBD Remux TL (1.8T)
|
||||
`/media/downloads/sonarr/Bleach USBD Remux TL` — full lossless Bluray remux S00–S16 (-ZR- group).
|
||||
Currently in SKIPPED (series dir `/media/series/Bleach (2004) {imdb-tt0434665}/` exists, 310G imported).
|
||||
Most seasons were imported from x265 Bluray packs (-iVy group) rather than from this remux.
|
||||
S11 has no imported content at all. S13, S14 partially imported.
|
||||
Decision: keep (for quality imports once disk freed) or delete (free 1.8T, accept x265 quality).
|
||||
See memory file for full per-season breakdown.
|
||||
|
||||
Currently SKIPPED — `/media/series/Bleach (2004) {imdb-tt0434665}/` exists (310G imported).
|
||||
|
||||
Most seasons were imported from lighter x265 Bluray packs (`Bleach S0x Bluray EAC3 2.0 1080p x265-iVy`) rather than this remux. S11 has no imported content. S13 and S14 partially imported.
|
||||
|
||||
Options:
|
||||
- **Delete** — free 1.8T, imported x265 content stays, re-download at remux quality later if desired
|
||||
- **Keep** — retain as source for Sonarr to import remaining episodes at lossless quality now that disk space is freed
|
||||
|
||||
Per-season breakdown saved in memory.
|
||||
|
||||
### SKIPPED downloads (111 Sonarr entries)
|
||||
Downloads where the series directory exists on disk but the series is not currently in Sonarr.
|
||||
Likely removed series (House, Lucifer, You, Black Clover, etc.) or ongoing shows with stale episodes.
|
||||
These need manual review — series may have been intentionally removed from Sonarr.
|
||||
Downloads where a matching series directory exists on disk but the series is not in Sonarr.
|
||||
Likely intentionally removed series (House, Lucifer, You, Black Clover, etc.) with leftover download copies.
|
||||
Needs manual review per series before deleting.
|
||||
|
||||
## Permanent Fix (not applied)
|
||||
|
||||
Mount per-HDD NFS paths instead of the mergerfs path, so qBit downloads and arr imports land on the same physical filesystem, enabling hardlinks:
|
||||
|
||||
## Fix (not applied — future reference)
|
||||
Mount per-HDD NFS paths instead of the mergerfs path, so downloads and media share the same physical filesystem and hardlinks work:
|
||||
```yaml
|
||||
# sonarr/radarr/qtun deployments — change NFS path from:
|
||||
path: /media/downloads → path: /mnt/hdd0/downloads
|
||||
path: /media/series → path: /mnt/hdd0/series
|
||||
path: /media/movies → path: /mnt/hdd0/movies
|
||||
# In sonarr/radarr/qtun deployments, change:
|
||||
path: /media/downloads → path: /mnt/hdd0/downloads
|
||||
path: /media/series → path: /mnt/hdd0/series
|
||||
path: /media/movies → path: /mnt/hdd0/movies
|
||||
```
|
||||
Jellyfin/Plex continue reading from `/media/` (mergerfs). New imports hardlink within hdd0.
|
||||
|
||||
Jellyfin/Plex keep reading from `/media/` (mergerfs union). New imports hardlink within hdd0, wasting no extra space.
|
||||
|
||||
Tradeoff: all new content lands on hdd0 only. Load balancing across the three disks stops working for new downloads. Once hdd0 fills up a migration strategy is needed.
|
||||
|
||||
@@ -8,7 +8,7 @@ Requirements:
|
||||
kubectl -n arr-stack port-forward svc/sonarr 8989:8989
|
||||
kubectl -n arr-stack port-forward svc/radarr 7878:7878
|
||||
- SSH access to aya01
|
||||
- API keys in ../sonarr.api.env and ../radarr.api.env
|
||||
- API keys in ../../../../sonarr.api.env and ../../../../radarr.api.env
|
||||
|
||||
Output:
|
||||
/tmp/arr_verified.json — full structured results for use by cleanup.py
|
||||
@@ -28,7 +28,7 @@ SSH_HOST = "aya01"
|
||||
script_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
|
||||
def load_key(filename):
|
||||
path = os.path.join(script_dir, '..', filename)
|
||||
path = os.path.join(script_dir, '../../../..', filename)
|
||||
return open(path).read().strip()
|
||||
|
||||
SONARR_KEY = load_key('sonarr.api.env')
|
||||
|
||||
Reference in New Issue
Block a user