docs(arr-cleanup): improve runbook and fix api key paths
Rewrites findings.md with how-to section, cleaner summary tables, and more detailed per-pass results. Fixes relative path for sonarr/radarr API key files after runbook moved deeper in repo.
This commit is contained in:
@@ -32,7 +32,7 @@ SERIES_ROOT = "/media/series"
|
|||||||
script_dir = os.path.dirname(os.path.abspath(__file__))
|
script_dir = os.path.dirname(os.path.abspath(__file__))
|
||||||
LOG_FILE = os.path.join(script_dir, "cleanup.log")
|
LOG_FILE = os.path.join(script_dir, "cleanup.log")
|
||||||
|
|
||||||
with open(os.path.join(script_dir, '..', 'sonarr.api.env')) as f:
|
with open(os.path.join(script_dir, '../../../..', 'sonarr.api.env')) as f:
|
||||||
SONARR_KEY = f.read().strip()
|
SONARR_KEY = f.read().strip()
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -34,7 +34,9 @@
|
|||||||
|
|
||||||
## Root Cause: No Hardlinks → All Imports Are Copies
|
## Root Cause: No Hardlinks → All Imports Are Copies
|
||||||
|
|
||||||
Zero hardlinked files exist anywhere across all three HDDs. Confirmed by inspecting the Kubernetes manifests in `argocd-homelab/services/arr-stack/` and by inode comparison of 1365 download/media file pairs (0 shared inodes found).
|
Zero hardlinked files exist anywhere across all three HDDs. Confirmed by two methods:
|
||||||
|
1. Inspecting the Kubernetes manifests in `argocd-homelab/services/arr-stack/`
|
||||||
|
2. Inode comparison of 1365 download/media file pairs — **0 shared inodes found** (every file is a distinct copy)
|
||||||
|
|
||||||
**All three services mount the mergerfs `/media/` path via NFS:**
|
**All three services mount the mergerfs `/media/` path via NFS:**
|
||||||
|
|
||||||
@@ -48,63 +50,106 @@ qbit: NFS 192.168.20.12:/media/downloads → /downloads
|
|||||||
|
|
||||||
mergerfs does not support hardlinks across underlying filesystems. When qBit downloads to `/media/downloads/sonarr/` (lands on e.g. hdd1) and Sonarr imports to `/media/series/` (lands on e.g. hdd0), the hardlink attempt crosses a physical disk boundary → falls back to copy. Every import doubles the data.
|
mergerfs does not support hardlinks across underlying filesystems. When qBit downloads to `/media/downloads/sonarr/` (lands on e.g. hdd1) and Sonarr imports to `/media/series/` (lands on e.g. hdd0), the hardlink attempt crosses a physical disk boundary → falls back to copy. Every import doubles the data.
|
||||||
|
|
||||||
|
**Estimated wasted space before cleanup: ~21T** (the entire downloads/sonarr + downloads/radarr).
|
||||||
|
|
||||||
|
## How to Run
|
||||||
|
|
||||||
|
Prerequisites:
|
||||||
|
```bash
|
||||||
|
# Port-forward Sonarr and Radarr APIs
|
||||||
|
kubectl -n arr-stack port-forward svc/sonarr 8989:8989 &
|
||||||
|
kubectl -n arr-stack port-forward svc/radarr 7878:7878 &
|
||||||
|
```
|
||||||
|
|
||||||
|
API keys are loaded from `../../../../sonarr.api.env` and `../../../../radarr.api.env`
|
||||||
|
(i.e. `/home/tudattr/workspace/infra/sonarr.api.env` relative to this repo).
|
||||||
|
|
||||||
|
Container path mappings used in scripts:
|
||||||
|
- Sonarr: `/tv/` → `/media/series/`
|
||||||
|
- Radarr: `/movies/` → `/media/movies/`
|
||||||
|
|
||||||
|
### Step 1 — Verify (generates `/tmp/arr_verified.json`)
|
||||||
|
```bash
|
||||||
|
python3 verify.py
|
||||||
|
```
|
||||||
|
Cross-references all downloads against Sonarr/Radarr APIs, verifies reported file paths exist on disk via SSH. Classifies each entry as `safe`, `not_imported`, or `path_missing`.
|
||||||
|
|
||||||
|
### Step 2 — Delete confirmed-imported downloads
|
||||||
|
```bash
|
||||||
|
python3 cleanup.py --dry-run # preview
|
||||||
|
python3 cleanup.py --arr sonarr --yes
|
||||||
|
python3 cleanup.py --arr radarr --yes
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3 — Delete orphans (downloads not in Sonarr at all)
|
||||||
|
```bash
|
||||||
|
python3 cleanup-orphans.py --dry-run # preview
|
||||||
|
python3 cleanup-orphans.py --yes
|
||||||
|
```
|
||||||
|
|
||||||
|
All actions are logged to `cleanup.log` with UTC timestamp, size, title, path, and outcome.
|
||||||
|
|
||||||
## Cleanup Performed (2026-04-23)
|
## Cleanup Performed (2026-04-23)
|
||||||
|
|
||||||
Three passes using the scripts in this directory:
|
### Pass 1 — Orphans (downloads not in Sonarr)
|
||||||
|
|
||||||
### Pass 1 — Orphans (not in Sonarr at all)
|
|
||||||
Script: `cleanup-orphans.py`
|
Script: `cleanup-orphans.py`
|
||||||
|
|
||||||
Deleted 49 entries totalling **461.6G** — downloads with no matching Sonarr series and no series directory on disk. Includes Game of Thrones (all 8 seasons), Sex Education (all 4 seasons), Love Death & Robots (multiple duplicate copies), and various anime episode files.
|
Two-pass logic:
|
||||||
|
1. Match each download name against Sonarr API (title, slug, sortTitle, alternate titles, partial match)
|
||||||
|
2. If no API match, check if a series directory with a similar name exists in `/media/series/` — if it does, skip (needs manual review)
|
||||||
|
3. Delete remaining true orphans
|
||||||
|
|
||||||
111 entries were SKIPPED (series dir found on disk, needs manual review) — includes Bleach, House, Lucifer, You, Detective Conan episodes, What If, etc. See cleanup.log for full list.
|
Result: **49 deleted, 461.6G freed, 0 failed**
|
||||||
|
|
||||||
|
111 entries SKIPPED (series dir found on disk) — includes Bleach, House, Lucifer, You, SpongeBob, Detective Conan episodes, What If, etc. See `cleanup.log` for full list.
|
||||||
|
|
||||||
|
Notable orphans deleted:
|
||||||
|
- Game of Thrones S01–S08 (~267G) — removed from Sonarr
|
||||||
|
- Sex Education S01–S04 (~110G) — removed from Sonarr
|
||||||
|
- Love Death & Robots (multiple duplicate copies, ~45G)
|
||||||
|
- Senpai is an Otokonoko, Wind Breaker, Wistoria, Hibike! Euphonium S3 episodes, etc.
|
||||||
|
|
||||||
### Pass 2 — Confirmed-imported Sonarr downloads
|
### Pass 2 — Confirmed-imported Sonarr downloads
|
||||||
Script: `cleanup.py --arr sonarr`
|
Script: `cleanup.py --arr sonarr --yes`
|
||||||
|
|
||||||
Deleted **1106 entries**, 0 failed. These were downloads where Sonarr confirmed `episodeFileCount > 0` AND the series directory was verified to exist on disk at the time of `verify.py` run.
|
Deleted downloads where Sonarr confirmed `episodeFileCount > 0` AND the series directory was verified to exist on disk.
|
||||||
|
|
||||||
|
Result: **1106 deleted, 0 failed**
|
||||||
|
|
||||||
### Pass 3 — Confirmed-imported Radarr downloads
|
### Pass 3 — Confirmed-imported Radarr downloads
|
||||||
Script: `cleanup.py --arr radarr`
|
Script: `cleanup.py --arr radarr --yes`
|
||||||
|
|
||||||
Deleted **259 entries**, 0 failed. These were downloads where Radarr confirmed `hasFile=True` AND the file/directory path was verified to exist on disk.
|
Deleted downloads where Radarr confirmed `hasFile=True` AND the file/directory path was verified to exist on disk.
|
||||||
|
|
||||||
### Totals
|
Result: **259 deleted, 0 failed**
|
||||||
| Pass | Entries | Space |
|
|
||||||
|------|---------|-------|
|
|
||||||
| Orphans (cleanup-orphans.py) | 49 | ~461G |
|
|
||||||
| Sonarr imports (cleanup.py) | 1106 | ~12T (estimated) |
|
|
||||||
| Radarr imports (cleanup.py) | 259 | ~4T (estimated) |
|
|
||||||
| **Total** | **1414** | **~16T freed** |
|
|
||||||
|
|
||||||
All deletions logged to `cleanup.log` with UTC timestamp, size, title, path, outcome.
|
### Summary
|
||||||
|
| Pass | Script | Entries | Space freed |
|
||||||
|
|------|--------|---------|-------------|
|
||||||
|
| Orphans | `cleanup-orphans.py` | 49 | ~461G |
|
||||||
|
| Sonarr imports | `cleanup.py --arr sonarr` | 1106 | ~12T (estimated) |
|
||||||
|
| Radarr imports | `cleanup.py --arr radarr` | 259 | ~4T (estimated) |
|
||||||
|
| **Total** | | **1414** | **~16T** |
|
||||||
|
|
||||||
## Verification Results (via API + disk path check)
|
## Verification Results (from verify.py run before cleanup)
|
||||||
|
|
||||||
API keys stored in `../sonarr.api.env` and `../radarr.api.env`.
|
| | Safe to delete | Not imported | Path missing | Orphans (no API match) |
|
||||||
Access via `kubectl -n arr-stack port-forward svc/sonarr 8989:8989` and `svc/radarr 7878:7878`.
|
|---|---|---|---|---|
|
||||||
|
| **Sonarr** (1439 downloads) | 1106 | — | — | 333 |
|
||||||
|
| **Radarr** (289 downloads) | 265 | — | — | 25 |
|
||||||
|
|
||||||
Container path mappings:
|
Note: `cleanup-orphans.py` uses more aggressive title matching (alternate titles, partial match) than `verify.py`, so its orphan count (160 not-in-Sonarr out of 1438) is lower than `verify.py`'s 333.
|
||||||
- Sonarr `/tv/` → `/media/series/`
|
|
||||||
- Radarr `/movies/` → `/media/movies/`
|
|
||||||
|
|
||||||
| | Safe to delete | Orphans (not in arr) | Keep |
|
### Radarr Orphans (25) — not matched, not deleted
|
||||||
|---|---|---|---|
|
|
||||||
| **Radarr** (289 items, ~5.2T) | **265** | 25 | 0 |
|
|
||||||
| **Sonarr** (1439 items, ~17T) | **1106** | 333 | 0 |
|
|
||||||
|
|
||||||
"Safe to delete" = API confirms `hasFile=True` (Radarr) or `episodeFileCount > 0` (Sonarr), AND the reported file/directory path was verified to exist on disk via SSH.
|
|
||||||
|
|
||||||
### Radarr Orphans (25) — not matched in Radarr, not deleted
|
|
||||||
- Constantine (2005)
|
- Constantine (2005)
|
||||||
- Cowboy Bebop: Knockin' on Heaven's Door (2001)
|
- Cowboy Bebop: Knockin' on Heaven's Door (2001)
|
||||||
- Les Misérables (2012)
|
- Les Misérables (2012)
|
||||||
- Pokémon Detective Pikachu (2019)
|
- Pokémon Detective Pikachu (2019)
|
||||||
- Code Geass: Fukkatsu no Lelouch (2019)
|
- Code Geass: Fukkatsu no Lelouch (2019)
|
||||||
- Eiga Go-Toubun no Hanayome (2022)
|
- Eiga Go-Toubun no Hanayome (2022)
|
||||||
- Gisaengchung / Parasite (Korean title — matching failure)
|
- Gisaengchung / Parasite — Korean title, matching failure
|
||||||
- Dune: Part One (2021) — matching failure, is in Radarr
|
- Dune: Part One (2021) — matching failure, confirmed in Radarr
|
||||||
- Harry Potter (older/duplicate copies — matching failure)
|
- Harry Potter older/duplicate copies — matching failure
|
||||||
- Porco Rosso / Kurenai no buta — matching failure
|
- Porco Rosso / Kurenai no buta — matching failure
|
||||||
- Castle in the Sky / Laputa — matching failure
|
- Castle in the Sky / Laputa — matching failure
|
||||||
- Steins;Gate: The Movie — matching failure
|
- Steins;Gate: The Movie — matching failure
|
||||||
@@ -115,32 +160,41 @@ Container path mappings:
|
|||||||
- Fantastic Four (2025) extra copies (3)
|
- Fantastic Four (2025) extra copies (3)
|
||||||
- JJK DCP trailer file
|
- JJK DCP trailer file
|
||||||
|
|
||||||
### 6 Radarr "path mismatch" entries (all confirmed safe, deleted)
|
### Path mismatch entries (confirmed safe, deleted anyway)
|
||||||
Flagged due to path comparison artifacts, manually verified on disk:
|
- Star Wars Episode IV/V/VI/IX — all matched to Episode IV record; manually confirmed all 4 dirs exist
|
||||||
- Star Wars Episode IV/V/VI/IX — each is a separate Radarr entry; all directories exist
|
- WALL·E — `·` middle-dot (U+00B7) broke string comparison; file confirmed on disk
|
||||||
- WALL·E — `·` middle-dot character caused comparison failure; file exists
|
|
||||||
|
|
||||||
## Pending Decisions
|
## Pending Decisions
|
||||||
|
|
||||||
### Bleach USBD Remux TL (1.8T)
|
### Bleach USBD Remux TL (1.8T)
|
||||||
`/media/downloads/sonarr/Bleach USBD Remux TL` — full lossless Bluray remux S00–S16 (-ZR- group).
|
`/media/downloads/sonarr/Bleach USBD Remux TL` — full lossless Bluray remux S00–S16 (-ZR- group).
|
||||||
Currently in SKIPPED (series dir `/media/series/Bleach (2004) {imdb-tt0434665}/` exists, 310G imported).
|
|
||||||
Most seasons were imported from x265 Bluray packs (-iVy group) rather than from this remux.
|
Currently SKIPPED — `/media/series/Bleach (2004) {imdb-tt0434665}/` exists (310G imported).
|
||||||
S11 has no imported content at all. S13, S14 partially imported.
|
|
||||||
Decision: keep (for quality imports once disk freed) or delete (free 1.8T, accept x265 quality).
|
Most seasons were imported from lighter x265 Bluray packs (`Bleach S0x Bluray EAC3 2.0 1080p x265-iVy`) rather than this remux. S11 has no imported content. S13 and S14 partially imported.
|
||||||
See memory file for full per-season breakdown.
|
|
||||||
|
Options:
|
||||||
|
- **Delete** — free 1.8T, imported x265 content stays, re-download at remux quality later if desired
|
||||||
|
- **Keep** — retain as source for Sonarr to import remaining episodes at lossless quality now that disk space is freed
|
||||||
|
|
||||||
|
Per-season breakdown saved in memory.
|
||||||
|
|
||||||
### SKIPPED downloads (111 Sonarr entries)
|
### SKIPPED downloads (111 Sonarr entries)
|
||||||
Downloads where the series directory exists on disk but the series is not currently in Sonarr.
|
Downloads where a matching series directory exists on disk but the series is not in Sonarr.
|
||||||
Likely removed series (House, Lucifer, You, Black Clover, etc.) or ongoing shows with stale episodes.
|
Likely intentionally removed series (House, Lucifer, You, Black Clover, etc.) with leftover download copies.
|
||||||
These need manual review — series may have been intentionally removed from Sonarr.
|
Needs manual review per series before deleting.
|
||||||
|
|
||||||
|
## Permanent Fix (not applied)
|
||||||
|
|
||||||
|
Mount per-HDD NFS paths instead of the mergerfs path, so qBit downloads and arr imports land on the same physical filesystem, enabling hardlinks:
|
||||||
|
|
||||||
## Fix (not applied — future reference)
|
|
||||||
Mount per-HDD NFS paths instead of the mergerfs path, so downloads and media share the same physical filesystem and hardlinks work:
|
|
||||||
```yaml
|
```yaml
|
||||||
# sonarr/radarr/qtun deployments — change NFS path from:
|
# In sonarr/radarr/qtun deployments, change:
|
||||||
path: /media/downloads → path: /mnt/hdd0/downloads
|
path: /media/downloads → path: /mnt/hdd0/downloads
|
||||||
path: /media/series → path: /mnt/hdd0/series
|
path: /media/series → path: /mnt/hdd0/series
|
||||||
path: /media/movies → path: /mnt/hdd0/movies
|
path: /media/movies → path: /mnt/hdd0/movies
|
||||||
```
|
```
|
||||||
Jellyfin/Plex continue reading from `/media/` (mergerfs). New imports hardlink within hdd0.
|
|
||||||
|
Jellyfin/Plex keep reading from `/media/` (mergerfs union). New imports hardlink within hdd0, wasting no extra space.
|
||||||
|
|
||||||
|
Tradeoff: all new content lands on hdd0 only. Load balancing across the three disks stops working for new downloads. Once hdd0 fills up a migration strategy is needed.
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ Requirements:
|
|||||||
kubectl -n arr-stack port-forward svc/sonarr 8989:8989
|
kubectl -n arr-stack port-forward svc/sonarr 8989:8989
|
||||||
kubectl -n arr-stack port-forward svc/radarr 7878:7878
|
kubectl -n arr-stack port-forward svc/radarr 7878:7878
|
||||||
- SSH access to aya01
|
- SSH access to aya01
|
||||||
- API keys in ../sonarr.api.env and ../radarr.api.env
|
- API keys in ../../../../sonarr.api.env and ../../../../radarr.api.env
|
||||||
|
|
||||||
Output:
|
Output:
|
||||||
/tmp/arr_verified.json — full structured results for use by cleanup.py
|
/tmp/arr_verified.json — full structured results for use by cleanup.py
|
||||||
@@ -28,7 +28,7 @@ SSH_HOST = "aya01"
|
|||||||
script_dir = os.path.dirname(os.path.abspath(__file__))
|
script_dir = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
|
||||||
def load_key(filename):
|
def load_key(filename):
|
||||||
path = os.path.join(script_dir, '..', filename)
|
path = os.path.join(script_dir, '../../../..', filename)
|
||||||
return open(path).read().strip()
|
return open(path).read().strip()
|
||||||
|
|
||||||
SONARR_KEY = load_key('sonarr.api.env')
|
SONARR_KEY = load_key('sonarr.api.env')
|
||||||
|
|||||||
Reference in New Issue
Block a user