humanize docs: fix bold abuse, title case, single-item lists, ProtonVPN ref
This commit is contained in:
@@ -1,24 +1,24 @@
|
||||
# Observability
|
||||
|
||||
Two parallel stacks cover metrics and logs.
|
||||
Two parallel stacks: Prometheus for metrics, Elastic for logs.
|
||||
|
||||
---
|
||||
|
||||
## Metrics — Prometheus + Grafana
|
||||
## Metrics
|
||||
|
||||
Deployed via the **kube-prometheus-stack** Helm chart (ArgoCD-managed), running in the `prometheus` namespace.
|
||||
kube-prometheus-stack runs in the `prometheus` namespace (ArgoCD-managed). Prometheus scrapes all nodes, pods, and control plane components. Grafana has dashboards for cluster overview, node resources, Longhorn, ArgoCD, and Traefik.
|
||||
|
||||
- **Prometheus** scrapes all nodes, pods, and K8s control plane components
|
||||
- **Grafana** dashboards: cluster overview, node resource usage, Longhorn, ArgoCD, Traefik
|
||||
- **Alertmanager** routes alerts to Ntfy (self-hosted push notifications) via a custom webhook bridge
|
||||
- **Node Exporter** runs on all VMs including docker-host11 and the edge VPS (Ansible-deployed)
|
||||
- **Goldilocks + VPA** analyse actual resource usage and recommend request/limit values
|
||||
Node Exporter is deployed via Ansible on every VM including `docker-host11` and the edge VPS, so coverage isn't limited to what's inside Kubernetes.
|
||||
|
||||
Goldilocks and VPA run alongside and analyze actual resource usage to suggest better request/limit values.
|
||||
|
||||
Alertmanager routes alerts to Ntfy via a custom webhook bridge.
|
||||
|
||||
---
|
||||
|
||||
## Logs + Fleet — Elastic Stack (ECK)
|
||||
## Logs and fleet management
|
||||
|
||||
Deployed via the **ECK operator** (Elastic Cloud on Kubernetes), running in the `elastic-system` namespace.
|
||||
The ECK operator (Elastic Cloud on Kubernetes) manages the Elastic stack in the `elastic-system` namespace:
|
||||
|
||||
| Component | Purpose |
|
||||
|-----------|---------|
|
||||
@@ -28,13 +28,13 @@ Deployed via the **ECK operator** (Elastic Cloud on Kubernetes), running in the
|
||||
| Elastic Agent (DaemonSet) | Ships logs and metrics from every cluster node |
|
||||
| Elastic Agent (standalone) | Runs on docker-host11 and the edge VPS |
|
||||
|
||||
The Elastic Agent DaemonSet tolerates the control-plane `NoSchedule` taint so logs are collected from server nodes as well as agents.
|
||||
The DaemonSet tolerates the control-plane `NoSchedule` taint so server nodes are covered too.
|
||||
|
||||
Alerts from Elasticsearch rules are bridged to Ntfy via a small CronJob (`elastic-ntfy-bridge`) that polls the Elasticsearch alerts API and forwards new alerts as push notifications.
|
||||
Elastic alert rules are bridged to Ntfy via `elastic-ntfy-bridge`, a small CronJob that polls the Elasticsearch alerts API and forwards new alerts as push notifications.
|
||||
|
||||
---
|
||||
|
||||
## Alerting Flow
|
||||
## Alerting flow
|
||||
|
||||
```
|
||||
Prometheus Alertmanager ──► Ntfy (push notification)
|
||||
@@ -42,4 +42,4 @@ Prometheus Alertmanager ──► Ntfy (push notification)
|
||||
Elasticsearch alert rule ──► elastic-ntfy-bridge CronJob ─┘
|
||||
```
|
||||
|
||||
All alerts land in the same Ntfy topic, accessible on mobile and desktop.
|
||||
Both sources land in the same Ntfy topic.
|
||||
|
||||
Reference in New Issue
Block a user