I Ran a SIEM in My Homelab, Then Pulled It

The Objection First

A SIEM in a homelab sounds like cosplay. One operator, one cluster, a handful of services -- who is correlating events for whom? I had the same reaction, and for a long time my "security monitoring" was Grafana dashboards I looked at when something felt slow.

What changed my mind is that the access platform I built (Cloudflare Tunnel into Traefik into Authentik) produces genuinely interesting security events, and nothing was reading them. Authentik logs every login success and failure. The K3s API server writes an audit log of every API call. Those streams answer the only security question that matters at home scale: did someone who is not me authenticate, or try to? If nothing consumes those events, the answer arrives weeks late or never. So Wazuh went in.

What Signal Matters at Home Scale

The enterprise SIEM playbook assumes a fleet: endpoint agents everywhere, packet capture, threat intel feeds. Almost none of that earns its keep on a single-operator lab. There is no anomalous lateral movement to baseline when the baseline is one person.

What did carry signal while Wazuh ran:

Authentication events. Authentik is the front door for every externally reachable service, so its login events were the highest-value stream I had.
K3s API server audit logs. The API server is configured with audit logging at install time (/var/log/k3s-audit.log, 30-day retention). Anything creating pods or reading secrets that isn't me or a known controller is worth an alert.
Edge traffic patterns stayed with Cloudflare analytics. The edge absorbs the scanning noise, so I never needed to ingest it.

What I deliberately skipped even at the peak: packet capture, file integrity monitoring on application containers, and agents on every workload. The cost-to-signal ratio is wrong for a lab this size.

Wazuh for Detections, Loki for Logs

The split that took me a while to articulate: Loki and Wazuh both ingest logs, but they answer different questions. Traefik access logs (JSON) go to Loki, because access logs are a debugging and forensics resource -- I query them after I already know what I'm looking for. The auth events and audit logs went to Wazuh, because those streams need rules evaluated against them continuously, without me asking.

Putting everything in one system tempted me. Loki with alerting rules can approximate detection, and Wazuh can archive logs. But Loki's query-time model is built for exploration, and Wazuh ships with detection rules and a decoder pipeline I did not want to rebuild as LogQL alerts. Two tools, each doing the thing it is shaped for, cost less than bending one tool into both roles.

Operational alerting is a third lane entirely: Alertmanager fires on pod crash loops, node resource pressure, certificate expiry within 14 days, and Authentik outpost health failures. Those are reliability signals, and mixing them into the security queue is how the security queue gets ignored.

Alert Fatigue, Honestly

The first weeks of any SIEM are a firehose, and a homelab is no exception. Wazuh's default ruleset assumes you care about things a home operator does not. The failure mode is predictable: fifty low-severity alerts a day for a month, then you stop reading them, then the one real alert lands in a muted channel.

The standard I held while tuning: an alert that fires must be worth interrupting my evening for. Everything else should be a dashboard panel or a log query, not a notification. Most of the default ruleset's chatter got downgraded against that bar.

Why It Came Back Out

The K3s cluster runs on small hardware -- a Raspberry Pi 5 and a mini PC. The Wazuh manager stack is not small software, and on those nodes it could not run at anything close to its full capacity. A SIEM that is resource-starved is the worst of both worlds: it consumes a real slice of a small cluster while delivering a fraction of its detection value, and it competes with the workloads it is supposed to be watching.

So I pulled it. Not as a verdict on the idea -- as a sizing decision. The event streams it consumed still exist: Authentik keeps its event log, the API server still writes its audit log with 30-day retention, Loki still holds the access logs. What's missing is the layer that evaluates rules against them while I sleep. When the cluster gains real compute, Wazuh goes back in; the pipelines that fed it are cheap to keep pointed in the right direction.

What Running It Was Worth

Even ending in a decommission, the exercise paid for itself. Wiring the streams forced me to enumerate what "suspicious" means for my own infrastructure, and that found gaps the SIEM itself never would have. I also now know my actual detection requirements -- auth anomalies and API audit events, not packet capture -- which makes the eventual re-deployment a sizing problem instead of a design problem.

The caveat stands in both directions: a SIEM you don't tune is worse than no SIEM, because it manufactures false confidence. So is a SIEM your hardware can't feed. Security tooling has to be sized like any other workload, and "pull it until the cluster can carry it" is a more honest posture than letting a starved manager limp along as a checkbox.

The Objection First#

What Signal Matters at Home Scale#

Wazuh for Detections, Loki for Logs#

Alert Fatigue, Honestly#

Why It Came Back Out#

What Running It Was Worth#