the slow request my APM never told me about
A user was hitting six-minute page loads and my APM had nothing on it. Turns out APMs cap trace duration. Here is what I reached for instead.
The disk that filled itself
Homelab box said 100 percent full, du said half the disk was free. Docker was holding open a log file I had rm'd months ago. lsof +L1 untangles the knot.
The webhook that worked in Postman and nowhere else
Two bootstrap paths in one app drifted apart. The HTTP entrypoint loaded the webhook signing secret, the queue worker did not, and a silent-skip guard hid the cause behind a misleading 401.