r/UptimeKuma: How to Diagnose downtime issues

💡
This article archives a conversation, which took place in a subreddit post (original source linked below) and to which I contributed a solution or answer (with the u/MasterofSynapse handle), in a Q&A format.

Original Reddit post: https://www.reddit.com/r/UptimeKuma/comments/z0vcb3/how_to_diagnose_downtime_issues/

Question

Hey I'm pretty new to trying to out Uptime Kuma and I like the software but I want to improve the uptime of my sites. I'm randomly seeing "timeout of 48000ms exceeded" on a number of my sites at random times that immediately fixes a minute later.

Is there any way I can diagnose these blips? I'm running Uptime Kuma on my home broadband to sites hosted across the other side of the world so I'm also not sure where the issue is occurring.

Ideally it would be great if there was a way I could trigger a traceroute command or something to try and get to the bottom of what the issue is.

I've moved a few of the sites to a different DNS provider and I've added an uptime check to google.com to see maybe I can get some false positives from other sources too.

Any ideas / suggestions would be great though :-)

Answer

In my opinion, Uptime Kuma doesn't want to be a troubleshooting suite, just a software to achieve converged monitoring.

You can always start pings, traceroutes or anything else from the OS UtK is running on. That will take the same path UtK takes and could show you additional information.

What infrastructure do you host your sites on? It is normal for providers to do non-critical changes at anytime during the day, as long as no major outage exceeding SLA is caused. So a few-seconds-long ping timeouts are completely normal.