-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ping reliable service to prevent false DOWN on internet connection loss #774
Comments
Basically a dependency between services. If you were to ping a web server, ans also tdo a tcp check on 80 and a http check on the same port, the latter should not fail explicitly if the ping already fails (assuming icmp is not blocked of course, perhaps a bad example, but tcp/80 and http by themselves are a better one) |
This sounds similar to #216, but in this case, instead of a "core load balancer", you watch some public address. |
Not quite. I had thought this before by using https://www.npmjs.com/package/is-online. However, it may be a breaking change, since Uptime Kuma is no longer working in the internal environment. |
@louislam |
@louislam Something like this: each monitor can be set to depend on other monitor. If that other monitor is down (or in warning) then current monitor stops checks and sets warning status (or maybe some other non-green status) until the dependency comes back up. Then people would make dependencies however they want - for external connectivity (by polling something like 1.1.1.1), core services like gateways/balancers (when many services depend on one), service chains (ping host -> poll backend -> poll frontend), etc... It is nice that "is-online" determines the status by polling more than one source though. So described above would go great with #324 :) |
Would love to see a feature like this. My stats are garbage if i get downs for my services just because the internet connection was lost. 🙂 |
I think something like that would be ideal; I would rather not have it ping all my alerts when the monitor's connection goes down. ETA: Pinging another service feels more like a work-around by itself, though personally, if I can show we're online and show that at least one of two other reliable services on different networks were online, that would validate my results of my own checks.
Unless I'm misunderstanding, would it be possible to make it optional, off by default to avoid this being a breaking change? |
I came here to post the same request. I believe a simple solution might be to have an option to ping a user definable host before classifying anything as down. Here is how I would build this into the current system. Each monitor has a boolean value to determine if it should check the global 'connectivity check' before recording a 'DOWN' value. This would be false by default and that value means the system works as it does currently. If this boolean value is set to true for a particular monitor - then when a 'DOWN' is encountered, it should also look at the global 'connectivity host' value in the main settings, if this is set to anything other than empty, then the system pings that host (e.g. google.com would be an easy one). If that host is reachable then the system has connectivity and the original monitor does indeed get a 'DOWN' value. If however 'google.com' or whatever host is entered into the global 'connectivity host' is not reachable - then the original monitor gets a 'NO DATA' status, which in the stats would be considered an 'UP' for stats, but shown in grey. This solution requires an extra boolean / checkbox for each monitor, and one global host value in the main settings. It requires at most one extra ping whenever a DOWN value is encountered. As DOWN's should be rare, this is not going to add much extra network traffic. Currently Uptime Kuma is perfect but for this one failing. I have monitors that are extremely unlikely to ever be genuinely DOWN, but when they are it's a big deal. I need to ensure that the occasional loss of connectivity for the Uptime Kuma host doesn't lead to 'DOWN' values being recorded (e.g. when I reset the router for the network it's on). |
Came to post the same thing too. |
@CommanderStorm Issue grooming is a huge pain, and I feel for you, but I don't think this is a duplicate issue. There should be a reliable method for ensuring connectivity before declaring a service down. The process outlined in that thread would be valuable for grouping, as outlined, and may function as a work-around for connectivity-checking, but robust-connectivity checking is a more desirable -- and decidedly separate -- solution. e.g.: Can the user contact any of:
This would tell a user if they have connectivity, and if not, where the breakdown is. The work-around as outlined in that thread would: a) not prioritize connectivity-checking by default, meaning the application would be inaccurate by default Recommend re-opening this issue and treating it as a separate feature. |
@chakflying |
Even reliable monitors go down -- there have been both Cloudflare and Fastly outages in recent memory, the latter of which caused something like 40% of the internet to go down -- and my sites and services were still online. If the #1089 proposal can be tweaked to allow a "parent service" of sorts with multiple monitors that itself goes on to then have multiple child monitors, then I think that'd be workable, e.g. for monitoring sites
...where if ANY service for the parent is up, we are considered to have connectivity and do not check the other parent services, and then check child services as normal. If all services in the parent group are down, we do NOT have connectivity, and should not assume we have data about downtime for any child services, and should instead list the monitoring service as being down. We could show the monitor as its own service (e.g. This allows for actual connectivity checking as a first-class citizen, with both self-monitoring and arbitrary service monitoring. As currently proposed, #1089 does not meet connectivity-checking needs, and indeed the most recent comment seems to declare that connectivity checking is NOT what is desired in that ticket, but something more along the lines of "if |
I think considering this as a separate feature would be more convenient for users. But of course it would be pretty complicated since there are many different use cases. |
I've come to lend my support to this idea; it's an extremely necessary feature that prompted me to temporarily stop using Uptime Kuma. I use it in my local network, which experiences frequent fluctuations, and this completely disrupts the uptime report. |
I also stopped using UptimeKuma for this reason. I used it on a cloud server, the uptime is not 100% (like many cloud providers), so every x days all my 50 monitors are reported as down, and 100 notifications are sent. .. |
Folks, I've created the simple Bash script below that has addressed this issue for me. Whenever my internet goes offline, I shut down the Docker container running Uptime Kuma. This way, my statistics remain organized when the internet at home experiences disruptions. #!/bin/bash
# Get the Docker container ID containing 'uptimekuma' in the name
container_id=$(docker ps -aq --filter "name=uptimekuma")
# Check if the internet is active by sending 3 ping packets to 8.8.8.8
ping -c 3 8.8.8.8 > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "Internet is active."
# Check if the container is active
if [ -n "$container_id" ] && [ "$(docker inspect -f '{{.State.Running}}' $container_id 2>/dev/null)" == "true" ]; then
echo "Container is active, doing nothing."
else
echo "Container uptimekuma is not active, starting..."
docker start $container_id
fi
else
echo "Internet is offline."
# Check if the container is active
if [ -n "$container_id" ] && [ "$(docker inspect -f '{{.State.Running}}' $container_id 2>/dev/null)" == "true" ]; then
echo "Container is active, shutting down..."
docker stop $container_id
else
echo "Container uptimekuma is not active, doing nothing."
fi
fi To use, create a file named Create a new crontab by using * * * * * sleep 20 && /bin/bash /<EDIT HERE>/check_internet.sh This way, the script will be executed every 20 seconds. It worked well in my tests, but you may need to adjust it according to your specific requirements. |
I remember PHPMon very well (if we are talking about the same as here: https://github.com/phpservermon/phpservermon) which worked pretty nice at the time and I had it in production for a couple of sites/networks for many years and discontinued the last one recently to swap to U.K. which is far better. And sorry to say, but PHPmon didn't handle this very well (not to say at all). When I had Pushover as notifications it used to go ballistic for the sake of it's own issues. Which brings me why I wanted to reply here, since the type of notification you all seem looking for that says "I'm down, help me please!", how would you reckon that get's send if the U.K. machine is somehow disconnected from the Internet? |
I like this answer from @Rod-Gomes since it addresses a supposed issue which it really isn't from start, with a system orientated workaround/solution. Of course you should solve stuff like this in that manner and not asking apps to do stuff they are not supposed to do. If I look back to the initial question "(...)It could then send only one message instead of tens/hundreds. It also could omit the data of the services from the logs or mark them differently, to prevent incorrect uptime numbers.(...)" my first thought is "how can it send a message since it is disconnected itself? The 2nd point of preventing incorrect uptime stats but that is not only adressed by the software itself but merely by the environment it is evaluating in. |
It has incorrect uptime stats, and reports it once it comes back online, as well as the "recovery" even if the site(s) in question never actually went down. In short, the monitor going down should show differently, and try to validate results -- the test isn't a failure, it's inconclusive. Connectivity checking is a pretty standard idea here. |
Glad to see others already had the idea for this feature. Main principleAdd masters checks with conditions and do not run any other check if conditions fail. Uptime Kuma's core checks functions can be tweaked so that if master check is enabled, it has to respect its conditions before processing any other check. That solves the problem of this issue. Details of master checkThe feature allows selecting existing monitors (single or groups) checks, combined with different OR, AND, conditions and desired status "Up" or "Down"). For example:
Where to add the new optionSection can be added to "Settings" menu. How the menu looks likeMaster Monitors Description: Configure monitor rules that must be true in order for any other check to run. Prevents false positive alert if your monitoring system has connectivity issues.
Pros of such implementationThis wouldn't require changing existing monitorings, hence wouldn't cause any breaking change, and would be off if unset, so users would have to manually configure it and turn it on. Cons of such implementationI don't see any, let me know. |
This comment has been minimized.
This comment has been minimized.
Is there any update on this? Uptime Kuma can no longer be used by us in its current state. As already described by other users: if there is an outage of our host's internet connection, notifications for all monitors are sent, and the statistics are distorted. |
@GitBaer For now I've abandoned the idea of monitoring from the office. I'm using a little VPS. It's far slower showing the interface but still gets the job done. That is probably the way to go if you have many internet outages. The good part of it is I'm now notified on time when office loses internet, not afterwards :) |
This comment has been minimized.
This comment has been minimized.
@manprinsen looks like the feature is still unassigned. I'm sure @louislam would welcome a pull request in the spirit of the project if you have the ability to contribute, though! |
Is it a duplicated question?
Please search in Issues without filters: https://github.com/louislam/uptime-kuma/issues?q=
Haven't found any duplicate. (I'm not ceratin on the terminology regarding this problem though, so please correct me if I'm wrong).
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
It sometimes happens, that my internet connection goes down. In that case, Uptime Kuma reports all of my services as down, even though only my internet connection is down. This leads to a lot of messages and incorrect uptime stats in the dashboard.
Describe the solution you'd like
A clear and concise description of what you want to happen.
I'd love some kind of 'health check' (not sure if that's the right term) in the settings, in which I ping a reliable service (e.g. Google or Cloudflare DNS). If that reliable service is down, Uptime Kuma would assume that my internet connection broke instead of all my services. It could then send only one message instead of tens/hundreds. It also could omit the data of the services from the logs or mark them differently, to prevent incorrect uptime numbers.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Instead of, or additionally, you could make an exception, if more than x services go down at the same time, or all of the services go down at once. This could create problems though when shutting down a server with multiple services, or when only monitoring very few services (would potentially lead to false positives).
The text was updated successfully, but these errors were encountered: