Cronjobs Monitoring - Your feedback needed #42283
Replies: 56 comments 147 replies
-
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
This is all experience from my previous job:
|
Beta Was this translation helpful? Give feedback.
-
This is an exciting idea!
I run my crons through Heroku Scheduler + django-cron. They're all meant to be relatively short-lived (the main 'action' they take is to dump asynchronous work in to worker queues rather than to do the work themselves), and cover a wide variety of uses:
The core 'cron runner' hooks into my workload database so I do get failed entries if something fails (and exceptions go to Sentry); beyond that, very little.
I'd complain about two things:
Code, rather than infra. (Or more accurately — "exogenous changes such as a third party provider", rather than infra.) And yes. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Thanks for this feature. I think the implementation could be better or easier without CLI, just a simple HEAD, GET, or a POST request to the following URL like this example will do. |
Beta Was this translation helpful? Give feedback.
-
Hey Eran, I tried this new feature today and I have some feedback:
Anyway I'm very excited to see where this new feature is going, great job! |
Beta Was this translation helpful? Give feedback.
-
Hello,
Tried to setup a couple of cronjobs that we have to use Sentry Crons to make a quick evaluation and besides from what was already written have just 1 complain: It can be tested with As others pointed out, would be nice to send some accompanied data related to that cron run. Either from the code with the Sentry SDK or in the format of |
Beta Was this translation helpful? Give feedback.
-
Run into this issue while trying out the beta: Requests to 404s: PUT Same behavior for the POST endpoint to start a check. |
Beta Was this translation helpful? Give feedback.
-
It would be really nice to be able to specify an optional "Label" on a check-in. This could be used to do things like adding a date, a server identifier, etc |
Beta Was this translation helpful? Give feedback.
-
Editing an existing monitor appears to be broken in the Web UI at the moment. Changing settings and then clicking Save will disable the button, but the form doesnt submit and there are javascript console errors and changes are not applied. |
Beta Was this translation helpful? Give feedback.
-
Is the I completed the first check-in after the second one had been missed. I would expect the status to still be |
Beta Was this translation helpful? Give feedback.
-
It might be nice to have a way to create and complete a check-in in one API call. My use-case: we have a nightly job that enqueues a bunch of other jobs. I don't need to monitor the duration, it is more of a "ping" operation. Right now I would have to do something like: desc "Tasks that should run ~nightly"
task nightly_scheduler: :environment do
check_id = Sentry.start_check("some-monitor-id")
Hubspot::SyncMergedObjectsJob.enqueue_all!
Segment::UpdateOrgGroupJob.perform_later
Postmark::MaintenanceJob.perform_later
Plans::TallyDaysAsCurrentJob.enqueue_all!
Sentry.end_check("some-monitor-id", check_id)
end Ideally I could just make one API call to "check in" that everything is fine. |
Beta Was this translation helpful? Give feedback.
-
One weak spot in our Sentry setup is if our background worker queues get overloaded. I would like to periodically send a "tracer bullet" job through the queue as a way to measure how long a job is waiting in the queue before running. For example, if jobs are taking more than 10 minutes to be processed, that is a problem and we need to be alerted. Could you advise on if you see this as an appropriate use-case for Crons? Or should I try to implement this via Sentry Performance? I could imagine creating a monitor with max runtime: 10 minutes and schedule type: every 3 hours. |
Beta Was this translation helpful? Give feedback.
-
Is there a way to navigate to the monitor from an issue created by one? Because if so, I didn't manage to find it. And if not, it'd be nice to have that in the future. |
Beta Was this translation helpful? Give feedback.
-
Hello i just install sentry on premise for the first time - self-hosted-23.10.0 and my logs are full of: docker/sentry-self-hosted_cron_1[861]: 15:59:39 [WARNING] sentry_sdk.errors: Intervals shorter than one minute are not supported by Sentry Crons. Monitor 'sync-options-control' has an interval of 10 seconds. Use the is there a doc to help me fix this ? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
It would be nice to have a read-only API so that we can show the up/down statuses of checks on an external platform! |
Beta Was this translation helpful? Give feedback.
-
It would be great to have an option to remove certain environments from monitoring slug. |
Beta Was this translation helpful? Give feedback.
-
We have multiple platforms with the same code. So different environments. |
Beta Was this translation helpful? Give feedback.
-
We have cron jobs running every minute, and we see some strange missed check-ins and timeouts, comparing with our logs. The jobs execution time is short 1-2 sec. We have spike protection activated. I was wondering if the ckeck-ins could be rejected by the spike protection, since the ckeck-ins are errors? |
Beta Was this translation helpful? Give feedback.
-
The public beta seemed to have started almost a year ago. Are there plans to mark it as production ready sometime soon? I'm not comfortable depending on a beta feature for monitoring things. |
Beta Was this translation helpful? Give feedback.
-
After using Cron Monitoring for the past couple months, the thing I like about it is the simplicity in set up and how well it works when it works. On the flip side, we receive a lot of "no-checkins" and timeouts even though the cron jobs ran just fine. It would be nice to see some attention on this issue as I've seen it mentioned in a few places already including this discussion thread. |
Beta Was this translation helpful? Give feedback.
-
We've discovered this CRON monitoring and it is very convenient way to track jobs execution (especially to catch missed executions). I was trying to use sentry-cli but it wasn't handy as we are using Sentry self-hosted installation. I needed to spend some time to create some bash to use simple curl and post statuses. Also, sometimes it cannot be just simply wrapped as many scripts are coded not to throw errors, etc. I've made it so it creates monitor automatically if not there, also it is handy to use different environments to avoid mixing everything together. It would be cool, if there is a simple explanation how to enrich errors automatically created by Sentry by passing some string or to upload a short log or something like that. I know there must be some way through envelope and events but I need to spend some time deciphering how to link it to cron. I see in the UI there is a column attachment but no documentation how to use it. What if you add an additional body property that can contain simple string or event or something like that? Also, if you add spans, one day we would be able to have a script execution checkpoints and measure duration of each step. Something similar what already exists in Performance. In the end you have phases like blue "request", yellow "DB something", orange "something filesystem", etc. Any help/suggestion would be appreciated. Thank you for adding this cron monitoring feature to Sentry. CC: @rniv-cls P.S. Sharing my helper bash if someone would see any value ( # In order to make it work, three parameters need to be passed:
# - SENTRY_DSN (will be related to a project in sentry)
# - SENTRY_NAME (CRON monitor slug)
# - SENTRY_SCHEDULE
# - SENTRY_ENV (defaults to production)
# - SENTRY_TZ (defaults to America/Chicago)
# - SENTRY_BASEURL (defaults to sentry.io)
# If a monitor_slug is not registered to Sentry, it will add it automatically. Project must exist.
if [ -z "$SENTRY_DSN" ] | [ -z "$SENTRY_NAME" ] | [ -z "$SENTRY_SCHEDULE" ]; then
echo "You need to set SENTRY_DSN, SENTRY_NAME (monitor slug) and SENTRY_SCHEDULE environment variables before start. Exiting...";
exit 1;
fi
# Setting a default environment
SENTRY_ENV=${SENTRY_ENV:-production}
# Setting a default timezone
SENTRY_TZ=${SENTRY_TZ:-"America/Chicago"}
# Setting a default base url
SENTRY_BASEURL=${SENTRY_BASEURL:-https://sentry.io}
# Register functions
function sentry_report_start(){
curl -s \
-X POST "$SENTRY_BASEURL/api/0/organizations/sentry/monitors/$SENTRY_NAME/checkins/" \
-H "Content-Type: application/json" \
-H "Authorization: DSN $SENTRY_DSN" \
-d "{\"monitor_config\":{\"schedule\":{\"type\":\"crontab\",\"value\":\"$SENTRY_SCHEDULE\"},\"timezone\":\"$SENTRY_TZ\"},\"status\":\"in_progress\",\"environment\":\"$SENTRY_ENV\"}"
}
function sentry_report_error(){
curl -s \
-X PUT "$SENTRY_BASEURL/api/0/organizations/sentry/monitors/$SENTRY_NAME/checkins/$checkin_id/" \
-H "Content-Type: application/json" \
-H "Authorization: DSN $SENTRY_DSN" \
-d "{\"status\":\"error\",\"environment\":\"$SENTRY_ENV\"}" > /dev/null
}
function sentry_report_success(){
curl -s \
-X PUT "$SENTRY_BASEURL/api/0/organizations/sentry/monitors/$SENTRY_NAME/checkins/$checkin_id/" \
-H "Content-Type: application/json" \
-H "Authorization: DSN $SENTRY_DSN" \
-d "{\"status\":\"ok\",\"environment\":\"$SENTRY_ENV\"}" > /dev/null
}
function sentry_report_ping(){
curl -s \
-X PUT "$SENTRY_BASEURL/api/0/organizations/sentry/monitors/$SENTRY_NAME/checkins/$checkin_id/" \
-H "Content-Type: application/json" \
-H "Authorization: DSN $SENTRY_DSN" \
-d "{\"environment\":\"$SENTRY_ENV\"}" > /dev/null
}
# Send a CRON start message
checkin_id=$(sentry_report_start | grep -Eo '[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}') To use it, you need to set variables and then call it: SENTRY_DSN=...
SENTRY_ENV=...
...
# This is needed to handle relative path
source "$(dirname "$0")/sentry_helper.sh"
# This will report the start and then you just need to report success or error
#... do your code
# if success
sentry_report_success
# if error
sentry_report_error |
Beta Was this translation helpful? Give feedback.
-
Being able to specify what (in Laravel) environments the monitor will run on/check against is crucial. I have many E2E's, review apps, staging, production environments. I only want and need to run the monitor actively on staging/production however this doesn't seem like it's possible. It would be a great feature to have, until then I have to disable |
Beta Was this translation helpful? Give feedback.
-
Cronjobs Configuration:Our cronjobs are diversified with various schedules like DailyAtTen, DailyAtNoon, etc. Each cron job triggers processors responsible for executing specific tasks determined by user settings. For instance, the DailyTasks CronJob manages a list of processes, calling each one sequentially. The initiated processor incorporates a method for retrieving the users it needs to handle. Once the processor completes the processing for all users, it proceeds to the next processor in line. The cron job is considered done only after all processors have completed their respective tasks. Monitoring Tools:We use a solution that records the cron start date, end date, and errors in the database. This allows us to retrieve the data for analysis, such as calculating the average time for a process and a cron task. Monitoring Challenges:While our current setup effectively executes cron jobs and processes, we acknowledge the challenges in monitoring. Specifically, there is difficulty in tracking processed users, understanding the status of tasks within the cron job, and distinguishing between manual and automatic triggers. To address these challenges, we are considering working on enhancements. One key improvement we are considering is the ability to cancel running processes. This enhancement aims to provide better control and visibility, allowing us to halt processes if needed and obtain more accurate insights into the progression of tasks. Cronjob Failures and Warnings:In addition to managing failures, our cron jobs have been enhanced to capture warnings. While the cron jobs themselves typically run successfully, our system is designed to detect scenarios where the process completes, yet certain users are skipped based on specific rules or settings. These warnings offer valuable insights into instances where the process might deviate from the expected execution. User-specific configurations or rules may lead to the exclusion of certain users during task execution. By capturing these warnings, we gain visibility into potential deviations from the intended processing flow. Our monitoring system is structured to effectively differentiate between failures and warnings. This capability allows us to proactively identify and address situations where the process completes but includes user-specific skips. This additional layer of information contributes to a more comprehensive understanding of cron job execution outcomes. Priority of Fixing Failed Cronjobs:Fixing failed cronjobs is not a high priority since cron jobs don't fail. However, addressing task failures for specific users is important, but the current setup lacks visibility into which process failed for each user. Note: The current monitoring system is rather bare bones, and the existing dashboard, while functional, requires custom SQL queries for a detailed analysis of processor and cron results. Integration Inquiry:We are considering moving our cron monitoring to Sentry as a third-party solution to streamline management and enhance efficiency. In our current workflow, a single cron job manages multiple processes, and we are particularly interested in understanding if Sentry's tool supports a workflow where each individual process within the cron job schedule can be viewed separately. Our goal is to avoid creating a new Cronjob-Monitor for every processor, especially since processors may be reused across multiple cron jobs. |
Beta Was this translation helpful? Give feedback.
-
Hello, Thank you all for all the reports on some instability with Crons recently. We found a bug in the Ruby, PHP and Go SDK concerning Crons. if you set a sample_rate , this was falsely also applied to check-ins, meaning some check-ins were sampled out and never sent to Sentry. Thank you again for being part of this Beta! |
Beta Was this translation helpful? Give feedback.
-
Since last night, we've started getting
|
Beta Was this translation helpful? Give feedback.
-
We're seeing issues with the change to Day Light Savings Time. Example: Cron scheduled in NY Timezone, we see logging of jobs and their switch to DST, but are now getting alerts about missed jobs. Looks like the monitors do not recognize the switch to DST. Please advise.
|
Beta Was this translation helpful? Give feedback.
-
I'm a bit confused about the pricing model of cronjobs. I'm unable to find any pricing for cron monitors but I can increase the budget for them. So how much do they cost each and why is there only a single one included in each plan by default? |
Beta Was this translation helpful? Give feedback.
-
Hi Folks!
Eran here, a member of Sentry’s product team. We are constantly thinking about ways to make your life easier, and one of the areas we are thinking about is Cronjobs or recurring tasks!
We would love to understand how you all use Cronjobs and what type of monitoring is currently done, if any. Some questions to get the conversation going -
Beta Was this translation helpful? Give feedback.
All reactions