-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.6] Multi server scheduling cron support #22137
Conversation
This is a single special case. How about if I want to execute jobs every fifth minute, and server A started at 12:00, and server B at 12:01? |
Then Server B wont run the event - because the cron expression fails? We only reach this point if both servers pass the same cron expression. I've changed the original post to now read "cron expression" rather than "minute". edit: actually - I can think of a race condition where this might occur if Server B runs another long running process that takes more than 60s. I've got a solution, I'll amend the PR now. |
|
@sisve - ok, fixed. We use the time the event was created to do matching to crontab schedule and the mutex. This will prevent strange issues if one server runs a longer process first, and ensures a job can only ever run once per defined cron expression regards of other processes and timelines. |
Your description doesn't explain "why" or "how" this actually prevents the overlap across multiple servers. It just states that it "does". We need better explanation of the "how". |
@taylorotwell - sure, no dramas:
So let's say we have Server A and Server B both running
At Server B happens to be 3 microseconds faster than Server A, and creates a mutex Server A comes along, and tries to obtain a timestamp mutex. But it fails, because Server B already has it, so Server A skips that event and carries on the list. What is important here is that the timestamp mutex is never deleted, even when the event is complete, unlike the event mutex. i.e. when Server B is complete, it will delete the event mutex The reason we store the event creation time now in the constructor is to prevent race conditions that @sisve describes. So if Server A gets the event list, but runs another event for 3-4 minutes before coming along to this one - it is still evaluated against the time the cron expression passed - not the actual time. This is important to prevent strange race conditions that could occur, and makes the system robust to handle large scheduler lists. Does that make sense? Let me know if you need more info - happy to discuss further... |
Actually - another thought: Rather than use Instead we could make this default scheduler behaviour in 5.6, and add a new command This seperates the concerns, and leaves the other function of what Let me know if you want me to change the PR to be that? |
This is all sort of dependent on them using something like Redis or Memcached as their cache driver, correct? |
I do think a new method could be useful. In case you are fine with multiple servers running the same job but not overlapping locally. |
With this changes if Server A handle the command at This was the argument on the related issue. |
They need a "central" cache across all servers - yes. Ideally Redis/Memcache - although I think the Database driver would work ok too (might be issues around atomic locks?). They can also write their own Mutex driver and do it anyway they want (since the Mutex can be injected).
@taylorotwell - so can I clarify - do you want the PR to make this default behavior for 5.6 - and you can opt out with a function on your commands like My preference/idea is to make it default behavior for 5.6, and you 'opt out' as most people would want this as default I would guess? It means the scheduler is ready to go for load balancing, without needing to change code when the time comes.
@crynobone - the PR will be changing, so the So yes - this handles the issues discussed in the other thread. |
@laurencei so @crynobone is correct that server B would run the command again even though the 0:05 run is still running? I don't think that would be the expected behavior for me if I was going to use this feature. Imagine I have |
@taylorotwell - the scenario you have given and expect is exactly what will occur (once I update the PR with the new command you asked for) - yes. So we can do it like this:
What I am asking though, is if we make So instead for 5.6, perhaps we make the default behaviour to only run on one server, and you can add So what I am proposing is:
Let me know which way you want me to go on this - and I'll fix the PR. |
I would rather make it opt-in for now. Maybe later we can make it a default once it gains more traction and I play with it more. |
@laurencei do you have an ETA on finishing this out? |
@taylorotwell - today - I'm doing it now. |
Closing for a new PR as it's completely changed now. |
Simple solution to allowing scheduling to occur on multiple servers simulatenously. Issue recently came up here: #22129
Using the existing Mutex system, we create a second stamp relating to the "minute" of this cron.
So using
->withoutOverlapping()
ensures that the command is only run once per cron expression across any server in your farm, as well as ensuring it does not overlap with a command already running longer than a minute (regardless which server started it). Best of both worlds.Now multiple servers can run the scheduler without fear of the same command running twice.
p.s. Perhaps for
5.6
we could invert, and make->withoutOverlapping()
the default, and instead make developers use a new command->allowOverlapping()
if they want it to run simulatenously or on multiple servers?