Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.6] Multi server scheduling cron support #22216

Merged
merged 6 commits into from
Nov 30, 2017
Merged

[5.6] Multi server scheduling cron support #22216

merged 6 commits into from
Nov 30, 2017

Conversation

laurencei
Copy link
Contributor

@laurencei laurencei commented Nov 26, 2017

This is a new PR to replace #22137 - see there for some previous discussion - but many comments are outdated and no longer relevant as this PR approaches it with a entirely new concept

This PR is a solution to allow cron scheduling to occur on multiple servers simulatenously without fear of the same command running twice unintentionally. It was created after seeing the issues described in #22129

We expand on the existing mutex system to create a dedicated scheduling mutex as each server runs the command for each given cron expression. This is disabled by default. So anyone upgrading to 5.6 would not notice any changes without explicity enabling the new feature.

How it works:
Let's say we have Server A and Server B both running schedule:run each minute with the following command:

$schedule->command('run:report')->everyFiveMinutes();

At 00:05 both Server A and Server B will run schedule:run. Both servers run the event filters, and determine that the command run:report is due to run.

Server B happens to be 3 microseconds faster than Server A, and creates a mutex run_report_mutex_0005 (the scheduling mutex). Because Server B was able to obtain a scheduling mutex - it then goes ahead runs the report.

Server A comes along, and tries to obtain a scheduling mutex. But it fails, because Server B already has it, so Server A skips that event and carries on the list.

Example:

  • $schedule->enableMultiServerScheduling() - required to turn on the feature. Otherwise there is no change. Can be put at the start or end of the scheduling kernel function - it does not make a difference as long as its called somewhere there.
  • $schedule->command('something') - only runs once on a single server per cron expression across the server farm.
  • $schedule->command('something')->withoutOverlapping() - only runs once per cron expression, and will not start again if still running on a server somewhere.
  • $schedule->command('something')->runOnAllServers() - force it to run once on every server. Might be good for a command like disk:cleanup or something.
  • $schedule->command('something')->runOnAllServers()->withoutOverlapping() - this will throw a logic exception - it doesnt make sense to ask it to run on all servers while also saying you dont want it to run overlapping.

Breaking changes:
This would largely be a non-breaking change for most users and would require no change to any code to keep existing behaviour.

The only exception is if you had a custom Mutex class - the interface name has changed to EventMutex to make it more obvious what each Mutex does now. Technically we could leave the original Mutex name unchanged and make this an entirely non-breaking change - but it makes the class names less obvious what they do.

A GIF is a thousand words - showing two servers running one scheduler:

scheduler

@sisve
Copy link
Contributor

sisve commented Nov 26, 2017

I'm not sure about the naming of runOnAllServers(). It sounds like something you must use since you want your job to work on all the servers. It also sounds like something that states on which servers the code may allow.

Does it perhaps make sense to call it withOverlapping? Or something that talks about concurrency like allowConcurrentOnOtherServers? How long can we make the method name?

@laurencei
Copy link
Contributor Author

laurencei commented Nov 26, 2017

Using withOverlapping is not correct - because it makes it sound like the inverse of withoutOverlapping - and they are not the same thing at all.

What about forceRunOnAllServers - to make it more explicit?

I'll let @taylorotwell decide what he likes and I can change if needed...

@laurencei
Copy link
Contributor Author

what about runOnEveryServerOnce()

@morloderex
Copy link
Contributor

hm, Maybe the method name is to copled with the functionallity.
I'm thinking something like $schedule->command('some-commad')->withinCluster()

@taylorotwell
Copy link
Member

I don't understand why we need this enableMultiServerScheduling thing? Why not just allow people to do ->withoutOverlapping()->onOneServer(), etc.

@laurencei
Copy link
Contributor Author

laurencei commented Nov 28, 2017

I don't understand why we need this enableMultiServerScheduling thing? Why not just allow people to do ->withoutOverlapping()->onOneServer(), etc.

Realistically 99% of cron jobs would need ->onOneServer(). So if you have 25 crons scheduled - you would need to add it 25 times - which seems strange?

So now we just add one line that allows people to "opt in". Alternatively I could even literally do it as a "config" option instead perhaps? i.e. create a config/scheduler.php file - and place it in there?

Lastly - anyone using packages that schedules commands themselves would have been unable to use this feature without forking the package (or the package maintainer updating it). Not a massive issue, but makes it easier for people to use if they want...

@taylorotwell
Copy link
Member

I really only want to add one new public method: onOneServer. Yes, you have to opt-in every time.

@laurencei
Copy link
Contributor Author

laurencei commented Nov 29, 2017

@taylorotwell - ok - I've made the requested changes.

So now it's just

  • $schedule->command('something') - runs every time
  • $schedule->command('something')->onOneServer() - Only run on one server each expression
  • $schedule->command('something')->onOneServer()->withoutOverlapping() - prevent overlapping, and only runs once per cron expression
  • $schedule->command('something')->withoutOverlapping() - runs every time - but wont overlap. So you might get multi runs on the same cron expression, but never at the same time. Not recommended if running multiple servers.

@danielkleach
Copy link

I see you have this marked as 5.6, but this would be beneficial to have now if possible.

@laurencei
Copy link
Contributor Author

laurencei commented Nov 29, 2017

If you are injecting a custom event or scheduler this might break something unexpectedly - so I'd prefer to target 5.6

@taylorotwell taylorotwell merged commit a55edc6 into laravel:master Nov 30, 2017
@laurencei laurencei deleted the schedulingmutex branch November 30, 2017 17:03
@t202wes
Copy link

t202wes commented Mar 27, 2018

@laurencei when an application has automatic load balancing & scaling, a server could be shut down. When a server is shut down running the onOneServer() method, it obviously can't mark it as it's not being ran any more. The other servers then try to pick up the jobs, but it won't because it thinks another server, which doesn't exist any more is still running it. We just encountered this issue.

Will it automatically allow another server to pick it up after X amount of time?

Could we specify it like the withoutOverlapping($minutes) method?

I can never tell when we will be able to resume our jobs, and trying to go into the memcache database isn't ideal to debug.

@t202wes
Copy link

t202wes commented Mar 27, 2018

We need a way to clear the cache when it gets stuck, thinking another server is running the job when in fact, there is no more server running the job. Or setting a timeout so it'll restart again.

@laurencei
Copy link
Contributor Author

@t202wes - if your scheduled cron does not complete - there is no retry mechanism. That is more of a queue functionality.

Once the next schedule runs - it'll work on any server as normal.

@congkhuong
Copy link

@laurencei Do you have any tip/trick to resolve this issue on laravel 5.4?

@laurencei
Copy link
Contributor Author

@congkhuong - you could probably just use the same code and extend the Class's to override with the same mechanisms.

but to be honest - it's probably easier to just upgrade...

@mpyw
Copy link
Contributor

mpyw commented Jun 23, 2020

@pokono
Copy link

pokono commented Apr 15, 2021

This is actually creating me an issue at the moment.
I have a fleet of server that needs to run a disk clean up every x minutes, without overlapping (meaning I need to be sure that this clean up only runs one time per servers until it's done) but at the same time it should run on ALL servers.

With this change, as the servers share the same cache it will only run on one of them (the first to get the lock as far as I understand it).

Is there anything I can do beside removing without overlap?

@laurencei
Copy link
Contributor Author

@pokono - you just run

$schedule->command('something')->withoutOverlapping()

That will run it on all servers individually, but each server wont overlap itself.

@ascepanovic
Copy link

@laurencei is an order of chaining important ?

Do we expect same behaviour if we have:

$schedule->command('something')->everyFiveMinutes()->onOneServer()->withoutOverlapping()
and
$schedule->command('something')->everyFiveMinutes()->withoutOverlapping()->onOneServer()

@laurencei
Copy link
Contributor Author

@ascepanovic - that should give the same behavior, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants