Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple crons problem #22129

Closed
webmake opened this issue Nov 18, 2017 · 20 comments
Closed

Multiple crons problem #22129

webmake opened this issue Nov 18, 2017 · 20 comments

Comments

@webmake
Copy link
Contributor

webmake commented Nov 18, 2017

  • Laravel Version: 5.5.20
  • PHP Version: 7.1

Description:

After upgrading from 5.5.19 into 5.5.20 started exceptions related with crons. (very rare but happens, and I don't like that)

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: delete from `multi_server_event` where `mutex` = 74d07339684141a8f7edbeb61c835586 and `complete` < 2017-11-17 23:27:51 and `next` < 2017-11-17 23:29:00)

Together I use https://github.com/jdavidbakr/MultiServerEvent

CACHE_DRIVER=redis

I suspect that doesn't work https://github.com/laravel/framework/blob/5.5/src/Illuminate/Console/Scheduling/Event.php#L170 after changes in 5.5.20.
There was 9events in 2days so far (6 yesterday, 3 till now). Cron works every minute, I have 3crons:
1: every minute
2: every minute
3: every day at 6h o'clock
Duration for them is not so long, I guess up to 5seconds.
Also I have 2servers, so this error means, that same cron is executed when should not been executed.

Can it be Redis related somehow? Or is it changes in 5.5.20 broken functionality at very edge cases?

@Dylan-DPC-zz
Copy link

Hey

Can you provide us with more details? Some steps to reproduce?

@webmake
Copy link
Contributor Author

webmake commented Nov 18, 2017

Like I just said, it's rare problem 0.2% of events, can't understand what causes this problem. If during the day 6 failures out of 2881 (24h*60min * 2crons every minute +1daily cron), in compare all other previous versions then never happened, since 5.4 laravel.
I have noticed that only one interesting thing not mentioned in change log, can it be the cause?
image

@Kyslik
Copy link
Contributor

Kyslik commented Nov 18, 2017

Perhaps make an example repo that we can play with.

@webmake
Copy link
Contributor Author

webmake commented Nov 18, 2017

Could you please firstly explain how crons are protected from overlapping mechanizm works?

Because soon I will provide POC, that was not solved even in 5.4 laravel.

@webmake
Copy link
Contributor Author

webmake commented Nov 18, 2017

Okay, experiment proved for me that bug not related with redis mechanism at all, because has does not affected create as I thought..
I guess I have very strange issue, when inserted record in database won't appear for other php process. (I had such issue and I know) so my used library fails on that case.

Anyway, POC, that you both required:

<?php

namespace App\Console;

use Illuminate\Console\Command;
use Illuminate\Console\Scheduling\Schedule;
use Illuminate\Foundation\Console\Kernel as ConsoleKernel;

class Kernel extends ConsoleKernel
{
    protected $commands = [
        Command1::class,
        Command2::class,
        Command3::class,
    ];

    protected function schedule(Schedule $schedule): void
    {
        $schedule->command(Command1::class)
            ->everyMinute()
            ->withoutOverlapping();
        $schedule->command(Command2::class)
            ->everyMinute()
            ->withoutOverlapping();
        $schedule->command(Command3::class)
            ->everyMinute()
            ->withoutOverlapping();
    }

    protected function commands(): void
    {
        require base_path('routes/console.php');
    }
}

class Command1 extends Command
{
    protected $signature = 'command:first';

    public function handle()
    {
        sleep(3);
        logger($this->signature);
    }
}

class Command2 extends Command
{
    protected $signature = 'command:second';

    public function handle()
    {
        sleep(20);
        logger($this->signature);
    }
}

class Command3 extends Command
{
    protected $signature = 'command:third';

    public function handle()
    {
        sleep(4);
        logger($this->signature);
    }
}

crontab (which simulates 2servers)

* * * * * php /project/artisan schedule:run
* * * * * php /project/artisan schedule:run

result:

[2017-11-18 12:01:42] local.DEBUG: command:first
[2017-11-18 12:01:59] local.DEBUG: command:third
[2017-11-18 12:01:59] local.DEBUG: command:second
[2017-11-18 12:02:22] local.DEBUG: command:third
[2017-11-18 12:02:37] local.DEBUG: command:first
[2017-11-18 12:02:51] local.DEBUG: command:third
[2017-11-18 12:02:55] local.DEBUG: command:second
[2017-11-18 12:03:09] local.DEBUG: command:third
[2017-11-18 12:03:34] local.DEBUG: command:first
[2017-11-18 12:03:51] local.DEBUG: command:third

This is bug, because cron command itself doesn't know that it was executed already or not. Proven by two people with different approaches, but eventually got same results, that third command runs twice in the same minute.

@sisve
Copy link
Contributor

sisve commented Nov 18, 2017

Also I have 2 servers, so this error means, that same cron is executed when should not been executed.

There is a protection against overlapping scheduled jobs in Laravel, which you have to opt-in to, and is based on a shared cache iirc. That protection targets only overlapping jobs, not jobs that execute quickly and several times in one minute at different seconds.

@webmake
Copy link
Contributor Author

webmake commented Nov 18, 2017

Basically cron means, I want task to be executed once per period. This does not happen. There was a try, but not fully solved.

@crynobone
Copy link
Member

crynobone commented Nov 19, 2017 via email

@webmake
Copy link
Contributor Author

webmake commented Nov 19, 2017

Did you tried an example? It proofs that no matter how much instances you have (I only simulate it on vm with duplicated scheduler, for clearity, as it would be two instances of aws servers or smth like similar) , cron third on same minute is executed on both servers, despite the fact that it was already executed.

That is visual explanation
https://user-images.githubusercontent.com/5016464/27125161-0d185162-50fc-11e7-81c0-76bb30645637.png

@crynobone
Copy link
Member

I wouldn't reply here if I didn't have this setup on my own production code. We will be heavily affected if multiple instance running the same scheduled at the same triggered time.

Our example already proved that the cron is executed on each minutes based on the result. Which is expected since you declared it everyMinutes()->without overlapping()

@crynobone
Copy link
Member

As far as I understand it should prevent you from calling cron from multiple instance at the same time. Whether if Laravel intended to support running process cron or not will determine whether this is a bug or something Laravel never intends to support in the first place.

In our case we never want our cron to run at a very long time, it should create small job for each of it task so it can utilize queue. It much easier to fallback when a job has failed compared to a cron failing.

@sisve
Copy link
Contributor

sisve commented Nov 19, 2017

You're both right. The overlapping stuff works if all your machines execute the jobs at the same time.
However, it will not protect you against crontab jittering, slightly out-of-sync clocks and tears in the spacetime continuum.

Also, as the image shows, it will not help if there is a large scheduled task to execute. One of them gets to do it, and will then afterwards start executing the other jobs since the first server has already completed them.

This issue has been present for a long time. I wouldn't call it a bug, the implementation is not meant to be distributed onto several machines. A solution would be to have a dedicated machine that runs the scheduler and dispatches jobs handled by your queue workers.

@webmake
Copy link
Contributor Author

webmake commented Nov 19, 2017

Well, if there are still people who don't understand how wild this stuff can become
image
Everything depends on what sequence and duration are of them, and how much servers you have. Having 4crons and 4servers, in perfect world we could manage to run only 4times (if we could control duration of them to be very short).
But in samples above, you can see instead of expected executed 4crons, can be 9times or even worse 10 times. (1+2+3+4), that means, linear summation.
Don't you think that is bug?

@webmake
Copy link
Contributor Author

webmake commented Nov 19, 2017

And yes, @sisve , we were advised to dedicate 1 server only for crons, which means additional separated deployment, and so rised costs. Or either to use smart mechanism of queue managing (that crons won't be executed more than expected).
But after some experiments we found out that with some tweaks (knowing when cron must be executed next time) multiserver package could solve our problem.

Until now... Why, because for no reason we having deadlocks, I don't see any piece of ours code where starting transactions or dealing with them. So I think it is laravel bug how it deals with crons (despite parallelism problem mentioned above) that forbid from executing two at the same time

@sisve
Copy link
Contributor

sisve commented Nov 19, 2017

Don't you think that is bug?

I do not think this is a bug. The scheduler has never claimed to support the feature you're now describing. It would be an improvement if it worked in these scenarios, but it is currently not a bug in my eyes.

we were advised to dedicate 1 server only for crons, which means additional separated deployment, and so rised costs.

I'm not sure what that additional deployment would be. The only cost I see is the increase is for another machine, and both Digital Ocean and AWS has them for 5 USD per month. The actual application deployment would be identical on all your machines, only the configuration would be different. Only one would call php artisan schedule:run. There is an additional cost, but I do consider it nearly as high as you make it sound.

The multiserver package you link does almost exactly what Laravel does; except it has a few seconds delay after every job where it is still blocking. Even the description of the package states that it is focusing on overlapping jobs; "plugin to allow scheduled events across multiple servers with the same scheduler to not overlap". The distributed overlapping functionality that the linked package provides has been implemented in Laravel 5.4, except Laravel is using a shared cache instead of a database table.

The problem reported in this issue, regarding jdavidbakr/multi-server-event, is about database deadlocks. Report this to the package writer; it's their queries that are failing. Simultaneous access and state management in databases is hard. It's error-prone, it's database-dependent, and even a single index can bring havoc to the locking strategies. Consider moving off that package and use the built-in withoutOverlapping support provided by Laravel.

Neither Laravel, nor the package, provides support for the scenarios described in this issue. The package happens to have a 10 second delay between when a task ends and the mutex is removed, and that is in turn creating additional issues for that package.

In order to prevent a condition where a short-running command's lock doesn't last long enough, we are implementing a minimum 10 second break between the completion of the command and its next execution time, so if a command runs every minute but takes between 50 and 59 seconds to complete, the next command will be delayed one more minute.

Source: https://github.com/jdavidbakr/MultiServerEvent

@webmake
Copy link
Contributor Author

webmake commented Nov 19, 2017

Btw I forgot to mention, additional downside of dedicated single server for crons - would lead us to single point of failure.

FYI, currently package solves that problem completely, not even via 10s gap. (Sure I will consider to report as well about deadlocks for package owned, if will find out exact the locks problem).

So, if that's not considered to be a bug, then can such feature be implemented, to prevent from executing more than once in laravel itself?

@Kyslik
Copy link
Contributor

Kyslik commented Nov 19, 2017

I would move this to internals and just reference this issue from there. So no notifications for 900+ people are sent out.

@laurencei
Copy link
Contributor

laurencei commented Nov 19, 2017

As @Kyslik said - this is not a "bug". It is a feature limitation that has been present literally forever since Laravel 5.0.0

The original post describes a fault, but it is due to the use of a 3rd party library - so it's not directly related to the core.

Move to Internals to discuss. Or someone can do a PR for Taylor that provides true multi-server scheduling support. I've actually got a pending PR relating to re-writing part of the scheduler (#20403), I think I can add it there with some tweaks - I'll check it out.

@laurencei
Copy link
Contributor

I've done a PR for this feature. Turns out it seems really easy to add this feature: #22137

Lets move any further discussion there...

@themsaid
Copy link
Member

Closing this issue and moving the discussion to the PR: #22137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants