How many scheduled task were used for the benchmark? #209

gianielsevier · 2021-06-02T15:30:06Z

Hi there,

I'm looking for an alternative for quartz and I think your solution can be the one.
Today we use quartz a lot and can have over 14 million triggers in our DB. Quartz is not behaving well under this number and adding more instances to the cluster don't bring any benefit, the triggers are delaying a lot.

I would like to know what would be the limit of the db-scheduler and if we can add more instances to scale the growing number of scheduled tasks?

kagkarlsson · 2021-06-02T17:56:23Z

Hi!

Could you describe a bit more what type of tasks you have? 14 million recurring tasks? How often are they running?

For the benchmark I created synthetic executions scheduled to run now(), maybe 2 million each time. But I don't think the amount of executions in the table should affect the performance that much, as long as it is indexed properly.
What kind of throughput do you require (executions/s) and what database are you using?

kagkarlsson · 2021-06-02T18:01:32Z

Scaling depends a bit on the tasks as well. Up to the point where the database becomes the bottleneck you can increase throughput by adding instances. If the task does nothing database-related, tests indicate you should be able to reach 10k executions/s.

gianielsevier · 2021-06-07T11:09:45Z

Hi, @kagkarlsson sorry for my delayed reply.

Let me explain our use case.

We have different clients that can come to our application and create/update/delete a trigger to run any time.
Our clients are different websites with millions of users interested in receiving recurrent information and for that, they use our system to save it. All of our triggers are dynamically created and we can have thousands running at the same time every second.

The number of triggers is just growing and growing.

Please let I know if you have any other question.

kagkarlsson · 2021-06-08T07:04:38Z

The limiting factor will be the number of triggers running to completion / second. If these triggers/tasks take say 10s to run, and there are 1000 running in parallell, that will approximately be 100 completions/second (also referred to as executions/s).

If you have long-running tasks like that, you will likely first be limited by the size of the thread-pool. That can be increased both per instance (configurable) and by adding more instances.

If you reach a point where you need to run more than say 10.000 completions/s, you might need to use multiple databases and split the triggers/executions between them (i.e. sharding).

How long does a typical trigger / execution / task run?

create/update/delete a trigger to run any time

Is this one-time tasks or recurring on a schedule? If recurring, what is typically the schedule?

Our clients are different websites with millions of users interested in receiving recurrent information and for that, they use our system to save it

Is it one trigger created per user?

gianielsevier · 2021-06-08T20:57:50Z

How long does a typical trigger / execution / task run?
It should take a maximum of 1 second

Is this one-time tasks or recurring on a schedule? If recurring, what is typically the schedule?
They are always recurring tasks

Is it one trigger created per user?
It can be one or more per user

kagkarlsson · 2021-06-11T08:34:05Z

Is this one-time tasks or recurring on a schedule? If recurring, what is typically the schedule?
They are always recurring tasks

What is the schedule? Are they evenly spread in time, or are there peaks?

I still feel that I don't have the complete picture here. Currently, at what threshold of executions/s are you starting to experience problems? And how far are you hoping to push that using db-scheduler? Keep in mind that the key-metric here is executions/s.

gianielsevier · 2021-06-23T08:59:47Z

Hi @kagkarlsson I've started the POC and I have a question.
I'm trying to use the spring version with tasks created dynamically based on requests coming from a controller.
The tasks are being persisted to the database but the the column task_data is always null.
I'm also confused on how to handle the trigger when it's time to run it.

I've tried to follow the examples from here:
https://github.com/kagkarlsson/db-scheduler/blob/master/examples/features/src/main/java/com/github/kagkarlsson/examples/PersistentDynamicScheduleMain.java

This is the code I'm using to create the task:

Note.: scheduler is

@Service
public class SchedulerService {

    private final ExecutionRunner executionRunner;

    private final CronTriggerBuilder cronTriggerBuilder;

    private final Scheduler scheduler;

    public SchedulerService(
                            final ExecutionRunner executionRunner,
                            final CronTriggerBuilder cronTriggerBuilder,
                            final Scheduler scheduler) {
        this.dataSource = dataSource;
        this.executionRunner = executionRunner;
        this.cronTriggerBuilder = cronTriggerBuilder;
        this.scheduler = scheduler;
    }

    public void create(final DummyPojo pojo) {

        String idOne = pojo.getIdOne();
        String idTwo = pojo.getIdTwo();

        SerializableSchedule serializableSchedule = new SerializableSchedule(idOne, idTwo, cronTriggerBuilder.build(pojo));

        RecurringTask<SerializableSchedule> task = Tasks.recurring(UUID.randomUUID().toString(), serializableSchedule, SerializableSchedule.class)
                .execute(executionRunner);

        Instant newNextExecutionTime = serializableSchedule.getNextExecutionTime(ExecutionComplete.simulatedSuccess(Instant.now()));
        
        TaskInstance<SerializableSchedule> instance = task.instance(idOne);

        scheduler.schedule(instance, newNextExecutionTime);

    }

}

This is the execution runner class:

@Component
public class ExecutionRunner implements VoidExecutionHandler<SerializableSchedule> {

    private final SQSService sqsService;

    public ExecutionRunner(final RotsSqsWorker rotsSqsWorker) {
        this.rotsSqsWorker = rotsSqsWorker;
    }

    @Override
    public void execute(final TaskInstance<SerializableSchedule> taskInstance, final ExecutionContext executionContext) {

        SerializableSchedule serializableSchedule = taskInstance.getData();

        if (serializableSchedule != null) {

            long scheduledTimeEpochSeconds = executionContext.getExecution().executionTime.toEpochMilli();

            SQSMessage message = new SQSMessage();
            message.setIdOne(serializableSchedule.getIdOne());
            message.setIdTwo(serializableSchedule.getIdTwo());
            message.setRandomId(UUID.randomUUID().toString());
            message.setScheduledTimeEpochSeconds(scheduledTimeEpochSeconds);

            sqsService.send(message);
        }

    }
}

This is the SerializableSchedule class:

public class SerializableSchedule implements Serializable, Schedule {

    private final String idOne;

    private final String idTwo;

    private final String cronPattern;

    public SerializableSchedule(final String idOne, final String idTwo, final String cronPattern) {
        this.idOne = idOne;
        this.idTwo = idTwo;
        this.cronPattern = cronPattern;
    }

    @Override
    public Instant getNextExecutionTime(ExecutionComplete executionComplete) {
        return new CronSchedule(cronPattern).getNextExecutionTime(executionComplete);
    }

    @Override
    public boolean isDeterministic() {
        return true;
    }

    public String getIdOne() {
        return idOne;
    }

    public String getIdTwo() {
        return idTwo;
    }

    public String getCronPattern() {
        return cronPattern;
    }

    @Override
    public String toString() {
        return "SerializableCronSchedule pattern=" + cronPattern;
    }
}

kagkarlsson · 2021-06-24T06:41:40Z

RecurringTask<SerializableSchedule> task = Tasks.recurring(UUID.randomUUID().toString(), serializableSchedule, SerializableSchedule.class)
                .execute(executionRunner);

You only do this once, at scheduler construction and startup. Inject a reference to the task and inject that in SchedulerService and create instances from that. You probably also want to use a CustomTask and disable the scheduleOnStartup(...). The RecurringTask will get automatically added when the scheduler starts if it does not exist

kagkarlsson · 2021-06-24T06:42:52Z

I have gotten a couple of other questions along these lines which has made it clear I need a better Spring Boot example for tasks with dynamic schedule that are added at runtime

kagkarlsson · 2021-06-24T06:44:21Z

Also, for more robust serialization, you may want to consider setting a custom JsonSerializer. (also something I need to add an example for)

kagkarlsson · 2021-06-24T06:53:04Z

This is just setting up the implementation, I see that execute(..) is not the best choice of method-name, should maybe call it onExecute(...)

        final CustomTask<SerializableCronSchedule> task = Tasks.custom("dynamic-recurring-task", SerializableCronSchedule.class)
            .scheduleOnStartup(RecurringTask.INSTANCE, initialSchedule, initialSchedule)
            .onFailure((executionComplete, executionOperations) -> {
                final SerializableCronSchedule persistedSchedule = (SerializableCronSchedule) (executionComplete.getExecution().taskInstance.getData());
                executionOperations.reschedule(executionComplete, persistedSchedule.getNextExecutionTime(executionComplete));
            })
            .execute((taskInstance, executionContext) -> {
                final SerializableCronSchedule persistentSchedule = taskInstance.getData();
                System.out.println("Ran using persistent schedule: " + persistentSchedule.getCronPattern());

                return (executionComplete, executionOperations) -> {
                    executionOperations.reschedule(
                        executionComplete,
                        persistentSchedule.getNextExecutionTime(executionComplete)
                    );
                };
            });
            ```

gianielsevier · 2021-06-29T09:57:58Z

Hey, @kagkarlsson many thanks for your help. 🙌
Now it is working as expected. We will prepare the tests and I'll give you an update.

kagkarlsson · 2021-06-29T11:08:36Z

Np. Will be interesting to hear the results. Sounded like a very-high-throughput use-case

gianielsevier · 2022-02-01T10:37:31Z

Hi @kagkarlsson,

Finally, I've managed to have time and come back with results:

The POC numbers:
We've created 14 million custom recurrent tasks.
The tasks were created to run like this 2 million per day of week distribute among the 24 hours of the day.
We were running the application on K8S and for that, we dedicated 4 pods with 500MB of memory and 0.5 core CPU.
The database was Postgres DB was an AWS db.m6g.large which has 8GB of memory and 2 vCPU, this instance also handles other applications mainly with Quartz (this is our nonprod environment).

Application behaviour:
Saving the tasks:
We had an endpoint where the client can send a payload asking to save a scheduler (task) giving a day of the week and what time it should run (they are always recurrent)

Running the tasks:
Once it is time to run the tasks the APP was being triggered by DB Scheduler lib collecting the information about the task and sending a message to AWS SQS.

The aim of this POC was to check if db-scheduler would be able to handle millions of schedulers(tasks) without delaying the execution of them (the main issue we have with Quartz today).
We also wanted to make sure that db-schduler would be able to scale horizontally without looking at the db and causing delays
To check the delay we were basically getting the current time - the task execution time and logging it. From our logs, we are also printing which pod did the job.

After making few changes on the configs below:
db-scheduler.threads
db-scheduler.polling-strategy-lower-limit-fraction-of-threads
db-scheduler.polling-strategy-upper-limit-fraction-of-threads

Also checking the number of pods to handle the 14 million tasks saved in our DB we've managed to not have delays.

We kept the POC running for a month and checking our logs it was clear that db-scheduler was able to run with multiple pods distributing equally the load among them and also no delays.

We will start a new project soon to provide a scalable scheduler solution for our company and db-scheduler is the way to go.

Many thanks for your support @kagkarlsson and also for building this incredible solution.

kagkarlsson · 2022-02-01T14:58:49Z

Good to hear! And just to let you know, working on an improvement to your use-case, many instances of the same recurring-tasks with variable schedule: #257

gianielsevier · 2022-02-01T15:28:04Z

@kagkarlsson that's great, thanks for the feedback.
I was wondering if I could contribute to your repo by providing an example similar to the POC we did?

kagkarlsson · 2022-02-23T12:00:11Z

Improved api released in 11.0.

I was wondering if I could contribute to your repo by providing an example similar to the POC we did?

I missed your comment here, sorry. If you have such code that you think might be valuable for people to see, how about pushing it to your own github-repo, and I can link from the README ? I can also add a link to this issue where you are describing your setup.

Also, if you are happy users, you are welcome to add your company to the list here:
https://github.com/kagkarlsson/db-scheduler#who-uses-db-scheduler
:)

huynhnt · 2023-04-11T08:13:20Z

I followed this guide and also to create the the schedule by this way. but i can't cancel this task in my spring boot project.

who can help me?

@PostMapping(path = "stop", headers = {"Content-type=application/json"})
public void stop(@RequestBody StartRequest request) {

    final TaskInstanceId scheduledExecution = TaskInstanceId.of("dynamic-recurring-task", RecurringTask.INSTANCE);
    if(!Objects.isNull(scheduledExecution)) {
        System.out.println("TaskID:" + scheduledExecution.getId());
        schedulerClient.cancel(scheduledExecution);
    }
}

nj2208 · 2023-11-18T05:22:54Z

Hi @kagkarlsson,

Finally, I've managed to have time and come back with results:

The POC numbers: We've created 14 million custom recurrent tasks. The tasks were created to run like this 2 million per day of week distribute among the 24 hours of the day. We were running the application on K8S and for that, we dedicated 4 pods with 500MB of memory and 0.5 core CPU. The database was Postgres DB was an AWS db.m6g.large which has 8GB of memory and 2 vCPU, this instance also handles other applications mainly with Quartz (this is our nonprod environment).

Application behaviour: Saving the tasks: We had an endpoint where the client can send a payload asking to save a scheduler (task) giving a day of the week and what time it should run (they are always recurrent)

Running the tasks: Once it is time to run the tasks the APP was being triggered by DB Scheduler lib collecting the information about the task and sending a message to AWS SQS.

The aim of this POC was to check if db-scheduler would be able to handle millions of schedulers(tasks) without delaying the execution of them (the main issue we have with Quartz today). We also wanted to make sure that db-schduler would be able to scale horizontally without looking at the db and causing delays To check the delay we were basically getting the current time - the task execution time and logging it. From our logs, we are also printing which pod did the job.

After making few changes on the configs below: db-scheduler.threads db-scheduler.polling-strategy-lower-limit-fraction-of-threads db-scheduler.polling-strategy-upper-limit-fraction-of-threads

Also checking the number of pods to handle the 14 million tasks saved in our DB we've managed to not have delays.

We kept the POC running for a month and checking our logs it was clear that db-scheduler was able to run with multiple pods distributing equally the load among them and also no delays.

We will start a new project soon to provide a scalable scheduler solution for our company and db-scheduler is the way to go.

Many thanks for your support @kagkarlsson and also for building this incredible solution.

@gianielsevier

Thanks for providing detailed explanation about you poc . We also have a similar use case . Is it possible for you to share the example code which you have used in your POC ?

Thanks in advance !

nj2208 · 2023-11-21T16:03:41Z

@kagkarlsson Could you please share which example we can follow for similar use case to achieve very high throughput in case of short running jobs which just post message on message broker ?

kagkarlsson · 2023-11-22T06:09:59Z

I think you will get the best throughput using PostgreSQL and .pollUsingLockAndFetch(1.0, 4.0) (thresholds tunable). Possibly increase the number of threads using .threads(xx) (or their spring-boot starter counterparts)

nj2208 · 2023-11-23T12:33:15Z

I think you will get the best throughput using PostgreSQL and .pollUsingLockAndFetch(1.0, 4.0) (thresholds tunable). Possibly increase the number of threads using .threads(xx) (or their spring-boot starter counterparts)

Thanks a lot . Will use these settings in our PoC .

kagkarlsson · 2023-11-23T15:45:25Z

Also make sure you have the necessary indices

kagkarlsson added the question label Jun 2, 2021

AnanthaKrishnaV mentioned this issue Nov 1, 2022

Is there a plan to support MSSQL for lock_and_fetch. Using fetch polling strategy has performance issue. #337

Open

kagkarlsson closed this as completed Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many scheduled task were used for the benchmark? #209

How many scheduled task were used for the benchmark? #209

gianielsevier commented Jun 2, 2021

kagkarlsson commented Jun 2, 2021

kagkarlsson commented Jun 2, 2021

gianielsevier commented Jun 7, 2021

kagkarlsson commented Jun 8, 2021

gianielsevier commented Jun 8, 2021

kagkarlsson commented Jun 11, 2021

gianielsevier commented Jun 23, 2021 •

edited

Loading

kagkarlsson commented Jun 24, 2021 •

edited

Loading

kagkarlsson commented Jun 24, 2021

kagkarlsson commented Jun 24, 2021

kagkarlsson commented Jun 24, 2021

gianielsevier commented Jun 29, 2021

kagkarlsson commented Jun 29, 2021

gianielsevier commented Feb 1, 2022

kagkarlsson commented Feb 1, 2022

gianielsevier commented Feb 1, 2022

kagkarlsson commented Feb 23, 2022

huynhnt commented Apr 11, 2023

nj2208 commented Nov 18, 2023

nj2208 commented Nov 21, 2023

kagkarlsson commented Nov 22, 2023

nj2208 commented Nov 23, 2023

kagkarlsson commented Nov 23, 2023

How many scheduled task were used for the benchmark? #209

How many scheduled task were used for the benchmark? #209

Comments

gianielsevier commented Jun 2, 2021

kagkarlsson commented Jun 2, 2021

kagkarlsson commented Jun 2, 2021

gianielsevier commented Jun 7, 2021

kagkarlsson commented Jun 8, 2021

gianielsevier commented Jun 8, 2021

kagkarlsson commented Jun 11, 2021

gianielsevier commented Jun 23, 2021 • edited Loading

kagkarlsson commented Jun 24, 2021 • edited Loading

kagkarlsson commented Jun 24, 2021

kagkarlsson commented Jun 24, 2021

kagkarlsson commented Jun 24, 2021

gianielsevier commented Jun 29, 2021

kagkarlsson commented Jun 29, 2021

gianielsevier commented Feb 1, 2022

kagkarlsson commented Feb 1, 2022

gianielsevier commented Feb 1, 2022

kagkarlsson commented Feb 23, 2022

huynhnt commented Apr 11, 2023

nj2208 commented Nov 18, 2023

nj2208 commented Nov 21, 2023

kagkarlsson commented Nov 22, 2023

nj2208 commented Nov 23, 2023

kagkarlsson commented Nov 23, 2023

gianielsevier commented Jun 23, 2021 •

edited

Loading

kagkarlsson commented Jun 24, 2021 •

edited

Loading