Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fields beind solved sequentially in the same VirtualThread #1051

Closed
gpinzegher opened this issue Aug 30, 2024 · 12 comments
Closed

Fields beind solved sequentially in the same VirtualThread #1051

gpinzegher opened this issue Aug 30, 2024 · 12 comments
Assignees
Labels
status: invalid An issue that we don't feel is valid

Comments

@gpinzegher
Copy link

gpinzegher commented Aug 30, 2024

Hello,
I am migrating my GraphQL project to use spring-graphql and I noticed that my Controllers are called sequentially instead of parallel.
All the code I will paste here is a similar code, but with the exact same coding structure:

schema.graphqls

type Query {
    books(bookIds: Int): [Book]
}

type Book {
    bookId: Int
    name: String
    pageCount: Int
    author: Author _#this field could be resolved in parallel with qtySold_
    qtySold: Int _#this field could be resolved in parallel with author_
}

type Author {
    authorId: Int
    firstName: String
    lastName: String
}

BookController.java

@Controller
public class BookController {
    private final BookService service;

    public BookController(BookService service) {
        this.service = service;
    }

    @QueryMapping
    public CompletableFuture<List<Book>> books(@Argument List<Integer> bookIds) {
        return CompletableFuture.supplyAsync(() -> service.getBooksByIds(bookIds));
    }
}

BookService.java

@Service
public class BookService {
    private final BookRepository repository;

    public BookService(BookRepository repository) {
        this.repository = repository;
    }

    public List<Book> getBooksByIds(List<Integer> bookIds) {
        return repository.findAllByIdIn(bookIds);
    }
}

AuthorController.java

@Controller
public class AuthorController {
    private final AuthorService service;

    public AuthorController(AuthorService service, BatchLoaderRegistry batchLoaderRegistry) {
        this.service = service;

        batchLoaderRegistry
                .forTypePair(Book.class, Author.class)
                .registerMappedBatchLoader(
                        (books, env) -> {
                            return Mono.just(service.getAuthors(books));
                        }
                );
    }

    @SchemaMapping(typeName = "Book", field = "author")
    public CompletableFuture<Author> books(Book book, DataLoader<Book, Author> dataLoader) {
        return dataLoader.load(book);
    }
}

AuthorService.java

@Service
public class AuthorService {
    private final AuthorRepository repository;

    public AuthorService(AuthorRepository repository) {
        this.repository = repository;
    }

    public Map<Book, Author> getAuthors(Set<Book> books) {
        List<Author> authors = repository.findAllByAuthorIdIn(books.stream().map(Book::authorId).collect(Collectors.toList()));
        return books.stream().collect(Collectors.toMap(book -> book, book -> getAuthorForBook(book, authors)));
    }

    private Author getAuthorForBook(Book book, List<Author> authors) {
        return authors.stream().filter(author -> author.id() == book.authorId()).findFirst().orElse(null);
    }
}

SellingController.java

@Controller
public class SellingController {
    private final SellingService service;

    public SellingController(SellingService service, BatchLoaderRegistry batchLoaderRegistry) {
        this.service = service;

        batchLoaderRegistry
                .forTypePair(Book.class, Integer.class)
                .registerMappedBatchLoader(
                        (books, env) -> {
                            return Mono.just(service.getSellings(books));
                        }
                );
    }

    @SchemaMapping(typeName = "Book", field = "qtySold")
    public CompletableFuture<Integer> qtySold(Book book, DataLoader<Book, Integer> dataLoader) {
        return dataLoader.load(book);
    }
}

SellingService.java

@Service
public class SellingService {
    private final SellingRepository repository;

    public SellingService(SellingRepository repository) {
        this.repository = repository;
    }

    public Map<Book, Integer> getSellings(Set<Book> books) {
        List<Selling> sellings = repository.findAllByBookIdIn(books.stream().map(Book::bookId).collect(Collectors.toList()));
        return books.stream().collect(Collectors.toMap(book -> book, book -> getSellingQtyForBook(book, sellings).qtySold()));
    }

    private Selling getSellingQtyForBook(Book book, List<Selling> sellings) {
        return sellings.stream().filter(selling -> selling.bookId() == book.bookId()).findFirst().orElse(new Selling(book.bookId(), 0));
   

On my understanding as soon as I have all Books solved GraphQL would trigger to retrieve both author and qtySold to be executed in the same time in different Threads as they rely only on Book's.
What I noticed is that both author and qtySold Controllers/Services are triggered sequentially on the same VirtualThread.

I also tried using an AsyncConfiguration (setting Core and Max PoolSize and also QueueCapacity) but I always get the same.

Just remember: this is a translated sample from the project I'm developing. The entire entities structure is way bigger and when I migrated to spring-grapql from graphql-java-kickstart one query that took from 3 to 5 seconds now is taking from 15 to 20+ seconds.

I am using

  • Java 21
  • SpringBoot 2.7.18
  • Spring Framework 5.5.8
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Aug 30, 2024
@bclozel
Copy link
Member

bclozel commented Aug 30, 2024

Can you share a minimal sample so we can have a look? Have you tried recording traces with the observability instrumentation that we provide - a trace is usually helpful to visualize the execution of a request.

Additionally, OSS support ended for Spring Boot 2.7.x so please use Spring Boot 3.2.x+ in your sample, if you can reproduce the issue. Thanks!

@bclozel bclozel added the status: waiting-for-feedback We need additional information before we can continue label Aug 30, 2024
@bclozel
Copy link
Member

bclozel commented Aug 30, 2024

This is also probably relevant to your use case: #958

@gpinzegher
Copy link
Author

Hi @bclozel,
Thanks for your answer.
I got this simple project added to GitHub:
https://github.com/gpinzegher/graphql

This happens even with Spring Boot 3.2.9.

I'm checking on how to use observability instrumentation and the case #958 you mentioned.
As soon I have more details I will share it here.

Thanks

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Sep 2, 2024
@bclozel bclozel self-assigned this Sep 4, 2024
@bclozel
Copy link
Member

bclozel commented Sep 4, 2024

Hello @gpinzegher

I forked your sample and created 3 branches demonstrating the behaviors.
It's using tracing and the zipkin UI to show actual traces.

Synchronous execution, non-batched

https://github.com/bclozel/graphql/tree/base

Data fetching is performed synchronously and we obviously have the N+1 problem here.

image

2024-09-04T17:06:03.183+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-28a30e7986f345a5] c.demo.project.demo.service.BookService  : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:06:04.219+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-600e6b4c8bc70a79] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:06:05.240+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-bd1497b5c1fb761c] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:06:06.246+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-239dc97790d43e62] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:06:07.254+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-1cb1ac9166264a80] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:06:08.261+02:00  WARN 75520 --- [demo] [nio-8080-exec-1] [66d8775bca5b425200e87761cfca3e60-d6790b5dd70d4a47] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [78] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]]

Same thing with spring.threads.virtual.enabled=true - the trace looks the same, but blocking work is offloaded to virtual threads automatically instead of a thread pool.

2024-09-04T17:08:09.201+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-a21ab12d1073a434] c.demo.project.demo.service.BookService  : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]] 
2024-09-04T17:08:10.242+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-b71e7352d2ff229c] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]] 
2024-09-04T17:08:11.251+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-622376486eaf5bab] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]] 
2024-09-04T17:08:12.257+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-1a8acf5b86cee5dd] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]] 
2024-09-04T17:08:13.266+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-b6d62f70e349cabc] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]] 
2024-09-04T17:08:14.273+02:00  WARN 76281 --- [demo] [omcat-handler-0] [66d877d907cac8d14379a59f71172543-a7f05a91a89c94c5] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [tomcat-handler-0] Id: [91] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=VirtualThreads,maxpri=10]]

Asynchronous execution, non-batched

https://github.com/bclozel/graphql/tree/async

Data fetching operations are performed in parallel.

image

2024-09-04T17:03:26.165+02:00  WARN 74510 --- [demo] [nio-8080-exec-1] [66d876be4cc41125128ff45fe51dce61-57e5e77e28865653] c.demo.project.demo.service.BookService  : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [81] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:03:27.182+02:00  WARN 74510 --- [demo] [onPool-worker-1] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [ForkJoinPool.commonPool-worker-1] Id: [106] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:03:27.183+02:00  WARN 74510 --- [demo] [onPool-worker-2] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [ForkJoinPool.commonPool-worker-2] Id: [107] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:03:27.183+02:00  WARN 74510 --- [demo] [onPool-worker-3] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [ForkJoinPool.commonPool-worker-3] Id: [108] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:03:27.183+02:00  WARN 74510 --- [demo] [onPool-worker-4] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [ForkJoinPool.commonPool-worker-4] Id: [109] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:03:27.184+02:00  WARN 74510 --- [demo] [onPool-worker-5] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthor -> Current Thread -> Name: [ForkJoinPool.commonPool-worker-5] Id: [110] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]]

Asynchronous execution, with batch loading

https://github.com/bclozel/graphql/tree/batch

The trace is still showing multiple data fetching operations executed in parallel. We would like to make batch loading more obvious from a trace perspective. See #1034

image

2024-09-04T17:20:20.370+02:00  WARN 78504 --- [demo] [nio-8080-exec-1] [66d87ab4d14f3eff60e9175a95a8c4ad-1086090ec75813a2] c.demo.project.demo.service.BookService  : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [77] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:20:22.431+02:00  WARN 78504 --- [demo] [     parallel-1] [                                                 ] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [parallel-1] Id: [102] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]]

If you enable automatic context propagation with spring.reactor.context-propagation=auto, the traceId is also propagated for asynchronous calls:

2024-09-04T17:22:12.361+02:00  WARN 78695 --- [demo] [nio-8080-exec-1] [66d87b24a5a7bf4771f3460858b98680-3df7b35f86455b26] c.demo.project.demo.service.BookService  : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [http-nio-8080-exec-1] Id: [77] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 
2024-09-04T17:22:14.415+02:00  WARN 78695 --- [demo] [     parallel-1] [66d87b24a5a7bf4771f3460858b98680-c52898b55067314a] c.d.project.demo.service.AuthorService   : [THREAD] AuthorService.getAuthors -> Current Thread -> Name: [parallel-1] Id: [102] Priority: [5] State: [RUNNABLE] ThreadGroup: [java.lang.ThreadGroup[name=main,maxpri=10]] 

I believe all those behaviors are consistent with the setup. Please let us know what you think is missing.

@bclozel bclozel added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Sep 4, 2024
@rstoyanchev
Copy link
Contributor

rstoyanchev commented Sep 5, 2024

The Zipkink UI illustrates the behavior nicely.

One thing to note is the BookController has to return before others are called because you need books before you know what authors or selling quantities to get. However, authors and selling can be done in parallel, and that's the case. I'm not sure what you are migrating from, or what the behavior was, but it can't really happen in any other way.

@gpinzegher
Copy link
Author

Hi @bclozel,

Thanks a lot for your time. I was checking your changes and I saw that async option is resolved one by one (that means several database connections).

Is there any option for only Author and Selling controllers to be executed asynchronous (in parallel) at once?
What I mean is, BookController is resolved and I have a list of Books, then Author and Selling controllers could be executed in the same time with the list of books as an input. This way I will have only 2 connections to the database instead of several connections for each service.

I will expand a bit my Book:

type Book {  **BookController/Service being called first (1 second to finish) **
    bookId: Int
    name: String
    pageCount: Int
    author: Author **AuthorController/Service (1 second to finish) **
    qtySold: Int  **SellingController/Service (1 second to finish) **
    ratings: [Rating] **RatingController/Service (1 second to finish) **
    publisher: Publisher **PublisherController/Service (1 second to finish) **
} 

With the return of BookControler (List) I could trigger Author, Selling, Rating and Publisher controllers to be executed in parallel in a batch (for each one) using the list of Books as an input. Doing that in a batch I can bind the database 1 time per controller (in this case 4 times no matter how many books I want to retrieve) and doing it in parallel I would take 1 second to finish BookController and close to 1 second to finish the other 4 controller if executed in parallel.

Is there any option to achieve this behavior?

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Sep 10, 2024
@rstoyanchev
Copy link
Contributor

@gpinzegher isn't that what the batch branch does?

@rstoyanchev rstoyanchev added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Sep 11, 2024
@spring-projects-issues
Copy link
Collaborator

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

@spring-projects-issues spring-projects-issues added the status: feedback-reminder We've sent a reminder that we need additional information before we can continue label Sep 18, 2024
@gpinzegher
Copy link
Author

Hi @rstoyanchev,

As far as I understand using @BatchMapping the following entities "D" would be resolved separately:
image

Using the BatchLoaderRegistry I could load all the entities "C" and resolve "D" in one time with only 1 database connection.
Using the @BatchMapping I would have 2 connection to the database.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue status: feedback-reminder We've sent a reminder that we need additional information before we can continue labels Sep 19, 2024
@bclozel
Copy link
Member

bclozel commented Sep 20, 2024

@gpinzegher why would they? Can you show any evidence of this behavior?

@bclozel bclozel added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Sep 20, 2024
@gpinzegher
Copy link
Author

Hi @bclozel,
I just saw that using BatchMapping is resolving the fields together as mentioned.

My concern about fields being resolved in parallel or sequentially still stands.

In my project I need that fields are resolved in parallel when its possible. I created this new branch with an extra field to demonstrate the behavior:
https://github.com/gpinzegher/graphql/tree/batchMapping

Fields author, rating and selling could be resolved at the same time as soon BookController has its results.

From the the following branch (asynchronous with batch loading) the AuthorController is using SchemaMapping that means several calls to my database, and that's what I want to avoid. Books are resolved in a batch but authors is not.

From the logs here you can see that's executed one after the other and not in parallel.

Is asynchronous execution only available when using SchemaMapping?

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Sep 24, 2024
@rstoyanchev
Copy link
Contributor

The sample is on 1.2.8, but parallel execution on VT's is a 1.3 feature, see #958. When I switch to Boot 3.3.4, it works as expected. Below 1.3.x, controller methods can return Callable to be executed asynchronously. For example:

@BatchMapping(typeName = "Book", field = "qtySold")
public Callable<Map<Book, Integer>> qtySold(Set<Book> books) {
    return () -> service.getSellings(books);
}

@rstoyanchev rstoyanchev closed this as not planned Won't fix, can't repro, duplicate, stale Sep 26, 2024
@rstoyanchev rstoyanchev added status: invalid An issue that we don't feel is valid and removed status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged labels Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: invalid An issue that we don't feel is valid
Projects
None yet
Development

No branches or pull requests

4 participants