Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gatsby-source-graphql: n+1 queries to remote GraphQL APIs, and other concerns #13425

Closed
jasonbahl opened this issue Apr 17, 2019 · 14 comments · Fixed by #22347
Closed

gatsby-source-graphql: n+1 queries to remote GraphQL APIs, and other concerns #13425

jasonbahl opened this issue Apr 17, 2019 · 14 comments · Fixed by #22347
Labels
topic: GraphQL Related to Gatsby's GraphQL layer

Comments

@jasonbahl
Copy link
Contributor

jasonbahl commented Apr 17, 2019

In working on the initial phase of gatsbywpthemes.com, using WPGraphQL to pull data from WordPress into Gatsby, we initially used gatsby-source-graphql and stitched in the WPGraphQL Schema.

Then the createPages query queried all the pages needed from WordPress asking for just the ID, passed that through then each page used a pageQuery to query the data needed to produce that page.

One of the big benefits of using WPGraphQL is the reduction of n+1 queries when fetching data from WP, but by using this method, it was CREATING a new n+1 problem.

If we had 1000 items in WP, we could execute 10 fetches for 100 items each, then the pageQuery would fetch once per page again, leading to a total of 1010 queries and slower builds.

Our solution was to change to fetching the GraphQL API directly up front and passing data through as context. This allows us to make just the 10 up front requests, not 1,010 requests.

However, readability becomes a concern because the pageQuery isn't coupled with the component anymore.

I tried to solve that by having a fragment at the page level and importing the fragment into the query that runs at build time, but I was unable to acheive that.

I have a Spectrum thread about this here: https://t.co/qeHM10MMHc

Related Twitter thread here: https://twitter.com/wpgraphql/status/1118570506750496772

Also, related

Another issue with gatsby-source-graphql (or at least my understanding of how to best use it) is that a stitched Query isn't meaninful to dev tools on the live API.

You can't just copy a stitched query and execute it in GraphiQL against the live server. Stitched queries only have meaning in the Gatsby build GraphQL server, so if you're building a Gatsby site that also needs to pull data from a live WPGraphQL (or other live GraphQL) endpoint, you can't re-use queries across the build and dynamic fetching, you can't use the stitched queries in dev tooling on the live API, etc.

Turns into a headache, in my experience.

Ideally, I'd like to be able to re-use fragments from build on the client, but since Types are stitched, the build fragments won't work in live queries on the client.

For example, the build query might look like:

query GET_PAGES {
  wpgraphql {
      pages {
         nodes {
            id
            title
            parent {
                __typename 
                ...on WPGraphQL_Page {
                  id
                  title
                }
            }
         }
      }
  }
}

You can't just use that on your wordpress-site.com/graphql endpoint because of the stitching. The live endpoint wouldn't understand it. Instead, it would expect the following query:

query GET_PAGES {
  pages {
     nodes {
        id
        title
        parent {
           __typename
           ...on Page {
              id
              title
           }
        }  
     }
  }
}

Subtle differences, but Page vs WordPress_Page, and { pages{ ...} } vs. { wpgraphql { pages { ... } } } make it impossible to re-use fragments across client and build.

@jasonbahl jasonbahl changed the title gatsby-source-graphql n+1 queries to remote GraphQL APIs and other concerns gatsby-source-graphql: n+1 queries to remote GraphQL APIs, and other concerns Apr 17, 2019
@KyleAMathews
Copy link
Contributor

/cc @freiksenet

@freiksenet
Copy link
Contributor

That is an issue I want to tackle. I'm not 100% sure on the approach, but first steps would be to add some kind of query combiner so that multiple queries can be combined in one query for the backend. It can be supplemented with a controllable cache. It would be possible to pre-populate it by running eg a list query beforehand. I'll look at it as soon as #13028 lands.

@jasonbahl
Copy link
Contributor Author

Ya, one thing Apollo Client does is provide cache redirects. So you can query a list of items, then if you run a { post( id: "..." ) { ... } } query, you can tell it to first look for that ID in the cache instead of making another network request

Takes some manual mapping, but might be something along those lines that could be done.

@jasonbahl
Copy link
Contributor Author

I still think if there's a way to share fragments between template files and the build queries, we could just do something like:

pageFragment, similar to pageQuery, then during the build it can make use of that fragment in the query that gets the list of items.

Seems to me to solve the readability issue and the n+1 issue. . .I just don't know node enough to know how to do this, or if it's even possible.

Seemed like exports from a React file couldn't be imported to a Node file, and not sure how to work around that.

@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label May 15, 2019
@gatsbot
Copy link

gatsbot bot commented May 15, 2019

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contributefor more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

@jasonbahl
Copy link
Contributor Author

I've got a lot of folks that are running into issues using WPGraphQL with Gatsby, so I want to make sure this doesn't get closed because it's stale, but I'm not sure the best path forward.

Probably would be good to get some folks together and chat through some ideas.

I've talked with @pieh a bit and @lightstrike but we don't have any concrete plans for the best way forward here.

I think ultimately WPGraphQL is a better fit for Gatsby than the REST API largely because the GraphQL Schema is defined closer to the source and can better expose the possibilities of the WordPress site, where the REST API is so inconsistent and wrapping an inconsistent REST API with GraphQL in the Gatsby layer leads to an unpredictable, buggy GraphQL API.

Additionally, WPGraphQL will allow at some point for subscriptions to be implemented and that could be huge for various things, such as live previews, etc.

Let's not let this go stale, even if we don't have a concrete plan yet. 😄

@pieh pieh added not stale and removed stale? Issue that may be closed soon due to the original author not responding any more. labels May 15, 2019
@freiksenet
Copy link
Contributor

@jasonbahl This is definitely on my short-term roadmap.

@jonlow
Copy link
Contributor

jonlow commented Aug 11, 2019

👍

We have a WP site with over 4000 posts, so building all of the post pages and paginated blog/category pages results in over 5000 page queries. Needless to say the build is extremely slow (20 - 30 minutes).

@jasonbahl Is decoupling the data and passing it through context the only way to speed this up at the moment? Thanks for all your hard work! 💯

@lannonbr lannonbr added the topic: GraphQL Related to Gatsby's GraphQL layer label Aug 14, 2019
@TylerBarnes
Copy link
Contributor

For anyone here that hasn't seen it yet, this is being worked on for WordPress in #19292
I think this issue is still relevant though since it's not specific to WordPress

@yaacovCR
Copy link
Contributor

At least for n+1 issue, you should be able to use existing createLink option and use an Apollo Link that supports batching, ie https://github.com/prisma-labs/http-link-dataloader

@vladar
Copy link
Contributor

vladar commented Mar 25, 2020

For those interested in this: we implemented query batching to address the N+1 problem. Query batching must be enabled explicitly so make sure to read updated plugin documentation on how to do this.

The feature is published in gatsby@2.20.5 and gatsby-source-graphql@2.3.0

@yaacovCR
Copy link
Contributor

Awesome!

@vladar
Copy link
Contributor

vladar commented Mar 26, 2020

@yaacovCR Thanks again for all the suggestions and links you provided for this!

@yaacovCR
Copy link
Contributor

@vladar, Are you able to break out your batching link into a separate npm package? I think others users of schema stitching may find it useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: GraphQL Related to Gatsby's GraphQL layer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants