Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(gatsby): cache parsing and validation results in graphql-runner #20477

Merged
merged 2 commits into from
Jan 17, 2020

Conversation

vladar
Copy link
Contributor

@vladar vladar commented Jan 8, 2020

Description

This PR introduces a cache for GraphQL parsing and validation steps in the graphql-runner, which
improves the performance of query running for about 18% on our synthetic benchmarks/query site.

It is inspired by an experiment of @pieh with graphql-jit. We saw positive numbers from the
graphql-jit branch, but it turns out most benefits came from avoiding re-parsing and re-validating
the same query.

Here are some numbers on my laptop (queries per second, higher is better):

Node 10.17.0:

graphql-js graphql-jit
without caching 460 (baseline) 422 (-8.2%)
with caching 543 (+18%) 561 (+22%)

Node 13.1.0 (surprisingly slower in general in this benchmark):

graphql-js graphql-jit
without caching 433 (baseline) 381 (-12%)
with caching 510 (+17.8%) 514 (+18.7%)

For each test, I ran the build 7 times using NUM_PAGES=20000 gatsby build, ignored the first two
results (warm-up), and averaged the remaining 5. The number in the table is from the "run queries"
steps (queries per second).

Typical scenario

Say you have 1000 pages with the same template component. Then the graphql-runner executes
GraphQL query of this template 1000 times (with different variables).

But GraphQL query execution involves 3 steps:

  1. Parsing GraphQL query string to AST
  2. Validation of this AST document against GraphQL schema
  3. Execution of the AST document (using different variables, context, etc.)

Those steps occur inside the graphql function call (which is a facade for those steps).

Before this PR graphql-runner was using this graphql facade and for the example above
it would have run each step 1000 times (for the same query).

In this PR it runs parsing and validation only once per query and still runs execute
1000 times (with different variables/context).

In general, it should improve the performance of sites executing the same query
many times during the build.

Thoughts about graphql-jit

Without cache, graphql-jit has additional overhead for query compilation. But it performs great
when the query is cached.

In the real-world projects though execution time itself is small comparing to GraphQL resolvers
time (which we see even in this benchmark) but we can get back to graphql-jit when we need
to squeeze the last millisecond from query running.

The project is also in a too early stage, but we should definitely keep an eye on it
(it is very promising).

@vladar vladar requested a review from a team as a code owner January 8, 2020 12:14
@vladar vladar self-assigned this Jan 9, 2020
Copy link
Contributor

@freiksenet freiksenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think it's good to merge after some testing.

@vladar
Copy link
Contributor Author

vladar commented Jan 17, 2020

I tried it on several sites and it seems to be working OK. The biggest impact noticed for https://github.com/tsriram/ifsc/ (after our other recent tweaks):

Run queries step:
Before: 618.91 q/s
After: 1003.08 q/s

Which is almost 60%+ speedup (for this specific step)

Copy link
Contributor

@sidharthachatterjee sidharthachatterjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 🙌🏼

@sidharthachatterjee sidharthachatterjee added the bot: merge on green Gatsbot will merge these PRs automatically when all tests passes label Jan 17, 2020
@gatsbybot gatsbybot merged commit ac7c79f into master Jan 17, 2020
@delete-merged-branch delete-merged-branch bot deleted the vladar/perf-graphql-runner branch January 17, 2020 16:13
@t2ca
Copy link
Contributor

t2ca commented Jan 18, 2020

Hello,

Im not sure how to explain this but since upgrading Gatsby to v2.18.25, I've noticed that when i scroll down to the bottom of the page, there is like a 1px line at the very bottom. You can reproduce this on the gatsbyjs.org website if you scroll to the bottom of the page and change the background color of the footer to a dark color inside dev tools.

@pvdz
Copy link
Contributor

pvdz commented Jan 20, 2020

@t2ca Could you file a new issue about this, please? If you think your issue is related to this PR please link it. To me it sounds like a completely different issue.

@t2ca
Copy link
Contributor

t2ca commented Jan 20, 2020

#20728

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot: merge on green Gatsbot will merge these PRs automatically when all tests passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants