perf(gatsby): cache parsing and validation results in graphql-runner #20477

vladar · 2020-01-08T12:14:16Z

Description

This PR introduces a cache for GraphQL parsing and validation steps in the graphql-runner, which
improves the performance of query running for about 18% on our synthetic benchmarks/query site.

It is inspired by an experiment of @pieh with graphql-jit. We saw positive numbers from the
graphql-jit branch, but it turns out most benefits came from avoiding re-parsing and re-validating
the same query.

Here are some numbers on my laptop (queries per second, higher is better):

Node 10.17.0:

	graphql-js	graphql-jit
without caching	460 (baseline)	422 (-8.2%)
with caching	543 (+18%)	561 (+22%)

Node 13.1.0 (surprisingly slower in general in this benchmark):

	graphql-js	graphql-jit
without caching	433 (baseline)	381 (-12%)
with caching	510 (+17.8%)	514 (+18.7%)

For each test, I ran the build 7 times using NUM_PAGES=20000 gatsby build, ignored the first two
results (warm-up), and averaged the remaining 5. The number in the table is from the "run queries"
steps (queries per second).

Typical scenario

Say you have 1000 pages with the same template component. Then the graphql-runner executes
GraphQL query of this template 1000 times (with different variables).

But GraphQL query execution involves 3 steps:

Parsing GraphQL query string to AST
Validation of this AST document against GraphQL schema
Execution of the AST document (using different variables, context, etc.)

Those steps occur inside the graphql function call (which is a facade for those steps).

Before this PR graphql-runner was using this graphql facade and for the example above
it would have run each step 1000 times (for the same query).

In this PR it runs parsing and validation only once per query and still runs execute
1000 times (with different variables/context).

In general, it should improve the performance of sites executing the same query
many times during the build.

Thoughts about graphql-jit

Without cache, graphql-jit has additional overhead for query compilation. But it performs great
when the query is cached.

In the real-world projects though execution time itself is small comparing to GraphQL resolvers
time (which we see even in this benchmark) but we can get back to graphql-jit when we need
to squeeze the last millisecond from query running.

The project is also in a too early stage, but we should definitely keep an eye on it
(it is very promising).

packages/gatsby/src/query/graphql-runner.js

freiksenet

LGTM, I think it's good to merge after some testing.

vladar · 2020-01-17T16:05:44Z

I tried it on several sites and it seems to be working OK. The biggest impact noticed for https://github.com/tsriram/ifsc/ (after our other recent tweaks):

Run queries step:
Before: 618.91 q/s
After: 1003.08 q/s

Which is almost 60%+ speedup (for this specific step)

sidharthachatterjee

Great work! 🙌🏼

t2ca · 2020-01-18T21:42:57Z

Hello,

Im not sure how to explain this but since upgrading Gatsby to v2.18.25, I've noticed that when i scroll down to the bottom of the page, there is like a 1px line at the very bottom. You can reproduce this on the gatsbyjs.org website if you scroll to the bottom of the page and change the background color of the footer to a dark color inside dev tools.

pvdz · 2020-01-20T08:53:41Z

@t2ca Could you file a new issue about this, please? If you think your issue is related to this PR please link it. To me it sounds like a completely different issue.

t2ca · 2020-01-20T21:06:28Z

#20728

vladar added 2 commits January 8, 2020 18:40

perf(gatsby): cache parsing and validation results in graphql-runner

0c68b93

Fix broken test

6e6a6c9

vladar requested a review from a team as a code owner January 8, 2020 12:14

vladar added the status: WIP label Jan 8, 2020

vladar self-assigned this Jan 9, 2020

freiksenet reviewed Jan 10, 2020

View reviewed changes

packages/gatsby/src/query/graphql-runner.js Show resolved Hide resolved

freiksenet reviewed Jan 10, 2020

View reviewed changes

vladar removed the status: WIP label Jan 17, 2020

sidharthachatterjee approved these changes Jan 17, 2020

View reviewed changes

sidharthachatterjee added the bot: merge on green Gatsbot will merge these PRs automatically when all tests passes label Jan 17, 2020

gatsbybot merged commit ac7c79f into master Jan 17, 2020

delete-merged-branch bot deleted the vladar/perf-graphql-runner branch January 17, 2020 16:13

pvdz mentioned this pull request Jan 17, 2020

Is there a hard limit on maximum number of pages that Gatsby can build? #20338

Closed

vladar mentioned this pull request Jan 17, 2020

refactor(gatsby): Avoid re-parsing query by storing AST after extraction #20677

Closed

ascorbic mentioned this pull request Jan 23, 2020

Explore page build optimisations #20785

Closed

pvdz mentioned this pull request Jan 28, 2020

[Request] Real-world Gatsby sites (50k+ pages) #19512

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(gatsby): cache parsing and validation results in graphql-runner #20477

perf(gatsby): cache parsing and validation results in graphql-runner #20477

vladar commented Jan 8, 2020

freiksenet left a comment

vladar commented Jan 17, 2020

sidharthachatterjee left a comment

t2ca commented Jan 18, 2020

pvdz commented Jan 20, 2020

t2ca commented Jan 20, 2020

perf(gatsby): cache parsing and validation results in graphql-runner #20477

perf(gatsby): cache parsing and validation results in graphql-runner #20477

Conversation

vladar commented Jan 8, 2020

Description

Typical scenario

Thoughts about graphql-jit

freiksenet left a comment

Choose a reason for hiding this comment

vladar commented Jan 17, 2020

sidharthachatterjee left a comment

Choose a reason for hiding this comment

t2ca commented Jan 18, 2020

pvdz commented Jan 20, 2020

t2ca commented Jan 20, 2020