Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core schema support and Typescript query planner redux #622

Merged
merged 51 commits into from
Mar 31, 2021

Conversation

martijnwalraven
Copy link
Contributor

This PR re-introduces the TypeScript query planner, and adds support for core schema as input. This builds on the work @trevor-scheer has done in #588 to have composition output core schema instead of the previous incarnation of CSDL.

We've verified the output of the query planner through a large number of tests and feel confident these changes shouldn't affect the generated query plans.

@apollo/query-planner currently exposes the same API as before, but we'll change this in a follow-up PR to avoid repeated printing/parsing of schemas and queries.

trevor-scheer and others added 30 commits March 19, 2021 10:12
This re-introduces the TypeScript query planner code as of apollographql/apollo-server@60d9742, which is one of the parents of the commit that deleted that code from the `apollo-server` repository: apollographql/apollo-server@149f78e.

I wanted to start out with this to then re-apply the changes I've made in an experimental repository in a more tractable manner.

On top of that, this commit adds a `composedSchema` directory, which contains code that is used to build a schema with Federation metadata from CSDL (this will be replaced with code that takes a core schema file with join directives as input). That was needed to get things wired up with the current gateway code.

 We temporarily export the same API we used for the wasm query planner from `@apollo/query-planner`, but implemented as a facade on top of the TypeScript one. This is ugly and inefficient (we shouldn't be parsing the query again), but the goal is to get things working first without making changes to the gateway code.
The original code used `getFederationMetadata` from `@apollo/federation`, which is a private utility function that works based on the internal metadata added to the `GraphQLSchema` that's being built up during composition.

Since the query planner has its own `buildComposedSchema` now, which uses CSDL as input and builds its own `GraphQLSchema`, we need to switch to its utility functions instead.

(We expose separate `getFederationMetadataForType` and `getFederationMetadataForField` functions because there is no need to support both with the same function and that avoids runtime checks.)
The corresponding test in `@apollo/gateway` (`single-service.test.ts`) uses a simple value type, not a union. To make sure we come back to this, I've left the union type in there and introduced an additional failing test.
This replicates the behavior of the current Rust query planner, because I wanted to get the tests passing before making further changes. But the type explosion fix this depends on is fundamentally flawed and needs to be replaced.
This code has more problems and should be replaced by proper recursive selection set merging, but removing the unnecessary distinction between aliased fields and non-aliased fields at least fixes the test.
The disabled test covers code that was added to the Rust query planner to fix #177, but that logic is problematic and we should have a more principled discussion about propagation of executable directives first.
The order of the generated fragments seems to differ between the Rust query planner and the TypeScript one. Since this shouldn't change the semantics of the query, updating the snapshots to make the tests pass.
The generated core schema was invalid because the `@join__type` directive definition should be repeatable and have `key` as its argument (instead of `requires` and `provides`). We would also generate a `@join__field` directive without a graph name in case `serviceName` was undefined, which would lead to a syntax error when parsing.

This commit also renames `@http` to `@join__endpoint` and includes a directive definition for it (which was missing before).
The goal here is to get the tests passing first, while making as few changes to the gateway as possible. That means the way the gateway generates composed schemas and service lists performs redundant work and should be cleaned up later.
We should rethink the way we handle value types in composition. Currently, a type is only seen as a value type if it is defined in more than one service. Those aren't the right semantics because a type that is only defined in one service can still be a value type if it doesn't have a `@key`. That wasn't an issue before, but with the join spec that means we end up generating `@join__field` directives on fields of what should be value types.

A particularly thorny issue is the use of `@provides` on fields of value types, which if allowed, should only apply to the subgraph that includes it. We don't currently keep track of `@requires`/`@provides` per subgraph however, plus `join__field` is defined as non repeatable, so there is no easy way to generate the correct metadata.
The generated core schema was invalid because the `@join__type` directive definition should be repeatable and have `key` as its argument (instead of `requires` and `provides`). We would also generate a `@join__field` directive without a graph name in case `serviceName` was undefined, which would lead to a syntax error when parsing.

This commit also renames `@http` to `@join__endpoint` and includes a directive definition for it (which was missing before).
To make the schema returned from `buildComposedSchema` reflect the API schema, we filter out core and join elements. The approach is naive and doesn't follow the Core Schema spec, but it should work for the core SDL currently printed from `printComposedSchema` in `@apollo/federation`.
…ut `graph`

Allowing `@join__field` without `graph` gives us the same metadata previously computed from CSDL input, and should keep the query planner behavior the same as before.

In the future, we should probably change composition to keep track of `@provides` on a per subgraph basis, and reflect that in the core schema output by making `@join__field` repeatable and `graph` non-nullable.
We should change the way we detect value types. If a type is defined in only one service, we currently don't consider it a value type even if it doesn't specify any keys.

That resulted in us printing `@join__field`, with the name of the single service that defined the type, even though we didn't add a corresponding `join__type` to the parent type (because that relies on there being keys defined). That is invalid according to the join spec, and also broke the current workaround for avoiding type explosion on value types (which we should replace, but are leaving in for now to first replicate the existing behavior).
This was always the recommended style, but now that `GraphQLNonNull` and `GraphQLList` have been converted to proper classes in graphql/graphql-js#2906, calling them as functions no longer works.
trevor-scheer and others added 2 commits March 30, 2021 14:04
Remove all notions of CSDL in code. Replaced with "Supergraph SDL" or
its appropriately camelCased counterpart.
query-planner-js/src/composedSchema/buildComposedSchema.ts Outdated Show resolved Hide resolved
export type FieldSet = readonly (FieldNode | InlineFragmentNode)[];

export interface Endpoint {
serviceName: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the "serviceName" terminology (which shows up a bunch in this file)?

export type ServiceName = string;
export type FieldSet = readonly (FieldNode | InlineFragmentNode)[];

export interface Endpoint {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to Graph (or Subgraph)?

query-planner-js/src/composedSchema/buildComposedSchema.ts Outdated Show resolved Hide resolved
);

for (const typeDirectiveArgs of typeDirectivesArgs) {
const serviceName = graphMap[typeDirectiveArgs['graph']].serviceName;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line also assumes the graph is in graphMap


const keyFields = parseFieldSet(typeDirectiveArgs['key']);

typeMetadata.keys?.add(serviceName, keyFields);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate question: Is it OK for a type to have @join__type declared multiple times for the same graph? Here it looks like you want it to be OK (use of MultiMap). I do see "It is an error to @join an object against the same subgraph multiple times." in the current spec draft but is that something we are changing? I guess it looks like there can be multiple from the printSupergraphSdl implementation, so I guess this is a spec issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should be allowed according to the Federation model, but it turns out we have a composition validation rule against it (which was put in there because the query planner currently only uses the first key). So this is something we explicitly need to change both in the spec and in composition.


const typeMetadata: FederationTypeMetadata = ownerDirectiveArgs
? {
serviceName: graphMap[ownerDirectiveArgs?.['graph']].serviceName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code assumes that the graph arg value is found in graphMap, which seems reasonable because it's an enum value and we've already set up graphMap with all enum values. But this relies on (a) the assumption that the @join__owner directive is declare with graph: join__Graph!, which we have not validated, and (b) the assumption that graphql-js validates directive argument types which... um, I think maybe getArgumentValues does even if schema parsing does not.

OK but (a) still matters: let's make sure we get a better error message if the graph isn't in graphMap!

Part of this is a problem with the type

export type GraphMap = { [graphName: string]: Endpoint };

This type means "no matter what string you put in, you'll get an Endpoint out", which is not true. The type should be changed to Endpoint | undefined, or we can switch to an actual Map which handles this better.

I guess you could also leave this GraphMap type and validate both @join__* directives have the right argument. Or make both changes.

query-planner-js/src/composedSchema/buildComposedSchema.ts Outdated Show resolved Hide resolved
const shouldPrintOwner = isObjectType(type);
const joinOwnerString = shouldPrintOwner
? `\n @join__owner(graph: ${
context.sanitizedServiceNames?.[ownerService] ?? ownerService
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what context does this ?? ownerService trigger? It seems like this should be an error rather than creating references to a non-existent enum value.

Same for the other three similar calls, though one could imagine a helper function or method on context or something that does the lookup-and-error-check (or the lookup-and-fallback as done here).

@@ -321,54 +360,81 @@ export function printWithReducedWhitespace(ast: ASTNode): string {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up a few lines^^

A couple years ago at our request GraphQL gained a stripIgnoredCharacters function that maps a string to a string removing unnecessary characters (including comments) in a more rigorous way. It made it into 14.3 so it's in all versions our peer deps ask for. We should use that instead of the .replace!

It looks like the repo passes tests modulo a couple trivially-updated snapshot tests if you just replace this function with return stripIgnoredCharacters(print(ast)) today. Perhaps this is too stressful to do as part of this big PR (eg, it may slightly affect query plans in trivial ways that might confuse our validation maybe?) but can you at least add a comment suggesting we try the change at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(specifically this matters because the current code mangles whitespace in string literals in arguments, and the upstream function doesn't)

@glasser
Copy link
Member

glasser commented Mar 31, 2021

While I think many of my comments should be addressed before doing a preview release, I don't have strong opinions about whether they should be addressed before merging this particular PR if it would be better to get it merged and iterate on main.

@trevor-scheer trevor-scheer changed the base branch from main to release-0.26.0 March 31, 2021 15:13
@abernix abernix marked this pull request as ready for review March 31, 2021 15:49
@trevor-scheer trevor-scheer merged commit 56f6e21 into release-0.26.0 Mar 31, 2021
@trevor-scheer trevor-scheer deleted the typescript-redux branch March 31, 2021 20:25
trevor-scheer added a commit that referenced this pull request Apr 1, 2021
This commit re-introduces the TypeScript query planner, and adds support for core schema as input. This builds on the work @trevor-scheer has done in #588 to have composition output core schema instead of the previous incarnation of CSDL.

We've verified the output of the query planner through a large number of tests and feel confident these changes shouldn't affect the generated query plans.

@apollo/query-planner currently exposes the same API as before, but we'll change this in a follow-up PR to avoid repeated printing/parsing of schemas and queries.

Any references to CSDL both internally and outward facing have been updated to their new counterpart:
`supergraphSdl`. Supergraph SDL implements both the core and join specs, necessary for powering a router.

Co-authored-by: Trevor Scheer <trevor.scheer@gmail.com>
Co-authored-by: Jesse Rosenberger <git@jro.cc>
trevor-scheer added a commit that referenced this pull request Apr 22, 2021
This commit is a set of follow-ups from #622 (and captured in #624).

Most of these changes are internal and of little to no outward consequence, however a few changes worth noting are:
* a switch to graphql-js's `stripIgnoredCharacters` during field set printing
* an update to the `join__Enum` generation algorithm
* the splitting of entity and value type metadata types
* a conversion of the `GraphMap` type to an actual `Map`
* additional assertions throughout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants