-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Feature: GraphQL schema snapshots for all data sources to solve undefined/empty data issues #3344
Comments
It seems like an interesting idea since it should work regardless of the source. It would definitely work for my usecase. |
Good question about assets. I think for assets most data sources represent them as a string with a URL or path to the asset. The schema snapshot would just represent the type as a string with the URL/path or an asset object which has a string property with the URL/path. |
Seems like an interesting idea, though why not skip the extra indirection of a snapshot and and define the schema fully? Like if we're going go a head and make fully complete data example, it seems like not much more work to directly specify the gql schema while also being clearer? |
@jquense that's what I'm thinking. You could either run a command to "snapshot" the dynamically generated schema which would write out a file with the schema in normal graphql form or you could directly write the same form yourself. |
Yeah I definitely like the idea of keeping it open to write your schema from scratch along side the "snapshot" feature. The benefits I see from the "snapshot" feature is the ability to ease yourself into learning the GraphQL schema syntax and reducing the amount of boilerplate code needed to get up and running with Gatsby and any source plugin. I am currently looking into the internals of Gatsby's schema inferring logic in .. |
I totally understand easing users into a new thing, however I think plugins are a bit different. Authors are generally expected to understand gql already and users shouldn't generally have to mess with anything other than querying when consuming a source plugin. That said tho the schema language for defining said schema is probably less complicated than querying, in terms of getting up to speed. I think it'd be time better spent allowing plugin authors to provide a .graphql file than working on snapshot ting a comprehensive data object, which may not even be enough to properly define the schema wholly. I would like to mention to that manually defining the schema is already possible and supported via the gatsby api, albeit with graphql-js not the terser schema language. |
I understand that plugin authors should know gql already and users
shouldn't need to mess with schemas. But the issue is that the schema
produced from inferring from the source data is many times incomplete based
on the current state of the source data (see the issues I mentioned in the
opening of this issue). So the way I see it is in order for users not to
really mess around with having to write there schema's from scratch, they
could set up the source data to complete, take a snapshot of the schema and
then have Gatsby use that snapshot going forward. This way the user avoids
having to write schemas from scratch in order to solve empty/undefined data
causing the graphql queries to fail.
Also could you elaborate a little more on how " I think it'd be time better
spent allowing plugin authors to provide a .graphql ..." would work?
…On Thu, Dec 28, 2017 at 8:27 AM Jason Quense ***@***.***> wrote:
I totally understand easing users into a new thing, however I think
plugins are a bit different. Authors are generally expected to understand
gql already and users shouldn't generally have to mess with anything other
than querying when consuming a source plugin.
That said tho the schema language for defining said schema is probably
less complicated than querying, in terms of getting up to speed. I think
it'd be time better spent allowing plugin authors to provide a .graphql
file than working on snapshot ting a comprehensive data object, which may
not even be enough to properly define the schema wholly.
I would like to mention to that manually defining the schema is *already*
possible and supported via the gatsby api, albeit with graphql-js not the
terser schema language.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuc_6X_eEWEf281KUspXxB1NuPZPeC4ks5tE5clgaJpZM4RNLJn>
.
|
So I think we are missing each other on where this should be solved :) in the WordPress example, I think rather than the plugin completely relying schema inference it should fully specify fields that may exist, e.g. all the nullable fields. Then a user won't run into the problem at all, because gatsby will have a complete schema for the datasource. In cases where the plugin author can't possibly know what all the fields are I still think that, while more of a learning curve, it'll end up being more accurate and less error prone for the user to provide the rough schema vs fake data to be inferred as a schema. Yes you need to learn more up front (and we should provide good documentation for it) but you avoid users needing to understand the internal workings of how gatsby infers schemas. For example, if a user wants to specify a date field, it's (I think) simpler to learn to directly specify a date vs know to provide an iso compatible string that will be inferred as a date. Plus a lot of other features like connecting two data nodes, or specifying nullablility, is only really doable with a schema definition vs a data object |
What about 2 different plugins? One that allows us to easily specify the schema in a file and another one to generate that file from a specific source. |
I like the idea of two plugins. I think the API @jquense is talking about
is https://www.gatsbyjs.org/docs/node-apis/#setFieldsOnGraphQLNodeType. But
I think it only allows to add fields to a schema right? And not overwrite
what was inferred.
…On Thu, Dec 28, 2017 at 10:06 AM M4rrc0 ***@***.***> wrote:
What about 2 different plugins? One that allows us to easily specify the
schema in a file and another one to generate that file from a specific
source.
So the source might be different... For example a contentful space where
every field is complete might be the source to generate the schema. Then
the schema can be used on any contentful space that has the same
architecture. And of course we might still tweak the schema manually if
needed.
In any case it seems like plugin no1 is the first step. We might still
decide when it is done what to do next.
@jquense <https://github.com/jquense> is there somewhere an example of a
manually defined schema? I didn't know it was possible.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuc_4fqrocUXxBPkFsK2tFn2VVDduv3ks5tE66AgaJpZM4RNLJn>
.
|
Yeah I think you are right it doesn't all overwriting :/ I'm generally in favor of plugins to solve issues but I think this sort of thing may need to handled in the core api to strike the right usability notes (I could totally be wrong tho). In my head the ideal api would be just to point gatsby at a .graphql file and it does the rest, but that may not as nice as I imagine. I think it's all probably worth exploring in plugins tho if possible before possibly moving into core... I of course defer to what Kyle thinks since he's the boss :) |
Ok cool I will do some exploring around the first plugin for defining
schemas as .graphql files. I'm interested to hear Kyle's input too :)
Thanks for the great discussion everyone!
…On Thu, Dec 28, 2017 at 10:31 AM Jason Quense ***@***.***> wrote:
Yeah I think you are right it doesn't all overwriting :/
I'm generally in favor of plugins to solve issues but I think this sort of
thing may need to handled in the core api to strike the right usability
notes (I could totally be wrong tho). In my head the ideal api would be
just to point gatsby at a .graphql file and it does the rest, but that may
not as nice as I imagine. I think it's all probably worth exploring in
plugins tho if possible before possibly moving into core... I of course
defer to what Kyle thinks since he's the boss :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuc_6wNLHfte9zCZ2OqgirIMRadX609ks5tE7RTgaJpZM4RNLJn>
.
|
@jsanchez034 did you look more into it? I will propably try to work on this next week and I'm interested if you did some research and can share your findings |
Hey Michal, I did some research and have some next steps in mind. I'll post
my findings when I get back on my machine in a few hours.
…On Fri, Jan 5, 2018 at 7:16 PM Michal Piechowiak ***@***.***> wrote:
@jsanchez034 <https://github.com/jsanchez034> did you look more into it?
I will propably try to work on this next week and I'm interested if you did
some research and can share your findings
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3344 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAuc_3w7px_NJjL2EKl8G5TP1GJgQFVWks5tHrtNgaJpZM4RNLJn>
.
|
Hey @pieh, I started looking into how we can inject whole new schemas into the schema that Gatsby infers from the source data. So I figured a plugin could be created to add schemas to Gatsby. Below are my notes for this plugin... Schema writer pluginWhat should the plugin do?
Those are my rough notes on a schema writer plugin. The other plugin that @MarcCoet mentioned would be one where you could point the plugin to endpoints on a domain where data is set up to be in a complete state then schemas would be generated and saved off to Note: Schemas would be defined using GraphQL type language |
Unfortunately I don't think From documentation of
We would rather want to use string option, as
As for second option (using
because File isn't defined in this additional schema. We could probably append builtin types so it would work but I feel like trying to "fool" this tool to do stuff it's not designed to do (at least at this point in time) is not the way to go. I'll will try to evaluate 2 other approaches:
|
I've made some progress. Decided to go with gatsby specific approach for now (adding support for "forced types" in https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby/src/schema ) as I found it easier to work with than playing with graphql schema objects. I'll try to clean up my code a bit so it would be more readable and create pull request this week to discuss it further there. Just little sneak peak of results (imo quite promising): // post #1
---
title: Hello World
date: "2015-05-01T22:12:03.284Z"
featuredImage: "sasalty_egg.jpg" // badly formatted value that wouldn't allow to make this File type
test: "some string" // incompatible value type with other posts (in other posts test field is object) - it would be skipped in schema
---
// post #2
---
title: My Second Post!
date: "2015-05-06T23:46:37.121Z"
test:
floatField: 5.5
intField: 99
arrayOfStringsField:
- "it"
- "works"
---
// post #3
---
title: New Beginnings
date: "2015-05-28T22:40:32.169Z"
test:
stringField: "string"
boolField: true
--- Used type definitions:
Note: those don't have to be full type definitions (f.e. in MarkdownRemark definition I just force type of frontmatter field to not rely on automatic type naming) Here's how frontmatter looks before using my changes (no |
Hi @pieh! Thanks for taking this on, your solution looks very promising! Sorry I couldn't contribute more to this issue in the past couple of weeks, I haven't had the time to dive into this. Let me know if there is anything I can help with. |
If anyone is interested you can view my code changes master...pieh:schema_wip (after deleting some dead code and refactoring some stuff I had yesterday it's much less code than I thought :) ). Don't have time to put PR for comments with good description and website repository used for testing, so figured I would just share my code for now. If anyone want to play with it - place |
Any chance this could make its way into a PR? |
@MarcCoet I personally have moved all my projects temporarily to react-static, as this feature is essentially a deal-breaker for me. |
Hey again! It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it. Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing! Thanks again for being part of the Gatsby community! |
am I the only one still having this problem? |
Sorry we're working towards this. I'll set this as not stale so the bot stops closing it. |
@silviopaganini I found this is a pretty good solution in the interim |
Yeah! trying this now, seems to work, but it's not scalable... I'm working on a huge website, if I have to create those for all non-required fields is not great, but works for now |
@moreguppy I'm trying this to fix a similar issue with gatsby-source-wordpress, but I get the following message when I add this code to gatsby-node.js: Error: Schema must contain uniquely named types but contains multiple types named "wordpress__PAGEAcf". My code:
|
Nevermind, for anyone experiencing a similar issue, this is the correct format ti fix it:
|
Hey again! It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it. Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing! Thanks again for being part of the Gatsby community! |
Re-opening and adding a |
I'd be super grateful if someone could give the stuff in #16291 some real-world testing. Thanks! |
We have a open PR for schema lock-down in #16291. Would be cool if those still interested in seeing this land could share some feedback. Thanks! |
Hey @stefanprobst . Awesome work! |
@stefanprobst any idea about #19210 |
Hello - I'm also having trouble with this still. I've attempted to fix it as documented here: https://www.gatsbyjs.com/docs/reference/config-files/gatsby-node/#createSchemaCustomization And: https://www.gatsbyjs.org/docs/schema-customization/#creating-type-definitions However, first I wanted to ask, why is
In my case I seem to have extra problems. Although I explicitly define fields which can be optional, in order to avoid errors if Gatsby does not find any examples and therefore can't infer their existence, Simply specifying the type as (for example) I'm using a headless CMS to allow my users to enter content, so I can't control what they do exactly, and failures crop up unpredictably depending on what they've created, are not in the least friendly to users or developers. I find myself caught between these cases. I am encountering the following:
What happens when inference is used all seems very dependent on the first content file Gatsby sees, which isn't something I know how to fix, the order seems non-deterministic. An apparent fix can later turn out to be working only by lucky ordering. I am currently using Gatsby 2.23.12, not the most recent version as this project is over a year old and still intermittently encountering problems. Upgrading seems to be another can of breaking-change worms I don't yet want to open. |
I'm being hopeful here, I think... has failed in the past, see comments in the file. Posted in a related bug here: gatsbyjs/gatsby#3344 (comment)
My team had a problem inferring complex types like dynamic zones and some fields that could be empty. Example code:
Used with gatsby 5.7.0. |
@fgroenendijk this is the solution we used. It could be better, but it's setup to refresh the schema during development, and rely on this cached schema on deployed environments. It still relies on our non-master branch having all content fields filled, but it won't break production. // Only rebuild definitions on non-master branches, and not on codebuild
const rebuildDefinitions = process.env.GATSBY_CONTENTFUL_ENVIRONMENT !== 'master' && !process.env.CODEBUILD_BUILD_ARN
const definitionPath = 'src/type-definitions.gql'
const schemaPath = path.resolve(__dirname, definitionPath)
export const onPreInit: GatsbyNode['onPreInit'] = async () => {
// On non-production environments clear outdated type definitions to force a resync
if (rebuildDefinitions && fs.existsSync(schemaPath)) {
console.log(`Deleting types from ${definitionPath}`)
fs.unlinkSync(schemaPath)
}
}
export const createSchemaCustomization: GatsbyNode['createSchemaCustomization'] = async ({ actions }) => {
const { createTypes, printTypeDefinitions } = actions
// ON UAT this will save type definitions to a file
if (rebuildDefinitions) {
// On non-master branch we persist types
console.log(`Saving types to ${definitionPath}`)
printTypeDefinitions({ path: definitionPath, withFieldTypes: true })
} else {
// Otherwise we restore types
console.log(`Restoring types from ${definitionPath}`)
const typeDefs = fs.readFileSync(schemaPath, 'utf8')
createTypes(typeDefs)
}
} |
Description
Currently there are issues with GraphQL schemas produced from data sources where at the moment
gatsby develop
orgatsby build
is executed the data shape is incomplete, parts are in an empty state or in a different type than they would be if the source data was filled out. Below are a few example issues..false
when empty even though they should be an object or nullIt would be great if as the data shape evolves on the data source side, you could create test pages or pieces of content that are fully filled out, meaning no empty fields. Then on the Gatsby side you could run a cli command called something like
gatsby snapshot-schemas
which would fetch the current data sources, run the source data through there regular plugin data normalization paths, run the data through the existing infer GraphQL schema code and then finally at the end take the schemas generated and save them off to a folder in/src
calledschemas
.On subsequent builds Gatsby could skip over inferring of the GraphQL schemas when it sees schemas defined in
/src/schemas
. These schema snapshots could then committed into a sites repo and allow for data shape changes to require new snapshots instead of just data source changes. These schema snapshots open up many possibilities such as the validation of incoming data shape changes. Schema snapshot diffs could be shown in Gatsby CLI as well whengatsby snapshot-schemas
is ran again once initial schemas have been saved.I would love some feedback on the idea itself from the contributors of the various source plugins. If the idea passes the smell test, I would like some feedback on what format the snapshots should be saved to. Maybe the GraphQL schema language using something like gestalt-graphql would be nice.
The text was updated successfully, but these errors were encountered: