-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cronet request lifecycle not behaving correctly with Apollo. #5885
Comments
Posting some of my analysis for the "http batching turned on scenario"
to
appears to make the app behave correctly. ofc I realize this is not a proper fix - but I wanted to post my findings at the end of the day. Perhaps a more robust one should verify, that after reading the batched operation responses array, end document token appears - happy to hear your thoughts/suggestions on this. Also I have not looked into the scenario where HttpBatching is not enabled, and what causes all requests to be interpreted as cancelled by Cronet (I suspect json parsing also to be at play here, but I'm simply out of time for today). |
Thanks for the detailed issue and reproducer ❤️.
Indeed, with JSON we can know that an object/list is terminated without reading the end of stream. This would allow to read several objects concatenated in the same stream. Not that it's the case here but the API allows for that. What should happen is that the This also means that if a server sends garbage after a JSON response, Do you have issues besides lifecycle events (request finished) not being fired? How bad would it be to just ignore those missing events (and the cancelled ones as well?) |
np. fun fact - my team told me we're the og authors of batching implemented in apollo-kotlin :)
I didn't know that, my knowledge of gql is rather limited - what else could be there in the response? Wouldn't that indicate that it's no longer a JSON document (but something like JSONL instead?) - and so in turn, wouldn't that be discernible by content-type and/or some other parts of the response? also the way I understand the current batching interceptor impl, would anyone really have a chance to access the rest of the response?
Is that consistent with the apollo client behavior when there's no batching turned on? (e.g: would extra content beyond the first JSON "entity" also be ignored there?)
Just to be clear - that's not the current behavior. I've described the cancellations being fired for all the requests, but only when batching is turned off. I haven't looked at the root cause of that yet.
|
It's a small world :)
My point was entirely theoretical based on low level At a high level, no GraphQL server that I know of, batching or not batching, sends data after the payload. So it should be safe to consume that. May I suggest a fix like this? val list = responseBody.jsonReader().use {jsonReader ->
// TODO: this is most likely going to transform BigNumbers into strings, not sure how much of an issue that is
AnyAdapter.fromJson(jsonReader, CustomScalarAdapters.Empty).also {
if (jsonReader.peek() != JsonReader.Token.END_DOCUMENT) {
println("Apollo: extra tokens after payload")
}
}
} If it works for you, feel free to open a PR and I'll review it but I probably won't have time to dive into this myself as it's KotlinConf busy busy times right now. |
Ty, yeah, I think something in that vein is ok. I'll take a closer look at the cause of cancellations when batching is off, and also revisit the old issue I've submitted, but should be able to squeeze in a PR, even if it's just for the batching interceptor.
np, your responses have been very helpful. I'm traveling next week as well, though not for KotlinConf :). This is is not affecting us in any severe way, and a path forward looks relatively clear now. |
short update on the "requests cancelled when http batching is not enabled" it's sort of a similar case. excepts from stacktrace suggest:
I haven't experimented with a "hack fix" as replacing the entire transport/parser seems more involved than just an interceptor, but my bet is this is the cause. Anyways, in terms of PRs imo that should be a different one. And perhaps it could be similar - it doesn't sound unreasonable for an object called ResponseParser to assert that it has read the entire response before closing the buffer (though ofc, how to handle cases where it had not is something to be discussed, especially from compatibility perspective). idk, perhaps the control of the reader should be pulled higher up the call stack. In any case, this is a secondary issue. |
My initial thoughts on this are that both should check for the EOF token after reading batched list/response. And if not EOF then just ignore the trailing token. We might need a function that consumes a |
Now, that I've created the PRs (and after some internal discussions with my team), I'm considering if it's the right choice to just fail silently on extra payload in content, would like to discuss that part once more. From the compatibility standpoint, ofc that would introduce exceptions on paths where those weren't thrown before. On the other hand, any other "json not valid" would throw an exception so this could be seen as just fixing an inconsistency in the already existing strict approach to such errors. Additionally, like you said, in practical applications, it will always be end-of-document, so to an extent the debate is about which form of no-op is best :) |
Indeed, that's a fair point. I was going to quote the robustness principle but this is more and more questioned. See https://www.ietf.org/archive/id/draft-iab-protocol-maintenance-05.html for an example. Being strict does have its advantages. The thing is I don't think this behaviour is specified anywhere (and even less for batching) so if we want to be strict there, I'd rather make sure we're not breaking someone elses workflow. Might be worth raising the concern in the GraphQL over HTTP spec ? |
I'll try to clarify that in the spec repo - graphql/graphql-over-http#292 as far as I can read/understand the spec:
https://graphql.github.io/graphql-over-http/draft/#sec-Response
https://spec.graphql.org/draft/#sec-Response it boils down to whether including anything that's semantically different from "end of document" breaks the "well-formedness" of the response. |
Looks like a pretty good case that trailing data is an error and should be rejected by the client. Let's give it a couple of days to see what everyone thinks about this but I'm more and more team "let's make this an error" |
@duzinkie Let's make it an error? I just added a comment in your Benjie has a good point that if there is a lot of whitespace (and no trailing token), the response will now be delayed until all the whitespace is consumed but this is the case everywhere in the json response, not only at the end so backends should be aware of whitespace in all cases. What's more, it should reasonnably never happen in practice. |
@duzinkie this is all merged in and will be available in the next release. |
Fix is in v4.0.0-rc1 and v3.8.5. |
Do you have any feedback for the maintainers? Please tell us by taking a one-minute survey. Your responses will help us understand Apollo Kotlin usage and allow us to serve you better. |
Version
3.8.4
Summary
I've adapted https://github.com/apollographql/apollo-kotlin-tutorial to use https://github.com/google/cronet-transport-for-okhttp (https://github.com/duzinkie/apollo-kotlin-tutorial/tree/duzinkie/cronet-lifecycle-repro) and have observed the app is not functioning correctly when Cronet is used as an http engine.
Let me elaborate:
The app built from https://github.com/duzinkie/apollo-kotlin-tutorial/tree/duzinkie/cronet-lifecycle-repro is working fine, until one tries to adjust the logging level of HttpLoggingInterceptor (duzinkie/apollo-kotlin-tutorial@9b3e58a#diff-518bd35be782210b23f518eaf618c538f99e9f33a81f6b518ccb053b4315eff1R94) - when that happens 2 distinct failure modes occur, depending on whether http batching is enabled.
side notes:
Steps to reproduce the behavior
cronet request finished: 0
lines from the cronet http engine (indicating requests are finishing just fine)a. change the logging level of
HttpLoggingInterceptor
toBASIC
b. disable
httpBatching
cronet request finished
messages are absent and the app hangs after a few interactionscronet request finished: 2
log messagesLogs
I think reproduction steps is enough for this issue, lmk if you'd like to see my logs anyway.
The text was updated successfully, but these errors were encountered: