-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Should flush endpoint also wait for in-progress forecasts to complete? #32094
Comments
Pinging @elastic/ml-core |
This sounds like a good idea to me. The implementation of this requires changes on the cpp side as well. We also would need to think about the jobs in the forecast queue. Looking at #31973 there is an alternative i like to mention: The forecasting request stats document is written last after all results have been written, but only from the engine side. On the persistence side however the |
I don't think this will create any visibility guarantees for searches from different processes. The other process (UI, integration test harness or client application) could still search at a time when the document is visible in the primary but not the replica. The thread writing results will wait until the refresh is completed, but that doesn't make the sequence of events required to |
Client applications watch for the forecast request document having status set to
With that I mean the internal communication part between the engine and the result processor, not the external client/es part. |
Yes, true, that would solve it for the case of a client reacting to seeing It would still be possible for two consecutive gets of the forecast stats document to return strange results, like the first get saying the forecast was finished and the second saying it was still running. But it's probably only tests that care about this. A real client will not get the forecast stats document twice in quick succession. |
#31973 showed that the forecast results can be inconsistent on replica nodes. Is the issue that the UI is polling for a forecast stats document with I can see the benefit of flush request ensuring any running forecasts have finished and that the results are visible. Forecasting sends a flush command immediately after the forecast request in that case flushing should not block on the forecast finishing ( |
#31973 highlights a problem with polling indices being the only mechanism to check whether a forecast has completed: the primary and replica can have different documents searchable, so you can get inconsistent results from two consecutive searches of the same forecast results.
We get around this for non-forecast results by recommending people flush a job before requesting results if they require perfect consistency. However, flushing a job does not currently also flush forecasts.
If you close a job then the request does not return until all results are committed to the results index, including forecast results. So there is a discrepancy here between close and flush. Prior to the introduction of forecasts, flush provided the same consistency guarantee as close, but now it doesn't for the case of forecast results.
To restore this parity between the consistency guarantees of flush and close I propose that a request to flush a job should not return until all forecasts are complete as well as other processing. (If there are some situations where we want to flush everything except forecasts then we could have a
?forecasts=false
option on the flush endpoint, but I think the default should be that flush and close cause the same documents to be searchable in indices.)The text was updated successfully, but these errors were encountered: