-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OUTDATED fix: prevent timeouts while writing to object storage #1916
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good find but I'm not sure I understand yet.
/// A task that flushes the data every 500ms. This is to make sure that the | ||
/// futures within the writer are polled at least every 500ms. This is | ||
/// necessary because the internal writer buffers data and holds up to 10 | ||
/// write request futures in FuturesUnordered. These futures only make | ||
/// progress when polled, and if they are not polled for a while, they can | ||
/// cause the requests to timeout. | ||
background_flusher: tokio::task::JoinHandle<()>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't those futures be polled when this AsyncWrite
is polled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the AsyncWrite::write
returns Poll::Ready
once it has put the read task onto it's internal FuturesUnordered
. So the scheduler has no reason to poll the internal tasks.
There is a simpler reproduction here:
if let Ok(err) = this.background_error.try_recv() { | ||
return Poll::Ready(Err(err)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these try_recv
blocks be inside the mutex? Otherwise is there a slight possiblity that you could:
Check background_error
, no error
Background error occurs
Check poll_write, it returns ready, background error is never checked again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. Thanks
let background_flusher = tokio::task::spawn(async move { | ||
loop { | ||
tokio::time::sleep(std::time::Duration::from_millis(100)).await; | ||
match writer_ref.lock().unwrap().flush().now_or_never() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we call flush after a writer is finished writing / closed?
Also, what is the normal exit path for this task? It looks like it can only exit this loop if flush
returns an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now calling flush after it's done seems to be a no-op, but there's no strong guarantee of that in the API. We should abort this task in the shutdown method, I think.
ACTION NEEDED Lance follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Closes #1878