-
Notifications
You must be signed in to change notification settings - Fork 25
fix possible deadlock, add retries. #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
add retries, possibly fix deadlock.
Update seekable_http_reader.rs
Hi @pgrace-google ! please review this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your patience with this PR!
log::debug!("Immediate cache success"); | ||
return Ok(bytes_read_from_cache); | ||
} | ||
if state.read_failed_somewhere { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this being set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was when we consider two thread scenarios; when one reader thread enters this method and has already notified the second thread of the failure, but second thread is not 'on time' to receive the notification, this could result in the problem described in the issue.
but there could be better way of handling this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this kind of resulted in problems for us, we checked in this change, we could see that there were issues of partial downloads. but when we revert the change, we still see the original problem, so maybe the fix is something more subtle.
ErrorKind::UnexpectedEof | ErrorKind::Interrupted => { | ||
retries += 1; | ||
// This is a bit of a hack, but it seems to be the only way | ||
// to get the underlying stream to retry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain more what happens? i am a bit hesitant about this way of retry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there is a better way to perform retries in rust,
so, here when I want to retry, I ensure that we use a new stream reader, and use 'take()' to get full ownership of the reader for the thread in execution. and we retry only in case of specific failures. now that we have a new reader,
ensure that we update the reading_stuff with the right information.
Fixes #102
After we check the cache for data and before we wait on stream reader thread to notify other threads, check if stream read failed somewhere.
Add retries when we error out during read_exact block, very basic while loop to ensure we retry, when we hit UnexpectedEof or Interrupted ErrorKind,