Skip to content

Conversation

ayebhi10
Copy link

@ayebhi10 ayebhi10 commented May 30, 2025

Fixes #102

After we check the cache for data and before we wait on stream reader thread to notify other threads, check if stream read failed somewhere.

Add retries when we error out during read_exact block, very basic while loop to ensure we retry, when we hit UnexpectedEof or Interrupted ErrorKind,

It's a good idea to open an issue first for discussion.

  • Tests pass
  • Appropriate changes to README are included in PR

ayebhi10 added 2 commits May 30, 2025 22:03
add retries, possibly fix deadlock.
@ayebhi10
Copy link
Author

Hi @pgrace-google ! please review this.

Copy link
Collaborator

@pgrace-google pgrace-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your patience with this PR!

log::debug!("Immediate cache success");
return Ok(bytes_read_from_cache);
}
if state.read_failed_somewhere {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this being set?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was when we consider two thread scenarios; when one reader thread enters this method and has already notified the second thread of the failure, but second thread is not 'on time' to receive the notification, this could result in the problem described in the issue.

but there could be better way of handling this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this kind of resulted in problems for us, we checked in this change, we could see that there were issues of partial downloads. but when we revert the change, we still see the original problem, so maybe the fix is something more subtle.

ErrorKind::UnexpectedEof | ErrorKind::Interrupted => {
retries += 1;
// This is a bit of a hack, but it seems to be the only way
// to get the underlying stream to retry.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain more what happens? i am a bit hesitant about this way of retry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there is a better way to perform retries in rust,
so, here when I want to retry, I ensure that we use a new stream reader, and use 'take()' to get full ownership of the reader for the thread in execution. and we retry only in case of specific failures. now that we have a new reader,
ensure that we update the reading_stuff with the right information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

possibility of an indefinite while loop when https stream reader thread notifies and the other thread has not yet read the condVar
3 participants