Skip to content

HDDS-15422. Stream read seek should not close stream#10415

Open
szetszwo wants to merge 4 commits into
apache:masterfrom
szetszwo:HDDS-15422
Open

HDDS-15422. Stream read seek should not close stream#10415
szetszwo wants to merge 4 commits into
apache:masterfrom
szetszwo:HDDS-15422

Conversation

@szetszwo
Copy link
Copy Markdown
Contributor

@szetszwo szetszwo commented Jun 2, 2026

What changes were proposed in this pull request?

  • Sequential read: close the block (gRPC) stream at block boundary
  • Position read: do not close
  • Seek: do not close

What is the link to the Apache JIRA

HDDS-15422

How was this patch tested?

By existing tests. Checked that the StreamReader won't be closed for seek and position read.

Copy link
Copy Markdown
Contributor

@yandrey321 yandrey321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need tests for validating that stream is not closed when:

  1. seek inside the length of prefetched buffer
  2. seek outside the length of prefetched buffer
  3. read last N bytes from the input stream, seek to 0, read

@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Jun 2, 2026

we need tests for validating that stream is not closed when ...

@yandrey321 , We already have such a lot test for seek and position read; see TestStreamBlockInputStream.

... Checked that the StreamReader won't be closed for seek and position read.

The way I checked it is to turn on debug log and run org.apache.hadoop.ozone.client.rpc.read.TestStreamBlockInputStream before and after this fix. You may try it.

LOG.debug("{}: seek {} -> {}", this, position, pos);
closeStream();
LOG.debug("{}: seek {} -> {}", getName(streamingReader), position, pos);
buffer = null;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was checking HBase behavior, I noticed that it's quite a common scenario where it does seek {current} -> 0 and seek 0 -> current + small offset without any reads in between. Usually, the last seek lands in the existing buffer/prefetch. Would it be more reasonable if we try to preserve the buffer if the actual read happen for the data we already have in the buffer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! Let's put it back to the queue for seek.

@yandrey321
Copy link
Copy Markdown
Contributor

we need tests for validating that stream is not closed when ...

@yandrey321 , We already have such a lot test for seek and position read; see TestStreamBlockInputStream.

... Checked that the StreamReader won't be closed for seek and position read.

The way I checked it is to turn on debug log and run org.apache.hadoop.ozone.client.rpc.read.TestStreamBlockInputStream before and after this fix. You may try it.

we need to make sure that we'd not have regression when the next person would change the behavior.

@szetszwo
Copy link
Copy Markdown
Contributor Author

szetszwo commented Jun 3, 2026

we need to make sure that we'd not have regression when the next person would change the behavior.

@yandrey321 , this is a good suggestion only if such test is easy to be done in low cost. It would be great if you could contribute on it. I look forward to see your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants