Skip to content

compute: factor out PeekResultIterator #32514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

aljoscha
Copy link
Contributor

The original motivation for this is so that the code that extracts peek results can be re-used in
https://github.com/MaterializeInc/database-issues/issues/9180, where we want to use a different transport for sending back peek responses but still need to read them out of arrangements the same way.

The nice side effect is that we separate extracting the result from the logic that accumulates it in a response for sending it back. Which leads to clearer separation.

Work towards https://github.com/MaterializeInc/database-issues/issues/9180

@aljoscha aljoscha requested a review from a team as a code owner May 16, 2025 11:53
@aljoscha
Copy link
Contributor Author

Maybe @antiguru and/or @teskje would be good to review this? 🙏

@antiguru antiguru self-requested a review May 16, 2025 12:06
@aljoscha aljoscha force-pushed the compute-refactor-peek-result-iterator branch from 570a68c to c3a400a Compare May 16, 2025 12:32
@aljoscha aljoscha force-pushed the compute-refactor-peek-result-iterator branch 2 times, most recently from e75dd05 to 214ebd8 Compare May 28, 2025 14:00
Comment on lines 872 to 885
PendingPeek::Index(peek) => 'response: {
let is_ready = peek.is_ready(upper);

match is_ready {
Ok(false) => break 'response None,
Err(err) => break 'response Some(err),
Ok(true) => (), // Falling through...,
}

if let Some(err) = peek.extract_errs(upper) {
break 'response Some(err);
}

Some(peek.read_result(upper, self.compute_state.max_result_size))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like early returns usually, but in this case the nested-if version ends up more readable, imo:

            PendingPeek::Index(peek) => match peek.is_ready(upper) {
                Ok(true) => {
                    let resp = peek.extract_errs(upper).unwrap_or_else(|| {
                        peek.read_result(upper, self.compute_state.max_result_size)
                    });
                    Some(resp)
                }
                Ok(false) => None,
                Err(err) => Some(err),
            },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, this was from an outdated branch, I updated and backed out all of the extra refactorings, it's not only about the peek iterator

Comment on lines +1 to +4
// Copyright Materialize, Inc. and contributors. All rights reserved.
//
// Use of this software is governed by the Business Source License
// included in the LICENSE file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, isn't there usually also a paragraph about transitioning to the Apache license here? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the header of compute_state.rs, but I checked and some of our other source files do indeed have that Apache license blob as well. So, I don't know ... 🤷‍♂️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess that part is redundant anyway because it's also in the LICENSE file 🤷

peek_timestamp: mz_repr::Timestamp,
has_literal_constraints: bool,
literals: L,
oks_handle: &mut Tr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It's a bit confusing to have an oks_handle if we don't also have an errs_handle. We could call it trace/trace_reader/trace_handle instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change to trace_reader

literals: L,
oks_handle: &mut Tr,
) -> Self {
let (cursor, storage): (<Tr as TraceReader>::Cursor, <Tr as TraceReader>::Storage) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These type annotations are not needed, are they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no, these where from some work in progress where types didn't line up

Comment on lines +101 to +103
tracing::trace!(
?self.literals_exhausted,
key_valid = self.cursor.key_valid(&self.storage),
val_valid = self.cursor.val_valid(&self.storage), "next");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How useful are these traces? I think they will mostly print a lot of true/false that are hard to interpret? Should they also print the actual keys and values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the key/val wouldn't always succeed because it fails when they're not valid. So I should rather remove all these trace logs, yeah?

Comment on lines 214 to 216
if self.cursor.val_valid(&self.storage) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surprising I think. The doc string says that this method will step the key forward, but it only does so if the current value is not valid, which seems like an important detail!

A thing that would make sense to check here is key_valid. Maybe you meant to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intended behavior was actually maybe_step_key, and it would only step when the val is not valid. Which is what the code did. In practice the method is only called when the val is not valid, so I changed that check into an assert and left the docstring as is.

Comment on lines +239 to +242
if self.cursor.val_valid(&self.storage) {
break;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this break to be invoked we would need to have a key with zero values. Is this possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant it the other way round, right? That you expect val_valid to always be true so we always break?

I think it can happen that we go around the loop multiple times but I don't remember why. I pushed a commit that panics when that happens, so let's see what ci has to say.

@teskje
Copy link
Contributor

teskje commented May 28, 2025

Pretty sure this will conflict with #32593, sorry 🙈

@aljoscha aljoscha changed the title compute: factor out PeekResultIterator, refactor peek fulfillment compute: factor out PeekResultIterator May 28, 2025
aljoscha added 3 commits May 28, 2025 18:42
The original motivation for this is so that the code that extracts peek
results can be re-used in
MaterializeInc/database-issues#9180, where we
want to use a different transport for sending back peek responses but
still need to read them out of arrangements the same way.

The nice side effect is that we separate extracting the result from the
logic that accumulates it in a response for sending it back. Which leads
to clearer separation.
@aljoscha aljoscha force-pushed the compute-refactor-peek-result-iterator branch from 4640550 to a74366e Compare May 28, 2025 16:43
@aljoscha
Copy link
Contributor Author

@teskje thanks for the review! I hope I addressed all comments, could you please take a look again? 🙇‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants