Skip to content

[ENH] Support list_prefix operations for S3 and AC/S3. #4637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: rescrv/wal3-gc
Choose a base branch
from

Conversation

rescrv
Copy link
Contributor

@rescrv rescrv commented May 27, 2025

Description of changes

This adds support for a list_prefix operation that calls the V2
ListOBjects API in the AWS SDK. I've included an integration test.

The intended consumer of this API is the wal3 garbage collector,
which needs to list cursors in the directory to find the lowest-version
cursor.

Test plan

Integration test added.

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

N/A

Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@HammadB HammadB self-requested a review May 28, 2025 00:22
options: GetOptions,
) -> Result<Vec<String>, StorageError> {
let atomic_priority = Arc::new(AtomicUsize::new(options.priority.as_usize()));
let _permit = self.rate_limiter.enter(atomic_priority, None).await;
Copy link
Collaborator

@HammadB HammadB May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should list be part of the same rate limiting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. It's a request to S3. A new tier might make sense, but I think rate limiting it as a read makes sense absent any other direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current rate limits assume a read is roughly 8MB and tries to saturate the network bandwidth accordingly. Seems like this has different characteristics and might not be ok to be used with the same rate limiter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make it separate? All I'm seeing is that we'll under perform during lists. Given that the only lists are offline ops, it seems OK.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If lists don't compete with the usual reads and writes for tokens then probably fine to keep it, otherwise I'd make it separate

@HammadB
Copy link
Collaborator

HammadB commented May 28, 2025

Can you please add some discussion of the intended consumer of this?

@rescrv rescrv changed the base branch from rescrv/wal3-gc to main May 28, 2025 19:00
Copy link
Contributor

propel-code-bot bot commented May 28, 2025

Add S3 list_prefix support and implement WAL3 garbage collection CPU-side logic

This PR adds support for a new list_prefix operation to S3 and AdmissionControlledS3 storage backends, enabling efficient listing of objects with a given prefix using the AWS SDK ListObjectsV2 API. It also introduces a comprehensive implementation of WAL3 garbage collection logic, including struct definitions, calculation and tracking of garbage (obsolete fragments/snapshots), associated manifest changes, extensive property-based testing, and integration with existing invariants. Modifications are made across the WAL3/manifest model, storage abstraction layers, and the README to document garbage file semantics and update invariants.

Key Changes:
• Implemented S3 and AdmissionControlledS3 list_prefix operations using AWS ListObjectsV2 paginator with full test coverage
• Integrated list_prefix into storage abstraction and adapted code and tests to take GetOptions as needed
• Added and documented new WAL3 garbage collection structures (Garbage, GarbageAction) for tracking dropped/replaced fragments/snapshots and setsum deltas
• Added application and invariance checking of Garbage collections to manifests, with corresponding invariant/test utilities
• Updated manifest data model to explicitly track 'collected' setsum values and ensure gross setsum accounting even post-GC
• Added heavy property/proptest-based testing of GC correctness and manifest invariants
• Added detailed documentation to README on the purpose, file layout, and invariants associated with garbage files
• Fixed type signatures and updated necessary call-sites throughout the codebase to accommodate the new logic
• Registered all new structures and logic in module imports/exports

Affected Areas:
• S3 and AdmissionControlledS3 storage backends
• Storage abstraction interfaces
• WAL3 library (garbage collection, manifests, property tests)
• Log-service (test/manifest construction to include collected)
• README documentation detailing garbage file invariants

Potential Impact:

Functionality: Enables WAL3-based garbage collection with verifiable setsum accounting; adds ability to list objects via prefix for S3-like backends. No breaking API changes for external users unless they depend on manifest internal fields.

Performance: No significant runtime/latency penalties; list_prefix operation is paginated and used for GC/bookkeeping operations, not critical path. Some extra setsum tracking may slightly increase manifest size.

Security: No negative impact; manifests and GC logic maintain strong integrity invariants via setsum.

Scalability: Scales to large logs; list_prefix paginates and GC happens out-of-band. No increases to hot path object storage contention or batch sizes.

Review Focus:
• Correctness of WAL3 garbage collection graph traversal and setsum accounting (including manifest math and collation logic)
• list_prefix handling in S3/AC-S3 and integration with Storage abstraction and options propagation
• Backward compatibility: manifest serialization/deserialization, especially new collected field
• Comprehensiveness and clarity of proptest-based test harness for manifest + GC operations
• Documentation: Are garbage file semantics clear for new users?

Testing Needed

• Run all new and existing property-based and integration tests; ensure manifest/garbage collection invariants are not violated
• Test list_prefix specifically with S3/MinIO and AC/S3 backends for various subdirectory prefix sizes, ensuring correct, complete listing results
• Exercise manifests with GC and non-GC scenarios to validate collected setsum math and serialization/deserialization through upgrade cycles

Code Quality Assessment

rust/wal3/src/manifest.rs: Model changes are backward-compatible and clearly documented; code is modular with added GC application/check logic

rust/storage/src/s3.rs: Standard paginator ListObjectsV2 usage; good error handling and testing; aligns with AWS SDK best practices

rust/storage/src/admissioncontrolleds3.rs: Appropriate rate-limiting, prioritized requests, follows review suggestion to rate limit list_prefix

rust/wal3/tests/properties.rs: Thorough property/proptest-based testing of manifest+GC interaction

rust/wal3/src/gc.rs: Well-structured, clearly documented; correct use of setsum differential accounting, explicit error cases for corruption

Best Practices

Testing:
• Property-based testing for functional correctness (proptest)
• Integration/e2e testing against actual S3 backends
• Explicit invariant checking in critical sections

Error Handling:
• Clear propagation and matching of custom error types
• Invariant failures are explicit and fail-closed

Documentation:
README updated with file layout, invariants, critical path documentation
• Module and struct documentation thorough

Modularity:
GC logic composed into focused modules; separation of data model from operational paths

Potential Issues

• Manifest backward compatibility: If old code persists and loads manifests, they may see missing 'collected' field (protected by default serde behavior, but deployments should be checked)
• Reliance on proper setsum math for invariants-any bug here would only surface if setsum mismatches occur (but property tests are robust)
• Potential for silent logic bug if new GC/manifest code is not fully exercised in a non-toy deployment; reviewers should ensure invariants are meaningful

This summary was automatically generated by @propel-code-bot

This adds support for a list_prefix operation that calls the V2
ListOBjects API in the AWS SDK.  I've included an integration test.
@rescrv rescrv changed the base branch from main to rescrv/wal3-gc May 28, 2025 19:49
@rescrv rescrv requested a review from HammadB May 28, 2025 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants