Skip to content

Conversation

dervoeti
Copy link
Member

@dervoeti dervoeti commented Aug 4, 2025

Description

Implementation for #576

When providing credentials for S3, Azure or GCS we currently need a SecretClass. It would be easier if the user could just provide a Secret as an alternative, we had that discussion recently. To be consistent with the current state, I used SecretClass exclusively for now.

Things I did not do in this PR to ease the review process (we could tackle those as a separate PR):

  • Combine redundant logic from FTE and Catalog modules
  • Use config-utils instead of doing the load_env_from_files stunt

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch 2 times, most recently from 4cc640f to 3d113df Compare August 5, 2025 18:44
@dervoeti dervoeti self-assigned this Aug 5, 2025
@dervoeti dervoeti moved this to Development: In Review in Stackable Engineering Aug 5, 2025
@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch from fc684e3 to 85a09cc Compare August 5, 2025 19:40
Copy link
Member

@sbernauer sbernauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped some comments on the CRD code and YAML.
As an possible alternative I can propose a complex enum style, where we can together think about it, just pasting it here for reference for now

spec:
  clusterConfig:
    faultTolerantExecution:
      query:
        retryAttempts: 4 # maps to query-retry-attempts
        retryInitialDelay: 10s
        retryMaxDelay: 60s
        retryDelayScaleFactor: 2.0 # f32
        exchangeManager: # Optional
          deduplicationBufferSize: 64Mi # Quantity
          encryptionEnabled: true
          sinkBufferPoolMinSize: 20
          sinkBuffersPerPartition: 4
          sinkMaxFileSize: 2Gi # Quantity
          sourceConcurrentReaders: 8
          s3:
            baseDirectories:
              - s3://trino-exchange-bucket/spooling
            connection: # Mandatory
              reference: minio-connection
            maxErrorRetries: 10
            uploadPartSize: 10Mi # Quantity
      # OR
      task:
        retryAttemptsPerTask: 4 # maps to task-retry-attempts-per-task
        retryInitialDelay: 10s
        retryMaxDelay: 60s
        retryDelayScaleFactor: 2.0 # f32
        exchangeManager: # Mandatory
          # ... same struct as above

@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch 2 times, most recently from 2567920 to 843b18d Compare August 7, 2025 14:06
@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch from fd42c2b to af39e6b Compare August 12, 2025 12:28
@lfrancke lfrancke linked an issue Aug 13, 2025 that may be closed by this pull request
3 tasks
@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch from eaaff12 to 9e587d8 Compare August 13, 2025 11:39
@dervoeti dervoeti requested a review from sbernauer August 13, 2025 11:44
@sbernauer sbernauer moved this from Development: In Review to Development: Waiting for Review in Stackable Engineering Aug 18, 2025
@sbernauer sbernauer requested review from maltesander and removed request for sbernauer August 18, 2025 07:07
@sbernauer sbernauer moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Aug 18, 2025
@dervoeti dervoeti force-pushed the feat/fault-tolerant-execution branch from 539473f to b9f5ca9 Compare August 18, 2025 08:20
Copy link
Member

@maltesander maltesander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dervoeti dervoeti added this pull request to the merge queue Aug 18, 2025
Merged via the queue into main with commit 28260dc Aug 18, 2025
16 of 17 checks passed
@dervoeti dervoeti deleted the feat/fault-tolerant-execution branch August 18, 2025 14:35
@dervoeti dervoeti moved this from Development: In Review to Development: Done in Stackable Engineering Aug 19, 2025
@dervoeti
Copy link
Member Author

Release note

SDP now supports configuring fault-tolerant execution for Trino via the TrinoCluster CRD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Development: Done
Development

Successfully merging this pull request may close these issues.

Support Fault tolerant execution (FTE)
3 participants