-
-
Notifications
You must be signed in to change notification settings - Fork 8
feat: Support for fault-tolerant execution #779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
dcb6f32
CRD update
dervoeti d3f9322
feat: fault tolerant execution
dervoeti 96ebd55
test: fault-tolerant execution integration test
dervoeti 04d31cb
docs: fault-tolerant execution documentation
dervoeti 5a247ff
chore: changelog
dervoeti 39eecfa
fix: lint fixes
dervoeti ee145e0
Update docs/modules/trino/pages/usage-guide/fault-tolerant-execution.…
dervoeti 47d2ef5
fix: fixed review feedback
dervoeti 056f84b
feat!: remove explicit Azure and GCS support
dervoeti aefba45
feat: use PascalCase for Query/Task / allow configOverrides for excha…
dervoeti 499665c
fix: always convert durations to seconds
dervoeti 9986801
feat!: restructured CRD
dervoeti 2bd9e45
feat: adapted graceful shutdown docs
dervoeti 3c45e01
chore: add newlines after attributes
dervoeti edac3f9
chore: MinIO legacy charts and updated version
dervoeti 6c6a897
feat: use quantities instead of strings
dervoeti 97d2d1f
fix: moved to quantities in the FTE docs example
dervoeti fab0e45
fix: moved to quantities in the FTE docs
dervoeti 096afe6
chore: cargo fmt
dervoeti b9f5ca9
chore: pre-commit fix
dervoeti 8afac31
Update docs/modules/trino/pages/usage-guide/fault-tolerant-execution.…
dervoeti 30baa69
Update docs/modules/trino/pages/usage-guide/fault-tolerant-execution.…
dervoeti 0d9e066
Update docs/modules/trino/pages/usage-guide/fault-tolerant-execution.…
dervoeti 684083c
Update docs/modules/trino/pages/usage-guide/fault-tolerant-execution.…
dervoeti cef8eac
fix: integration test fixes
dervoeti f4a438a
fix: integration test fixes
dervoeti File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
107 changes: 107 additions & 0 deletions
107
docs/modules/trino/examples/usage-guide/fault-tolerant-execution.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
--- | ||
apiVersion: trino.stackable.tech/v1alpha1 | ||
kind: TrinoCluster | ||
metadata: | ||
name: trino-fault-tolerant | ||
spec: | ||
image: | ||
productVersion: "476" | ||
clusterConfig: | ||
catalogLabelSelector: | ||
matchLabels: | ||
trino: trino-fault-tolerant | ||
faultTolerantExecution: | ||
task: | ||
retryAttemptsPerTask: 4 | ||
retryInitialDelay: 10s | ||
retryMaxDelay: 60s | ||
retryDelayScaleFactor: 2.0 | ||
exchangeDeduplicationBufferSize: 64Mi | ||
exchangeManager: | ||
encryptionEnabled: true | ||
sinkBufferPoolMinSize: 20 | ||
sinkBuffersPerPartition: 4 | ||
sinkMaxFileSize: 2Gi | ||
sourceConcurrentReaders: 8 | ||
s3: | ||
baseDirectories: | ||
- "s3://trino-exchange-bucket/spooling" | ||
connection: | ||
reference: minio-connection | ||
maxErrorRetries: 10 | ||
uploadPartSize: 10Mi | ||
coordinators: | ||
roleGroups: | ||
default: | ||
replicas: 1 | ||
workers: | ||
roleGroups: | ||
default: | ||
replicas: 3 | ||
--- | ||
apiVersion: s3.stackable.tech/v1alpha1 | ||
kind: S3Connection | ||
metadata: | ||
name: minio-connection | ||
spec: | ||
host: minio | ||
port: 9000 | ||
accessStyle: Path | ||
credentials: | ||
secretClass: minio-credentials | ||
tls: | ||
verification: | ||
server: | ||
caCert: | ||
secretClass: minio-tls-certificates | ||
--- | ||
apiVersion: secrets.stackable.tech/v1alpha1 | ||
kind: SecretClass | ||
metadata: | ||
name: minio-tls-certificates | ||
spec: | ||
backend: | ||
k8sSearch: | ||
searchNamespace: | ||
pod: {} | ||
--- | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: minio-tls-certificates | ||
labels: | ||
secrets.stackable.tech/class: minio-tls-certificates | ||
data: | ||
ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQyVENDQXNHZ0F3SUJBZ0lVTmpxdUdZV3R5SjVhNnd5MjNIejJHUmNNbHdNd0RRWUpLb1pJaHZjTkFRRUwKQlFBd2V6RUxNQWtHQTFVRUJoTUNSRVV4R3pBWkJnTlZCQWdNRWxOamFHeGxjM2RwWnkxSWIyeHpkR1ZwYmpFTwpNQXdHQTFVRUJ3d0ZWMlZrWld3eEtEQW1CZ05WQkFvTUgxTjBZV05yWVdKc1pTQlRhV2R1YVc1bklFRjFkR2h2CmNtbDBlU0JKYm1NeEZUQVRCZ05WQkFNTURITjBZV05yWVdKc1pTNWtaVEFnRncweU16QTJNVFl4TWpVeE1ESmEKR0E4eU1USXpNRFV5TXpFeU5URXdNbG93ZXpFTE1Ba0dBMVVFQmhNQ1JFVXhHekFaQmdOVkJBZ01FbE5qYUd4bApjM2RwWnkxSWIyeHpkR1ZwYmpFT01Bd0dBMVVFQnd3RlYyVmtaV3d4S0RBbUJnTlZCQW9NSDFOMFlXTnJZV0pzClpTQlRhV2R1YVc1bklFRjFkR2h2Y21sMGVTQkpibU14RlRBVEJnTlZCQU1NREhOMFlXTnJZV0pzWlM1a1pUQ0MKQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFOblYvdmJ5M1JvNTdhMnF2UVJubjBqZQplS01VMitGMCtsWk5DQXZpR1VENWJtOGprOTFvUFpuazBiaFFxZXlFcm1EUzRXVDB6ZXZFUklCSkpEamZMMEQ4CjQ2QmU3UGlNS2UwZEdqb3FJM3o1Y09JZWpjOGFMUEhTSWxnTjZsVDNmSXJ1UzE2Y29RZ0c0dWFLaUhGNStlV0YKRFJVTGR1NmRzWXV6NmRLanFSaVVPaEh3RHd0VUprRHdQditFSXRxbzBIK01MRkxMWU0wK2xFSWFlN2RONUNRNQpTbzVXaEwyY3l2NVZKN2xqL0VBS0NWaUlFZ0NtekRSRGNSZ1NTald5SDRibjZ5WDIwMjZmUEl5V0pGeUVkTC82CmpBT0pBRERSMEd5aE5PWHJFZXFob2NTTW5JYlFWcXdBVDBrTWh1WFN2d3Zscm5MeVRwRzVqWm00bFVNMzRrTUMKQXdFQUFhTlRNRkV3SFFZRFZSME9CQllFRkVJM1JNTWl5aUJqeVExUlM4bmxPUkpWZDFwQk1COEdBMVVkSXdRWQpNQmFBRkVJM1JNTWl5aUJqeVExUlM4bmxPUkpWZDFwQk1BOEdBMVVkRXdFQi93UUZNQU1CQWY4d0RRWUpLb1pJCmh2Y05BUUVMQlFBRGdnRUJBSHRLUlhkRmR0VWh0VWpvZG1ZUWNlZEFEaEhaT2hCcEtpbnpvdTRicmRrNEhmaEYKTHIvV0ZsY1JlbWxWNm1Cc0xweU11SytUZDhaVUVRNkpFUkx5NmxTL2M2cE9HeG5CNGFDbEU4YXQrQytUakpBTwpWbTNXU0k2VlIxY0ZYR2VaamxkVlE2eGtRc2tNSnpPN2RmNmlNVFB0VjVSa01lSlh0TDZYYW1FaTU0ckJvZ05ICk5yYStFSkJRQmwvWmU5ME5qZVlidjIwdVFwWmFhWkZhYVNtVm9OSERwQndsYTBvdXkrTWpPYkMzU3BnT3ExSUMKUGwzTnV3TkxWOFZiT3I1SHJoUUFvS21nU05iM1A4dmFUVnV4L1gwWWZqeS9TN045a1BCYUs5bUZqNzR6d1Y5dwpxU1ExNEtsNWpPM1YzaHJHV1laRWpET2diWnJyRVgxS1hFdXN0K1E9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K | ||
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUR5RENDQXJDZ0F3SUJBZ0lVQ0kyUE5OcnR6cDZRbDdHa3VhRnhtRGE2VUJvd0RRWUpLb1pJaHZjTkFRRUwKQlFBd2V6RUxNQWtHQTFVRUJoTUNSRVV4R3pBWkJnTlZCQWdNRWxOamFHeGxjM2RwWnkxSWIyeHpkR1ZwYmpFTwpNQXdHQTFVRUJ3d0ZWMlZrWld3eEtEQW1CZ05WQkFvTUgxTjBZV05yWVdKc1pTQlRhV2R1YVc1bklFRjFkR2h2CmNtbDBlU0JKYm1NeEZUQVRCZ05WQkFNTURITjBZV05yWVdKc1pTNWtaVEFnRncweU16QTJNVFl4TWpVeE1ESmEKR0E4eU1USXpNRFV5TXpFeU5URXdNbG93WGpFTE1Ba0dBMVVFQmhNQ1JFVXhHekFaQmdOVkJBZ01FbE5qYUd4bApjM2RwWnkxSWIyeHpkR1ZwYmpFT01Bd0dBMVVFQnd3RlYyVmtaV3d4RWpBUUJnTlZCQW9NQ1ZOMFlXTnJZV0pzClpURU9NQXdHQTFVRUF3d0ZiV2x1YVc4d2dnRWlNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUIKQVFDanluVnorWEhCOE9DWTRwc0VFWW1qb2JwZHpUbG93d2NTUU4rWURQQ2tCZW9yMFRiODdFZ0x6SksrSllidQpwb1hCbE5JSlBRYW93SkVvL1N6U2s4ZnUyWFNNeXZBWlk0RldHeEp5Mnl4SXh2UC9pYk9HT1l1aVBHWEsyNHQ2ClpjR1RVVmhhdWlaR1Nna1dyZWpXV2g3TWpGUytjMXZhWVpxQitRMXpQczVQRk1sYzhsNVYvK2I4WjdqTUppODQKbU9mSVB4amt2SXlKcjVVa2VGM1VmTHFKUzV5NExGNHR5NEZ0MmlBZDdiYmZIYW5mdlltdjZVb0RWdE1YdFdvMQpvUVBmdjNzaFdybVJMenc2ZXVJQXRiWGM1Q2pCeUlha0NiaURuQVU4cktnK0IxSjRtdlFnckx3bzNxUHJ5Smd4ClNkaWRtWjJtRVI3RXorYzVCMG0vTGlJaEFnTUJBQUdqWHpCZE1Cc0dBMVVkRVFRVU1CS0NCVzFwYm1sdmdnbHMKYjJOaGJHaHZjM1F3SFFZRFZSME9CQllFRkpRMGdENWtFdFFyK3REcERTWjdrd1o4SDVoR01COEdBMVVkSXdRWQpNQmFBRkVJM1JNTWl5aUJqeVExUlM4bmxPUkpWZDFwQk1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQmNkaGQrClI0Sm9HdnFMQms1OWRxSVVlY2N0dUZzcmRQeHNCaU9GaFlOZ1pxZWRMTTBVTDVEenlmQUhmVk8wTGZTRURkZFgKUkpMOXlMNytrTVUwVDc2Y3ZkQzlYVkFJRTZIVXdUbzlHWXNQcXN1eVpvVmpOcEVESkN3WTNDdm9ubEpWZTRkcQovZ0FiSk1ZQitUU21ZNXlEUHovSkZZL1haellhUGI3T2RlR3VqYlZUNUl4cDk3QXBTOFlJaXY3M0Mwd1ViYzZSCmgwcmNmUmJ5a1NRVWg5dmdWZFhSU1I4RFQzV0NmZHFOek5CWVh2OW1xZlc1ejRzYkdqK2wzd1VsL0kzRi9tSXcKZnlPNEN0aTRha2lHVkhsZmZFeTB3a3pWYUJ4aGNYajJJM0JVVGhCNFpxamxzc2llVmFGa3d2WG1teVJUMG9FVwo1SCtOUEhjcXVTMXpQc2NsCi0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K | ||
tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2QUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktZd2dnU2lBZ0VBQW9JQkFRQ2p5blZ6K1hIQjhPQ1kKNHBzRUVZbWpvYnBkelRsb3d3Y1NRTitZRFBDa0Jlb3IwVGI4N0VnTHpKSytKWWJ1cG9YQmxOSUpQUWFvd0pFbwovU3pTazhmdTJYU015dkFaWTRGV0d4SnkyeXhJeHZQL2liT0dPWXVpUEdYSzI0dDZaY0dUVVZoYXVpWkdTZ2tXCnJlaldXaDdNakZTK2MxdmFZWnFCK1ExelBzNVBGTWxjOGw1Vi8rYjhaN2pNSmk4NG1PZklQeGprdkl5SnI1VWsKZUYzVWZMcUpTNXk0TEY0dHk0RnQyaUFkN2JiZkhhbmZ2WW12NlVvRFZ0TVh0V28xb1FQZnYzc2hXcm1STHp3NgpldUlBdGJYYzVDakJ5SWFrQ2JpRG5BVThyS2crQjFKNG12UWdyTHdvM3FQcnlKZ3hTZGlkbVoybUVSN0V6K2M1CkIwbS9MaUloQWdNQkFBRUNnZ0VBQWQzdDVzdUNFMjdXY0llc3NxZ3NoSFAwZHRzKyswVzF6K3h6WC8xTnhPRFkKWVhWNkJmbi9mRHJ4dFQ4aVFaZ2VVQzJORTFQaHZveXJXdWMvMm9xYXJjdEd1OUFZV29HNjJLdG9VMnpTSFdZLwpJN3VERTFXV2xOdlJZVFdOYW5DOGV4eGpRRzE4d0RKWjFpdFhTeEl0NWJEM3lrL3dUUlh0dCt1SnpyVjVqb2N1CmNoeERMd293aXUxQWo2ZFJDWk5CejlUSnh5TnI1ME5ZVzJVWEJhVC84N1hyRkZkSndNVFZUMEI3SE9uRzdSQlYKUWxLdzhtcVZiYU5lbmhjdk1qUjI5c3hUekhSK2p4SU8zQndPNk9Hai9PRmhGQllVN1RMWGVsZDFxb2UwdmIyRwpiOGhQcEd1cHRyNUF0OWx3MXc1d1EzSWdpdXRQTkg1cXlEeUNwRWw2RVFLQmdRRGNkYnNsT2ZLSmo3TzJMQXlZCkZ0a1RwaWxFMFYzajBxbVE5M0lqclY0K0RSbUxNRUIyOTk0MDdCVVlRUWoxL0RJYlFjb1oyRUVjVUI1cGRlSHMKN0RNRUQ2WExIYjJKVTEyK2E3c1d5Q05kS2VjZStUNy9JYmxJOFR0MzQwVWxIUTZ6U01TRGNqdmZjRkhWZ3YwcwpDYWpoRng3TmtMRVhUWnI4ZlQzWUloajR2UUtCZ1FDK01nWjFVbW9KdzlJQVFqMnVJVTVDeTl4aldlWURUQU8vCllhWEl6d2xnZTQzOE1jYmI0Y04yU2FOU0dEZ1Y3bnU1a3FpaWhwalBZV0lpaU9CcDlrVFJIWE9kUFc0N3N5ZUkKdDNrd3JwMnpWbFVnbGNNWlo2bW1WM1FWYUFOWmdqVTRSU3Y0ZS9WeFVMamJaYWZqUHRaUnNqWkdwSzBZVTFvdApWajhJZVE3Zk5RS0JnQ1ArWk11ekpsSW5VQ1FTRlF4UHpxbFNtN0pNckpPaHRXV2h3TlRxWFZTc050dHV5VmVqCktIaGpneDR1b0JQcFZSVDJMTlVEWmI0RnByRjVPYVhBK3FOVEdyS0s3SU1iUlZidHArSVVVeEhHNGFGQStIUVgKUVhVVFRhNUpRT1RLVmJnWHpWM1lyTVhTUk1valZNcDMyVWJHeTVTc1p2MXpBamJ2QzhYWjYxSFJBb0dBZEJjUQp2aGU1eFpBUzVEbUtjSGkvemlHa3ViZXJuNk9NUGdxYUtJSEdsVytVOExScFR0ajBkNFRtL1Rydk1PUEovVEU1CllVcUtoenBIcmhDaCtjdHBvY0k2U1dXdm5SenpLbzNpbVFaY0Y1VEFqUTBjY3F0RmI5UzlkRHR5bi9YTUNqYWUKYWlNdll5VUVVRll5TFpDelBGWnNycDNoVVpHKzN5RmZoQXB3TzJrQ2dZQkh3WWFQSWRXNld3NytCMmhpbjBvdwpqYTNjZXN2QTRqYU1Qd1NMVDhPTnRVMUdCU01md2N6TWJuUEhMclJ2Qjg3bjlnUGFSMndRR1VtckZFTzNMUFgvCmtSY09HcFlCSHBEWEVqRGhLa1dkUnVMT0ZnNEhMWmRWOEFOWmxRMFZTY0U4dTNkRERVTzg5cEdEbjA4cVRBcmwKeDlreHN1ZEVWcmtlclpiNVV4RlZxUT09Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K | ||
--- | ||
apiVersion: secrets.stackable.tech/v1alpha1 | ||
kind: SecretClass | ||
metadata: | ||
name: minio-credentials | ||
spec: | ||
backend: | ||
k8sSearch: | ||
searchNamespace: | ||
pod: {} | ||
--- | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: minio-credentials-secret | ||
labels: | ||
secrets.stackable.tech/class: minio-credentials | ||
stringData: | ||
accessKey: minio-access-key | ||
secretKey: minio-secret-key | ||
--- | ||
apiVersion: trino.stackable.tech/v1alpha1 | ||
kind: TrinoCatalog | ||
metadata: | ||
name: tpch | ||
labels: | ||
trino: trino-fault-tolerant | ||
spec: | ||
connector: | ||
tpch: {} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
213 changes: 213 additions & 0 deletions
213
docs/modules/trino/pages/usage-guide/fault-tolerant-execution.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
= Fault-tolerant execution | ||
:description: Configure fault-tolerant execution in Trino clusters for improved query resilience and automatic retry capabilities. | ||
:keywords: fault-tolerant execution, retry policy, exchange manager, spooling, query resilience | ||
|
||
Fault-tolerant execution is a mechanism in Trino that enables a cluster to mitigate query failures by retrying queries or their component tasks in the event of failure. | ||
With fault-tolerant execution enabled, intermediate exchange data is spooled and can be re-used by another worker in the event of a worker outage or other failures during query execution. | ||
|
||
By default, if a Trino node lacks the resources to execute a task or otherwise fails during query execution, the query fails and must be run again manually. | ||
The longer the runtime of a query, the more likely it is to be susceptible to such failures. | ||
|
||
NOTE: Fault tolerance does not apply to broken queries or other user errors. | ||
For example, Trino does not spend resources retrying a query that fails because its SQL cannot be parsed. | ||
|
||
Take a look at the link:https://trino.io/docs/current/admin/fault-tolerant-execution.html[Trino documentation for fault-tolerant execution {external-link-icon}^] to learn more. | ||
|
||
== Configuration | ||
|
||
Fault-tolerant execution is not enabled by default. | ||
It can be enabled in the `TrinoCluster` resource by adding a `faultTolerantExecution` section to the cluster configuration. | ||
The configuration uses a structured approach where you choose either `query` or `task` retry policy, each with their specific configuration options. | ||
|
||
=== Query retry policy | ||
|
||
A `query` retry policy instructs Trino to automatically retry a query in the event of an error occurring on a worker node. | ||
This policy is recommended when the majority of the Trino cluster's workload consists of many small queries. | ||
|
||
By default, Trino does not implement fault tolerance for queries whose result set exceeds 32Mi in size. | ||
This limit can be increased by modifying the `exchangeDeduplicationBufferSize` configuration property to be greater than the default value of `32Mi`, but this results in higher memory usage on the coordinator. | ||
|
||
[source,yaml] | ||
---- | ||
spec: | ||
clusterConfig: | ||
faultTolerantExecution: | ||
query: | ||
retryAttempts: 3 | ||
exchangeDeduplicationBufferSize: 64Mi # Increased from default 32Mi | ||
---- | ||
|
||
=== Task retry policy | ||
|
||
A `task` retry policy instructs Trino to retry individual query tasks in the event of failure. | ||
You **must** configure an exchange manager to use the task retry policy. | ||
This policy is recommended when executing large batch queries, as the cluster can more efficiently retry smaller tasks within the query, rather than retry the whole query. | ||
|
||
IMPORTANT: A `task` retry policy is best suited for long-running queries, but this policy can result in higher latency for short-running queries executed in high volume. | ||
As a best practice, it is recommended to run a dedicated cluster with a `task` retry policy for large batch queries, separate from another cluster that handles short queries. | ||
There are tools that can help you achieve this by automatically routing queries based on certain criteria (such as query estimates or user) to different Trino clusters. Notable mentions are link:https://github.com/stackabletech/trino-lb[trino-lb {external-link-icon}^] and link:https://github.com/trinodb/trino-gateway[trino-gateway {external-link-icon}^]. | ||
|
||
[source,yaml] | ||
---- | ||
spec: | ||
clusterConfig: | ||
faultTolerantExecution: | ||
task: | ||
retryAttemptsPerTask: 4 | ||
exchangeManager: # Mandatory for Task retry policy | ||
encryptionEnabled: true | ||
s3: | ||
baseDirectories: | ||
- "s3://trino-exchange-bucket/spooling" | ||
connection: | ||
reference: my-s3-connection # <1> | ||
---- | ||
<1> Reference to an xref:concepts:s3.adoc[S3Connection] resource | ||
|
||
== Exchange manager | ||
|
||
Exchange spooling is responsible for storing and managing spooled data for fault-tolerant execution. | ||
You can configure a filesystem-based exchange manager that stores spooled data in a specified location, such as AWS S3 and S3-compatible systems, HDFS, or local filesystem. | ||
|
||
NOTE: An exchange manager is required when using the `task` retry policy and optional for the `query` retry policy. | ||
|
||
=== S3-compatible storage | ||
|
||
You can use S3-compatible storage systems for exchange spooling, including AWS S3 and MinIO. | ||
|
||
[source,yaml] | ||
---- | ||
spec: | ||
clusterConfig: | ||
faultTolerantExecution: | ||
task: | ||
retryAttemptsPerTask: 4 | ||
exchangeManager: | ||
s3: | ||
baseDirectories: # <1> | ||
- "s3://exchange-bucket-1/trino-spooling" | ||
connection: | ||
reference: minio-s3-connection # <2> | ||
--- | ||
apiVersion: s3.stackable.tech/v1alpha1 | ||
kind: S3Connection | ||
metadata: | ||
name: minio-s3-connection | ||
spec: | ||
host: minio.default.svc.cluster.local | ||
port: 9000 | ||
accessStyle: Path | ||
credentials: | ||
secretClass: minio-secret-class | ||
tls: | ||
verification: | ||
server: | ||
caCert: | ||
secretClass: tls | ||
---- | ||
<1> Multiple S3 buckets can be specified to distribute I/O load | ||
<2> S3 connection defined as a reference to an xref:concepts:s3.adoc[S3Connection] resource | ||
|
||
For storage systems like Google Cloud Storage or Azure Blob Storage, you can use the S3-compatible configuration with `configOverrides` to provide the necessary exchange manager properties. | ||
|
||
=== HDFS storage | ||
|
||
You can configure HDFS as the exchange spooling destination: | ||
|
||
[source,yaml] | ||
---- | ||
spec: | ||
clusterConfig: | ||
faultTolerantExecution: | ||
task: | ||
retryAttemptsPerTask: 4 | ||
exchangeManager: | ||
hdfs: | ||
baseDirectories: | ||
- "hdfs://simple-hdfs/exchange-spooling" | ||
hdfs: | ||
configMap: simple-hdfs # <1> | ||
---- | ||
<1> ConfigMap containing HDFS configuration files (created by the HDFS operator) | ||
|
||
=== Local filesystem storage | ||
|
||
Local filesystem storage is supported but only recommended for development or single-node deployments: | ||
|
||
WARNING: It is only recommended to use a local filesystem for exchange in standalone, non-production clusters. | ||
A local directory can only be used for exchange in a distributed cluster if the exchange directory is shared and accessible from all nodes. | ||
|
||
[source,yaml] | ||
---- | ||
spec: | ||
clusterConfig: | ||
faultTolerantExecution: | ||
task: | ||
exchangeManager: | ||
local: | ||
baseDirectories: | ||
- "/trino-exchange" | ||
coordinators: | ||
roleGroups: | ||
default: | ||
replicas: 1 | ||
podOverrides: | ||
spec: | ||
volumes: | ||
- name: trino-exchange | ||
persistentVolumeClaim: | ||
claimName: trino-exchange-pvc | ||
containers: | ||
- name: trino | ||
volumeMounts: | ||
- name: trino-exchange | ||
mountPath: /trino-exchange | ||
workers: | ||
roleGroups: | ||
default: | ||
replicas: 1 | ||
podOverrides: | ||
spec: | ||
volumes: | ||
- name: trino-exchange | ||
persistentVolumeClaim: | ||
claimName: trino-exchange-pvc | ||
containers: | ||
- name: trino | ||
volumeMounts: | ||
- name: trino-exchange | ||
mountPath: /trino-exchange | ||
--- | ||
kind: PersistentVolumeClaim | ||
apiVersion: v1 | ||
metadata: | ||
name: trino-exchange-pvc | ||
spec: | ||
accessModes: | ||
- ReadWriteOnce | ||
resources: | ||
requests: | ||
storage: 50Gi | ||
---- | ||
|
||
== Connector support | ||
|
||
Support for fault-tolerant execution of SQL statements varies on a per-connector basis. | ||
Take a look at the link:https://trino.io/docs/current/admin/fault-tolerant-execution.html#configuration[Trino documentation {external-link-icon}^] to see which connectors support fault-tolerant execution. | ||
|
||
When using connectors that do not explicitly support fault-tolerant execution, you may encounter a "This connector does not support query retries" error message. | ||
|
||
== Example | ||
|
||
Here's an example of a Trino cluster with fault-tolerant execution enabled using the `task` retry policy and MinIO backed S3 as the exchange manager: | ||
|
||
[source,bash] | ||
---- | ||
stackablectl operator install commons secret listener trino | ||
helm install minio oci://registry-1.docker.io/bitnamicharts/minio --version 17.0.19 --set auth.rootUser=minio-access-key --set auth.rootPassword=minio-secret-key --set tls.enabled=true --set tls.server.existingSecret=minio-tls-certificates --set tls.existingSecret=minio-tls-certificates --set tls.existingCASecret=minio-tls-certificates --set tls.autoGenerated.enabled=false --set provisioning.enabled=true --set provisioning.buckets[0].name=trino-exchange-bucket --set global.security.allowInsecureImages=true --set image.repository=bitnamilegacy/minio --set clientImage.repository=bitnamilegacy/minio-client --set defaultInitContainers.volumePermissions.image.repository=bitnamilegacy/os-shell --set console.image.repository=bitnamilegacy/minio-object-browser | ||
---- | ||
|
||
[source,yaml] | ||
---- | ||
include::example$usage-guide/fault-tolerant-execution.yaml[] | ||
---- |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.