Skip to content

Conversation

ianton-ru
Copy link

@ianton-ru ianton-ru commented Aug 22, 2025

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

JOIN with *Cluster table functions

Documentation entry for user-facing changes

Solved #819

Queries like

SELECT a.c1, b.c2
  FROM
    s3Cluster(...) AS a
  JOIN
    local_table AS b
  ON
    a.key=b.key

execute JOIN's on remote nodes.
This behavior is with default value for new setting object_storage_cluster_join_mode='allow'.

With this PR with setting object_storage_cluster_join_mode='local' query rewrited on

SELECT a.c1, a.key
  FROM
    s3Cluster(...) AS a

and JOIN's executed on initiator.

'global' mode (select from right table and send result to swarm nodes to make join on swarm nodes) somewhere later in separate PR.

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

@ianton-ru
Copy link
Author

Don't forget to change branch to merge on antalya-25.6.5 after #949 merged there.

@ianton-ru ianton-ru changed the title WIP s3cluster global join s3cluster joins, part 1 Sep 1, 2025
auto table_function_node = table_function_searcher.getNode();

if (!table_function_node)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Can't fiund table function node");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw Exception(ErrorCodes::LOGICAL_ERROR, "Can't fiund table function node");
throw Exception(ErrorCodes::LOGICAL_ERROR, "Can't find table function node");


- `local` — Replaces the database and table in the subquery with local ones for the destination server (shard), leaving the normal `IN`/`JOIN.`
- `global` — Unsupported for now. Replaces the `IN`/`JOIN` query with `GLOBAL IN`/`GLOBAL JOIN.`
- `allow` — Default value. Allows the use of these types of subqueries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `allow` — Default value. Allows the use of these types of subqueries.
- `allow` — Default value. Allows the use of these types of subqueries. The join will be executed on the initiator.

should we also support 'forbid' ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW - what will happen when joining xxxCluster with non-distributed table? Maybe a test?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's a default old behavour - query executed on remote nodes "as is" and join executed there.
I guess this is optimal for casual, non-swarm mode, when right table is present on all cluster nodes.

using Base = InDepthQueryTreeVisitorWithContext<SearcherVisitor>;
using Base::Base;

explicit SearcherVisitor(QueryTreeNodeType type_, ContextPtr context) : Base(context), type(type_) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the code is only for analyzer (i.e. build on the top of QueryTree, not on the top of AST). It's ok for new features, but maybe it's better to document it clearly. I was also thinking about adding throw when analizer is disabled. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full support for non-anylyzer code can be more challenging (see for example JoinToSubqueryTransformVisitor)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added exception

passed_node = node;
}

QueryTreeNodePtr getNode() const { return passed_node; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually i never looked close on the anazyzer code, thought it should be simpler (i.e. avoiding tree traversal, but i can be wrong).

query_tree_distributed->setAlias(table_function_ast.alias);

// Find add used columns from table function to make proper projection list
CollectUsedColumnsForSourceVisitor collector(table_function_node, context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe things like tryResolveIdentifierFromJoinTreeNode / tryResolveIdentifierFromJoinTree / tryResolveIdentifierFromTableExpression can be reused.

@ianton-ru ianton-ru changed the base branch from feature/antalya-25.6.5/alternative_syntax to antalya-25.6.5 September 8, 2025 22:21
Possible values:

- `local` — Replaces the database and table in the subquery with local ones for the destination server (shard), leaving the normal `IN`/`JOIN.`
- `global` — Unsupported for now. Replaces the `IN`/`JOIN` query with `GLOBAL IN`/`GLOBAL JOIN.`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is unsupported, then why was it added? I suggest ditching that value from enum

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for part 2, when right table is calculated on initiator and is sent as part of the query to the swarm nodes.

@@ -1713,6 +1713,22 @@ Possible values:
- `global` — Replaces the `IN`/`JOIN` query with `GLOBAL IN`/`GLOBAL JOIN.`
- `allow` — Allows the use of these types of subqueries.
)", IMPORTANT) \
DECLARE(ObjectStorageClusterJoinMode, object_storage_cluster_join_mode, ObjectStorageClusterJoinMode::ALLOW, R"(
Changes the behaviour of object storage cluster function ot table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Changes the behaviour of object storage cluster function ot table.
Changes the behaviour of object storage cluster function of table.

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'or' :)

Copy link

github-actions bot commented Sep 9, 2025

Workflow [PR], commit [c0468e7]

@Enmk Enmk merged commit 53d1f5e into antalya-25.6.5 Sep 15, 2025
308 of 324 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants