Skip to content

cluster: expel instances removed from config in cluster.sync #424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

locker
Copy link
Member

@locker locker commented Jun 11, 2025

Currently, cluster:sync() leaves instances removed from the config in the instance list. This makes cluster:reload() fail because it tries to reload the config for all instances including those that are not in the config anymore. To fix that, let's make cluster:sync() move instances that were removed from the config to the special expelled instance list so that they become inaccessible via cluster but are still dropped by cluster:drop().

Also, let's add the new boolean option start_stop for cluster:sync(). If set, cluster:sync() will stop old servers and start new ones.

Closes #423

Comment on lines 227 to 247
for _, iserver in pairs(old_server_map) do
iserver:drop()
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting. We don't start new instances in :sync(), just create server objects in Lua. So, it looks like we shouldn't stop them too.

But,

  1. If we don't stop the server, it remains running after the test case.
  2. A user may want some automation about synchronization of the configured and present instances.

The first point is simple: just track the excluded server objects in some list to call :drop() on it in the after_each hook. self._excluded_server_map or something like this. We can also reuse the server object from this mapping if a previously deleted server is added again.

The second point is really interesting. Since tarantool 3.3 a user can configure autoexpelling if needed. So, we really need just update the config and start/stop processes to manage the cluster.

It looks like we really have two suitable methods:

  • update config + update server objects
  • update config + update server objects + start/stop processes

The first one is what :sync() do now. We can implement the second as a :sync() option. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for the suggestion. Now, cluster:sync() moves removed instances to a separate list to be dropped along with the cluster. Also, added the new option start_stop, which makes cluster:sync() start/stop added/removed instances.

@Totktonada Totktonada assigned locker and unassigned Totktonada Jun 23, 2025
Currently, cluster:sync() leaves instances removed from the config in
the instance list. This makes cluster:reload() fail because it tries
to reload the config for all instances including those that are not
in the config anymore. To fix that, let's make cluster:sync() move
instances that were removed from the config to the special expelled
instance list so that they become inaccessible via cluster but are still
dropped by cluster:drop().

Also, let's add the new boolean option start_stop for cluster:sync().
If set, cluster:sync() will stop old servers and start new ones.

Closes tarantool#423
@locker locker force-pushed the cluster-sync-fix branch from f21b8ae to 163c7d4 Compare July 18, 2025 09:59
@locker locker changed the title cluster: drop instances removed from config in cluster.sync cluster: expel instances removed from config in cluster.sync Jul 18, 2025
@locker locker requested a review from Totktonada July 18, 2025 10:01
@locker locker assigned Totktonada and unassigned locker Jul 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support removal of instances from test cluster
2 participants