Support for distributed migrations #130
Open
+350
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Main motivation for this PR is to fix the handling of migrations performed by django through a load balancer, which can lead to inconsistent results if a clickhouse cluster with multiple nodes is behind a load balancer and round-robin is in effect. By making migrations distributed, all nodes are aware of the migration data and we can have much more consistent results when running manage.py migrate. It also makes the process of distributing migrations data automatic. (See discussion #114)
When having
distributed_migrations
andmigration_cluster
set, new distributed and local tables will be created for migrations, and all migration querysets will be routed to the distributed table.In order to test the load balacing use case, a new docker compose service was added for HAProxy. For simplicity, already existent clickhouse nodes were used behind the HAProxy.
Example configuration would be
In my case, a clickhouse cluster with 3 nodes is behind an AWS ELB and everytime when running
makemigrations
ormigrate
, a different result could be achieved, and by using distributed migrations, all my issues were gone.