-
Notifications
You must be signed in to change notification settings - Fork 731
feat: prevent duplicates [CM-2320] #3282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR titles must follow Conventional Commits. Love from, Your reviewers ❤️.
WHERE "deletedAt" IS NULL; | ||
|
||
-- 3. Create or replace the sync function | ||
CREATE OR REPLACE FUNCTION sync_repositories() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm so far we didn't use triggers or functions or procedures in the db itself so I'm wondering if it would be better to have this logic in the code itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of doing it this way for performance reasons. We can also implement it in code and see how it performs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ye I would do that instead of triggers as it hides somewhat the logic. I mean if we were using this before I would let it be but if it's gonna be the only function we have in the db with such logic it will be hidden in a sense where people won't expect to find the logic here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes perfectly sense, thanks!
9ec1c10
to
54f5bd6
Compare
6a2679d
to
0c14779
Compare
-- 2. Enforce that a repository can be assigned to only one active segment | ||
CREATE UNIQUE INDEX IF NOT EXISTS unique_active_segment_repos | ||
ON "segmentRepositories" (repository) | ||
WHERE "deletedAt" IS NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add the unique index for every row, even if it's deleted, to avoid issues in case we ever remove the soft deleted marker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this constraint to keep track of deletions. If we use a unique index for every row, I guess we must use hard deletion. If this is the case I think we can go directly with hard deletion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to have another unique index for each row because we already have the primary key on the repository and segmentId, which already makes them unique. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. 👍 Just one question that isn't critical.
✍️ Proposed Changes
What:
insightsProjects
andsegments
.SegmentRepository.mappedRepos
, which was not accounting for thedeletedAt
field.How:
segmentRepositories
— stores the association between each repository and its correspondingsegment
with aUNIQUE INDEX
on the combination in order to prevent multiple assosiations.InsightsProject
a new functionupsertSegmentRepositories
is called. This will upsert in bulk all the new repo associated.softDeleteMissingSegmentRepositories
is called, which set the deletedAt on the repo which are not anymore related to the projectNotes:
This approach may introduce some duplication of data already available in the
githubRepos
andgitlabRepos
tables. However, since repository data is derived from theintegrations.settings
field too, enforcing database-level constraints directly on those tables is not feasible. Hence, this duplication is a necessary trade-off to maintain data integrity.🔗 colses: CM-2320
Checklist ✅
Feature
,Improvement
, orBug
.