-
Notifications
You must be signed in to change notification settings - Fork 27
improvement: rework Store.toSQL function to minimize UPDATE statements #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -18,9 +40,230 @@ export abstract class StoreBase implements Store { | |||
); | |||
} | |||
|
|||
protected _toSQL(props: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the relevant part with the algorithm.
return this._toSQL({ | ||
serializeInsertStatement, | ||
serializeUpdateStatement, | ||
getEndStatements, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dialect implementation is now slimmed down to serializing the insert and update statements! (+ providing optional "end statements", in Postgres' case, to fix the sequences).
@@ -253,5 +253,50 @@ describe.each(["postgresJs"] as const)("store: %s", (adapter) => { | |||
]), | |||
); | |||
}); | |||
test("should handle circular references", async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a simple case, but I'm sure @avallete will be able to provide more elaborated ones and break all the things! 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crack his fingers
import { type CodegenContext } from "#core/codegen/codegen.js"; | ||
import { type Shape, type TableShapePredictions } from "#trpc/shapes.js"; | ||
import { type CodegenContext } from "../../codegen/codegen.js"; | ||
import { shouldGenerateFieldValue } from "../../dataModel/shouldGenerateFieldValue.js"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgoux bikeshed, but just checking if these ..
paths are intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, they live in the same "module" which is #core
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so convention-wise, we should use ../../
if its in the same "module" (e.g. code in core/
importing other code in core/
, and #
if its in a different module (e.g. code in dialect/
importing code in core/
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it's very informal, so I wouldn't obsess on it. It's just a way for me to keep layers of responsibility separated. If I see a "#" I know it's from another layer of responsibility.
// do we need to keep track of the exact chain of parents? | ||
if (ctx.pending[parent.type].includes(parentRowId)) { | ||
if (parent.isRequired) { | ||
throw new Error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question
Should I try to add tests case on this PR so we address that before merging ?
I wonder if the fact that we don't handle circular deps will decrease the UX for most of the users or not. Because before we were just appending some update and that was it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't handle non-nullable circular deps, which wouldn't be insertable anyway.
I don't think this comment is relevant anymore, I think the way we generate UPDATE statements and replace these nullable FK with NULL when necessary solves it. But you can try breaking it, of course, maybe I'm missing obvious cases!
serializeUpdateStatement: ctx.serializeUpdateStatement, | ||
}); | ||
|
||
ctx.insertStatements.push(insertStatement); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that this line would be reached for parents first, then their children later? Such that we're always inserting parents before children, to avoid needing to later on update already inserted rows?
const insertStatements: Array<string> = []; | ||
const updateStatements: Array<string> = []; | ||
|
||
const sortedModels = sortModels(this.dataModel); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what reason do we topologically sort the models? To make sure the parents are reached before other ancestors, or something?
].join(EOL), | ||
); | ||
} else { | ||
updatableParents.push(parent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this line only be reached if there is a cyclic dependency + there is at least one nullable relation in the chain causing the cycle?
foreign key (last_order_id) references "order"(id); | ||
|
||
alter table product add constraint fk_first_order | ||
foreign key (first_order_id) references "order"(id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth also testing cycles where there are relations in-between? e.g. instead of order->customer->order
and order->product->order
, something like order->transaction->customer->order
, or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left a bunch of questions for my understanding, but looks great! Code makes sense (provided I understood it correctly :D but the questions should indicate that at least). Nice that we also were able to abstract out the bulk of the logic so it is not per-dialect. Great work.
Reconverting the PR to draft as @avallete was able to destroy my implementation with more complicated tests! |
Cool, but was really just me wanting to understand the approach. If the implementation changing though, then I'll just re-grok and re-ask when the PR is open again, so works for me. |
I'd like us to consider using a more strategic approach:
|
We have a user who encountered a "bug" because we split our insertions between
INSERT
andUPDATE
statements.This PR's goal is to use as many
INSERT
statements as we can.UPDATE
statements should only be used if we encounter a circular dependency.The strategy here is to follow the rows structure in order to generate the statements.
For a given row, we look for parents, and if there are, we recursively (sorry @avallete 😂) create those parents statements.
It ensures that we will always create the rows in the right order, so we can get rid of the
UPDATE
.Context: https://discord.com/channels/788353076129038346/1214886051886268426
I also updated the dependencies and rewrote some imports, I commented on the interesting files so it's easier to follow. 👍