-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Connection.predefine_table_schema (#422) #749
base: master
Are you sure you want to change the base?
Implement Connection.predefine_table_schema (#422) #749
Conversation
Provide existing model schema to a table connection in model to reduce DescribeTable requests
Avoid using unittest.TestCase.subTest, which is supported only since version 3.4
I didn't do any refactoring to avoid unnecessary big pull requests, but I'd suggest
I could do it if there are no objections. |
Do anybody has an idea when the PR is going to be reviewed? Thanks! |
Would it be possible to make that DescribeTable call unnecessary? Our model definition already describes the table's keys and indices so we can just use them. Of course DescribeTable gets that same information from the source of truth, but we don't actually take advantage of that to call out discrepancies between the model definition and the table definition: instead, in those cases PynamoDB fails in confusing ways (e.g. if the table defines a key with a different underlying data type vs. our model definition). I remember @jpinner-lyft was working on removing that DescribeTable call, but I don't think he ever completed that effort. |
That’s the point of the PR. Connection doesn’t call DescribeTable if the
scheme is provided, which is already the case for models.
Ilya Konstantinov <[email protected]> schrieb am Sa. 4. Apr. 2020 um
22:16:
Would it be possible to make that DescribeTable call unnecessary? Our
model definition already describes the table's keys and indices so we can
just use them.
Of course DescribeTable gets that same information from the source of
truth, but we don't actually take advantage of that to call out
discrepancies between the model definition and the table definition:
instead, in those cases PynamoDB fails in confusing ways (e.g. if the table
defines a key with a different underlying data type vs. our model
definition).
I remember @jpinner-lyft <https://github.com/jpinner-lyft> was working on
removing that DescribeTable call, but I don't think he ever completed that
effort.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#749 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGFF33SRR6X6VIO2METBGTRK6ITNANCNFSM4KNRTQWQ>
.
--
Best Regards, Ilya Shmygol.
|
Ah, I missed the fact that you automatically produce a predefined schema in One thought, though, is that the only reason |
@ikonst do you mean avoid caching schemas in a class and store the table schema in an instance variable instead (we still have to store the schema somewhere)? I was afraid to introduce decline of performance for clients who use But I find your suggestion reasonable and can do it. Here how I see it: Does it make sense to you? |
(Sorry, been very busy lately, will have a look over the weekend) |
It's hard for me to imagine TableConnection is used by anyone but the library itself, though with enough users there's someone for every obscure use case :)
What I'm questioning is why the connection/model needs to be aware of the server's table schema. While it's the source of truth, in many other places we act based on the local schema. The schemas can be out of sync -- for example, server thinks HK is a number (N), locally it's a UnicodeAttribute (S). Do we use server's schema or local schema when serializing and deserializing? From last time I read the code, it's a mix of both, which is probably the most confusing and error prone approach :/ The library defines a local schema that's more extensive than the server-side schema -- we define non-key attributes while DynamoDB defines the keys only, and we have some complex data types on top of DynamoDB's built-in ones, so the server's schema is not sufficient to serialize/deserialize a model; thus, we need the local schema. Because of this, I'd rather see PynamoDB use only the local schema in all serialization/deserialization, in which case DescribeTable should be completely optional in normal operation. I think there's a good function where DescribeTable would be useful, and that's an (imaginary) "Model.validate_schema" method that would ensure that the local schema overlaps with the server's one (e.g. that we don't mistype indexes, which can create some really confusing error messages... from experience). |
Implement
Connection.predefine_table_schema
, which provides manual populating of cached table schemes (Connection._tables
) and helps avoid to requestDescribeTable
which takes some time.The method is integrated in
TableConnection
, which is used in model. It means all models don't sendDescribeTable
requests before querying the data any more, butModel.describe_table
sends the real request every time it's called.