-
Notifications
You must be signed in to change notification settings - Fork 25
INTPYTHON-527 Add Queryable Encryption support #329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Wrong commit message for 65bd15a and I don't want to force push yet. It should have said:
I'm aware that
|
It's not working as you think it is. As I said elsewhere, Does this fix the "command not supported for auto encryption: buildinfo" error? If so, it's perhaps because I'd suggest to use my patch is as a starting point for maintaining two connections. |
I don't disagree, but it feels a lot like
Yes it works by design, not a side effect. I'm
I'd make a few passes at it but did not get anywhere, I'll try again though. |
Your "stumble" theory of how it's working isn't correct. |
Copy that, thanks! I've removed
Still working on an unencrypted connection, but perhaps the only time we need it is for the version check. |
@ShaneHarvey @Jibola @timgraham FYI here is the
And here is the error again with some additional debug:
And the full traceback:
Test settings:
This is happening in the |
Also, - Use db_table in management command - Move feature check to base class
django_mongodb_backend/models.py
Outdated
|
||
class Meta: | ||
abstract = True | ||
required_db_features = {"supports_queryable_encryption"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think required_db_features
is appropriate for EncryptedModel
since that will silently cause encrypted models not to be created in user projects.
tests/backend_/test_features.py
Outdated
connection.features.__dict__.pop("supports_queryable_encryption", None) | ||
|
||
def tearDown(self): | ||
connection.features.__dict__.pop("supports_queryable_encryption", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in tearDown()
since each test will (presumably) initialize the attribute.
tests/encryption_/models.py
Outdated
|
||
class Patient(EncryptedModel): | ||
class Meta: | ||
db_table = "patient" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the custom table names are only temporary for debugging or something that's fine, but it's not really appropriate to use an unprefixed table name that could collide with other test apps.
tests/encryption_/models.py
Outdated
class Meta: | ||
db_table = "billing" | ||
|
||
cc_type = EncryptedCharField("cc_type", max_length=20, queries=QueryType.equality()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you add an explicit verbose name to every field (e.g. "cc_type"
)?
docs/source/howto/encryption.rst
Outdated
:class:`~pymongo.encryption_options.AutoEncryptionOpts` requires a key vault | ||
namespace to store encryption keys. The key vault namespace is typically a | ||
combination of a database and collection name. ``KEY_VAULT_COLLECTION_NAME`` | ||
and ``KEY_VAULT_DATABASE_NAME`` are defined in :mod:`~django_mongodb_backend.encryption` | ||
and used to create the key vault namespace with can be imported and used as follows. | ||
|
||
``KEY_VAULT_NAMESPACE`` | ||
~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
E.g.:: | ||
|
||
AutoEncryptionOpts( | ||
key_vault_namespace=encryption.KEY_VAULT_NAMESPACE, | ||
... | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be simpler and more transparent to document an example:
AutoEncryptionOpts(
key_vault_namespace="keyvault.__keyvault",
rather than to provide and use a constant.
KMS_CREDENTIALS = { | ||
"aws": { | ||
"key": os.getenv("AWS_KEY_ARN", ""), | ||
"region": os.getenv("AWS_KEY_REGION", ""), | ||
}, | ||
"azure": { | ||
"keyName": os.getenv("AZURE_KEY_NAME", ""), | ||
"keyVaultEndpoint": os.getenv("AZURE_KEY_VAULT_ENDPOINT", ""), | ||
}, | ||
"gcp": { | ||
"projectId": os.getenv("GCP_PROJECT_ID", ""), | ||
"location": os.getenv("GCP_LOCATION", ""), | ||
"keyRing": os.getenv("GCP_KEY_RING", ""), | ||
"keyName": os.getenv("GCP_KEY_NAME", ""), | ||
}, | ||
"kmip": {}, | ||
"local": {}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some documentation we can link to help users know how to configure credentials, providers, etc? It doesn't feel like Django's job to document and maintain this sort of mapping.
I also read:
To enable the driver’s behavior to obtain credentials from the environment, add the appropriate key (“aws”, “gcp”, or “azure”) with an empty map to “kms_providers” in either AutoEncryptionOpts or ClientEncryption options.
so this won't work for that use case (I think).
I'd suggest trying to minimize the amount of "helpers" in this PR. We can always add things later if there are user pain points, but I feel these thing shouldn't be our focus for v1. Really, we should enhance MongoDB/pymongo docs if it's unclear how to construct the providers dictionary. I don't think a solution of "set these environment variables instead" is making things simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some documentation we can link to help users know how to configure credentials, providers, etc? It doesn't feel like Django's job to document and maintain this sort of mapping.
It's definitely not Django's job but it may be Django MongoDB Backend's job since we are trying to support QE and we may need that mapping or something like it in the schema.
I also read:
To enable the driver’s behavior to obtain credentials from the environment, add the appropriate key (“aws”, “gcp”, or “azure”) with an empty map to “kms_providers” in either AutoEncryptionOpts or ClientEncryption options.
so this won't work for that use case (I think).
Good catch! Let me test some vendors with what I have now and if we can rely on PyMongo for some of this even better.
I'd suggest trying to minimize the amount of "helpers" in this PR. We can always add things later if there are user pain points, but I feel these thing shouldn't be our focus for v1. Really, we should enhance MongoDB/pymongo docs if it's unclear how to construct the providers dictionary. I don't think a solution of "set these environment variables instead" is making things simpler.
Agreed. I definitely don't want to be in the env var business but I do want to be in the "make this feature work with minimal effort" business.
django_mongodb_backend/schema.py
Outdated
kms_providers = options._kms_providers | ||
codec_options = CodecOptions() | ||
|
||
ce = ClientEncryption(kms_providers, key_vault_namespace, client, codec_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some example code that uses:
codec_options=client.codec_options
which might be more appropriate (though you wonder why codec_options
and is a required argument if options can be retrieved from the also passed client
).
class Tee(StringIO): | ||
"""Print the output of management commands to stdout.""" | ||
|
||
def write(self, txt): | ||
sys.stdout.write(txt) | ||
super().write(txt) | ||
|
||
out = Tee() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tee
is for temporary debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tee
is because I want to be able to use the output to update EXPECTED_ENCRYPTED_FIELDS_MAP
and I got tired of adding and removing print
statements, so I'd like to leave it in if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can leave it until the PR is ready to merge but we don't any test routinely writing to stdout.
Co-authored-by: Tim Graham <[email protected]>
@blink1073 FYI b88b167 |
A couple extra things you'll need:
|
Move the encryption checks for patient to test_patient.
(see previous attempts in #318, #319 and #323 for additional context)