-
Notifications
You must be signed in to change notification settings - Fork 11
feat(input_schema): Enable secret objects #515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the proposal. I have a few notes/questions:
|
Ah sorry, the first one is my fault, I forgot that cookies are an array when specifying it |
Yeah, that should be straightforward, in the same way as with objects
The issue is that if the type of the field is defined as We can add
Yeah it would be, this works the same as string field decryption and if the value doesn't look like it's encrypted no decryption is done at all. Value looks like it's encrypted means in this case that the object has field
There won't by any change in API call at all, since the input is provided unencrypted and is encrypted by API internally |
Hey, I understand that the representation was chosen so that the basic type is always matching, and it's not a problem for the validation. But the basic type (string/array/object) is all that can be validated, nothing else - so it's not very strong. So instead I'd go with something like Until it's decrypted, nobody should know what's inside (except the user). From the perspective of types, you'd have We can also use that later to cover the string case we have now. To compare it, the solution in the PR would look like this when you pass it encrypted:
And like this when you pass it un-encrypted:
Meaning that the encrypted case looks weird - we need a special representation for every type. I'd be more inclined to
And if passed as not-encrypted:
Meaning that the not-encrypted case looks weird - which is fine, because the encrypted is the default. I won't make it to the rest of the discussion, just wanted to chip in with this :) |
I don't see any additional value having the
Generally speaking with you suggestion we still wouldn't support encryption objects and arrays. Only one thing we can do is to add support for |
I only read this quickly, but I feel:
is a weird "Potemkin" JSON, which will not even be compatible with objects produced by our editors and now the newly introduced sub-schemas, so it only solves some cases, but not the others.
Is this really a problem? When you call an Actor via API, you pass a full unencrypted JSON input, which is validated as is, and encrypted when storing to KV-store. It's only when Actor one would fetch the JSON from KV-store they will receive an object which will not pass validation. But everyone should use |
Yes when you're calling an Actor via API, this should work as you are describing, but when you would start the Actor from UI, the encrypted input is stored right as you pasted it to UI input and this encrypted value is then used to call the API. And in this step we would be sending What we can do is to catch the error produced by JSON Schema and check if the field has
I supposed that the secret field wouldn't be able to define sub-schema, same as we're doing with properties like |
Thanks for the clarification. But it really feels like we're changing the external logic (and consequently the Actor specification that we want to make an open standard) just to make some internal implementation easier for us, which is a short-sighted way to design open systems. I'm sure there's some smarter way to do it. For example, we could prefix the encrypted JSON with Also I don't think it's a good idea to block sub-schemas for encrypted objects or arrays. For example, we want to use this new functionality for storing cookies, which need to have a specific format. If we don't validate what users enter in UI or use in API, it will be just another source of user errors and bad UX. We probably cut this corner back in the days when we designed encrypted value for strings, but let's not perpetuate that hack in the future if we can do it better now. CCing @mtrunkat for visibility |
I gave it a few more thoughts. I am not a big fan of this as it might be confusing:
Which is there just to allow us to use an unmodified JSONSchema validator and save some work. I am neither a big fan of having I see two more options for this problem - [1] What about doing it the other way around. Encrypted object will always be
The other option would be to enable developers to define a JSON schema for this object. I.e., allow nested JSON schema to be defined for a At the SDK level, we could automatically parse the value when it has this property. Or maybe better, we don't do it as it's not implied, but you will always be able to parse it as we validate the input to be parsable. [2] The other option is field-level encryption. This might be complex for largely nested objects, but it would preserve the shape. https://www.npmjs.com/package/prisma-field-encryption |
Yeah I see and I agree, my goal was to make it as close as possible to secure
Issue with this approach is that when executing via API the input value has to be stringified first by the client, because the JSON Schema would not accept
I don't like this exactly because the shape would be preserved - field names (or at least their count and structure) would not be encrypted.
I actually like this idea. We would just need to extend the schema validation to accept string value for
|
Seems we're in a bit of a deadlock here...
Imagine how this would look like in SDK or API examples ... people would think it's a broken design and it would be a source of errors when callers forget to format the input. Also LLMs would struggle with this, as it's non-standard.
I agree with @MFori, this leaks too much information and doesn't seem very secure I don't think it's a problem that encrypted value will no longer conform to some schema. It's pretty expected actually - any entity that is encrypted loses a structure and turns into a raw byte array, and we just do the same. So I'd go with the encrypt to string and capture schema hash approach... |
Fair, let's move forward this way. |
# Conflicts: # packages/input_schema/package.json
"pattern": { "type": "string" }, | ||
"nullable": { "type": "boolean" }, | ||
"minLength": { "type": "integer" }, | ||
"maxLength": { "type": "integer" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can now validate these for secret
strings too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure here - wouldn't we end up applying the validation to the encrypted value in this case? (That would not be useful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I removed these three properties for now. We can add it later but would need to catch the Ajv errors
errors = validator.errors! | ||
.filter((error) => { | ||
// We are storing encrypted objects/arrays as strings, so AJV will throw type the error here. | ||
// So we need to skip these errors. | ||
if (error.keyword === 'type' && error.instancePath) { | ||
const path = error.instancePath.replace(/^\//, '').split('/')[0]; | ||
const propSchema = inputSchema.properties?.[path]; | ||
const value = input[path]; | ||
|
||
// Check if the property is a secret and if the value is an encrypted value. | ||
// We do additional validation of the field schema in the later part of this function | ||
if ( | ||
propSchema?.isSecret | ||
&& typeof value === 'string' | ||
&& (propSchema.type === 'object' || propSchema.type === 'array') | ||
&& isEncryptedValueForFieldType(value, propSchema.type) | ||
) { | ||
return false; | ||
} | ||
} | ||
return true; | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ignores type errors produced by Ajv
validation when secret string is passed to object
/array
fields.
// Additional validation for secret fields | ||
if (isSecret && value && typeof value === 'string') { | ||
// If the value is a valid encrypted string for the field type, | ||
// we check if the field schema is likely to be still valid (is unchanged from the time of encryption). | ||
if (isEncryptedValueForFieldType(value, type) && !isEncryptedValueForFieldSchema(value, properties[property])) { | ||
// If not, we add an error message to the field errors and user needs to update the value in the input editor. | ||
fieldErrors.push(m('inputSchema.validation.secretFieldSchemaChanged', { rootName: 'input', fieldKey: property })); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we store field schema's hash to encrypted value, here we validate that the field schema (ignoring properties like title, description,..) doesn't change from the time of encryption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good, just few small comments
I also opened the apify-core PR that handle secret UI input and modal here: https://github.com/apify/apify-core/pull/21454 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
"pattern": { "type": "string" }, | ||
"nullable": { "type": "boolean" }, | ||
"minLength": { "type": "integer" }, | ||
"maxLength": { "type": "integer" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure here - wouldn't we end up applying the validation to the encrypted value in this case? (That would not be useful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just two small comments (one of them is a concern rather than a thing to surely fix).
// If the value is a valid encrypted string for the field type, | ||
// we check if the field schema is likely to be still valid (is unchanged from the time of encryption). | ||
if (isEncryptedValueForFieldType(value, type) && !isEncryptedValueForFieldSchema(value, properties[property])) { | ||
// If not, we add an error message to the field errors and user needs to update the value in the input editor. | ||
fieldErrors.push(m('inputSchema.validation.secretFieldSchemaChanged', { fieldKey: property })); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight concern here, but I'm not sure how big of a problem it will really be.
If you have e.g. an array
secret field, and you increase its maxItems
, the encrypted field value will still be semantically valid, but this validation will start throwing errors, which will be annoying. You'll have to hunt for all the passwords / secret keys you put in the field before, some of which might not be available anymore (since some secrets are only shown once after generation).
I'm not sure what changes to the input schema happen more often, but I would wager it's backwards compatible ones, so we'll be forcing people to update their inputs even when not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, maybe showing this only as a warning in the UI would be better?
Ajv does not have anything like warning, but we can updated the validateInputUsingValidator
function to return list of errors and list of warnings and show this warnings in the UI.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can keep it it as an error for now, see how many of these validation errors are happening, and possibly change it to a warning later?
If we put it as a warning now, but decide to change it to an error later, that would be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but please see the comment regarding the errors when the schema changes.
# Conflicts: # package-lock.json
Add support for `isSecret` objects and arrays in input schema and the new form of encrypted value: ``` ENCRYPTED_JSON_VALUE:{FIELD_SCHEMA_HASH}:{ENCRYPTED_PASSWORD}:{ENCRYPTED_VALUE} ``` More context here apify/apify-shared-js#515
To support decryption of secret object/array - apify/apify-shared-js#515
Add notes about `isSecret` availability for `object`/`array` inputs. Context: apify/apify-shared-js#515 --------- Co-authored-by: Michał Olender <[email protected]>
Add support for `isSecret` objects and arrays in input schema and the new form of encrypted value: ``` ENCRYPTED_JSON_VALUE:{FIELD_SCHEMA_HASH}:{ENCRYPTED_PASSWORD}:{ENCRYPTED_VALUE} ``` More context here apify/apify-shared-js#515
Secret object/array inputs
Based on the issue:
This is proposal to enable
isSecret
boolean property forobject
andarray
fields in Input Schema.Currently this is available only for
string
fields with editors eithertextfield
ortextarea
.Update based on the outcome of discussion in this PR:
array
properties (not justobject
)minProperties
,maxProperties
, sub-schema in the future,..) even for secured fieldsobject
/array
fields.The actual implementation stringify the object value and encrypts the string to final value in form of:
Where the second group (
FIELD_SCHEMA_HASH
) is optional so all existing stored encrypted values are still matching and are backwards compatible.Original description (❗outdated):