Skip to content

Commit 41573d9

Browse files
Add structured field and rule paths to Violation (#265)
This PR introduces a new structured field path format, and uses it to provide a structured path to the field and rule of a violation. - The new message `buf.validate.FieldPathElement` is added. - It describes a single path segment, e.g. equivalent to a string like `repeated_field[1]` - Both the text name and field number of the field is provided; this allows the field path to be rendered into a string trivially without the need for descriptor lookups, and will work for e.g. unknown fields. (Example: A new field is marked required; old clients can still print the field path, even if they do not have the new field in their schema.) - It also contains the kind of field, to make it possible to interpret unknown field values. - Finally, it contains a subscript oneof. This contains either a repeated field index or a map key. This is needed because maps in protobuf are unordered. There are multiple map key entries, one for each distinctly encoded valid kind of map key. - The new message `buf.validate.FieldPath` is added. It just contains a repeated field of `buf.validate.FieldPathElement` - It would be possible to just have `repeated buf.validate.FieldPathElement` anywhere a path is needed to save a level of pointer chasing, but it is inconvenient for certain uses, e.g. comparing paths with `proto.Equal`. - Two new `buf.validate.Violation` fields are added: `field` and `rule`, both of type `buf.validate.FieldPath`. The old `field_path` field is left for now, but deprecated. - The conformance tests are updated to match the expectations. Note that there are a number of very subtle edge cases: - In one specific case, field paths point to oneofs. In this case, the last element of the fieldpath will contain only the field name, set to the name of the oneof. The field number, field type and subscript fields will all be unset. This is only intended to be used for display purposes. - Only field constraints will output rule paths, because it is a relative path to the `FieldConstraints` message. (In other cases, `constraint_id` is always sufficient anyways, but we can change this behavior later.) - Custom constraints will not contain rule paths, since they don't have a corresponding rule field. (Predefined constraints will contain rule paths, of course.) Implementations: - bufbuild/protovalidate-go#154 - bufbuild/protovalidate-python#217 - bufbuild/protovalidate-cc#63
1 parent 3ce0417 commit 41573d9

38 files changed

+5588
-1402
lines changed

proto/protovalidate-testing/buf/validate/conformance/cases/custom_constraints/custom_constraints.proto

+14
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,20 @@ message FieldExpressions {
8787
message: "c.a must be a multiple of 4"
8888
expression: "this.a % 4 == 0"
8989
}];
90+
int32 d = 4 [
91+
(field).cel = {
92+
id: "field_expression_scalar_multiple_1"
93+
expression:
94+
"this < 1 ? ''"
95+
": 'd must be less than 1'"
96+
},
97+
(field).cel = {
98+
id: "field_expression_scalar_multiple_2"
99+
expression:
100+
"this < 2 ? ''"
101+
": 'd must be less than 2'"
102+
}
103+
];
90104

91105
message Nested {
92106
int32 a = 1 [(field).cel = {

proto/protovalidate/buf/validate/validate.proto

+117-2
Original file line numberDiff line numberDiff line change
@@ -4770,9 +4770,62 @@ message Violations {
47704770
// }
47714771
// ```
47724772
message Violation {
4773-
// `field_path` is a machine-readable identifier that points to the specific field that failed the validation.
4773+
// `field` is a machine-readable path to the field that failed validation.
47744774
// This could be a nested field, in which case the path will include all the parent fields leading to the actual field that caused the violation.
4775-
optional string field_path = 1;
4775+
//
4776+
// For example, consider the following message:
4777+
//
4778+
// ```proto
4779+
// message Message {
4780+
// bool a = 1 [(buf.validate.field).required = true];
4781+
// }
4782+
// ```
4783+
//
4784+
// It could produce the following violation:
4785+
//
4786+
// ```textproto
4787+
// violation {
4788+
// field { element { field_number: 1, field_name: "a", field_type: 8 } }
4789+
// ...
4790+
// }
4791+
// ```
4792+
optional FieldPath field = 5;
4793+
4794+
// `rule` is a machine-readable path that points to the specific constraint rule that failed validation.
4795+
// This will be a nested field starting from the FieldConstraints of the field that failed validation.
4796+
// For custom constraints, this will provide the path of the constraint, e.g. `cel[0]`.
4797+
//
4798+
// For example, consider the following message:
4799+
//
4800+
// ```proto
4801+
// message Message {
4802+
// bool a = 1 [(buf.validate.field).required = true];
4803+
// bool b = 2 [(buf.validate.field).cel = {
4804+
// id: "custom_constraint",
4805+
// expression: "!this ? 'b must be true': ''"
4806+
// }]
4807+
// }
4808+
// ```
4809+
//
4810+
// It could produce the following violations:
4811+
//
4812+
// ```textproto
4813+
// violation {
4814+
// rule { element { field_number: 25, field_name: "required", field_type: 8 } }
4815+
// ...
4816+
// }
4817+
// violation {
4818+
// rule { element { field_number: 23, field_name: "cel", field_type: 11, index: 0 } }
4819+
// ...
4820+
// }
4821+
// ```
4822+
optional FieldPath rule = 6;
4823+
4824+
// `field_path` is a human-readable identifier that points to the specific field that failed the validation.
4825+
// This could be a nested field, in which case the path will include all the parent fields leading to the actual field that caused the violation.
4826+
//
4827+
// Deprecated: use the `field` instead.
4828+
optional string field_path = 1 [deprecated = true];
47764829

47774830
// `constraint_id` is the unique identifier of the `Constraint` that was not fulfilled.
47784831
// This is the same `id` that was specified in the `Constraint` message, allowing easy tracing of which rule was violated.
@@ -4785,3 +4838,65 @@ message Violation {
47854838
// `for_key` indicates whether the violation was caused by a map key, rather than a value.
47864839
optional bool for_key = 4;
47874840
}
4841+
4842+
// `FieldPath` provides a path to a nested protobuf field.
4843+
//
4844+
// This message provides enough information to render a dotted field path even without protobuf descriptors.
4845+
// It also provides enough information to resolve a nested field through unknown wire data.
4846+
message FieldPath {
4847+
// `elements` contains each element of the path, starting from the root and recursing downward.
4848+
repeated FieldPathElement elements = 1;
4849+
}
4850+
4851+
// `FieldPathElement` provides enough information to nest through a single protobuf field.
4852+
//
4853+
// If the selected field is a map or repeated field, the `subscript` value selects a specific element from it.
4854+
// A path that refers to a value nested under a map key or repeated field index will have a `subscript` value.
4855+
// The `field_type` field allows unambiguous resolution of a field even if descriptors are not available.
4856+
message FieldPathElement {
4857+
// `field_number` is the field number this path element refers to.
4858+
optional int32 field_number = 1;
4859+
4860+
// `field_name` contains the field name this path element refers to.
4861+
// This can be used to display a human-readable path even if the field number is unknown.
4862+
optional string field_name = 2;
4863+
4864+
// `field_type` specifies the type of this field. When using reflection, this value is not needed.
4865+
//
4866+
// This value is provided to make it possible to traverse unknown fields through wire data.
4867+
// When traversing wire data, be mindful of both packed[1] and delimited[2] encoding schemes.
4868+
//
4869+
// [1]: https://protobuf.dev/programming-guides/encoding/#packed
4870+
// [2]: https://protobuf.dev/programming-guides/encoding/#groups
4871+
//
4872+
// N.B.: Although groups are deprecated, the corresponding delimited encoding scheme is not, and
4873+
// can be explicitly used in Protocol Buffers 2023 Edition.
4874+
optional google.protobuf.FieldDescriptorProto.Type field_type = 3;
4875+
4876+
// `key_type` specifies the map key type of this field. This value is useful when traversing
4877+
// unknown fields through wire data: specifically, it allows handling the differences between
4878+
// different integer encodings.
4879+
optional google.protobuf.FieldDescriptorProto.Type key_type = 4;
4880+
4881+
// `value_type` specifies map value type of this field. This is useful if you want to display a
4882+
// value inside unknown fields through wire data.
4883+
optional google.protobuf.FieldDescriptorProto.Type value_type = 5;
4884+
4885+
// `subscript` contains a repeated index or map key, if this path element nests into a repeated or map field.
4886+
oneof subscript {
4887+
// `index` specifies a 0-based index into a repeated field.
4888+
uint64 index = 6;
4889+
4890+
// `bool_key` specifies a map key of type bool.
4891+
bool bool_key = 7;
4892+
4893+
// `int_key` specifies a map key of type int32, int64, sint32, sint64, sfixed32 or sfixed64.
4894+
int64 int_key = 8;
4895+
4896+
// `uint_key` specifies a map key of type uint32, uint64, fixed32 or fixed64.
4897+
uint64 uint_key = 9;
4898+
4899+
// `string_key` specifies a map key of type string.
4900+
string string_key = 10;
4901+
}
4902+
}

0 commit comments

Comments
 (0)