Skip to content

Clarify data model types and formats #4045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 27, 2024
Merged
26 changes: 16 additions & 10 deletions versions/3.1.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,25 +199,31 @@ Note that no aspect of implicit connection resolution changes how [URIs are reso

### Data Types

Data types in the OAS are based on the types supported by the [JSON Schema Specification Draft 2020-12](https://tools.ietf.org/html/draft-bhutton-json-schema-00#section-4.2.1).
Note that `integer` as a type is also supported and is defined as a JSON number without a fraction or exponent part.
Models are defined using the [Schema Object](#schema-object), which is a superset of JSON Schema Specification Draft 2020-12.

<a name="data-type-format"></a>As defined by the [JSON Schema Validation specification](https://tools.ietf.org/html/draft-bhutton-json-schema-validation-00#section-7.3), data types can have an optional modifier property: `format`. As described in that specification, `format` is treated as a non-validating annotation by default; the ability to validate `format` varies across implementations.
Data types in the OAS are based on the six types supported by the [JSON Schema Specification Draft 2020-12 data model](https://tools.ietf.org/html/draft-bhutton-json-schema-00#section-4.2.1): array, boolean, null, number, object, or string. JSON Schema keywords and `format` values operate on these six types, with certain keywords and formats only applying to a specific type. For example, the `pattern` keyword and the `date-time` format only apply to strings, and treat any instance of the other five types as _automatically valid._ This means JSON Schema keywords and formats do **NOT** implicitly require the expected type. Use the `type` keyword to explicitly constrain the type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent addition! A common problem I see is schemas that contain "properties" and assume that this implies "type: object" -- and I too made this error early on, until a wiser colleague educated me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad this makes sense - I am generally opposed to repeating information from JSON Schema, except where it has clearly been problematic and there isn't an easy single spot to reference in the JSON Schema text. That's why I'm OK with this but do not want to do anything but link to the section on what constitutes an integer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Note that the `type` keyword allows `"integer"` as a value for convenience, but keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-6.3), meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.
Copy link
Contributor

@mikekistler mikekistler Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that the `type` keyword allows `"integer"` as a value for convenience, but keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-6.3), meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.
Note that the `type` keyword allows `"integer"` as a value for convenience, but JSON Schema keyword and format applicability does not recognize integers as being distinct from other numbers because [[RFC7159|JSON]] itself does not make that distinction. JSON Schema Validation defines integers [mathematically](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-validation-00#section-6.1.1) as "any number with a zero fractional part", meaning that both `1` and `1.0` are considered to be integers for the purpose of the `type` keyword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section 6.3 "Mathematical Integers" is correct here. There is no 6.1.1 in that draft of JSON Schema.

I am against "any number with a zero fractional part" as a description as it is ambiguous as to whether it refers to the value or the textual representation.

Again, this is a JSON Schema thing. JSON Schema's wording is what matters, and OpenAPI CANNOT change the requirement. The only thing we can or should do here is reference the appropriate section of JSON Schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite right that there is no 6.1.1 in that draft. Let me see if I can find what I was looking at. In any event, I don't believe section 6.3 "defines integers -- this is the problem I was trying to fix, so let's find a good solution to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ralfhandl ralfhandl Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON Schema Validation defines "integer" in section 6.1.1

or "integer" which matches any number with a zero fractional part

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated my suggested change. @handrews please review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler the point of this link is to explain why 1.0 is an integer. The core spec §6.3 does that. That is literally the entire purpose of that section.

This text here already explains that the type keyword allows "integer". If you also want to link to the type keyword's section, we can do that on the word type (although I'd rather not- we don't do that for any other mention of any other JSON Schema keyword AFAICT so it would be a confusing inconsistency).

If you want to look up the details of type, it's obvious that you should look up its section in the validation spec. On the other hand, the mathematical nature of integers in JSON Schema is a recurring point of confusion, and it's less obvious where to resolve it. Hence linking to the section that explains it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@handrews When I follow the link to section 6.3 I see this text:

Some programming languages and parsers use different internal representations for floating point numbers than they do for integers.
For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.

I don't think there is anything here that "defines integers", and I don't know why I would conclude from this text that "1.0" is an integer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler OK let me think on this more. §6.1.1 of the validation spec still isn't what should go here.


#### Data Type Format

As defined by the [JSON Schema Validation specification](https://tools.ietf.org/html/draft-bhutton-json-schema-validation-00#section-7.3), data types can have an optional modifier property: `format`. As described in that specification, `format` is treated as a non-validating annotation by default; the ability to validate `format` varies across implementations.

The OpenAPI Initiative also hosts a [Format Registry](https://spec.openapis.org/registry/format/) for formats defined by OAS users and other specifications. Support for any registered format is strictly OPTIONAL, and support for one registered format does not imply support for any others.

Types that are not accompanied by a `format` property follow the type definition in the JSON Schema. Tools that do not recognize a specific `format` MAY default back to the `type` alone, as if the `format` is not specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW ... I think there is a minor conflict with JSON Schema here, since it has the Format Assertion Vocabulary and:

When the Format-Assertion vocabulary is declared with a value of true, implementations MUST provide full validation support for all of the formats defined by this specification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikekistler by default what you get is the Format Annotation Vocabulary, to which implementations MAY add whatever opt-in validation they want to provide. This is because some major JSON Schema implementations refused to fully validate format, but did a certain amount of best-effort validation.

I really don't want to get into the Format Assertion Vocabulary here. I'm not sure many implementations support it, and you need to know how to change the meta-schema to include it.


The formats defined by the OAS are:

| [`type`](#data-types) | [`format`](#data-type-format) | Comments |
| -------------------- | --------------------------- | ---------------------------- |
| `integer` | `int32` | signed 32 bits |
| `integer` | `int64` | signed 64 bits (a.k.a long) |
| `number` | `float` | |
| `number` | `double` | |
| `string` | `password` | A hint to obscure the value. |
| `format` | JSON Data Type | Comments |
| ---------- | -------------- | ---------------------------- |
| `int32` | number | signed 32 bits |
| `int64` | number | signed 64 bits (a.k.a long) |
| `float` | number | |
| `double` | number | |
| `password` | string | A hint to obscure the value. |

As noted under [Data Type](#data-types), both `type: number` and `type: integer` are considered to be numbers in the data model.

#### Working With Binary Data

Expand Down