Skip to content

v3.2: Support ordered multipart including streaming #4589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: v3.2-dev
Choose a base branch
from

Conversation

handrews
Copy link
Member

Fixes:

This adds support for all multipart media types that do not have named parts, including support for streaming such media types. Note that multipart/mixed defines the basic processing rules for all multipart types, and implementations that encounter unrecognized multipart subtypes are required to process them as multipart/mixed. Therefore support for multipart/mixed addresses all other subtypes to some degree.

This builds on the recent support for sequential media types:

  • multipart/mixed and similar meet the definition for a sequential media type, requiring it to be modeled as an array. This does use an expansive definition of "repeating the same structure", where the structure is literally any content with a media type.
  • As a sequential media type, it also supports itemSchema
  • Adding a parallel itemEncoding is the obvious solution to multipart/mixed streams requiring an Encoding Object
  • We have regularly received requests to support truly mixed multipart/mixed payloads, and previously claimed such support from 3.0.0 onwards, without actually supporting it. Adding prefixEncoding along with itemEncoding supports this use case with a clear parallel to prefixItems, which is the schema construct needed to support this case.
  • There is no need for a prefixSchema field because the streaming use case requires a repetition of the same schema for each item. Therefore all mixed use cases can use schema and prefixItems
  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

We do not seem to run tests on the 3.2 schemas, and I couldn't quickly figure out how to add that, so we should do that separately and include coverage for this and other new fields.

Also paging @thecheatah, @jeremyfiel

@jeremyfiel
Copy link
Contributor

Thanks @handrews for taking this on. I'm really happy to see it coming to fruition and hopefully the tooling catches up with it sooner than later.

I couldn't immediately make out if this would support nested multipart.

POST  /things HTTP/1.1
content-type: multipart/mixed;boundary=aaa

--aaa
content-type: application/json

{ 
   "data": ""
}
--aaa
content-type: multipart/mixed;boundary=bbb

        --bbb
        content-type: application/json
        {
            "more_data": ""
        }
        --bbb
        content-type: text/plain
        test file
        --bbb
        content-type: application/zip
        
        <binary data>
        ---bbb
        content-type: application/pdf
        
        <binary data>
        --bbb--
--aaa--

multipart/mixed:
  schema:
     prefixItems:
     -  type: object
         properties:
           data:
             type: string
     - prefixItems:
        - type: object
           properties:
              more_data: ""
        - {}
        - {}
        - {}
    prefixEncoding:
      - {}
      - contentType: multipart/mixed
      # not sure how to further document a nested structure here.

@handrews
Copy link
Member Author

@jeremyfiel aww... I was hoping no one would bring up nested multipart... 😵‍💫

I think it would be hard to do that, because there isn't anywhere to put the nested Encoding Object. I think we'd have to add encoding, prefixEncoding, and itemEncoding to the Encoding Object as well as the Media Type Object. I'm a bit hesitant to do that, but we could talk about it at the Thursday call and I could submit it as a follow-up if it gains traction.

Alternatively, we could recommend trying that as an extension given that it adds significant complexity and is a rare case that is deprecated by the current RFC (I know that's small consolation when you're the "rare case" and built things in good faith using older RFCs when they were current).

The complexity is not just the recursive structure, but also that you are now correlating two separate trees of structure.

@jeremyfiel
Copy link
Contributor

I'm not entirely sure this is a correct statement to include multipart/mixed. It is registered in the IANA registry and it does technically have an envelope with the boundary parameter.

Sequential Media Types

Within this specification, a sequential media type is defined as any media type that consists of a repeating structure, without any sort of header, footer, envelope, or other metadata in addition to the sequence.
Some examples of sequential media types (including some that are not IANA-registered but are in common use) are:

  application/jsonl
  application/x-ndjson
  application/json-seq
  application/geo+json-seq
  text/event-stream
  multipart/mixed

@handrews
Copy link
Member Author

handrews commented May 27, 2025

[EDIT: This goes with the nested multipart discussion]

@jeremyfiel the problem is that instead of just re-using the Media Type Object, we came up with the contentType field :-(

@jeremyfiel
Copy link
Contributor

I totally understand the complexity, just trying to confirm my initial impression.

@handrews
Copy link
Member Author

@jeremyfiel That statement only says that some of the listed types are not registered. application/json-seq, application/geo+json-seq, and multipart/mixed are all registered.

I decided not to get into the preamble and postamble of multipart because AFAICT they're supposed to be ignored and are there for historical purposes. Media type parameters are not part of the actual media type content, and the boundaries in the content are no more (or less) significant than the various differences in the three sequential JSON media type delimiters.

@handrews
Copy link
Member Author

@jeremyfiel I added some clarifications about the envelope/preamble/epilogue and the lack of nesting support.

handrews added 4 commits May 30, 2025 10:04
This adds support for all `multipart` media types that do not
have named parts, including support for streaming such media types.
Note that `multipart/mixed` defines the basic processing rules
for all `multipart` types, and implementations that encounter
unrecognized `multipart` subtypes are required to process them
as `multipart/mixed`.  Therefore support for `multipart/mixed`
addresses all other subtypes to some degree.

This builds on the recent support for sequential media types:

* `multipart/mixed` and similar meet the definition for
  a sequential media type, requiring it to be modeled as
  an array.  This does use an expansive definition of
  "repeating the same structure", where the structure is
  literally any content with a media type.
* As a sequential media type, it also supports `itemSchema`
* Adding a parallel `itemEncoding` is the obvious solution to
  `multipart/mixed` streams requiring an Encoding Object
* We have regularly received requests to support truly mixed
  `multipart/mixed` payloads, and previously claimed such support
  from 3.0.0 onwards, without actually supporting it.
  Adding `prefixEncoding` along with `itemEncoding` supports this
  use case with a clear parallel to `prefixItems`, which is the
  schema construct needed to support this case.
* There is no need for a `prefixSchema` field because the streaming
  use case requires a repetition of the same schema for each item.
  Therefore all mixed use cases can use `schema` and `prefixItems`
@handrews
Copy link
Member Author

This force-push was just a plain re-base with no conflicts or other changes. Exactly the same commits applied, I just wanted to make sure the other big PRs wouldn't cause merge issues.

@jeremyfiel GitHub won't let me request a review from you, but if you could provide an approval when you are satisfied with the PR it would be much appreciated as you probably have more expertise with this than just about anyone else.

@thecheatah if you are able to review, even just for the streaming support part, that would also be greatly appreciated. I did not use application/json in the streaming multipart example, but the principle would be the same.

Copy link

@thecheatah thecheatah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. This change allows us to describe the multipart/mixed streaming use case. Thanks!

src/oas.md Outdated
The fourth repeats `application/geo+json`-structured values, while the last repeats a custom text format related to Server-Sent Events.
The fourth repeats `application/geo+json`-structured values, while `text/event-stream` repeats a custom text format related to Server-Sent Events.
The final media type listed above, `multipart/mixed`, provides an ordered list of documents of any media type, and is sometimes streamed.
Note that while `multipart` formats technically allow a preamble and an epilogue, the RFC directs that they are to be ignored, making the effectively comments, and this specification does not model them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making them effectively

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thecheatah thanks, just added a commit to fix this.

Thanks to @thecheatah for catching this.
Copy link
Contributor

@jeremyfiel jeremyfiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement media and encoding Issues regarding media type support and how to encode data (outside of query/path params)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants