-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OLMv1: catalogd metas
https endpoint proposal
#1749
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Jordan Keister <[email protected]>
c263327
to
27cfe0c
Compare
|
||
## Proposal | ||
|
||
This proposal introduces an additional HTTPS endpoint to an existing catalogd API. The existing HTTPS "all" endpoint will remain as a default option; the user will be able to enable this new capability via a feature gate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we talk about how we plan to deprecate "all" once the new endpoint is GA?
|
||
This option would require clients to query the entirety of the data (~21 MB for operatorhubio catalog) and parse the response to retrieve relevant information every time the client needs the data. Even if clients’ implement some form of caching, the first query the client does to catalogd server is still the dealbreaker. In a highly resource constrained environment (e.g. clusters in Edge devices), this basically translates to a chokepoint for the clients to get started. | ||
|
||
- A “path hierarchy” based construction of API endpoints to expose filtered FBC metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are worried about the fact that query endpoint responses will almost always be incomplete, a middle ground might be an endpoint that returns all of the FBC metadata for a specific package, but I'm not sure that endpoint would provide the necessary latency requirements we're shooting for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comes down to whether we require the new endpoint to always provide valid FBC.
When we start revising FBC schemas I think we're going to have to juggle this.
For example, if we revise olm.package.v2
which uses its package
field self-referentially, then we also get package-scoped valid FBC without a change to this endpoint.
But how does a client request this? Does it have to request the v2 schema specifically?
b478376
to
f1d014c
Compare
metas
https endpoint proposal
1e97427
to
41f501b
Compare
41f501b
to
49b6735
Compare
|
||
### Non-Goals | ||
|
||
* Redesigning FBC schema to facilitate additional efficiencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very curious what @spadgett and @TheRealJon think about whether this enhancement, on its own (i.e. without further FBC schema evolution), will be a significant enough improvement for Console's use cases, that it is reasonable to keep schema evolution out-of-scope.
I ask because we may need to include the schema evoluation in scope in order to reference it in a graduation criteria for taking the combined OLM and Console changes for OLMv1 support to GA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A fair take. All of this originated from our feature RFC, and it was concerned solely with what our team was going to implement in this iteration.
This doesn't seem aligned with this process' scope maybe, and we can adjust as makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to know for sure whether this will make significant improvements to the network latency issues. We still require almost all of the FBC data in order to populate the catalog view as designed. We hope that making more strategic asynchronous requests will help. The data model issues still remain.
|
||
### Non-Goals | ||
|
||
* Redesigning FBC schema to facilitate additional efficiencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to know for sure whether this will make significant improvements to the network latency issues. We still require almost all of the FBC data in order to populate the catalog view as designed. We hope that making more strategic asynchronous requests will help. The data model issues still remain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @grokspawn. Really appreciate all that you guys are doing to help the console team.
The existing `all` endpoint also incentivizes clients to conserve resources via local cache to avoid making | ||
many (potentially duplicate) requests. However, the OCP console proof of concept | ||
required what was deemed an unsupportable amount of code and complexity to cache, decompose, and render the | ||
complete FBC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On duplicate requests: If using cache control headers, this is presumably a quick check if the catalog has been modified and a 304 response. If the catalog has changed, we would want to refresh the cache anyway, so the extra request is actually desirable?
In practice, I think console will either be fetching the entire catalog or just one item. Generally, users work with extensions they've already installed much more often than they'd install a new extension, so we'll usually be getting one item. It's a steep cost to download everything for a single item, even if we only do it once per session. And depending how often the catalog updates, it could be many times per session if we refresh the cache.
I think this is more about performance than code complexity (although complexity is a consideration as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was something that Joe mentioned in an earlier review and I interpreted as a drawback, but it feels more like expected behavior w.r.t. caching, and also frequently-updated catalogs incur more network bandwidth for /either/ endpoint.
--> | ||
> 1. If a query comes in with `/api/v1/metas?package=foo`, should we include the blob with schema: `olm.package` and name: `foo`? | ||
|
||
We feel that it is incorrect for the metas service endpoint to mutate the data model (specifically, to create a synthetic package attribute for the `olm.package` schema). To access all the data modeled for an installable package, separate queries need to be made for the package-level metadata (`schema=olm.package&name=foo`) versus the channel/bundle-level metadata (`package=foo`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section could probably use a little more explanation as to why this is even a question, ie include the context that olm.package
objects do not contain packageName
|
||
#### Completeness | ||
The previous `all` endpoint always returns valid FBC. The new service cannot make that promise, | ||
so clients could make incorrect assumptions about the suitability of results. See Open Questions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a collection of valid FBC blobs not constitute valid FBC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. Valid FBC blobs can include metas?schema=olm.channel&name=foo
but this will not include bundles or packages, and would fail an opm validate
call. I'll add some lang that we're contrasting "valid FBC element collection" with "well-formed FBC for a package and constituents" .
Signed-off-by: Jordan Keister <[email protected]>
49b6735
to
dfce6b0
Compare
@grokspawn: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
metas
https endpoint proposalmetas
https endpoint proposal
/label tide/merge-method-squash |
No description provided.