-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pluggable transfer types for record files. #91
base: master
Are you sure you want to change the base?
Pluggable transfer types for record files. #91
Conversation
* Introduces pluggable transfer types * Adds support for multipart upload and remote transfer types * Adds support for per-transfer type permissions Co-authored-by: Mirek Simek <[email protected]>
Note: This pull request was originally proposed at inveniosoftware/rfcs#76 and has been relocated here to allow additional contributions from the CESNET team. |
* Implementation of multipart upload, as described in RFC 0072 * See inveniosoftware/rfcs#91 Co-authored-by: Mirek Simek <[email protected]>
* See inveniosoftware/rfcs#91 for details Co-authored-by: Mirek Simek <[email protected]>
* Implementation of multipart upload, as described in RFC 0072 * See inveniosoftware/rfcs#91 Co-authored-by: Mirek Simek <[email protected]>
This PR is a continuation of #76 |
* Customization of files ui design * Uppy.io files ui integration Co-authored-by: Miroslav Bauer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, and all concerns we've discussed have been addressed both on the backend and frontend. I'll let some more frontend-savvy folks to chime in for the use of window.invenio.files
.
Thank you for putting the work in, looking very much forward to review/test the code!
To make file transfers more consistent, we are introducing a new `transfer` | ||
field to replace the current inconsistent usage of `storage_class`. | ||
|
||
- In **invenio-files-rest**, `storage_class` indicates the file’s intended location or usage (e.g., `archive`, `standard`). | ||
- In **invenio-records-resources**, it has been (mis)used to denote the transfer type (e.g., `local`, `fetch`, `remote`). | ||
|
||
Going forward, the new `transfer` field will hold all transfer-related | ||
information, while `storage_class` will continue to serve its original purpose | ||
from `invenio-files-rest`. | ||
|
||
The `transfer` field will be a dictionary with the following structure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice clarification of the difference (and original purpose) of the two fields!
Here we describe integration of [Uppy.io](https://uppy.io/) as a custom Files UI implementation addressing | ||
use cases regarding reliable multi-part uploads. | ||
|
||
<!-- TODO: move under inveniosoftware org? --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with us of course :)
I think there's some procedure to take care off for publishing on NPM.
|
||
- A slightly more complicated code. | ||
- Some files UI components and Redux store actions that were private now need to be exported to allow reuse. | ||
- Reliance on the global state (`window.invenio.files`) for registering the Files UI, along with an additional Webpack entry point that must be included in a specific order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the decoupling of where these objects come from by storing them in window.invenio.files
.
On my first pass, I thought that swapping the loaded entrypoint via FILES_UI_ENTRYPOINT
to be able to override is somewhat unconventional compared to our typical use of Overridable. In the end, I don't personally feel the difference is that big (both ways are just setting some global state), but probably @kpsherva and @zzacharo can chime in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be concerned here about increasing the maintenance cost - we are aiming here at supporting two file uploaders, while one of them, uppy, seems to be superior in terms of file upload handling and overall user experience. What would be the added value of keeping the default file uploader as an option and maintaining both? ping @ntarocco
Personally, I would try to keep the UI as simple as it gets. I wouldn't rely on global state, I would rather try to come up with some abstraction layer (class) which can handle different apiClient
and service
implementation(s).
I agree for the overridable, it is too complex use case to use it here. I would rather try to understand if we need to support two file uploaders instead of introducing another method of overriding/replacing components (even if it is as clever as the one you proposed :)) .
My overall tendency is simplification where it is possible, my proposal is to consider together having only one uploader.
thank you for the massive work and the amazing RFC, it is super clear and reads well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely understand your concerns about the impact this has on code maintenance & complexity. It seemed quite a lot of work to me also, to make both UIs work while changing things in fileapi services around. If Uppy looks feasible to you as a replacement, without the option to fall back to the current lightweight UI, I'd be glad to update the RFC as to be a replacement of the current UI implementation. Just let me know @ntarocco @kpsherva.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kpsherva the main motivation of developing this new uploader in parallel to the existing one was to make the development easier by using a feature flag, instead of long-lasting branches, and allow easier testing.
We cannot completely replace the uploader in production instances without running it for a while and gather enough knowledge and experience first: files upload is the critical feature of a repository.
I would limit the changes to any existing code as much as possible, and avoid using any Overiddable if possible. I would implement completely new and isolated code for this new uploader and, when enabled via a Python config, the new JS bundle should be injected instead of the existing uploader (if feasible).
The idea, again if possible, is to make these 2 swappable: test the new implementation for a defined amount of time, in different instances, and then switch when mature enough.
- A slightly more complicated code. | ||
- Some files UI components and Redux store actions that were private now need to be exported to allow reuse. | ||
- Reliance on the global state (`window.invenio.files`) for registering the Files UI, along with an additional Webpack entry point that must be included in a specific order. | ||
- Need to create a Webpack entrypoint in site's webpack to be able to use any custom Files UI implementations, that are just NPM packages (like the Uppy.io integration above). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's intended or not, but I feel that the Uppy.io component and Webpack entrypoint should/can already live alongside the current file uploader component, as a first-class supported option for InvenioRDM.
via `react-overridable` by mapping the desired component in `mapping.js`. That would avoid the need for an extra webpack entrypoint and using the global state altogether. However, | ||
it remains unclear how site administrators could easily configure the correct services server-side (e.g., DepositFilesService, DepositFilesAPIClient), as they are purely JavaScript objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, often when we want to translate Flask/Python config to more dynamic JS functionality, we lose a lot of flexibility, and usually have to start double-keeping config-key-to-JS-object "registries".
#### UI workflow | ||
|
||
Default dashboard state: | ||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look great, especially the auto-thumbnails :)
I feel there's a bit of information and functionality duplication with having both the tiles and table below.
I see the Uppy Dashboard components have decent customization, so I'm not that worried about being able to sync/combine the two for a good UX.
|
||
As an admin, I want to: | ||
- Configure the default upload mechanism (TransferType) for new uploads. | ||
- Configure which uploader UI should be used in the deposit form. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for very clear use cases, it reads well!
I am curious, what would be a motivation behind an admin wanting to change the UI of the file uploader? Do you mean in application runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your review. This point comes from a discussion here: #86 (comment). I had an impression, that it would be useful for the admin to be able to choose whenever to keep the current simple UI, use the more powerful Uppy uploader, or to choose any other custom uploader UI implementation for a site. This is configured at runtime by setting, which webpack entrypoint should be used, but it must be included at build time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, by runtime I mean more "I go to the administration panel and I switch the file uploader as a setting from there, without redeploying the application" - but from the rest of the RFC I understand it is not the case, since you've mentioned it would be controlled by a config variable.
I think my confusion is coming from the fact that we call the actor "admin" in many cases, in how we are naming things it is not really distinguished from the "application maintainer" or "devops" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see :) I was writing this with "admin" meaning actually the devops guy, who configures invenio.cfg
/ runtime env config for a site. Changing this from the administration panel would pose additional challenge, but it's not impossible, given that all possible uploader bundles must be already built-in into the running site
[{ | ||
"key": "file.txt", | ||
"transfer": { | ||
"type": "R", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope the amount of types won't explode in the future, causing naming clashes :)
|
||
We will also introduce a `TransferSchema` to handle the new `transfer` field. | ||
It will allow using different schemas for each transfer type. To achieve this, | ||
we will leverage `marshmallow-oneofschema`, which is already included in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find the usage of oneofschema
a bit complex, so I would like to suggest documenting it well in the code.
Description
This pull request proposes refactoring of Invenio’s file transfer mechanism by making transfer types pluggable, moving beyond the existing fixed options (Local, Fetch, Remote). It also introduces a dedicated transfer field to contain all transfer-related metadata.
Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: