-
Notifications
You must be signed in to change notification settings - Fork 4
Wire preservation workflow to Archival Packaging Tool (APT) #1465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Why these changes are being introduced: DataEng has developed [APT](https://github.com/MITLibraries/archival-packaging-tool/) as middleware between ETD and Archivematica. This new application handles the BagIt logic, including creating bags in an S3 bucket connected to Archivematica. Thus, much of the SIP logic in ETD is no longer required. Relevant ticket(s): * [ETD-669](https://mitlibraries.atlassian.net/browse/ETD-669) How this addresses that need: This adds an Archivematica Payload model that effectively replaces the SIP model. The new model constructs the payload JSON expected by APT. Instantations of the model generate and persist this JSON on create, along with the metadata CSV as an ActiveStorage attachment. The other significant change is in the Preservation Submission Job. Previously, this job invoked the Submission Information Package Zipper model to stream a serialized bag to S3. Now, it's responsible for POSTing the JSON data to APT and handling the response. Side effects of this change: * The tests that call APT use webmock and stubbed responses. We would normally use VCR for external API calls, but in this case it doesn't seem prudent to pollute the APT S3 bucket, as it's possible the current test bucket will become the bucket we use. * The SIP model is retained for historical purposes. This is not ideal in terms of maintainability, but it feels important to retain that data, at least for the time being.
a83339c
to
8a8ed01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few comments. I'm not sure any require change but wanted to submit my initial thoughts so you can decide if you want to make any changes before we do a test in dev1 APT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest changes look good.
Let's figure out how to test this in Dev1 to confirm it works as expected while CB is on vacation so when he is back we are ready to merge/promote.
Why these changes are being introduced:
DataEng has developed APT as middleware between ETD and Archivematica. This new application handles the BagIt logic, including creating bags in an S3 bucket connected to Archivematica. Thus, much of the SIP logic in ETD is no longer required.
Relevant ticket(s):
How this addresses that need:
This adds an Archivematica Payload model that effectively replaces the SIP model. The new model constructs the payload JSON expected by APT. Instantations of the model generate and persist this JSON on create, along with the metadata CSV as an ActiveStorage attachment.
The other significant change is in the Preservation Submission Job. Previously, this job invoked the Submission Information Package Zipper model to stream a serialized bag to S3. Now, it's responsible for POSTing the JSON data to APT and handling the response.
Side effects of this change:
Developer
our guide and
all issues introduced by these changes have been resolved or opened as new
issues (link to those issues in the Pull Request details above) no UI changes
Code Reviewer
(not just this pull request message)
Requires database migrations?
YES
Includes new or updated dependencies?
YES