Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canned/Queue based Delete Operations for ADO #229

Open
DiegoPino opened this issue Feb 6, 2025 · 0 comments
Open

Canned/Queue based Delete Operations for ADO #229

DiegoPino opened this issue Feb 6, 2025 · 0 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request Event Subscribers You trigger things and I react to those queue workers Ones taking the FI and doing the FO queue FIFO Reporting Errors, Logs, etc. UI Buttons and pixels
Milestone

Comments

@DiegoPino
Copy link
Member

What?

When you have the need of deleting Objects, specially when dealing with hierarchies/nested/compounds and collections, we make use of VBO and filters/facets to narrow down the lists, or AMI sets (*using each ROW's UUID) via the "process deleted"

The first approach, even if needed (and will be kept) adds a level of difficulty to end users but also uncertainty. Delete operations don't have an UNDO. But also those lists might include multiples beyond the VBO limit (which is set via Solr), runs inline via AJAX (so disconnection from internet might leave things unfinished, or might even include children of a Collection where some of those children are CWS (compounds) with more children, which also implies more/secondary and manual deletions are needed to get rid of any left over orphans.

The second one, AMI sets, are the safest of course and give you complete control (one per row) of what is going to be deleted. But it either requires to keep/have the original AMI sets around, or run via VBO an CSV export that can then be used.

This ticket is here to enable a simpler operation without the fine grained options that VBO/Views based deletion list provides, but that ensures that there is 100% control over what is deleted, no orphans are left behind and uses a good backed algorithm based on a Queue processor (re-using a lot of code/functions we already have in AMI) to execute a deletion on the backed.

The main idea here is:

  • Either modify/intersect the current ADO (NODE) /delete action and form, and enable a secondary checkbox/some options that allow users directly, when visiting an Object that has children connected to it, do to a deep deletion with safety checks/limits and consistent behavior

@alliomeria and I discussed the following ideas:

  • One can not reliably delete if the Solr index is not at 100%.Disable the action until it is or add a check on the "deletion" controller queue entry until that is so, still allow the deletions to be enqueued but they do not kick in/start processing until that condition is met. Why Solr. Because we don't store (given the flexible amount of possible child-to-parent relations that data in a DB, but in Solr.
  • Have a global setting (which also can be automatic... I can read Solr fields and detect if they are connecting ADOS to ADOS) to take some or all ado-to-ado properties in account when making a deletion list.
  • The deletion should be deep. Means if collections have children and those have children all needs to go. We can have a checkbox here to avoid a certain level (collections of collections of collections) allowing an admin to totally wipe out a repository.
  • Access checks are key. If the User sends to deletion a CWS. But the children are not owned by the enduser (so can't delete them) then the whole "tree form leave to trunk" that makes those objects dependent is blocked from being deleted.
  • We could, in the case of a shallow deletion, remove any pointers from children to removed parents, but that will also depend on the Access check (if the user can actually "edit"... I mean we do it automatically) the deletion
  • Delete queues could have either a timing (run at midnight only) or be delayed. But I think users might want deletions to have ipso facto
  • Deletions (the list) should generate a precise Log that can not be deleted by the user and would allow an admin to find/discover exactly who/what was deleted.
  • There should never be orphans/errors left behind. Which probably means Queue items should re-try?
  • Also, queue items should not delete things (same UUIDs) created after they were sent. e.g I enqueue deletion. I manually delete (while the queue is running) and then I re-ingest. If the ADO to be deleted has a creation date > than the "I want to delete" the deletion should expire

Please feel free to talk/share ideas, use cases, needs and concerns here

thanks

@DiegoPino DiegoPino self-assigned this Feb 6, 2025
@DiegoPino DiegoPino added documentation Improvements or additions to documentation enhancement New feature or request UI Buttons and pixels Reporting Errors, Logs, etc. queue FIFO queue workers Ones taking the FI and doing the FO Event Subscribers You trigger things and I react to those labels Feb 6, 2025
@DiegoPino DiegoPino added this to the 0.9.0 milestone Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request Event Subscribers You trigger things and I react to those queue workers Ones taking the FI and doing the FO queue FIFO Reporting Errors, Logs, etc. UI Buttons and pixels
Projects
None yet
Development

No branches or pull requests

1 participant