Skip to content

Batch Commits #130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
theduke opened this issue Jul 27, 2022 · 6 comments
Open

Batch Commits #130

theduke opened this issue Jul 27, 2022 · 6 comments

Comments

@theduke
Copy link

theduke commented Jul 27, 2022

There should be a way to change multiple subjects in one commit to guarantee data integrity.

There could either be a new BatchCommit class or a new property on Commit that contains multiple actions , with an action being a nested resource of subject + action (destroy, remove, set, push).

I'd be in favor of adapting Commit to not introduce multiple concepts.

I'd also suggest deprecating having a plain subect + action directly in the commit, but have all commits use actions instead.

@theduke
Copy link
Author

theduke commented Jul 27, 2022

Example:

{
  "@id": "https://atomicdata.dev/commits/SOME-ID",
  "https://atomicdata.dev/properties/createdAt": 0,
  "https://atomicdata.dev/properties/isA": [
    "https://atomicdata.dev/classes/Commit"
  ],
  "https://atomicdata.dev/properties/commitActions": [
    {
      "https://atomicdata.dev/properties/subject": "some-subject",
      "https://atomicdata.dev/properties/set": {
        "https://atomicdata.dev/properties/shortname": "1611489928"
      }
    }
  ],
  "https://atomicdata.dev/properties/signature": "3n+U/3OvymF86Ha6S9MQZtRVIQAAL0rv9ZQpjViht4emjnqKxj4wByiO9RhfL+qwoxTg0FMwKQsNg6d0QU7pAw==",
  "https://atomicdata.dev/properties/signer": "https://surfy.ddns.net/agents/9YCs7htDdF4yBAiA4HuHgjsafg+xZIrtZNELz4msCmc=",
  "https://atomicdata.dev/properties/previousCommit": "https://surfy.ddns.net/commits/9YCs7htDdF4yBAiA4HuHgjsafg+xZIrtZNELz4msCmc=",
}

@theduke
Copy link
Author

theduke commented Jul 27, 2022

Thinking about it, this would also enable a marginally cleaner schema by requiring a class for each action.

Eg CommitActionDestroy , CommitActionModify:

This also allows for additional action types in the future.

"https://atomicdata.dev/properties/commitActions": [
    {
       "https://atomicdata.dev/properties/isA": ["https://atomicdata.dev/classes/CommitActionDestroy"  ],
      "https://atomicdata.dev/properties/subject": "some-subject1",
    },
    {
       "https://atomicdata.dev/properties/isA": ["https://atomicdata.dev/classes/CommitActionModify"  ],
      "https://atomicdata.dev/properties/subject": "some-subject1",
      "https://atomicdata.dev/properties/set": {
        "https://atomicdata.dev/properties/shortname": "1611489928"
      }
    }
  ],

@joepio
Copy link
Member

joepio commented Jul 29, 2022

I like this idea! Also, having actions instead of having a boolean modifier for destroying seems like a good idea. Having multiple subjects also opens up some questions / considerations.

When constructing a version of a resource, we currently use the identifier of the Commit as the identifier of the Resource version. If we allow multiple subjects per Commit, this can no longer work - we need a second parameter (i.e. the subject of the resource itself). This will make version references a bit more verbose, but I don't think that's a big issue.

However, if we also allow multiple commitActions in the same Commit, then we could not refer to intermediate states (e.g. after applying only 3 of the actions instead of all the actions). I don't think that's a big problem, too, but we should be aware of it.

Also I think I'll have to rewrite quite a bit of logic to allow for this - both client and server side. Commits are used throughout the entire application. Most logic resides in the handle_commit function in atomic-server. Rewriting this to work with multiple actions doesn't seem that complex. Executing these multi-Commits atomically (i.e. fail or succeed in its entirety) will pose some challenges, but also doable I think.

Could you perhaps provide some usecases that you had in mind for this?

@theduke
Copy link
Author

theduke commented Jul 30, 2022

An in-between solution could be still giving each commit it's own identity , and using batch commits just to group individual commits together with a resource array.

I have many use cases, which are all application-centric, not so much related to external data. Every time an application wants to alter multiple subjects atomically there needs to be some batch commit logic, otherwise data integrity can't be guaranteed.

@joepio
Copy link
Member

joepio commented Aug 3, 2022

An in-between solution could be still giving each commit it's own identity , and using batch commits just to group individual commits together with a resource array.

That is possible, but currently the ID of every commit is its signature. We could sign every Commit individually. But if we want to have one signature for multiple actions, we'd need a different way of thinking of Commit identifiers.

If we think of BatchCommits as a new type of resource, which contains a set of Commits, and only applies them if all of them are valid, I think we're good.

But implementing this will definitely be a bit of a challenge. I think I'll need to create a sled::Transaction or something similar and pass this around. Not impossible, but probably quite a refactor.

@theduke
Copy link
Author

theduke commented Aug 10, 2022

That is possible, but currently the ID of every commit is its signature. We could sign every Commit individually. But if we want to have one signature for multiple actions, we'd need a different way of thinking of Commit identifiers.

I think this still would work fine: each commit is signed individually, and the batch commit just contains an array with the IDs of the individual commits and is also signed.

An interesting aspect here is also syncing: one might want to expose/sync individual subjects , while batch commits might touch multiple subjects , including synced and unsynced ones. So the sync could only involve individual commits, not batches.

I haven't thoroughly thought this through, but I think batch commits are probably only really relevant locally for a single instance, not for distribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants