Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event UUID to event schema #40

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add event UUID to event schema #40

wants to merge 4 commits into from

Conversation

rhysrevans3
Copy link
Contributor

No description provided.

@djspstfc
Copy link

djspstfc commented Nov 6, 2024

I see that this UUID is in the Metadata section, which I do agree it belongs in; however I think that this field will not get into the STAC catalog? If this is the case can it still be used for end to end error detection, or is it the case that as we want to use this for error analysis and errors will not get into the STAC catalog that this is OK?

TL;DR: I think this is the right place but do we also want the UUID in the STAC catalog?

@rhysrevans3
Copy link
Contributor Author

I don't think it needs to be in the STAC record as it will be in the materialised view. If it were in the STAC record would this be a list of all events associated with the record or just the last event?

@jbryan
Copy link
Contributor

jbryan commented Nov 6, 2024

I agree that it probably doesn't need to be in the catalog since the consumer responsible for putting the data into the catalog will have access to it and should be able add it to any error message generated.

At the risk of bike shedding one of the hardest problems in software, I would suggest giving the field a somewhat more descriptive name. Perhaps event_id, request_id, or similar to provide some hint to what it is identifying.

@rhysrevans3
Copy link
Contributor Author

Just to note I've included the changes to the auth section that were suggested in the ad-hoc meeting.

@rhysrevans3
Copy link
Contributor Author

Note on separate event and request ids the request id will be the same thought the life of the request the event id will change when moving to a new stream/queue for example moving to the error queue.

@sturoscy-personal
Copy link
Contributor

Is the idea of removing auth_basis_data that group (or other) affiliations can be looked up later based off the requester_data?

@joshbryan-globus
Copy link
Collaborator

Yes, it could be looked up by requester_data, or more authoritatively would ideally be system logs from the ingest service that give a reason tied to the request_id / event_id. Since each system (east/west) will have it's own policy representation, they can each write logs that make sense from the perspective of that authorization policy. In other words, I think what we landed on today was to simplify coordination by moving the bar from "agree on a common representation of authorization decisions" to "agree on a common way to tie events back to authorization decisions".

@sturoscy-personal
Copy link
Contributor

So are we not doing any auth checks of our own? Or are we doing the auth checks AND system logging?
https://github.com/esgf2-us/stac-transaction-api/blob/main/src/client.py#L50-L60

@joshbryan-globus
Copy link
Collaborator

We'll do the auth checks, the only change is that we are recording the reason for +/-authz to system log rather than the event payload.

@lukaszlacinski
Copy link

@rhysrevans3 The PR adds user_id. Is user_id different from sub? The West authorizer introspects an access token sent by the STAC client in the Authorization header. The sample introspection response from the Globus Auth is:

{
    "active": true,
    "token_type": "Bearer",
    "scope": "https://auth.globus.org/scopes/6fa3b827-5484-42b9-84db-f00c7a183a6a/ingest",
    "client_id": "ec5f07c0-7ed8-4f2b-94f2-ddb6f8fc91a3",
    "username": "[email protected]",
    "name": "Lukasz Lacinski",
    "email": "[email protected]",
    "exp": 1730170318,
    "iat": 1729997518,
    "nbf": 1729997518,
    "sub": "a511c7bc-d274-11e5-9aea-4bedf3cb22c7",
    "aud": [
        "6fa3b827-5484-42b9-84db-f00c7a183a6a",
        "ec5f07c0-7ed8-4f2b-94f2-ddb6f8fc91a3"
    ],
    "iss": "https://auth.globus.org",
    "dependent_tokens_cache_id": "6b2c671e89f1f624f00cf46dfee50d911bf7efd306d9f396db3bae47ed5a549f",
    "identity_set_detail": [
        {
            "sub": "a511c7bc-d274-11e5-9aea-4bedf3cb22c7",
            "organization": "ANL",
            "name": "Lukasz Lacinski",
            "username": "[email protected]",
            "identity_provider": "41143743-f3c8-4d60-bbdb-eeecaba85bd9",
            "identity_provider_display_name": "Globus ID",
            "email": "[email protected]",
            "last_authentication": 1729271774
        }, {
            "sub": "1a552be5-ec3b-409e-8022-9a8e52546f6b",
            "name": null,
            "username": "[email protected]",
            "identity_provider": "1e0161ce-b32d-461a-ac9f-88c8099bffed",
            "identity_provider_display_name": "NCCS Open SSO",
            "email": null,
            "last_authentication": 1729200459
        }, {
            "sub": "4e868912-e4be-11e5-88c5-dbe133110c3e",
            "organization": "The University of Chicago",
            "name": "Lukasz Lacinski",
            "username": "[email protected]",
            "identity_provider": "0dcf5063-bffd-40f7-b403-24f97e32fa47",
            "identity_provider_display_name": "University of Chicago",
            "email": "[email protected]",
            "last_authentication": 1722359036
        }
    ]
}

Fields in the requester_data are copied from the introspection response (excepting auth_service, I would change to iss to keep it consistent with https://www.rfc-editor.org/rfc/inline-errata/rfc7662.html and the introspection response).

How does the authorization flow look like with EGI Check-in? Does EGI Check-in use user_id?

@rhysrevans3
Copy link
Contributor Author

@lukaszlacinski I added user_id to replace the personal information as I thought this would be better for GDPR. The key was just chosen at random so we could switch to using sub. I'll see if I can find someone who knows what the equivalent to sub is for EGI checkin.

@djspstfc
Copy link

@rhysrevans3 sub is part of the oauth2 standard and stands for "subject" - i.e. the canonical identity of the agent requesting authentication. We don't have to call it "sub" but whatever the "sub" value EGI or Globus give us is probably the ID we want to use.

@rhysrevans3
Copy link
Contributor Author

I've removed user_id and updated auth_service to iss. Are there any other fields we think should be included or is auth_policy_id enough information for the event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants