Skip to content

Conversation

taimoorzaeem
Copy link
Collaborator

@taimoorzaeem taimoorzaeem commented May 6, 2025

Combines #4059 and #4062.

  • Removes vault library (no longer needed).
  • Refactor
  • Code cleanup

The log format remains the same except on error cases we wouldn't be able to get the user/role.

Old: 127.0.0.1 - postgrest_test_anonymous [05/May/2025:01:39:57 -0500] "GET /projects HTTP/1.1" 200 212 "" "curl/7.81.0"
New: 127.0.0.1 - postgrest_test_anonymous [05/May/2025:01:39:57 -0500] "GET /projects HTTP/1.1" 200 212 "" "curl/7.81.0"

Old: 127.0.0.1 - postgrest_test_anonymous [05/May/2025:01:39:57 -0500] "GET /unknown HTTP/1.1" 404 212 "" "curl/7.81.0"
New: 127.0.0.1 - - [05/May/2025:01:39:57 -0500] "GET /unknown HTTP/1.1" 404 212 "" "curl/7.81.0"

This commit removes auth middleware for it hides
side effects and obscures logic. The auth operations
are now done in its own stage in the request-response
cycle.

It also removes the logging middleware because now
we instead use observation module to log the response.
@taimoorzaeem taimoorzaeem marked this pull request as draft May 6, 2025 18:23
@steve-chavez
Copy link
Member

The log format remains the same except on error cases we wouldn't be able to get the user/role.

Hm, that's no good, why is that?

@taimoorzaeem
Copy link
Collaborator Author

The log format remains the same except on error cases we wouldn't be able to get the user/role.

Hm, that's no good, why is that?

That's because it's a little harder to get that with the no middleware approach. In the middleware approach, we added the role to the vault that was attached to the request object. Because the request object was ubiquitously available, we could get the role from anywhere.

Now without middleware, we are doing the authentication in a linear way, it's harder to do it. I think it should be possible if I spend a bit more time on it. I'll try to reinstate it.

@taimoorzaeem
Copy link
Collaborator Author

taimoorzaeem commented May 7, 2025

I think we should not remove these middlewares because authentication and logging are generally considered a cross-cutting concern. This essentially means they should not be implemented in linear fashion and are better suited as middlewares. @wolfgangwalther WDYT?

Even if we do succeed to refactor these out, I think they would still cause issues in the future and we might end up reverting this change.

As for the original issue ref, I think we should solve it some other way.

@wolfgangwalther
Copy link
Member

I think we should not remove these middlewares [...]

I fully agree.

@steve-chavez
Copy link
Member

steve-chavez commented May 7, 2025

I think we should not remove these middlewares because authentication and logging are generally considered a cross-cutting concern.

I'm fully aware of the cross-cutting concern concept and I'm old enough to have used Aspect-oriented programming (AspectJ) in Java to try and tame it.

So far, we've been successful in implementing logging and metrics in a maintainable way, thanks to the Observation module. We don't have logging messages nor counter increments scattered everywhere. This is the best way I've seen a system implement these cross-cutting concerns.

This essentially means they should not be implemented in linear fashion and are better suited as middlewares.
Even if we do succeed to refactor these out, I think they would still cause issues in the future and we might end up reverting this change.

I think the codebase will be much more predictable, reasonable and maintainable if we do it in a linear fashion, middlewares just hide the complexity in a bad way IMO. Just the fact we have to use unsafe and partial functions plus vault to retain state.

For me the ideal end result would be having a type for request stages and then enforcing the order of the steps via GADTs. Something like:

data Stage
  = NotCORSValidated   -- Before we've done CORS validation
  | CORSValidated      -- CORS validated
  | Authenticated      -- Auth done

data Request (s :: Stage) where...

validateCORS :: Request 'NotCORSValidated -> Either Error (Request 'CORSValidated)

authenticate :: Request 'CORSValidated -> Either Error (Request 'Authenticated)

runMainLogic :: Request 'Authenticated -> IO ()

I believe this is all doable and PostgREST overall will be easier to extend and maintain. An observation at the end of these stages can add logging as needed.


@taimoorzaeem That being said, I see this is too complicated now. So for #3507, maybe we can use vault as you mention. Or otherwise leave it for later until a refactor makes it easier.

@wolfgangwalther
Copy link
Member

If we get rid of middlewares, do we need still need WAI? I'm not opposed to that in principle, I'm just thinking removing middlewares, but keeping WAI is kind of pointless. So we'd need to restructure this a bit more from the ground up. But by using WAI we essentially buy into the middleware approach.

@steve-chavez
Copy link
Member

If we get rid of middlewares, do we need still need WAI? I'm not opposed to that in principle, I'm just thinking removing middlewares, but keeping WAI is kind of pointless.

Yeah, it looks like WAI could be removed. An additional benefit of that is that we could integrate with other web servers since PostgREST would pretty much be just pure functions. We had the idea of integrating with Nginx as a module before and there's also #3932. It might help with #2452 too.

@mkleczek
Copy link
Contributor

mkleczek commented Jun 4, 2025

For me the ideal end result would be having a type for request stages and then enforcing the order of the steps via GADTs. Something like:

data Stage
  = NotCORSValidated   -- Before we've done CORS validation
  | CORSValidated      -- CORS validated
  | Authenticated      -- Auth done

data Request (s :: Stage) where...

validateCORS :: Request 'NotCORSValidated -> Either Error (Request 'CORSValidated)

authenticate :: Request 'CORSValidated -> Either Error (Request 'Authenticated)

runMainLogic :: Request 'Authenticated -> IO ()

I believe this is all doable and PostgREST overall will be easier to extend and maintain. An observation at the end of these stages can add logging as needed.

IMHO it should be a mix of both transparent middleware and using an effect system (like https://github.com/haskell-effectful/effectful).

There is no reason to couple anything with traceHeaderMiddleware artificially by adding a phantom kind TraceHeaderHandled. So having it as a middleware seems fine to me. Cors is similar as nothing depends on it being present or absent.

OTOH authentication middleware would be better implemented as an effect handler so subsequent stages could have signature like:
runMainLogic :: (Authenticated :> es, Error Error :> es) => Request -> Eff es ()

The actual Authenticated effect would be something like a Reader AuthResult but specialized and provide:
getAuth :: Authenticated :> es => Eff es AuthResult.

The signature of authentication effect handler would be:
authenticated :: (IOE :> es, Error Error :> es) => Eff (Authenticated : es) a -> Eff es a

(the above are examples - the actual effects and handlers structure should be well thought out - especially in terms of error handling)

Cors and tracę header could be left as "non-discharging effect handlers" (ie. wrappers) or as middlewares.

That would achieve linearity where necessary while still allowing to have really cross cutting concerns independent from any other logic.

@steve-chavez
Copy link
Member

There is no reason to couple anything with traceHeaderMiddleware artificially by adding a phantom kind TraceHeaderHandled. So having it as a middleware seems fine to me.

Trace header is cross-cutting but thinking about its nature it can be linearized and be part of our function pipeline:

  • It essentially adds an identifier to our requests, so it should be a first step (before Auth.hs and ApiRequest.hs) that populates an attribute on a type.
  • It will end up as part of the query comment (as a query tag, discussed on traceparent or X-Opaque-Id headers to query tags #2506), thus will reach the last module, Query.hs
  • The other modules in between (like Plan) should also include it (it's an id, so it makes sense)

Now it will affect the Error.hs module too, but maybe it can be worked in a way there's no much duplication.
Granted, I haven't tried it, but might be possible.

@mkleczek
Copy link
Contributor

mkleczek commented Jun 4, 2025

There is no reason to couple anything with traceHeaderMiddleware artificially by adding a phantom kind TraceHeaderHandled. So having it as a middleware seems fine to me.

Trace header is cross-cutting but thinking about its nature it can be linearized and be part of our function pipeline:

  • It essentially adds an identifier to our requests, so it should be a first step (before Auth.hs and ApiRequest.hs) that populates an attribute on a type.
  • It will end up as part of the query comment (as a query tag, discussed on traceparent or X-Opaque-Id headers to query tags #2506), thus will reach the last module, Query.hs
  • The other modules in between (like Plan) should also include it (it's an id, so it makes sense)

Now it will affect the Error.hs module too, but maybe it can be worked in a way there's no much duplication. Granted, I haven't tried it, but might be possible.

Right, so iff other code depends on trace header one way or another then it is not transparent cross cutting concern any more.

What do you think about modeling it as effects and effect handlers discharging them though?

@taimoorzaeem
Copy link
Collaborator Author

If we get rid of middlewares, do we need still need WAI? I'm not opposed to that in principle, I'm just thinking removing middlewares, but keeping WAI is kind of pointless. So we'd need to restructure this a bit more from the ground up. But by using WAI we essentially buy into the middleware approach.

How much restructuring are we talking about here? WAI also includes our Warp server, would we remove that too? If yes, what other options do we have? Yesod?

@wolfgangwalther
Copy link
Member

WAI also includes our Warp server, would we remove that too?

No, I don't think we want to actually change all of our stack. Warp can be used without WAI, right?

@taimoorzaeem
Copy link
Collaborator Author

[...] Warp can be used without WAI, right?

I don't think so. See runSettingsSocket from WAI, we run this with Wai.Application as input (postgrest server in our case). So, it seems like it can't be used without WAI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants