Replies: 3 comments 1 reply
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
As mentioned in #1707, Martin could provide some basic solution, but it would never compare to something like Varnish - where one can write fully custom caching logic. I would actually prefer to keep Martin's functionality as "practical" (whatever that even means), but caching is a fairly complex and involved topics, so not even certain how to do it best. I introduced per-tile caching at some point to optimize Martin's performance, but I am still not certain if that was a mistake or not. |
Beta Was this translation helpful? Give feedback.
-
We looked at other caching strategies, the issue is zoom level cache busting, would be the best performance option on constrained infrastructure. Varnish it was going to be very complex to handle, it basically was just a little bit better than Nginx performance wise, but also suffered a lot. Next steps I’ll submit a PR for the modification ended up using. It has some security implications, if exposed, you can continually invalidate caching. Something that would be interesting would be managing SQL tile hash that could reduce performance of Martin checking for tile recreation then automatically trigger invalidation. |
Beta Was this translation helpful? Give feedback.
-
Hello All 👋 ,
TL;DR
Let's discuss some approaches for actively invalidating cache tiles when sources are changed. We hope to submit a pull request for review once there is some consensus on the approach.
The Problem 🔢
We use and love martin tile server. One issue we face is invalidating tiles when sources are updated with user changes. Our specific use case involves a user editing geometry and saving to a PostgreSQL database. After a user change, we call a redraw on the map layer, but with martin, this would continue to show cached tiles without the change. Any other visitor requests to the cached tiles also don't show the changes either.
Architecture 🏗️
User's browser -> React Frontend -> Backend Authentication -> Martin TS -> PostgreSQL/PostGIS
Users submit a geometry change that writes to the PostgreSQL.
Developer Use Approaches 👀
First we'll go into how a frontend might request a tile or tile set to be re-fetched, then how it might be implemented in martin.
Versioning
Appending a version "number" to the URL request, when the frontend changes we could increment the version number or timestamp to notify martin to re-fetch a tile or tile set. It's well understood technique for cache busting, be it from the before times. However, martin would need to know about the versions of tiles for any smart re-fetching OR a version change would just be analogous to a boolean flag to remove the tile from the cache and re-fetch from the source. It also complicates URL parsing in martin.
It might look something like:
https://www-to-martin-endpoint/mvt?source/z/x/y/version
ORhttps://www-to-martin-endpoint/mvt?source-version/z/x/y
etc.It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Fingerprinting
Process of calculating a hash on frontend tile and parsing that as a URL request. The main issue of this approach is the frontend doesn't know the fingerprint of the tile or source beforehand. The other issue is vector tiles don't really know about the added feature geometry changes we add on top, at least in our implementation, so calculating an accurate fingerprint of a resulting tile where a geometry crosses more than one tile would be problematic. So this seems like the wrong mechanism for this feature. But happy to explore something like this if anyone has a good idea.
It might look like versioning in practice:
https://www-to-martin-endpoint/mvt?source-437b930db84b8079c2dd804a71936b5f/z/x/y/
where fingerprint is the hash hex string attached to source.It's worth noting this approach would be handled well by other caches like NGINX or CDN's.
Query String
Appending a version or flag on the query parameters in the GET request:
?version=2
OR?invalidate=true
etc. Easy to parse in martin, it could also be a timestamp.It is not handled well in other caching schemes like some CDN's
Use Cache-Control Header
Cache-Control Header value could be used a request value of no-cache or no-store. In this case we would be abusing the meaning of the terms, but both are valid HTTP Header Cache-Control Values. From the frontend developers would modify the TransformRequest in mapbox or similar in other libraries to add the cache-control value in the request. Looking forward to thoughts on this option as I can see a store value that is passed from a result of successful PostgreSQL update would be an approach to setting the value.
Use ETag Header
Entity Tags in the header have some neat properties, like passing a case-sensitive prefix of
/W
to indicate a weak validation on version, but this would have similar issues as versioning and fingerprinting about, martin and the frontend would need to sync on versions. It would be supplied to martin from the front end using the transform request discussed in Cache-Control Header.Custom Header
X-custom-header is deprecated by avoiding any conflicting header names we could choose a terse header such as
martin-refetch: true | false | next
or similar. Developers would implement the custom header similar to what is described in Cache-Control above.A New Endpoint
🔬 Potentially we could expand this to a wider discussion about some form of webhook interface that could allow for sources to flag when changes happen to layers or features which would remove tiles from the cache. This would be a powerful addition that could do a lot more than just invalidate tiles, though that would what this discussion would be about specifically.
Implementation Approaches 🛠️
One of the issues we face is invalidating not just one cache but all tiles without the source changes. I think the strategy should be to remove from the cache all "connected" tiles from configuration defined zoom level bounds. For example if a tile request was to be re-fetched, martin would remove the tile from the cache if it exists, then get then return the new tile, while removing a zoom level above and tiles zoom level bellow (if they exist). The fetch has only retrieved the requested tile but cache is then forcing a re-fetch for any new requests for other affected tiles.
As for custom headers, we can add an atrix_web allow_header for the custom name
martin-refetch
for example. In theDynTileSource
we can add a new invalidation expr to pass to theget_or_insert_cached_value
macro. Along with a new expr for removing other tiles around theTileCoord
.Webhook
Alternatively adding a new
get_webhook
factory and routes, with a simple definition of a source name and timestamp/serial of a change. The implementation in the tile cache would need some thought, would it invalidate the source everywhere or calculate the geometry change? Any thoughts on this idea or implementation greatly appreciated.Thank you for reading my wall of text!
Beta Was this translation helpful? Give feedback.
All reactions