Summary
Change the Cache interface so that Open() returns an io.ReadSeekCloser instead of an io.ReadCloser, in order to support HTTP Range requests when serving cached objects.
For most backends this is trivial. For backends that stream over the network (S3 and the Remote cache client), we introduce a wrapper that supports a single Seek() to set the start offset, followed by purely sequential reads, backed by a range request. The higher-level Range-serving code is written to use exactly this access pattern.
Interface change
In internal/cache/api.go, change:
Open(ctx context.Context, key Key) (io.ReadCloser, http.Header, error)
to return io.ReadSeekCloser. Document that callers serving ranges MUST use a single seek-to-start followed by sequential reads (no seek-to-end probing). Because io.ReadSeekCloser is a superset of io.ReadCloser, all existing sequential consumers (http.Fetch, git snapshot/bundle, gomod cacher, cachetest suite, etc.) continue to compile and work unchanged.
Shared seek helper
Add one reusable "seek-once, lazily open at offset, then sequential" wrapper implementing io.ReadSeekCloser, parameterised by an "open underlying stream at offset" function:
- Holds a pending start offset (default 0).
Seek is only meaningful before the first Read: it sets the start offset, resolving io.SeekStart / io.SeekCurrent / io.SeekEnd against the known object size. After reading begins, Seek returns an error.
- On first
Read, lazily opens the underlying stream at the offset, then reads sequentially.
Close tears down the underlying stream.
This helper is shared by the S3 and Remote backends (DRY).
Backend changes
- disk (
disk.go): return *os.File directly — already an io.ReadSeekCloser. Signature only.
- memory (
memory.go): wrap the existing *bytes.Reader (already seekable) in a no-op-close wrapper instead of io.NopCloser.
- noop (
noop.go): signature only (always returns a cache miss).
- s3 (
s3.go): implement the helper's "open at offset" using the existing parallelGet / GetObject path, starting from the seek offset instead of 0.
- remote (
remote.go + client/*.go): implement "open at offset" via a ranged GET. This requires:
- a new
Range(start) RequestOption in the client package that sets Range: bytes=start-;
client.Open accepting 206 Partial Content in addition to 200 OK.
Server-side Range support
Add single-range support to httputil.ServeCacheHit (shared by the API handler and the generic caching handler). Because the S3/Remote readers only support seek-to-start (not seek-to-end), parse the Range header manually rather than using http.ServeContent (which probes the end via Seek(0, io.SeekEnd)):
- Use the existing
Content-Length header for the object size (no seek-to-end).
- For a satisfiable single range:
Seek(start, io.SeekStart) once, then io.CopyN, emitting 206 Partial Content, Content-Range, Content-Length, and Accept-Ranges: bytes.
- For an unsatisfiable range:
416 Range Not Satisfiable with Content-Range: bytes */size.
- No range / full request: behave as today (advertise
Accept-Ranges: bytes).
- Preserve existing conditional (
If-Match / If-None-Match) handling.
This change powers both the API endpoint and, transitively, the Remote backend's ranged reads.
Tiered cache behaviour
The tiered backfill must not commit a truncated object when a range request reads only a slice.
- Full sequential read from a higher tier: keep today's free tee-backfill into tier 0 (no extra GET).
- Ranged read from a higher tier (a non-trivial
Seek): abandon the tee (cancel the tier-0 write so the partial entry is discarded) and kick off a singleton full copy — a background, request-independent (context.WithoutCancel) download of the whole object from the hitting tier into tier 0, deduplicated so N concurrent range readers trigger at most one copy.
- A
bytes=0- whole-object range (Seek to current position 0 before any read) is treated as a no-op and keeps the cheap tee path.
Mechanics:
backfillReadCloser becomes seekable and tracks bytes read. Seek to the current position before reading delegates to the source and keeps teeing; any other Seek cancels the tee, fires the singleton-copy trigger once, then delegates the seek to the source.
- Singleton copy dedup lives on
Tiered via a shared *sync.Map keyed by namespace + "/" + key. Since Tiered.Namespace() returns a fresh value per request, this map (and a namespace field) must be carried through Namespace() by pointer so dedup spans requests.
- On trigger:
LoadOrStore the key; if present, no-op. Otherwise spawn a goroutine that re-Opens the object from the hitting tier (full, unseeked read), writes it to tier 0 via WriteFunc, and deletes the dedup entry on completion. Errors are logged, not returned (best-effort warming).
Consequence: a ranged read against a cold local tier causes two reads from the higher tier (the range plus the deduplicated background full copy). This is the cost of warming tier 0 on range access.
Tests
- S3 seekable reader: seek-then-sequential-read, error on seek-after-read.
ServeCacheHit ranges: 206 + Content-Range, 416 unsatisfiable, Accept-Ranges advertised, full request unchanged.
- Remote range round-trip (client
Range option + 206 handling end-to-end).
- Tiered: ranged read does not commit a truncated tier-0 entry; ranged read triggers a (deduplicated) full singleton copy that warms tier 0; full read still tees as before.
- Add a Range case to the
cachetest suite so every backend is exercised.
Validation
just tasks / go test ./...
- linters (golangci-lint via the repo's
just target)
Out of scope / notes
- Multi-range (
multipart/byteranges) responses are not supported; only single ranges.
- The "warm tier 0 on range access" copy is best-effort and fire-and-forget.
Summary
Change the
Cacheinterface so thatOpen()returns anio.ReadSeekCloserinstead of anio.ReadCloser, in order to support HTTPRangerequests when serving cached objects.For most backends this is trivial. For backends that stream over the network (S3 and the Remote cache client), we introduce a wrapper that supports a single
Seek()to set the start offset, followed by purely sequential reads, backed by a range request. The higher-level Range-serving code is written to use exactly this access pattern.Interface change
In
internal/cache/api.go, change:to return
io.ReadSeekCloser. Document that callers serving ranges MUST use a single seek-to-start followed by sequential reads (no seek-to-end probing). Becauseio.ReadSeekCloseris a superset ofio.ReadCloser, all existing sequential consumers (http.Fetch, git snapshot/bundle, gomod cacher, cachetest suite, etc.) continue to compile and work unchanged.Shared seek helper
Add one reusable "seek-once, lazily open at offset, then sequential" wrapper implementing
io.ReadSeekCloser, parameterised by an "open underlying stream at offset" function:Seekis only meaningful before the firstRead: it sets the start offset, resolvingio.SeekStart/io.SeekCurrent/io.SeekEndagainst the known object size. After reading begins,Seekreturns an error.Read, lazily opens the underlying stream at the offset, then reads sequentially.Closetears down the underlying stream.This helper is shared by the S3 and Remote backends (DRY).
Backend changes
disk.go): return*os.Filedirectly — already anio.ReadSeekCloser. Signature only.memory.go): wrap the existing*bytes.Reader(already seekable) in a no-op-close wrapper instead ofio.NopCloser.noop.go): signature only (always returns a cache miss).s3.go): implement the helper's "open at offset" using the existingparallelGet/GetObjectpath, starting from the seek offset instead of 0.remote.go+client/*.go): implement "open at offset" via a rangedGET. This requires:Range(start)RequestOptionin theclientpackage that setsRange: bytes=start-;client.Openaccepting206 Partial Contentin addition to200 OK.Server-side Range support
Add single-range support to
httputil.ServeCacheHit(shared by the API handler and the generic caching handler). Because the S3/Remote readers only support seek-to-start (not seek-to-end), parse theRangeheader manually rather than usinghttp.ServeContent(which probes the end viaSeek(0, io.SeekEnd)):Content-Lengthheader for the object size (no seek-to-end).Seek(start, io.SeekStart)once, thenio.CopyN, emitting206 Partial Content,Content-Range,Content-Length, andAccept-Ranges: bytes.416 Range Not SatisfiablewithContent-Range: bytes */size.Accept-Ranges: bytes).If-Match/If-None-Match) handling.This change powers both the API endpoint and, transitively, the Remote backend's ranged reads.
Tiered cache behaviour
The tiered backfill must not commit a truncated object when a range request reads only a slice.
Seek): abandon the tee (cancel the tier-0 write so the partial entry is discarded) and kick off a singleton full copy — a background, request-independent (context.WithoutCancel) download of the whole object from the hitting tier into tier 0, deduplicated so N concurrent range readers trigger at most one copy.bytes=0-whole-object range (Seek to current position 0 before any read) is treated as a no-op and keeps the cheap tee path.Mechanics:
backfillReadCloserbecomes seekable and tracks bytes read.Seekto the current position before reading delegates to the source and keeps teeing; any otherSeekcancels the tee, fires the singleton-copy trigger once, then delegates the seek to the source.Tieredvia a shared*sync.Mapkeyed bynamespace + "/" + key. SinceTiered.Namespace()returns a fresh value per request, this map (and anamespacefield) must be carried throughNamespace()by pointer so dedup spans requests.LoadOrStorethe key; if present, no-op. Otherwise spawn a goroutine that re-Opens the object from the hitting tier (full, unseeked read), writes it to tier 0 viaWriteFunc, and deletes the dedup entry on completion. Errors are logged, not returned (best-effort warming).Consequence: a ranged read against a cold local tier causes two reads from the higher tier (the range plus the deduplicated background full copy). This is the cost of warming tier 0 on range access.
Tests
ServeCacheHitranges:206+Content-Range,416unsatisfiable,Accept-Rangesadvertised, full request unchanged.Rangeoption +206handling end-to-end).cachetestsuite so every backend is exercised.Validation
justtasks /go test ./...justtarget)Out of scope / notes
multipart/byteranges) responses are not supported; only single ranges.