Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
Tavern is a high-performance HTTP caching proxy server implemented in Go. It leverages a modern service framework to deliver a flexible architecture, strong extensibility, and excellent performance.
</p>

Other languages: [简体中文](README.zh-CN.md)
Other languages: [简体中文](README.zh-CN.md)

## ✨ Features

Expand All @@ -25,9 +25,9 @@ Other languages: [简体中文](README.zh-CN.md)
- [x] Cache Push (URL/DIR Push)
- [x] URL mark expired
- [x] URL cache file delete
- [ ] DIR mark expired
- [x] DIR mark expired
- [x] DIR cache file delete
- [ ] Fuzzy refresh (Fuzzing fetch)
- [x] Fuzzy refresh (Fuzzing fetch)
- [x] Auto refresh
- [x] Cache validation
- [ ] Hot migration
Expand Down Expand Up @@ -102,6 +102,10 @@ Once started, you can monitor and debug using the following (ports depend on `co
- `server/`: HTTP server implementation and middleware
- `storage/`: Storage engine abstractions and implementations

## 📚 Documentation

- PURGE design and operations: [docs/purge.md](docs/purge.md)

## 📝 License

[MIT License](LICENSE)
Expand Down
14 changes: 7 additions & 7 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,18 @@
Tavern 是一个 Go 实现的高性能 HTTP 缓存代理服务器,旨在利用现代化的服务框架提供更灵活的架构、更强的扩展性以及更优秀的性能。
</p>

其他语言: [English](README.md)
其他语言: [English](README.md)

## ✨ 特性 (Features)

- **核心缓存能力**:
- [x] 缓存预取 (Prefetch)
- [x] 缓存推送 (URL/DIR Push)
- [x] URL标记过期 (Mark Expired)
- [x] URL缓存文件删除 (CacheFile Delete)
- [ ] DIR标记过期 (DirPath Mark Expired)
- [x] DIR缓存文件删除 (DirPath Delete)
- [ ] 模糊刷新 (Fuzzing fetch)
- [x] URL 标记过期 (Mark Expired)
- [x] URL 缓存文件删除 (CacheFile Delete)
- [x] DIR 标记过期 (DirPath Mark Expired)
- [x] DIR 缓存文件删除 (DirPath Delete)
- [x] 模糊刷新 (Fuzzing fetch)
- [x] 自动刷新 (Auto Refresh)
- [x] 缓存变更校验 (Cache Validation)
- [ ] 热点迁移 (Hot Migration)
Expand All @@ -37,7 +37,7 @@ Tavern 是一个 Go 实现的高性能 HTTP 缓存代理服务器,旨在利用
- [x] Vary 分版本缓存 (Vary Cache)
- [x] 头部重写 (Headers Rewrite)
- [x] 支持 Multiple Range 请求
- [x] 缓存HASH校验 (CRC checksum/EdgeMode)
- [x] 缓存 HASH 校验 (CRC checksum/EdgeMode)
- 你可能需要 [缓存校验中心](https://github.com/omalloc/trust-receive) 服务
- **现代化架构**:
- 基于 **Kratos** 框架,提供高扩展、模块复用能力
Expand Down
13 changes: 13 additions & 0 deletions api/defined/v1/storage/storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package storage
import (
"context"
"errors"
"fmt"
"io"

"github.com/omalloc/tavern/api/defined/v1/storage/object"
Expand Down Expand Up @@ -84,6 +85,18 @@ type PurgeControl struct {
MarkExpired bool `json:"mark_expired"` // 是否标记为过期, default: false 与 Hard 冲突
}

func (r PurgeControl) String() string {
mode := "file"
if r.Dir {
mode = "dir"
}
expOrHard := "mark_expired"
if r.Hard {
expOrHard = "hard_del"
}
return fmt.Sprintf("mode:%s@%s", mode, expOrHard)
}

var ErrSharedKVKeyNotFound = errors.New("key not found")

type SharedKV interface {
Expand Down
189 changes: 189 additions & 0 deletions docs/purge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# PURGE Design

Tavern implements cache invalidation via a `purge` plugin plus storage-layer primitives that support both single-object and directory purges. This document details configuration, request semantics, internal flow, storage behavior, and known caveats.

## Overview

- Purpose: Invalidate cached content by URL or by directory prefix.
- Entry point: PURGE HTTP requests intercepted by the `purge` plugin and translated into storage operations.
- Modes:
- File purge: target a single cached object.
- Directory purge: target all cached objects whose `storeUrl` path shares a prefix.
- Strategies:
- Hard: delete cached file(s).
- MarkExpired: set past expiry to trigger revalidation on next access.

## Configuration

Enable the plugin and set options in `config.yaml`:

```yaml
plugins:
- name: purge
options:
allow_hosts:
- "127.0.0.1"
- "::1"
header_name: "Purge-Type" # default: Purge-Type
log_path: "logs/purge.log" # optional
threshold: 0 # reserved for queue sizing/backpressure
```

Options:

- allow_hosts: IP allowlist; only requests from these source IPs may PURGE.
- header_name: header used to define purge type; default `Purge-Type`.
- log_path: optional plugin log file path.
- threshold: reserved; not currently used in request path.

## Request API

Send PURGE requests to the resource URL you want to invalidate.

- Method: PURGE
- URL: the target resource (absolute URL)
- Headers:
- `Purge-Type`: controls mode and strategy
- `file` (default): single-object purge.
- `dir`: directory/prefix purge.
- Append `,hard` to perform hard delete. Examples: `dir,hard`, `file,hard`.
- `i-x-store-url` (optional): override the stored cache key URL used by storage.

Responses:

- 200 OK: purge executed successfully.
- 403 Forbidden: source IP not in allowlist.
- 404 Not Found: object(s) not present in cache.
- 500 Internal Server Error: internal error while processing purge.

Examples:

```bash
# Single file: mark-expired (soft)
curl -X PURGE http://example.com/static/js/main.js

# Single file: hard delete
curl -X PURGE -H "Purge-Type: file,hard" http://example.com/static/js/main.js

# Directory prefix: mark-expired
curl -X PURGE -H "Purge-Type: dir" http://example.com/static/js/

# Directory prefix: hard delete
curl -X PURGE -H "Purge-Type: dir,hard" http://example.com/static/js/

# Use internal store-url override
curl -X PURGE -H "i-x-store-url: http://example.com/static/js/" -H "Purge-Type: dir,hard" http://example.com/anything
```

## Internal Flow

Primary code paths:

- Plugin handler: [plugin/purge/purge.go](plugin/purge/purge.go)
- Storage implementation: [storage/storage.go](storage/storage.go)
- SharedKV implementation: [storage/sharedkv/kv_pebble.go](storage/sharedkv/kv_pebble.go)
- Internal header definitions: [internal/constants/global.go](internal/constants/global.go)

### Plugin: request handling

1. Accept only `PURGE` method; otherwise pass through to next handler.
2. Extract source IP from `RemoteAddr`; enforce allowlist (`allow_hosts`).
3. Compute `storeUrl`:
- Prefer `i-x-store-url` (`InternalStoreUrl`); otherwise `req.URL.String()`.
4. Parse `Purge-Type` header:
- First token: `dir` for directory, otherwise file.
- Second token: `hard` → hard delete; default is soft (MarkExpired).
5. Log request and look up current storage via `storage.Current()`.
6. For directory purge:
- If no domain counter exists for `u.Host` in SharedKV key `if/domain/<host>`, log and exit early.
- Call `storage.PURGE(storeUrl, ctrl)` and translate errors to HTTP status (404 for `ErrKeyNotFound`, 500 otherwise).
7. For file purge: call `storage.PURGE()` and translate errors as above.
8. On success, respond `200` with `{"message":"success"}`.

### Storage: purge behavior

File purge:

- Resolve `bucket` via selector on `object.ID(storeUrl)`.
- Hard: `bucket.Discard(id)`.
- Soft (MarkExpired): `bucket.Lookup(id)`, set `ExpiresAt` to a past timestamp, and `bucket.Store(md)`.

Directory purge:

- When `Hard` or soft-but-not-MarkExpired:
- Prefer SharedKV inverted index for fast targeting:
- Index key schema: `ix/<bucketID>/<storeUrl>`.
- Value: `object.IDHash` bytes (fixed size).
- Iterate buckets:
- For each inverted-index hit: delete file by hash (`DiscardWithHash`) if hard; delete-index mapping; count processed.
- Fallback scan when no index hits:
- Iterate bucket metadata; for any `md.ID.Path()` prefixed by `storeUrl`, delete (hard) or mark-expired (soft), and count processed.
- If processed is zero: return `ErrKeyNotFound`.
- When `Dir` with `MarkExpired` set:
- Current implementation returns `nil` immediately (no changes). See Caveats.

SharedKV keys used by PURGE:

- `if/domain/<host>`: domain counter (presence used by plugin to gate dir purges).
- `ix/<bucketID>/<storeUrl>`: inverted index mapping to object hashes for efficient dir purges.

## Flowchart

```mermaid
flowchart TD
A[Client PURGE Request] --> B{Method == PURGE?}
B -- no --> C[Pass to next handler]
B -- yes --> D{Source IP in allow_hosts?}
D -- no --> E[403 Forbidden]
D -- yes --> F[Compute storeUrl from i-x-store-url or req.URL]
F --> G[Parse Purge-Type: dir/file, hard/soft]
G --> H{Dir purge?}
H -- no --> I["Storage.PURGE(file)"]
H -- yes --> J["SharedKV has if/domain/<host>?"]
J -- no --> K["Exit early (no response body; logs)"]
J -- yes --> L["Storage.PURGE(dir)"]
I --> M{Error?}
L --> N{ErrKeyNotFound?}
M -- yes --> O[404 if not found; 500 otherwise]
M -- no --> P[200 OK]
N -- yes --> Q[404 Not Found]
N -- no --> P[200 OK]
```

Storage.PURGE(dir) detail:

```mermaid
flowchart TB
S["Dir purge"] --> T{"MarkExpired?"}
T -- yes --> U["Add dir purge task in KV"]
T -- no --> V["Iterate buckets"]
V --> W["Use SharedKV ix//"]
W --> X{"Has hits?"}
X -- yes --> Y["DiscardWithHash if hard; delete index"]
X -- no --> Z["Fallback: scan indexdb and match path prefix"]
Y --> AA["processed++"]
Z --> AB["Hard: Discard; Soft: set ExpiresAt"]
AB --> AC["processed++"]
AC --> AD{"processed == 0?"}
AD -- yes --> AE["ErrKeyNotFound"]
AD -- no --> AF["Return nil"]
```

## Caveats & Notes

- Dir purge with `MarkExpired`: The current storage implementation returns success (`nil`) without marking entries expired. The comment suggests a fallback full scan for soft purges but is not executed due to the early return. If you rely on soft dir purge, consider using `dir,hard` or update the implementation.
- Plugin response for dir purge when the `if/domain/<host>` counter is missing: the handler logs and returns without writing a response body; clients may see a `200` or connection close depending on upstream wrapping. Validate behavior in your deployment.
- Inverted index population: Ensure your storage buckets populate `ix/<bucketID>/<storeUrl>` keys to leverage fast dir purges; otherwise the fallback scan is used.

## Operational Guidance

- Use `allow_hosts` to restrict purge sources to trusted control planes.
- Prefer `hard` for immediate deletions when correctness is paramount; use soft (MarkExpired) to trigger revalidation while retaining metadata.
- For large dir purges, ensure SharedKV is healthy; fallback scans may be expensive.

## Related Files

- [plugin/purge/purge.go](plugin/purge/purge.go)
- [storage/storage.go](storage/storage.go)
- [internal/constants/global.go](internal/constants/global.go)
- [storage/sharedkv/kv_pebble.go](storage/sharedkv/kv_pebble.go)
8 changes: 5 additions & 3 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ import (
"github.com/omalloc/tavern/proxy"
"github.com/omalloc/tavern/server"
"github.com/omalloc/tavern/storage"
"github.com/omalloc/tavern/storage/marked"
)

var (
Expand Down Expand Up @@ -113,11 +114,12 @@ func newApp(bc *conf.Bootstrap, logger log.Logger) (*kratos.App, error) {
}

// init storage
st, err := storage.New(bc.Storage, log.GetLogger())
store, err := storage.New(bc.Storage, log.GetLogger())
if err != nil {
log.Fatalf("failed to initialize storage: %v", err)
}
storage.SetDefault(st)
store = marked.WrapStorage(store, marked.NewSharedKVChecker(store.SharedKV()))
storage.SetDefault(store)

// init upstream
nodes := make([]selector.Node, 0, len(bc.Upstream.Address))
Expand Down Expand Up @@ -184,7 +186,7 @@ func newApp(bc *conf.Bootstrap, logger log.Logger) (*kratos.App, error) {
log.Info("tavern sig event SIGUSR2")

// close all db
if err := st.Close(); err != nil {
if err := store.Close(); err != nil {
log.Errorf("failed to close storage: %s", err)
}

Expand Down
3 changes: 3 additions & 0 deletions pkg/e2e/e2e.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,9 @@ func New(caseUrl string, srcHandler http.HandlerFunc) *E2E {
}

func (e *E2E) Do(rewrite func(r *http.Request)) (*http.Response, error) {
// wait for a while to let the server ready
time.Sleep(time.Millisecond * 100)

rewrite(e.req)

method := e.req.Method
Expand Down
8 changes: 6 additions & 2 deletions pkg/e2e/e2e_file.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,16 @@ func SumMD5(buf []byte) string {
return hex.EncodeToString(h.Sum(nil))
}

func DiscardBody(resp *http.Response) int {
func DiscardBody(resp *http.Response, readSpeedKbps int) int {
if resp == nil || resp.Body == nil {
return 0
}
if readSpeedKbps <= 0 {
readSpeedKbps = 5 * 1024 // 5MB/s
}

n, _ := io.Copy(io.Discard, resp.Body)
// n, _ := io.Copy(io.Discard, resp.Body)
n, _ := io.Copy(io.Discard, iobuf.NewRateLimitReader(resp.Body, readSpeedKbps))
_ = resp.Body.Close()
return int(n)
}
Expand Down
Loading