Skip to content

Commit e7b822a

Browse files
docs(pii): describe Presidio as a standalone service, not a sidecar
Presidio now runs as its own ECS service (and, in Helm, its own Deployment + Service) reached over the network via PII_URL — not a sidecar in the app task. Update README, code comments, env docs, Dockerfiles, and the Helm chart docs to match, and note the deploy requirement that PII_URL must be reachable.
1 parent 965eb65 commit e7b822a

17 files changed

Lines changed: 77 additions & 68 deletions

File tree

apps/sim/app/api/guardrails/mask-batch/route.ts

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,10 @@ const logger = createLogger('GuardrailsMaskBatchAPI')
1111

1212
/**
1313
* Internal batch PII masking. The log-redaction persist path runs in both the
14-
* Next.js server and the trigger.dev runtime, but the Presidio sidecars live only
15-
* in the app task — so redaction calls this endpoint server-to-server (internal
16-
* JWT) to keep Presidio centralized here.
14+
* Next.js server and the trigger.dev runtime, but only the app task reaches the
15+
* Presidio service (it holds `PII_URL` and the internal-network access) — so
16+
* redaction calls this endpoint server-to-server (internal JWT) to keep the
17+
* Presidio call centralized here.
1718
*/
1819
export const POST = withRouteHandler(async (request: NextRequest) => {
1920
const auth = await checkInternalAuth(request, { requireWorkflowId: false })
@@ -35,7 +36,7 @@ export const POST = withRouteHandler(async (request: NextRequest) => {
3536
})
3637
return NextResponse.json({ masked })
3738
} catch (error) {
38-
// An unreachable/misconfigured Presidio sidecar makes maskPIIBatch throw; fail
39+
// An unreachable/misconfigured Presidio service makes maskPIIBatch throw; fail
3940
// loudly here (the caller scrubs to REDACTION_FAILED, so PII is never leaked).
4041
logger.error('PII batch masking failed', {
4142
error: getErrorMessage(error),

apps/sim/lib/core/config/env.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -325,8 +325,8 @@ export const env = createEnv({
325325
PORT: z.number().optional(), // Main application port
326326
INTERNAL_API_BASE_URL: z.string().optional(), // Optional internal base URL for server-side self-calls; must include protocol if set (e.g., http://sim-app.namespace.svc.cluster.local:3000)
327327
ALLOWED_ORIGINS: z.string().optional(), // CORS allowed origins
328-
PII_URL: z.string().optional(), // Presidio PII sidecar base URL serving /analyze + /anonymize (default http://localhost:5001)
329-
PII_MASK_CHUNK_CONCURRENCY: z.coerce.number().int().positive().optional(), // Max in-flight mask-batch requests per redaction (default 4); raise for a scaled Presidio service, lower to 1 for a single sidecar
328+
PII_URL: z.string().optional(), // Presidio PII service base URL serving /analyze + /anonymize (standalone ECS service; default http://localhost:5001 for local dev)
329+
PII_MASK_CHUNK_CONCURRENCY: z.coerce.number().int().positive().optional(), // Max in-flight mask-batch requests per redaction (default 4); raise for a scaled-out Presidio service, lower to 1 for a single instance
330330

331331
// OAuth Integration Credentials - All optional, enables third-party integrations
332332
GOOGLE_CLIENT_ID: z.string().optional(), // Google OAuth client ID for Google services

apps/sim/lib/guardrails/README.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,29 +19,36 @@ For **hallucination detection**, you'll need:
1919
- A knowledge base with documents
2020
- An LLM provider API key (or use hosted models)
2121

22-
### PII Detection (Presidio sidecar)
22+
### PII Detection (Presidio service)
2323

24-
PII detection runs against **one** long-lived Presidio sidecar — a combined service (built from
25-
`docker/pii.Dockerfile`, source in `apps/pii/server.py`) that constructs a warm `AnalyzerEngine` +
26-
`AnonymizerEngine` once and exposes both `/analyze` and `/anonymize` (plus `/health`) on a single
27-
port. In deployment it runs alongside the app container in the same ECS task; locally, build and run
28-
it:
24+
PII detection runs against a **standalone Presidio service** — a combined analyzer + anonymizer
25+
(built from `docker/pii.Dockerfile`, source in `apps/pii/server.py`) that constructs a warm
26+
`AnalyzerEngine` + `AnonymizerEngine` once and exposes `/analyze`, `/anonymize`, and `/health` on a
27+
single port. In deployment it is its **own ECS service** (a dedicated task/service, not a sidecar in
28+
the app task), reached over the network via `PII_URL` and scaled independently of the app. The app
29+
(both the Next.js server and the trigger.dev runtime) is a thin HTTP client (`validate_pii.ts`) — no
30+
Python, no local venv.
31+
32+
Locally, build and run it as a container:
2933

3034
```bash
3135
docker build -f docker/pii.Dockerfile -t sim-pii .
3236
docker run -d -p 5001:5001 sim-pii
3337
```
3438

35-
Point the app at it (default shown):
39+
Point the app at it with `PII_URL`:
3640

37-
```bash
38-
PII_URL=http://localhost:5001
39-
```
41+
- **Local**: `PII_URL=http://localhost:5001` (the default)
42+
- **Deployed**: `PII_URL` points to the Presidio ECS service's internal endpoint (service-discovery
43+
DNS / internal load balancer) — never `localhost`, since the service runs in a separate task
4044

4145
The image bakes in the recognizers itself — a check-digit-validated **VIN** recognizer and
42-
multi-language NLP models (en/es/it/pl/fi) — so the app is a thin HTTP client (`validate_pii.ts`) with
43-
no Python or local venv. The redaction language is configured per rule (Data Retention) and defaults
44-
to English.
46+
multi-language NLP models (en/es/it/pl/fi). The redaction language is configured per rule (Data
47+
Retention) and defaults to English.
48+
49+
> **Deploy requirement:** the execution-altering redaction stages (workflow input + block outputs)
50+
> fail-fast and abort a run if the Presidio service is unreachable. Every environment that can run
51+
> workflows must have a reachable Presidio service at `PII_URL`.
4552
4653
## Usage
4754

@@ -100,7 +107,7 @@ See [Presidio documentation](https://microsoft.github.io/presidio/supported_enti
100107
- `validate_json.ts` - JSON validation (TypeScript)
101108
- `validate_regex.ts` - Regex validation (TypeScript)
102109
- `validate_hallucination.ts` - Hallucination detection with RAG + LLM scoring (TypeScript)
103-
- `validate_pii.ts` - PII detection client: calls the Presidio sidecar's /analyze + /anonymize (TypeScript)
110+
- `validate_pii.ts` - PII detection client: calls the Presidio service's /analyze + /anonymize (TypeScript)
104111
- `pii-entities.ts` - Client-safe PII entity + language catalog (shared by the block and Data Retention)
105112
- `mask-client.ts` - Internal HTTP client for batch PII masking from the log-redaction persist path
106113
- `validate.test.ts` - Test suite for JSON and regex validators

apps/sim/lib/guardrails/mask-client.ts

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,19 @@ import { chunkIndicesByBudget } from '@/lib/guardrails/pii-batching'
88
/**
99
* Max in-flight mask-batch requests per call. Each request is a CPU-heavy NER
1010
* batch, so a single Presidio instance is easily saturated — default 4, raise it
11-
* via `PII_MASK_CHUNK_CONCURRENCY` for a scaled/load-balanced service, or set 1
12-
* for a single sidecar. No request timeout: masking a large batch is slow and the
13-
* (scaled) Presidio service is expected to eventually respond; an unreachable
14-
* sidecar still rejects fast (connection refused) so the caller scrubs.
11+
* via `PII_MASK_CHUNK_CONCURRENCY` for a scaled-out/load-balanced service, or set
12+
* 1 for a single instance. No request timeout: masking a large batch is slow and
13+
* the (scaled) Presidio service is expected to eventually respond; an unreachable
14+
* service still rejects fast (connection refused) so the caller scrubs.
1515
*/
1616
const CHUNK_CONCURRENCY = env.PII_MASK_CHUNK_CONCURRENCY ?? 4
1717

1818
/**
1919
* Mask PII across many strings via the internal app-container endpoint.
2020
*
21-
* The Presidio sidecars run only in the app task, but the log-redaction persist
22-
* path also runs inside the trigger.dev runtime — so redaction always routes
23-
* through HTTP, the same way the guardrails tool does.
21+
* Only the app task reaches the Presidio service (it holds `PII_URL`), but the
22+
* log-redaction persist path also runs inside the trigger.dev runtime — so
23+
* redaction always routes through HTTP, the same way the guardrails tool does.
2424
* Strings are grouped into byte/count-budgeted chunks (keeping each request far
2525
* under the 10MB Next body limit) and the chunks are sent with bounded
2626
* concurrency, so a large payload fans out rather than serializing; order is

apps/sim/lib/guardrails/pii-batching.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/**
22
* Per-request bounds shared by both Presidio hops: the app→route HTTP call
3-
* (`mask-client`) and the route→sidecar call (`validate_pii`). Keeping a single
3+
* (`mask-client`) and the route→service call (`validate_pii`). Keeping a single
44
* source of truth ensures every request stays far under the 10MB Next body limit
5-
* and small enough for one short spaCy NER pass under the sidecar timeout.
5+
* and small enough for one short spaCy NER pass per Presidio request.
66
*/
77

88
/** Max UTF-8 bytes of text per Presidio request. ~40× under the 10MB Next limit. */

apps/sim/lib/guardrails/validate_pii.test.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ function emailSpans(text: string, entities: string[] | undefined): Span[] {
2727
return idx === -1 ? [] : [{ entity_type: 'EMAIL_ADDRESS', start: idx, end: idx + 7, score: 0.9 }]
2828
}
2929

30-
describe('validate_pii (Presidio sidecar)', () => {
30+
describe('validate_pii (Presidio service)', () => {
3131
let analyzeBodies: Array<{ text: string; language: string; entities?: string[] }>
3232
let fetchMock: ReturnType<typeof vi.fn>
3333

@@ -87,7 +87,7 @@ describe('validate_pii (Presidio sidecar)', () => {
8787
expect(await maskPIIBatch([''], [])).toEqual([''])
8888
})
8989

90-
it('throws on a sidecar failure so the caller can scrub', async () => {
90+
it('throws on a service failure so the caller can scrub', async () => {
9191
fetchMock.mockResolvedValueOnce(new Response('boom', { status: 500 }))
9292
await expect(maskPIIBatch(['email a@b.com'], [])).rejects.toThrow(/Presidio analyze failed/)
9393
})

apps/sim/lib/guardrails/validate_pii.ts

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@ import { chunkIndicesByBudget } from '@/lib/guardrails/pii-batching'
77
const logger = createLogger('PIIValidator')
88

99
/**
10-
* Concurrent chunk requests in flight. Each chunk is itself a batched sidecar call
11-
* (spaCy `nlp.pipe` over many strings), so a small concurrency keeps the single-model
12-
* sidecar from holding too many parallel docs in memory while still overlapping
13-
* HTTP/JSON with the next chunk's NER.
10+
* Concurrent chunk requests in flight. Each chunk is itself a batched service call
11+
* (spaCy `nlp.pipe` over many strings), so a small concurrency keeps a single-model
12+
* Presidio instance from holding too many parallel docs in memory while still
13+
* overlapping HTTP/JSON with the next chunk's NER.
1414
*/
1515
const CHUNK_CONCURRENCY = 4
1616

17-
/** Single Presidio sidecar serving both /analyze and /anonymize (VIN is native there). */
17+
/** Presidio service serving both /analyze and /anonymize (VIN is native there). */
1818
const PII_URL = env.PII_URL || 'http://localhost:5001'
1919

2020
export interface PIIValidationInput {
@@ -58,7 +58,7 @@ async function analyze(
5858
): Promise<AnalyzerSpan[]> {
5959
const entities = entityTypes.length > 0 ? entityTypes : undefined
6060

61-
// boundary-raw-fetch: internal call to the Presidio analyzer sidecar over localhost
61+
// boundary-raw-fetch: internal call to the Presidio analyzer service via PII_URL
6262
const response = await fetch(`${PII_URL}/analyze`, {
6363
method: 'POST',
6464
headers: { 'content-type': 'application/json' },
@@ -83,7 +83,7 @@ async function analyzeBatch(
8383
): Promise<AnalyzerSpan[][]> {
8484
const entities = entityTypes.length > 0 ? entityTypes : undefined
8585

86-
// boundary-raw-fetch: internal call to the Presidio analyzer sidecar over localhost
86+
// boundary-raw-fetch: internal call to the Presidio analyzer service via PII_URL
8787
const response = await fetch(`${PII_URL}/analyze_batch`, {
8888
method: 'POST',
8989
headers: { 'content-type': 'application/json' },
@@ -110,7 +110,7 @@ interface AnonymizeBatchItem {
110110
async function anonymizeBatch(items: AnonymizeBatchItem[]): Promise<string[]> {
111111
if (items.length === 0) return []
112112

113-
// boundary-raw-fetch: internal call to the Presidio anonymizer sidecar over localhost
113+
// boundary-raw-fetch: internal call to the Presidio anonymizer service via PII_URL
114114
const response = await fetch(`${PII_URL}/anonymize_batch`, {
115115
method: 'POST',
116116
headers: { 'content-type': 'application/json' },
@@ -125,13 +125,13 @@ async function anonymizeBatch(items: AnonymizeBatchItem[]): Promise<string[]> {
125125
}
126126

127127
/**
128-
* Mask spans via the Presidio anonymizer sidecar. Omitting `anonymizers` uses the
128+
* Mask spans via the Presidio anonymizer service. Omitting `anonymizers` uses the
129129
* default `replace` operator, which yields `<ENTITY_TYPE>`. Throws on failure.
130130
*/
131131
async function anonymize(text: string, spans: AnalyzerSpan[]): Promise<string> {
132132
if (spans.length === 0) return text
133133

134-
// boundary-raw-fetch: internal call to the Presidio anonymizer sidecar over localhost
134+
// boundary-raw-fetch: internal call to the Presidio anonymizer service via PII_URL
135135
const response = await fetch(`${PII_URL}/anonymize`, {
136136
method: 'POST',
137137
headers: { 'content-type': 'application/json' },
@@ -146,7 +146,7 @@ async function anonymize(text: string, spans: AnalyzerSpan[]): Promise<string> {
146146
}
147147

148148
/**
149-
* Validate text for PII using the Presidio sidecar.
149+
* Validate text for PII using the Presidio service.
150150
*
151151
* - block: fails validation if any PII is detected
152152
* - mask: passes and returns masked text with PII replaced by `<ENTITY_TYPE>`
@@ -209,14 +209,14 @@ export async function validatePII(input: PIIValidationInput): Promise<PIIValidat
209209
}
210210

211211
/**
212-
* Mask PII across many strings via the Presidio sidecar, preserving input order.
212+
* Mask PII across many strings via the Presidio service, preserving input order.
213213
*
214214
* Strings are grouped into byte/count-budgeted chunks (see {@link chunkIndicesByBudget}),
215215
* and each chunk runs one batched `analyze` pass followed by one batched `anonymize`
216-
* pass over only the strings that actually matched — so the sidecar round-trip count
216+
* pass over only the strings that actually matched — so the service round-trip count
217217
* scales with payload size, not leaf count, and spaCy batches NER via `nlp.pipe`.
218218
* Chunks run with bounded concurrency. Strings with no detected PII pass through
219-
* unchanged. Rejects on any sidecar failure (which fails the whole batch) so callers
219+
* unchanged. Rejects on any service failure (which fails the whole batch) so callers
220220
* can apply their own fail-safe (scrub).
221221
*/
222222
export async function maskPIIBatch(

docker/app.Dockerfile

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,9 @@ COPY --from=builder --chown=nextjs:nodejs /app/apps/sim/lib/execution/isolated-v
125125
# apps/sim/lib/execution/sandbox/bundles/build.ts to regenerate.
126126
COPY --from=builder --chown=nextjs:nodejs /app/apps/sim/lib/execution/sandbox/bundles ./apps/sim/lib/execution/sandbox/bundles
127127

128-
# Guardrails PII runs in dedicated Presidio sidecar containers (analyzer +
129-
# anonymizer), reached over localhost — no Python/Presidio in this image.
128+
# Guardrails PII runs in a standalone Presidio service (combined analyzer +
129+
# anonymizer, docker/pii.Dockerfile), reached over the network via PII_URL —
130+
# no Python/Presidio in this image.
130131

131132
# Create .next/cache directory with correct ownership
132133
RUN mkdir -p apps/sim/.next/cache && \

docker/pii.Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ RUN groupadd -g 1001 pii && \
3838
chown -R pii:pii /app
3939
USER pii
4040

41-
# Listen on 5001. In the ECS task all containers share one network namespace
42-
# (awsvpc) and the app owns 3000, so this sidecar must not use 3000.
41+
# Listen on 5001. Runs as its own ECS service (separate task), reached via PII_URL;
42+
# 5001 avoids colliding with the app's 3000 in local/compose runs on one host.
4343
EXPOSE 5001
4444

4545
# start-period is generous: five large spaCy models load at import before

helm/sim/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ Optional components (off by default):
4848

4949
* **`copilot`** — the Sim Copilot service plus its own Postgres StatefulSet.
5050
* **`ollama`** — local LLM inference, with optional NVIDIA GPU support.
51-
* **`pii`** — Presidio PII redaction sidecar (analyzer + anonymizer) for the Guardrails PII block and log redaction. See [PII redaction](#pii-redaction).
51+
* **`pii`** — Presidio PII redaction service (analyzer + anonymizer) for the Guardrails PII block and log redaction. See [PII redaction](#pii-redaction).
5252
* **`telemetry`** — OpenTelemetry Collector wired to Jaeger / Prometheus / OTLP backends.
5353
* **`ingress`** — NGINX-style Ingress for the app and realtime services.
5454
* **`networkPolicy`** — east-west and egress isolation (blocks cloud metadata endpoints by default).
@@ -357,14 +357,14 @@ Requires the Prometheus Operator CRDs. Scrapes `/metrics` on the app and realtim
357357

358358
## PII redaction
359359

360-
Sim can redact personally identifiable information using a [Presidio](https://microsoft.github.io/presidio/) sidecar (analyzer + anonymizer combined into one image listening on port 5001). Enable it with:
360+
Sim can redact personally identifiable information using a [Presidio](https://microsoft.github.io/presidio/) service (analyzer + anonymizer combined into one image listening on port 5001). Enable it with:
361361

362362
```yaml
363363
pii:
364364
enabled: true
365365
```
366366

367-
When enabled, the chart deploys the sidecar (`<release>-pii` Deployment + Service) and **auto-wires** `PII_URL` on the app to the in-cluster service. The sidecar bundles five large spaCy models (en/es/it/pl/fi, ~2.2GB), so the first start takes ~3 minutes while models load — the `startupProbe` allows for this. Size the `pii.resources` for at least ~4Gi memory.
367+
When enabled, the chart deploys it as a standalone `<release>-pii` Deployment + Service and **auto-wires** `PII_URL` on the app to the in-cluster service. The service bundles five large spaCy models (en/es/it/pl/fi, ~2.2GB), so the first start takes ~3 minutes while models load — the `startupProbe` allows for this. Size the `pii.resources` for at least ~4Gi memory.
368368

369369
This alone powers the **Guardrails PII block** and on-demand masking. To additionally turn on **automatic log redaction** (the org/workspace data-retention scrub), you must:
370370

0 commit comments

Comments
 (0)