You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(pii): describe Presidio as a standalone service, not a sidecar
Presidio now runs as its own ECS service (and, in Helm, its own Deployment +
Service) reached over the network via PII_URL — not a sidecar in the app task.
Update README, code comments, env docs, Dockerfiles, and the Helm chart docs to
match, and note the deploy requirement that PII_URL must be reachable.
PORT: z.number().optional(),// Main application port
326
326
INTERNAL_API_BASE_URL: z.string().optional(),// Optional internal base URL for server-side self-calls; must include protocol if set (e.g., http://sim-app.namespace.svc.cluster.local:3000)
327
327
ALLOWED_ORIGINS: z.string().optional(),// CORS allowed origins
PII_MASK_CHUNK_CONCURRENCY: z.coerce.number().int().positive().optional(),// Max in-flight mask-batch requests per redaction (default 4); raise for a scaled Presidio service, lower to 1 for a single sidecar
328
+
PII_URL: z.string().optional(),// Presidio PII service base URL serving /analyze + /anonymize (standalone ECS service; default http://localhost:5001 for local dev)
329
+
PII_MASK_CHUNK_CONCURRENCY: z.coerce.number().int().positive().optional(),// Max in-flight mask-batch requests per redaction (default 4); raise for a scaled-out Presidio service, lower to 1 for a single instance
330
330
331
331
// OAuth Integration Credentials - All optional, enables third-party integrations
332
332
GOOGLE_CLIENT_ID: z.string().optional(),// Google OAuth client ID for Google services
Copy file name to clipboardExpand all lines: helm/sim/README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ Optional components (off by default):
48
48
49
49
***`copilot`** — the Sim Copilot service plus its own Postgres StatefulSet.
50
50
***`ollama`** — local LLM inference, with optional NVIDIA GPU support.
51
-
***`pii`** — Presidio PII redaction sidecar (analyzer + anonymizer) for the Guardrails PII block and log redaction. See [PII redaction](#pii-redaction).
51
+
***`pii`** — Presidio PII redaction service (analyzer + anonymizer) for the Guardrails PII block and log redaction. See [PII redaction](#pii-redaction).
***`ingress`** — NGINX-style Ingress for the app and realtime services.
54
54
***`networkPolicy`** — east-west and egress isolation (blocks cloud metadata endpoints by default).
@@ -357,14 +357,14 @@ Requires the Prometheus Operator CRDs. Scrapes `/metrics` on the app and realtim
357
357
358
358
## PII redaction
359
359
360
-
Sim can redact personally identifiable information using a [Presidio](https://microsoft.github.io/presidio/) sidecar (analyzer + anonymizer combined into one image listening on port 5001). Enable it with:
360
+
Sim can redact personally identifiable information using a [Presidio](https://microsoft.github.io/presidio/) service (analyzer + anonymizer combined into one image listening on port 5001). Enable it with:
361
361
362
362
```yaml
363
363
pii:
364
364
enabled: true
365
365
```
366
366
367
-
When enabled, the chart deploys the sidecar (`<release>-pii` Deployment + Service) and **auto-wires** `PII_URL` on the app to the in-cluster service. The sidecar bundles five large spaCy models (en/es/it/pl/fi, ~2.2GB), so the first start takes ~3 minutes while models load — the `startupProbe` allows for this. Size the `pii.resources` for at least ~4Gi memory.
367
+
When enabled, the chart deploys it as a standalone `<release>-pii` Deployment + Service and **auto-wires** `PII_URL` on the app to the in-cluster service. The service bundles five large spaCy models (en/es/it/pl/fi, ~2.2GB), so the first start takes ~3 minutes while models load — the `startupProbe` allows for this. Size the `pii.resources` for at least ~4Gi memory.
368
368
369
369
This alone powers the **Guardrails PII block** and on-demand masking. To additionally turn on **automatic log redaction** (the org/workspace data-retention scrub), you must:
0 commit comments