Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
244 changes: 244 additions & 0 deletions service/adrs/0008-sse-for-bulk-evaluation-changes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,244 @@
# 8. Server-Sent Events (SSE) for bulk evaluation changes

Date: 2026-02-20

## Status

Proposed

## Context

OFREP currently relies exclusively on polling for flag change detection in client-side (static context) providers. As described in [ADR-0005](0005-polling-for-bulk-evaluation-changes.md), polling was chosen initially for simplicity, with the explicit expectation that additional change detection mechanisms would be added later.

Polling has known limitations:
- There is no way to implement real-time flag updates
- Frequent polling introduces unnecessary load on flag management systems
- There is an inherent latency between flag changes and client awareness, bounded by the poll interval

The [vendor survey](https://docs.google.com/forms/d/1NzqKx57XvRK_2lRQOFCRmF5exet6f15-sCjdEy0HCS8#responses) referenced in ADR-0005 confirmed that many vendors already use SSE for change notification. Without a standardized mechanism in OFREP, each vendor must implement proprietary push solutions, undermining the protocol's goal of vendor-agnostic interoperability.

Server-Sent Events (SSE) is a W3C standard that fits this use case well:
- Unidirectional (server-to-client), matching the notification pattern
- Runs over standard HTTP without protocol upgrades
- Natively supported in browsers via the `EventSource` API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider adding that mobile is also supported. After a quick search I confirmed that.

Copy link
Member Author

@jonathannorris jonathannorris Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea Mobile should be supported with SSE will make that clearer. I recommend that we stick to the LaunchDarkly Event Source libraries where its not built-in:

- Built-in reconnection support; when events include `id`, reconnecting clients can send `Last-Event-ID` to resume from the last processed event
- Works through proxies, CDNs, and standard HTTP infrastructure

## Decision

Add an optional `sse` array to the bulk evaluation response (`POST /ofrep/v1/evaluate/flags`). When present, it provides SSE endpoint URLs that the provider connects to for real-time flag change notifications.

SSE is used as a **notification-only** mechanism -- events signal the provider to re-fetch the bulk evaluation via the existing endpoint, rather than streaming full evaluation payloads. This keeps the SSE message format simple, reuses existing infrastructure, and avoids duplicating evaluation logic.

### Response Schema

Add an optional `sse` field to `bulkEvaluationSuccess`:

```json
{
"flags": [
{
"key": "discount-banner",
"value": true,
"reason": "TARGETING_MATCH",
"variant": "enabled"
}
],
"sse": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of this object. I prefer to not use the name of the technology here, but its purpose.
For example:

"refresh": [ //I am terrible with naming so I am up for suggestions
{
"type": "sse",
"url": "http://",
"timeout": 123
}]

This would allow us to add extra notification/refresh type without breaking the contract.

{
"url": "https://sse.example.com/event-stream?channels=env_abc123_v1",
"inactivityDelaySec": 120
}
],
"metadata": {
"version": "v12"
}
}
```

Each SSE connection object has:
- `url` (string, required): The SSE endpoint URL. The URL is opaque to the provider and may include authentication tokens, channel identifiers, or other vendor-specific query parameters.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 60 states "The URL is opaque to the provider and may include authentication tokens", and line 229 notes "tokenized URL handling risk" where accidental logging can expose credentials. However, the main specification doesn't include any guidance on secure URL handling. Since this is a security-sensitive aspect mentioned in both the spec and open questions, consider adding at minimum a note in the provider behavior section warning implementations to avoid logging SSE URLs and to treat them as sensitive credentials.

Suggested change
- `url` (string, required): The SSE endpoint URL. The URL is opaque to the provider and may include authentication tokens, channel identifiers, or other vendor-specific query parameters.
- `url` (string, required): The SSE endpoint URL. The URL is opaque to the provider and may include authentication tokens, channel identifiers, or other vendor-specific query parameters. Implementations MUST treat this URL as sensitive credential material and MUST NOT log or otherwise persist the full value (including query string) in application logs, analytics, error reports, or other telemetry.

Copilot uses AI. Check for mistakes.
- `inactivityDelaySec` (integer, optional): Seconds of client inactivity (e.g., browser tab or mobile app backgrounded) after which the SSE connection should be closed. The client must reconnect and re-fetch when activity resumes.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification states inactivityDelaySec should be used to close connections "after the specified inactivity period" (line 61), but "inactivity" is not clearly defined. Does this mean no SSE events received, no user interaction with the application, browser tab backgrounded, or device screen off? Different interpretations could lead to inconsistent behavior across implementations. Consider providing a clear definition or examples of what constitutes "inactivity" in this context.

Suggested change
- `inactivityDelaySec` (integer, optional): Seconds of client inactivity (e.g., browser tab or mobile app backgrounded) after which the SSE connection should be closed. The client must reconnect and re-fetch when activity resumes.
- `inactivityDelaySec` (integer, optional): Seconds since the client application last considered itself "active" for the current user/session, after which the SSE connection should be closed. Inactivity is determined by the host application (for example, a browser tab becoming hidden or suspended, a mobile app moving to the background, or a configurable period with no user interaction), and **must not** be based solely on the absence or frequency of SSE events. The client must reconnect and re-fetch when activity resumes according to its activity detection rules.

Copilot uses AI. Check for mistakes.

The `sse` field is an array to support vendors whose infrastructure may require connections to multiple channels or endpoints (e.g., a global channel for environment-wide changes and a user-specific channel for targeted updates). Many SSE providers support multiple channels on a single URL, so the array will typically contain a single entry.

### SSE Event Format

Events use the standard [SSE event format](https://html.spec.whatwg.org/multipage/server-sent-events.html) with a JSON `data` field:

```
id: evt-1234
event: message
data: {"type": "refetchEvaluation", "etag": "\"abc123\"", "lastModified": 1771622898}
```

Event data fields:
- `type` (string, required): The event type. Providers must handle `refetchEvaluation` and must ignore unknown types for forward compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lastModified field is defined as string | integer, but the example shows an integer. It would be beneficial to include an example of the date string format (ISO 8601 or HTTP-date) to provide a complete illustration of the supported types.

- `etag` (string, optional): Latest flag configuration validator sent over SSE metadata. If present, providers should include it as the `sseEtag` query parameter on the re-fetch request.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description says etag is "Latest flag configuration validator sent over SSE metadata", but "validator" is an unclear term in this context. This should likely be "validation token" or "version identifier" to better describe ETag's purpose as a cache validation mechanism.

Suggested change
- `etag` (string, optional): Latest flag configuration validator sent over SSE metadata. If present, providers should include it as the `sseEtag` query parameter on the re-fetch request.
- `etag` (string, optional): Latest flag configuration validation token (version identifier) sent over SSE metadata. If present, providers should include it as the `sseEtag` query parameter on the re-fetch request.

Copilot uses AI. Check for mistakes.
- `lastModified` (string | integer, optional): Latest flag configuration timestamp sent over SSE metadata. Supports either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date). If present, providers should include it as the `sseLastModified` query parameter on the re-fetch request.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification states that lastModified in SSE events supports "either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date)" but doesn't provide guidance on how providers should handle parsing failures or ambiguous formats. Since line 94 mentions "cross-language date parsing ambiguity", consider adding explicit error handling guidance for providers when they encounter unparseable lastModified values.

Suggested change
- `lastModified` (string | integer, optional): Latest flag configuration timestamp sent over SSE metadata. Supports either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date). If present, providers should include it as the `sseLastModified` query parameter on the re-fetch request.
- `lastModified` (string | integer, optional): Latest flag configuration timestamp sent over SSE metadata. Supports either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date). Servers **SHOULD** prefer Unix timestamps in seconds or unambiguous ISO 8601 / HTTP-date (IMF-fixdate) strings to avoid cross-language parsing ambiguity. Providers **MUST NOT** apply locale-specific or heuristic parsing, and if a `lastModified` value cannot be parsed or is otherwise ambiguous, they **MUST** treat it as absent (i.e., omit `sseLastModified` on the re-fetch) rather than guessing a value. If present and successfully parsed, providers should include it as the `sseLastModified` query parameter on the re-fetch request.

Copilot uses AI. Check for mistakes.

Comment on lines +76 to +79
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification requires providers to "ignore unknown types for forward compatibility" (line 76), but doesn't specify what action providers should take when receiving an unknown event type. Should providers log it, increment a metric, or silently discard it? Consider adding guidance on how unknown event types should be handled to help with debugging and monitoring.

Suggested change
- `type` (string, required): The event type. Providers must handle `refetchEvaluation` and must ignore unknown types for forward compatibility.
- `etag` (string, optional): Latest flag configuration validator sent over SSE metadata. If present, providers should include it as the `sseEtag` query parameter on the re-fetch request.
- `lastModified` (string | integer, optional): Latest flag configuration timestamp sent over SSE metadata. Supports either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date). If present, providers should include it as the `sseLastModified` query parameter on the re-fetch request.
- `type` (string, required): The event type. Providers must handle `refetchEvaluation` and must ignore unknown types for forward compatibility (i.e., they MUST NOT trigger a re-fetch or other functional behavior for unknown types).
- `etag` (string, optional): Latest flag configuration validator sent over SSE metadata. If present, providers should include it as the `sseEtag` query parameter on the re-fetch request.
- `lastModified` (string | integer, optional): Latest flag configuration timestamp sent over SSE metadata. Supports either Unix timestamp in seconds (recommended) or a date string (ISO 8601 or HTTP-date). If present, providers should include it as the `sseLastModified` query parameter on the re-fetch request.
Providers MAY emit low-cost observability signals (for example, a debug-level log entry or a counter metric) when they encounter an event with an unknown `type`, but they SHOULD ensure that such signals do not cause excessive log volume or metric cardinality in the presence of high event rates.

Copilot uses AI. Check for mistakes.
SSE envelope fields:
- `id` (string, recommended): Event identifier used by SSE clients for resume semantics via `Last-Event-ID`.

Reconnection and replay behavior:
- Providers should rely on standard SSE reconnect behavior and pass `Last-Event-ID` when supported by the client/runtime.
- Servers that support replay should emit stable event `id` values for `refetchEvaluation` events and replay missed events when `Last-Event-ID` is provided.
- Providers must perform an immediate bulk re-fetch after reconnect, even when replay is supported, to guarantee cache correctness across implementations with different replay retention policies.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 86 states "Providers must perform an immediate bulk re-fetch after reconnect, even when replay is supported", but this contradicts the standard SSE reconnection pattern where Last-Event-ID is meant to resume from where the client left off. If replay is supported and working correctly, an immediate re-fetch would be redundant. Consider revising to "Providers should perform a bulk re-fetch after reconnect if replay is not supported or fails" or explaining why the redundant fetch is necessary despite replay.

Suggested change
- Providers must perform an immediate bulk re-fetch after reconnect, even when replay is supported, to guarantee cache correctness across implementations with different replay retention policies.
- Providers should perform an immediate bulk re-fetch after reconnect if replay is not supported, unavailable for the disconnect window, or otherwise fails, to guarantee cache correctness across implementations with different replay retention policies.

Copilot uses AI. Check for mistakes.

Transporting SSE metadata to the bulk endpoint:
- `sseEtag` and `sseLastModified` are SSE-trigger metadata, not standard HTTP conditional request validators for endpoint-level response caching semantics.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term "SSE-trigger metadata" is introduced on line 89 but never formally defined. Consider adding a clear definition earlier in the document explaining that this refers to metadata originating from SSE events that should be passed to the bulk evaluation endpoint when refetching, to distinguish it from standard HTTP conditional request headers.

Copilot uses AI. Check for mistakes.
- `sseEtag` and `sseLastModified` should only be sent when the re-fetch request is directly triggered by a received SSE message.
- For browser-based SDKs, query parameters avoid CORS preflight costs that would be introduced by custom headers.
- The metadata originates from the SSE channel, so query parameters make the source and intent explicit.
- This is particularly useful for implementations where the OFREP server validates internal cache state and storage freshness directly (for example, cache + object storage bindings) rather than forwarding conditional headers upstream.
- To reduce cross-language date parsing ambiguity, providers and servers should prefer Unix timestamp seconds for `lastModified` / `sseLastModified` when possible.

### Provider Behavior

```mermaid
sequenceDiagram
participant Client as OFREP Provider
participant Server as Flag Management System
participant SSE as SSE Endpoint

Client->>Server: POST /ofrep/v1/evaluate/flags
Server-->>Client: 200 OK (flags + sse URLs + ETag)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence diagram shows the server returning "200 OK (flags + sse URLs + ETag)" but the response schema example on line 38-57 doesn't show an ETag in the response body. ETags are typically HTTP headers, not body fields. Consider clarifying whether "ETag" here refers to the HTTP ETag header (from ADR-0005) or to a new body field, and update the diagram or schema example accordingly for consistency.

Copilot uses AI. Check for mistakes.
Client->>Client: Cache flags, store ETag
Client->>SSE: Connect to SSE URL(s)

Note over SSE,Client: Real-time change notification
SSE-->>Client: event: refetchEvaluation (etag, lastModified)
Client->>Server: POST /ofrep/v1/evaluate/flags?sseEtag=etag&sseLastModified=lastModified
alt Flags changed
Server-->>Client: 200 OK (new flags + ETag)
Client->>Client: Update cache, emit ConfigurationChanged
else Flags unchanged
Server-->>Client: 304 Not Modified
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence diagram on line 116 shows a server returning "304 Not Modified" response, but there's no specification of which headers or query parameters the server should use to determine whether the flags have changed. The document explains that sseEtag and sseLastModified are "not standard HTTP conditional request validators" (line 89), but doesn't clarify how servers should use them to decide between returning 200 vs 304. Consider adding explicit guidance on server-side validation logic.

Copilot uses AI. Check for mistakes.
end

Note over Client: Browser tab backgrounded
Client->>SSE: Close connection (after inactivityDelaySec)
Note over Client: Browser tab foregrounded
Client->>SSE: Reconnect to SSE URL(s)
Client->>Server: POST /ofrep/v1/evaluate/flags
```

Provider implementation guidelines:
1. After the initial bulk evaluation response, if `sse` is present, the provider should connect to the provided URL(s).
2. On receiving a `refetchEvaluation` event, the provider must re-fetch flag evaluations from the bulk evaluation endpoint. If `etag` is present, it should be sent as `sseEtag` query parameter. If `lastModified` is present, it should be sent as `sseLastModified` query parameter. These query parameters should only be included for requests directly triggered by processing that SSE event.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification states providers "should" send sseEtag and sseLastModified when they are present in the SSE event, but line 128 requires the provider "must re-fetch" without specifying what happens if these parameters are not included in the re-fetch. Consider clarifying whether omitting these parameters when they were present in the SSE event is a violation or just suboptimal behavior.

Suggested change
2. On receiving a `refetchEvaluation` event, the provider must re-fetch flag evaluations from the bulk evaluation endpoint. If `etag` is present, it should be sent as `sseEtag` query parameter. If `lastModified` is present, it should be sent as `sseLastModified` query parameter. These query parameters should only be included for requests directly triggered by processing that SSE event.
2. On receiving a `refetchEvaluation` event, the provider must re-fetch flag evaluations from the bulk evaluation endpoint. When the SSE event includes `etag`, the provider SHOULD send it as the `sseEtag` query parameter; when the SSE event includes `lastModified`, the provider SHOULD send it as the `sseLastModified` query parameter. These query parameters must only be included for requests directly triggered by processing that specific SSE event. Omitting `sseEtag` or `sseLastModified` when they were present in the SSE event is allowed but may reduce cache validation efficiency; providers must not fabricate or reuse values that were not present on the triggering event.

Copilot uses AI. Check for mistakes.
`lastModified` parsing should support Unix timestamp seconds and date string formats.
3. If `inactivityDelaySec` is specified, the provider should close the SSE connection after the specified inactivity period. On resumption, it must reconnect and immediately re-fetch without SSE query metadata.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provider behavior guideline states providers "must" reconnect and immediately re-fetch after inactivity (line 130), but there's no specification of what should happen if this re-fetch fails. Should the provider retry, fall back to polling if configured, or emit an error event? The existing polling behavior in static-context-provider.md handles various error codes (401, 403, 429, etc.), but this ADR doesn't specify how SSE-triggered re-fetches should handle those same errors.

Suggested change
3. If `inactivityDelaySec` is specified, the provider should close the SSE connection after the specified inactivity period. On resumption, it must reconnect and immediately re-fetch without SSE query metadata.
3. If `inactivityDelaySec` is specified, the provider should close the SSE connection after the specified inactivity period. On resumption, it must reconnect and immediately re-fetch without SSE query metadata. This re-fetch is a normal bulk evaluation request and **must** follow the same error-handling, retry, and fallback semantics as other bulk evaluation or polling-triggered refresh requests (for example, handling 401/403/429/5xx according to the static-context provider guidelines) and must not enter an unbounded tight retry loop.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 130 requires providers to "immediately re-fetch without SSE query metadata" after inactivity reconnection, but the sequence diagram on line 123 shows "POST /ofrep/v1/evaluate/flags" without any conditional headers. This is inconsistent with the existing polling behavior documented in ADR-0005 and the static-context-provider guideline, which specify that providers should send If-None-Match with the stored ETag. Consider clarifying whether the standard ETag conditional request header should still be included in this scenario.

Copilot uses AI. Check for mistakes.
4. If the SSE connection fails or is unavailable, the provider must fall back to its configured change detection behavior: if polling is enabled, continue with polling; if polling is disabled, continue SSE reconnection attempts and rely on explicit refresh triggers such as `onContextChange`.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback behavior in line 131 states "if polling is disabled, continue SSE reconnection attempts and rely on explicit refresh triggers such as onContextChange", but this could lead to a situation where the provider has stale data indefinitely if SSE never reconnects and no context changes occur. Consider documenting this operational risk more explicitly in the Consequences section or recommending a maximum reconnection duration before requiring manual intervention.

Suggested change
4. If the SSE connection fails or is unavailable, the provider must fall back to its configured change detection behavior: if polling is enabled, continue with polling; if polling is disabled, continue SSE reconnection attempts and rely on explicit refresh triggers such as `onContextChange`.
4. If the SSE connection fails or is unavailable, the provider must fall back to its configured change detection behavior: if polling is enabled, continue with polling; if polling is disabled, continue SSE reconnection attempts and rely on explicit refresh triggers such as `onContextChange`. Implementations SHOULD also enforce a maximum reconnection duration or attempt limit and, once exceeded, surface a degraded state (for example, by emitting an error, requiring a manual refresh, or exposing health telemetry) to avoid silently serving stale data indefinitely when SSE cannot be re‑established and no other refresh triggers occur.

Copilot uses AI. Check for mistakes.
5. Providers should implement reconnection with exponential backoff. The native `EventSource` API in browsers handles this automatically.
6. When `onContextChange` is triggered, the provider re-fetches the bulk evaluation without SSE query metadata. The SSE URL(s) in the new response may differ, and the provider must update its connections accordingly.
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 133 states "When onContextChange is triggered, the provider re-fetches the bulk evaluation without SSE query metadata. The SSE URL(s) in the new response may differ, and the provider must update its connections accordingly." However, there's no guidance on how to handle the transition period when the old SSE connection is still active but a new URL has been received. Should providers immediately close old connections and open new ones? Should they wait for a grace period? This could lead to missed events during the transition. Consider adding explicit connection lifecycle guidance for this scenario.

Suggested change
6. When `onContextChange` is triggered, the provider re-fetches the bulk evaluation without SSE query metadata. The SSE URL(s) in the new response may differ, and the provider must update its connections accordingly.
6. When `onContextChange` is triggered, the provider re-fetches the bulk evaluation without SSE query metadata. The SSE URL(s) in the new response may differ, and the provider must update its connections accordingly:
- If the new response omits `sse` entirely, the provider must close any existing SSE connections and rely solely on its fallback change detection behavior.
- If the new response includes `sse` and the URL set is identical to the currently connected URL(s), the provider may reuse the existing SSE connection and is not required to reconnect.
- If the new response includes `sse` and the URL set differs from the currently connected URL(s), the provider SHOULD first establish connections to all new SSE URL(s), and only after successful connection attempt(s) SHOULD it close the old SSE connection(s). Providers SHOULD bound this overlap with a short grace period to avoid unbounded duplicate connections, and MUST tolerate potential duplicate notifications during the transition (for example, by treating SSE-triggered refetches as idempotent).

Copilot uses AI. Check for mistakes.

### OpenAPI Schema Additions

```yaml
# Add to /ofrep/v1/evaluate/flags POST parameters:
- in: query
name: sseEtag
description: |
Optional SSE-provided ETag metadata for SSE-triggered re-fetches. This is
not a standard HTTP conditional request header; it is metadata for server-side
cache validation and freshness checks initiated by SSE events. It should only
be included when the request is directly triggered by a received SSE message.
schema:
type: string
required: false
example: "\"550e8400-e29b-41d4-a716-446655440000\""
Comment on lines +72 to +149
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSE event data example shows "etag": "\"abc123\"" with escaped quotes (line 72), but the query parameter example shows "\"550e8400-e29b-41d4-a716-446655440000\"" also with escaped quotes (line 149). This double-quoting convention (ETags typically include literal quote characters per RFC 7232) should be explicitly documented to avoid implementation confusion, as providers will need to handle the quote escaping when constructing the query parameter.

Copilot uses AI. Check for mistakes.

- in: query
name: sseLastModified
description: |
Optional SSE-provided last-modified metadata for SSE-triggered re-fetches.
Supports Unix timestamp seconds (recommended) or a date string (ISO 8601 /
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent terminology: Line 78 uses "Unix timestamp in seconds" but line 155 uses "Unix timestamp seconds" (without "in"). For consistency throughout the document, use the same phrasing in both locations.

Copilot uses AI. Check for mistakes.
HTTP-date), and is transported as query metadata rather than
`If-Modified-Since`. It should only be included when the request is directly
triggered by a received SSE message.
schema:
oneOf:
- type: integer
minimum: 0
- type: string
required: false
Comment on lines +160 to +164
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The oneOf schema for sseLastModified correctly specifies integer and string types. However, the examples section only provides epochSeconds for the integer type. It would be clearer to explicitly show examples for both isoDate and httpDate under the string type to fully illustrate the supported formats.

Suggested change
oneOf:
- type: integer
minimum: 0
- type: string
required: false
schema:
oneOf:
- type: integer
minimum: 0
- type: string
required: false
examples:
epochSeconds:
value: 1771622898
isoDate:
value: "2026-02-20T21:28:18Z"
httpDate:
value: "Thu, 20 Feb 2026 21:28:18 GMT"

examples:
epochSeconds:
value: 1771622898
isoDate:
value: "2026-02-20T21:28:18Z"
httpDate:
value: "Thu, 20 Feb 2026 21:28:18 GMT"

# Add to bulkEvaluationSuccess.properties:
sse:
type: array
description: |
Optional array of SSE (Server-Sent Events) endpoints the client can connect
to for real-time flag change notifications. When present, the provider should
connect to these endpoints and re-fetch flag evaluations when notified of changes.
If not present, the provider should continue using polling for change detection.
items:
$ref: "#/components/schemas/sseConnection"

# Add to components.schemas:
sseConnection:
description: |
An SSE connection endpoint for receiving real-time flag change notifications.
type: object
required:
- url
properties:
url:
type: string
format: uri
description: |
The SSE endpoint URL the client should connect to for real-time
flag change notifications. The URL may include authentication tokens,
channel identifiers, or other query parameters as needed by the
vendor's SSE infrastructure.
example: "https://sse.example.com/event-stream?channels=env_abc123_v1"
inactivityDelaySec:
type: integer
minimum: 0
description: |
Number of seconds of client inactivity after which the SSE connection
should be closed to conserve resources. The client must reconnect
when activity resumes. If omitted or 0, the connection should be
Comment on lines +203 to +207
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OpenAPI schema addition specifies inactivityDelaySec with minimum: 0 (line 203) and states "If omitted or 0, the connection should be maintained indefinitely" (line 207-208). However, having a minimum of 0 means 0 is a valid value, which creates ambiguity - does 0 mean "close immediately" or "maintain indefinitely"? Consider either removing 0 from the valid range (use minimum: 1) or clarifying the semantics more explicitly.

Suggested change
minimum: 0
description: |
Number of seconds of client inactivity after which the SSE connection
should be closed to conserve resources. The client must reconnect
when activity resumes. If omitted or 0, the connection should be
minimum: 1
description: |
Number of seconds of client inactivity after which the SSE connection
should be closed to conserve resources. The client must reconnect
when activity resumes. If omitted, the connection should be

Copilot uses AI. Check for mistakes.
maintained indefinitely.
example: 120
```

## Consequences

### Positive

- **Real-time flag updates**: Providers can receive flag change notifications immediately rather than waiting for the next poll interval
- **Reduced server load**: Eliminates unnecessary polling requests when flags have not changed
- **Vendor-agnostic**: The `url` field is opaque, allowing vendors to use any SSE infrastructure (hosted services like Ably/Pusher, self-hosted endpoints, CDN-based proxies)
- **Backward compatible**: The `sse` field is fully optional -- servers that don't support it omit the field, providers that don't support it ignore the field and continue their configured change detection behavior
- **Builds on existing infrastructure**: Uses the existing bulk evaluation endpoint for data transfer, keeping SSE as a lightweight notification layer

### Negative

- **Additional provider complexity**: Providers must manage SSE connection lifecycle, reconnection, inactivity handling, and fallback behavior based on configured change detection settings
- **Infrastructure requirements**: Flag management systems that want to support SSE need to operate or integrate with an SSE-capable service
- **Connection resource usage**: Long-lived SSE connections consume resources on both client and server, particularly at scale
- **Re-fetch amplification risk**: Multiple SSE URLs or bursty event streams can trigger redundant concurrent re-fetches unless providers coalesce events
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 227 mentions "re-fetch amplification risk" when multiple SSE URLs trigger redundant concurrent re-fetches, but the provider behavior guidelines don't specify any required coalescing behavior. While Open Question 3 asks about coalescing strategy, the lack of any mandatory coalescing in the main specification could lead to implementations that amplify load significantly. Consider adding at least a basic requirement like "providers SHOULD debounce re-fetch requests within a configurable time window" to the provider behavior section.

Suggested change
- **Re-fetch amplification risk**: Multiple SSE URLs or bursty event streams can trigger redundant concurrent re-fetches unless providers coalesce events
- **Re-fetch amplification risk**: Multiple SSE URLs or bursty event streams can trigger redundant concurrent re-fetches unless providers coalesce events. To mitigate this, providers MUST avoid issuing redundant concurrent re-fetches for the same bulk evaluation request (for example, via in-flight de-duplication) and SHOULD debounce re-fetch requests within a configurable time window.

Copilot uses AI. Check for mistakes.
- **Transport consistency trade-off**: Using query parameters for SSE metadata differs from common HTTP conditional request patterns and may need careful documentation for implementers
- **Tokenized URL handling risk**: If SSE URLs include scoped credentials or channel tokens, accidental logging/persistence can expose sensitive connection material

## Open Questions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the developer be able to opt out of SSE via a config option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well SSE is totally optional, it basically comes down to if the Server responds with an sse object or not. But yes the Providers should also have a disable SSE option in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest allowing the developer to choose as a setting:

  • none
  • pooling
  • sse

This would give more control to the developer to pick up their preferred way for refreshing the flags.


1. **Should `refetchEvaluation` be required, or should providers refetch on any SSE message?** Requiring a specific `type` field enables future event types without triggering unnecessary refetches. Refetching on any message is simpler. This ADR recommends requiring `type=refetchEvaluation` for forward compatibility.
2. **Should providers support streaming full evaluation payloads over SSE?** This ADR focuses on the notification pattern. Full payload streaming could be specified as a separate event type in a future revision.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe as a potential v2 for SSE, I would like to experiment having a json PATCH similar behaviour. What I mean is, instead of forcing a full refresh, the event would send the flags that need to be updated.

For example:

[
{ "op": "add", flagKey: "test", value: { } },
{ "op": "replace", flagKey: "test", "path": "/defaultValue", "value": "false" },
{ "op": "remove", flagKey: "test"}
]

This adds extra complexity but it would be a good improvement.

3. **What is the recommended coalescing strategy when multiple SSE connections are specified?** Providers should connect to all URLs, but should re-fetch in a coalesced way (debounce + in-flight dedupe) to avoid amplification. Should OFREP define minimum coalescing expectations?
4. **Should `inactivityDelaySec` be server-provided or client-side configuration?** This ADR specifies it as server-provided, allowing the server to tune connection lifecycle. Providers may also expose a client-side override.
5. **Should non-`refetchEvaluation` SSE messages be forwarded to the provider?** Should we add a mechanism to support non-`refetchEvaluation` typed messages that are forwarded through to the provider via an events/hook interface?
6. **Should SSE metadata be transported via query parameters or custom headers?** This ADR currently recommends query params (`sseEtag`, `sseLastModified`) due to browser CORS preflight considerations and the non-conditional-request semantics. Should OFREP also define an optional custom header form for non-browser clients?
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open Question 6 asks "Should SSE metadata be transported via query parameters or custom headers?" and notes the recommendation is query params. However, the justification in lines 91-92 mentions "query parameters avoid CORS preflight costs that would be introduced by custom headers" - but this only applies to browser environments. For non-browser SDKs (server-side, mobile native apps), custom headers would be more conventional. Consider revising the specification to allow either approach, with query params recommended for browsers and custom headers as an alternative for non-browser environments.

Copilot uses AI. Check for mistakes.
7. **What security requirements should apply to tokenized SSE URLs?** Should OFREP require URL redaction in logs/telemetry, recommend short-lived scoped tokens, and discourage long-term persistence of raw SSE URLs?

## Implementation Notes

- **Existing SSE libraries**: The LaunchDarkly open-source SSE client libraries ([Java](https://github.com/launchdarkly/okhttp-eventsource), [.NET](https://github.com/launchdarkly/dotnet-eventsource), [JavaScript](https://github.com/launchdarkly/js-eventsource), [Python](https://github.com/launchdarkly/python-eventsource)) are well-maintained and could be used by OFREP provider implementations. Browser environments can use the native `EventSource` API.
- **Static context provider guideline update**: The [static context provider guideline](../../guideline/static-context-provider.md) would need a new section describing SSE connection management alongside the existing polling section.