Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(client): add Realtime API support #1266

Merged
merged 4 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,93 @@ main();
If you need to cancel a stream, you can `break` from the loop
or call `stream.controller.abort()`.

## Realtime API beta

The Realtime API enables you to build low-latency, multi-modal conversational experiences. It currently supports text and audio as both input and output, as well as [function calling](https://platform.openai.com/docs/guides/function-calling) through a `WebSocket` connection.

The Realtime API works through a combination of client-sent events and server-sent events. Clients can send events to do things like update session configuration or send text and audio inputs. Server events confirm when audio responses have completed, or when a text response from the model has been received. A full event reference can be found [here](https://platform.openai.com/docs/api-reference/realtime-client-events) and a guide can be found [here](https://platform.openai.com/docs/guides/realtime).

This SDK supports accessing the Realtime API through the [WebSocket API](https://developer.mozilla.org/en-US/docs/Web/API/WebSocket) or with [ws](https://github.com/websockets/ws).

Basic text based example with `ws`:

```ts
// requires `yarn add ws @types/ws`
import { OpenAIRealtimeWS } from 'openai/beta/realtime/ws';

const rt = new OpenAIRealtimeWS({ model: 'gpt-4o-realtime-preview-2024-12-17' });

// access the underlying `ws.WebSocket` instance
rt.socket.on('open', () => {
console.log('Connection opened!');
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});

rt.send({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
},
});

rt.send({ type: 'response.create' });
});

rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});

rt.on('session.created', (event) => {
console.log('session created!', event.session);
console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.on('close', () => console.log('\nConnection closed!'));
```

To use the web API `WebSocket` implementation, replace `OpenAIRealtimeWS` with `OpenAIRealtimeWebSocket` and adjust any `rt.socket` access:

```ts
import { OpenAIRealtimeWebSocket } from 'openai/beta/realtime/websocket';

const rt = new OpenAIRealtimeWebSocket({ model: 'gpt-4o-realtime-preview-2024-12-17' });
// ...
rt.socket.addEventListener('open', () => {
// ...
});
```

A full example can be found [here](https://github.com/openai/openai-node/blob/master/examples/realtime/web.ts).

### Realtime error handling

When an error is encountered, either on the client side or returned from the server through the [`error` event](https://platform.openai.com/docs/guides/realtime/realtime-api-beta#handling-errors), the `error` event listener will be fired. However, if you haven't registered an `error` event listener then an `unhandled Promise rejection` error will be thrown.

It is **highly recommended** that you register an `error` event listener and handle errors approriately as typically the underlying connection is still usable.

```ts
const rt = new OpenAIRealtimeWS({ model: 'gpt-4o-realtime-preview-2024-12-17' });
rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});
```

### Request & Response types

This library includes TypeScript definitions for all request params and response fields. You may import and use them like so:
Expand Down
7 changes: 4 additions & 3 deletions examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,15 @@
"license": "MIT",
"private": true,
"dependencies": {
"@azure/identity": "^4.2.0",
"express": "^4.18.2",
"next": "^14.1.1",
"openai": "file:..",
"zod-to-json-schema": "^3.21.4",
"@azure/identity": "^4.2.0"
"zod-to-json-schema": "^3.21.4"
},
"devDependencies": {
"@types/body-parser": "^1.19.3",
"@types/express": "^4.17.19"
"@types/express": "^4.17.19",
"@types/web": "^0.0.194"
}
}
48 changes: 48 additions & 0 deletions examples/realtime/websocket.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import { OpenAIRealtimeWebSocket } from 'openai/beta/realtime/websocket';

async function main() {
const rt = new OpenAIRealtimeWebSocket({ model: 'gpt-4o-realtime-preview-2024-12-17' });

// access the underlying `ws.WebSocket` instance
rt.socket.addEventListener('open', () => {
console.log('Connection opened!');
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});

rt.send({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
},
});

rt.send({ type: 'response.create' });
});

rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});

rt.on('session.created', (event) => {
console.log('session created!', event.session);
console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.addEventListener('close', () => console.log('\nConnection closed!'));
}

main();
55 changes: 55 additions & 0 deletions examples/realtime/ws.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import { OpenAIRealtimeWS } from 'openai/beta/realtime/ws';

async function main() {
const rt = new OpenAIRealtimeWS({ model: 'gpt-4o-realtime-preview-2024-12-17' });

// access the underlying `ws.WebSocket` instance
rt.socket.on('open', () => {
console.log('Connection opened!');
rt.send({
type: 'session.update',
session: {
modalities: ['foo'] as any,
model: 'gpt-4o-realtime-preview',
},
});
rt.send({
type: 'session.update',
session: {
modalities: ['text'],
model: 'gpt-4o-realtime-preview',
},
});

rt.send({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text: 'Say a couple paragraphs!' }],
},
});

rt.send({ type: 'response.create' });
});

rt.on('error', (err) => {
// in a real world scenario this should be logged somewhere as you
// likely want to continue procesing events regardless of any errors
throw err;
});

rt.on('session.created', (event) => {
console.log('session created!', event.session);
console.log();
});

rt.on('response.text.delta', (event) => process.stdout.write(event.delta));
rt.on('response.text.done', () => console.log());

rt.on('response.done', () => rt.close());

rt.socket.on('close', () => console.log('\nConnection closed!'));
}

main();
6 changes: 6 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
"@swc/core": "^1.3.102",
"@swc/jest": "^0.2.29",
"@types/jest": "^29.4.0",
"@types/ws": "^8.5.13",
"@typescript-eslint/eslint-plugin": "^6.7.0",
"@typescript-eslint/parser": "^6.7.0",
"eslint": "^8.49.0",
Expand All @@ -52,6 +53,7 @@
"tsc-multi": "^1.1.0",
"tsconfig-paths": "^4.0.0",
"typescript": "^4.8.2",
"ws": "^8.18.0",
"zod": "^3.23.8"
},
"sideEffects": [
Expand Down Expand Up @@ -126,9 +128,13 @@
},
"bin": "./bin/cli",
"peerDependencies": {
"ws": "^8.18.0",
"zod": "^3.23.8"
},
"peerDependenciesMeta": {
"ws": {
"optional": true
},
"zod": {
"optional": true
}
Expand Down
1 change: 1 addition & 0 deletions src/beta/realtime/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export { OpenAIRealtimeError } from './internal-base';
83 changes: 83 additions & 0 deletions src/beta/realtime/internal-base.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import { RealtimeClientEvent, RealtimeServerEvent, ErrorEvent } from '../../resources/beta/realtime/realtime';
import { EventEmitter } from '../../lib/EventEmitter';
import { OpenAIError } from '../../error';

export class OpenAIRealtimeError extends OpenAIError {
/**
* The error data that the API sent back in an `error` event.
*/
error?: ErrorEvent.Error | undefined;

/**
* The unique ID of the server event.
*/
event_id?: string | undefined;

constructor(message: string, event: ErrorEvent | null) {
super(message);

this.error = event?.error;
this.event_id = event?.event_id;
}
}

type Simplify<T> = { [KeyType in keyof T]: T[KeyType] } & {};

type RealtimeEvents = Simplify<
{
event: (event: RealtimeServerEvent) => void;
error: (error: OpenAIRealtimeError) => void;
} & {
[EventType in Exclude<RealtimeServerEvent['type'], 'error'>]: (
event: Extract<RealtimeServerEvent, { type: EventType }>,
) => unknown;
}
>;

export abstract class OpenAIRealtimeEmitter extends EventEmitter<RealtimeEvents> {
/**
* Send an event to the API.
*/
abstract send(event: RealtimeClientEvent): void;

/**
* Close the websocket connection.
*/
abstract close(props?: { code: number; reason: string }): void;

protected _onError(event: null, message: string, cause: any): void;
protected _onError(event: ErrorEvent, message?: string | undefined): void;
protected _onError(event: ErrorEvent | null, message?: string | undefined, cause?: any): void {
message =
event?.error ?
`${event.error.message} code=${event.error.code} param=${event.error.param} type=${event.error.type} event_id=${event.error.event_id}`
: message ?? 'unknown error';

if (!this._hasListener('error')) {
const error = new OpenAIRealtimeError(
message +
`\n\nTo resolve these unhandled rejection errors you should bind an \`error\` callback, e.g. \`rt.on('error', (error) => ...)\` `,
event,
);
// @ts-ignore
error.cause = cause;
Promise.reject(error);
return;
}

const error = new OpenAIRealtimeError(message, event);
// @ts-ignore
error.cause = cause;

this._emit('error', error);
}
}

export function buildRealtimeURL(props: { baseURL: string; model: string }): URL {
const path = '/realtime';

const url = new URL(props.baseURL + (props.baseURL.endsWith('/') ? path.slice(1) : path));
url.protocol = 'wss';
url.searchParams.set('model', props.model);
return url;
}
Loading
Loading