Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(embedding): default embedding creation to base64 #1312

Merged
merged 11 commits into from
Mar 28, 2025
21 changes: 21 additions & 0 deletions src/core.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1287,6 +1287,27 @@ export const toBase64 = (str: string | null | undefined): string => {
throw new OpenAIError('Cannot generate b64 string; Expected `Buffer` or `btoa` to be defined');
};

/**
* Converts a Base64 encoded string to a Float32Array.
* @param base64Str - The Base64 encoded string.
* @returns An Array of numbers interpreted as Float32 values.
*/
export const toFloat32Array = (base64Str: string): Array<number> => {
if (typeof Buffer !== 'undefined') {
// for Node.js environment
return Array.from(new Float32Array(Buffer.from(base64Str, 'base64').buffer));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if you've benchmarked how much of a difference just returning the Float32Array directly would have?

if it's a big difference we should probably have an opt-in flag to just do that. (doesn't block this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick benchmark. Returning a Float32Array<ArrayBufferLike> buffer is 5.3X (81.2%) smaller size than returning a number[]. Note that both objects contains the base64 encoded embeddings.

Does that mean devs would need to decode Float32Array<ArrayBufferLike> to number[] in userland code?

image

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation seems to produce wrong results, see #1448

} else {
// for legacy web platform APIs
const binaryStr = atob(base64Str);
const len = binaryStr.length;
const bytes = new Uint8Array(len);
for (let i = 0; i < len; i++) {
bytes[i] = binaryStr.charCodeAt(i);
}
return Array.from(new Float32Array(bytes.buffer));
}
};

export function isObj(obj: unknown): obj is Record<string, unknown> {
return obj != null && typeof obj === 'object' && !Array.isArray(obj);
}
42 changes: 40 additions & 2 deletions src/resources/embeddings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,47 @@ export class Embeddings extends APIResource {
*/
create(
body: EmbeddingCreateParams,
options?: Core.RequestOptions,
options?: Core.RequestOptions<EmbeddingCreateParams>,
): Core.APIPromise<CreateEmbeddingResponse> {
return this._client.post('/embeddings', { body, ...options });
const hasUserProvidedEncodingFormat = !!body.encoding_format;
// No encoding_format specified, defaulting to base64 for performance reasons
// See https://github.com/openai/openai-node/pull/1312
let encoding_format: EmbeddingCreateParams['encoding_format'] =
hasUserProvidedEncodingFormat ? body.encoding_format : 'base64';

if (hasUserProvidedEncodingFormat) {
Core.debug('Request', 'User defined encoding_format:', body.encoding_format);
}

const response: Core.APIPromise<CreateEmbeddingResponse> = this._client.post('/embeddings', {
body: {
...body,
encoding_format: encoding_format as EmbeddingCreateParams['encoding_format'],
},
...options,
});

// if the user specified an encoding_format, return the response as-is
if (hasUserProvidedEncodingFormat) {
return response;
}

// in this stage, we are sure the user did not specify an encoding_format
// and we defaulted to base64 for performance reasons
// we are sure then that the response is base64 encoded, let's decode it
// the returned result will be a float32 array since this is OpenAI API's default encoding
Core.debug('response', 'Decoding base64 embeddings to float32 array');

return (response as Core.APIPromise<CreateEmbeddingResponse>)._thenUnwrap((response) => {
if (response && response.data) {
response.data.forEach((embeddingBase64Obj) => {
const embeddingBase64Str = embeddingBase64Obj.embedding as unknown as string;
embeddingBase64Obj.embedding = Core.toFloat32Array(embeddingBase64Str);
});
}

return response;
});
}
}

Expand Down
31 changes: 31 additions & 0 deletions tests/api-resources/embeddings.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,35 @@ describe('resource embeddings', () => {
user: 'user-1234',
});
});

test('create: encoding_format=float should create float32 embeddings', async () => {
const response = await client.embeddings.create({
input: 'The quick brown fox jumped over the lazy dog',
model: 'text-embedding-3-small',
});

expect(response.data?.at(0)?.embedding).toBeInstanceOf(Array);
expect(Number.isFinite(response.data?.at(0)?.embedding.at(0))).toBe(true);
});

test('create: encoding_format=base64 should create float32 embeddings', async () => {
const response = await client.embeddings.create({
input: 'The quick brown fox jumped over the lazy dog',
model: 'text-embedding-3-small',
encoding_format: 'base64',
});

expect(response.data?.at(0)?.embedding).toBeInstanceOf(Array);
expect(Number.isFinite(response.data?.at(0)?.embedding.at(0))).toBe(true);
});

test('create: encoding_format=default should create float32 embeddings', async () => {
const response = await client.embeddings.create({
input: 'The quick brown fox jumped over the lazy dog',
model: 'text-embedding-3-small',
});

expect(response.data?.at(0)?.embedding).toBeInstanceOf(Array);
expect(Number.isFinite(response.data?.at(0)?.embedding.at(0))).toBe(true);
});
});
Loading