Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inserting Buffer / ArrayBuffer values for BYTEA fields are corrupt under HTTP client #118

Closed
andyjy opened this issue Nov 11, 2024 · 2 comments · Fixed by #119
Closed

Inserting Buffer / ArrayBuffer values for BYTEA fields are corrupt under HTTP client #118

andyjy opened this issue Nov 11, 2024 · 2 comments · Fixed by #119
Assignees

Comments

@andyjy
Copy link
Contributor

andyjy commented Nov 11, 2024

Inserting binary data by passing a Buffer or ArrayBuffer value works fine under the webhook client (handled by the underlying pg implementation), but is unimplemented under the http client and silently breaks, instead inserting corrupted values representing the result of causing JSON.stringify() on the Buffer value.

Steps to reproduce

CREATE TABLE example (
    binary_data BYTEA
);
const sql = neon(DATABASE_URL);
await sql`INSERT INTO example VALUES (${Buffer.from('test')})`

Expected result

SELECT CONVERT_FROM(binary_data, 'utf8') FROM example;
> 'test'

Actual result

SELECT CONVERT_FROM(binary_data, 'utf8') FROM example;
> '{"data":[xx, xx, xx...],"format":"Buffer"}'
@jawj
Copy link
Collaborator

jawj commented Nov 11, 2024

Thanks for the thorough report. I can see this is a problem. Let me see if this is best solved on the front- or back-end.

@jawj
Copy link
Collaborator

jawj commented Nov 12, 2024

OK. I wanted to figure out where this issue came from. Note that we use the buffer npm package in our driver to shim Buffer for platforms that aren't Node.

We call prepareValues on query parameters for HTTP queries: https://github.com/brianc/node-postgres/blob/95d7e620ef8b51743b4cbca05dd3c3ce858ecea7/packages/pg/lib/utils.js#L47

This leaves Buffer objects well alone, and it turns typed arrays (such as Uint8Array) into Buffer objects too (using the test ArrayBuffer.isView).

As an aside, ArrayBuffer objects get no special handling: they're identified only as objects and get inserted as the 2 bytes{} even over TCP/WebSockets. It might be nice to fix that, but my key priority is node-postgres compatibility, so I'm going to park this idea for now.

When querying over WebSockets, the prepared Buffer objects are then handled specially in the Postgres protocol/serializer code, so everything works out: https://github.com/brianc/node-postgres/blob/95d7e620ef8b51743b4cbca05dd3c3ce858ecea7/packages/pg-protocol/src/serializer.ts#L123.

But over HTTP the Buffer objects just get passed to JSON.stringify, which uses toJSON() to turn them into an object like { type: 'Buffer', data: [ 0 ] }.

So: I think we do need to handle these objects specially for HTTP on the client side. We can't do this nicely on the back-end, since it will be hard at that point to distinguish the stringified data from any other string.

What you've done in #119 looks pretty sensible on this score, so many thanks.

I've also investigated how the buffer package handles toString('hex'), and it's fine but not brilliant. I'm sounding them out about making it 2 – 4x quicker: feross/buffer#364

@jawj jawj closed this as completed in #119 Nov 22, 2024
jawj pushed a commit that referenced this issue Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants