You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think we've encountered performance trouble for this at Pydantic at service boundaries where we pass IPC streams and will either add a mitigation in our own code or contribute here.
Also, I was wondering if this problem will be relevant for the parquet writer too?
Uh oh!
There was an error while loading. Please reload this page.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The IPC writer typically tries to avoid writing unreferenced data, e.g. re-encoding offsets for StringArray, ListArray, etc... see
write_array_data
.Currently the IPC writer writes view data types as is, potentially including significant unreferenced data.
Describe the solution you'd like
I think by default the IPC writer should at least invoke the GC API, and potentially deduplicate them. We could potentially make this configurable.
Describe alternatives you've considered
Additional context
Potentially some overlap with
interleave
kernel with deep copy buffers forStringViewArray
#7184FYI @alamb @XiangpengHao
The text was updated successfully, but these errors were encountered: