-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Handle large read/write/sendfile to File; return actual bytes written #60321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
633307e to
0dae450
Compare
The return value for the `jl_fs_write/read/sendfile` functions is an `int`, even through they all take `size_t` length arguments, resulting in potentially truncated counts for large file operations. Instead, on success (including partial read/write), we should return the `uv_fs_t.result` field. I have added a test for this but have left it disabled because it allocates a 5 GiB Vector to read into. One consequence of the truncated return values is that we previously mangled large files while copying them with sendfile. If the return value is truncated to a random negative value, it manifests as JuliaLang#30723 or JuliaLang#39868. If it is truncated to a positive value, sendfile will create a very large destination file with thousands of repeated sections of the source file, causing JuliaLang#56537. The bug will only manifest if LLVM decides to zero-extend the return value. This also changes `jl_fs_write` to return the actual number of bytes written, rather than the number of bytes it was asked to write. If a write returns EAGAIN (usually because the file is O_NONBLOCK/O_NODELAY), we never see the actual number of bytes written.
5034aab to
e3f29f1
Compare
| { | ||
| uv_fs_t req; | ||
| JL_SIGATOMIC_BEGIN(); | ||
| int ret = uv_fs_sendfile(unused_uv_loop_arg, &req, dst_fd, src_fd, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| int ret = uv_fs_sendfile(unused_uv_loop_arg, &req, dst_fd, src_fd, | |
| ssize_t ret = uv_fs_sendfile(unused_uv_loop_arg, &req, dst_fd, src_fd, |
and same below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the suggestion. Does this avoid a bad implicit conversion later on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more for consistency: the function is supposed to return an ssize_t now, but the value returned is an int
The return value for the
jl_fs_write/read/sendfilefunctions is anint, even through they all takesize_tlength arguments, resulting in potentially truncated counts for large file operations. Instead, on success (including partial read/write), we should return theuv_fs_t.resultfield.I have added a test for this but have left it disabled because it allocates a 5 GiB Vector to read into.
One consequence of the truncated return values is that we previously mangled large files while copying them with sendfile. If the return value is truncated to a random negative value, it manifests as #30723 or #39868. If it is truncated to a positive value, sendfile will create a very large destination file with thousands of repeated sections of the source file, causing #56537. The bug will only manifest if LLVM decides to zero-extend the return value.
This also changes
jl_fs_writeto return the actual number of bytes written, rather than the number of bytes it was asked to write. If a write returns EAGAIN (usually because the file is O_NONBLOCK/O_NODELAY), we never see the actual number of bytes written.