-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix download workspace zip file event loop hanging #6722
base: main
Are you sure you want to change the base?
Fix download workspace zip file event loop hanging #6722
Conversation
@@ -21,12 +20,13 @@ | |||
|
|||
from fastapi import Depends, FastAPI, HTTPException, Request, UploadFile | |||
from fastapi.exceptions import RequestValidationError | |||
from fastapi.responses import JSONResponse, StreamingResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont try to hand build a StreamingResponse
just let FileResponse
handle it, it's already specialized for sending large files
05a2aaf
to
ecb7d53
Compare
for root, _, files in os.walk(path): | ||
for file in files: | ||
file_path = os.path.join(root, file) | ||
zipf.write( | ||
file_path, arcname=os.path.relpath(file_path, path) | ||
) | ||
temp_zip.seek(0) # Rewind the file to the beginning after writing | ||
content = temp_zip.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is bad, seek(0), read()
just brings the entire file back in memory, won't scale to workspace with 1GB+ of files
ecb7d53
to
1bf4bfa
Compare
A bunch of problems with the whole download workspace:
action_execution_server
the entire zip operation (compute intensive, blocking IO), and the entire zip file was read back in memory, on theasync def ...
route handler, blocking that fastapi event loop, fix this by running that route as a sync handler, and not reading the file back in-mem, letFileResponse
handle streaming it back out asynchronouslyaction_execution_client
theres unnecessary looping of the content stream, just letshutil.copyfileobj
handle streaming that raw socket out to the temp filezip_current_workspace
even though its an async handler, it immediately throws the real work onto call_sync_from_async, so might as well just run this route handler entirely on the fastAPI blocking threadpool instead, fix the mimetype of the zip file to the right standard oneFileResponse
object, so when it finishes streaming, it will delete the temp fileVerified by adding a 1GB random incompressible file into the workspace and hitting the
/zip-directory
endpoint to retrieve the zip. The only issue remaining is that the entire workspace zip file has to be fully generated before the download stream begins. Ideally, it would be streaming the zip the entire way through incrementally as files are compressed.