Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix download workspace zip file event loop hanging #6722

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

diwu-sf
Copy link
Contributor

@diwu-sf diwu-sf commented Feb 14, 2025

A bunch of problems with the whole download workspace:

  • inside action_execution_server the entire zip operation (compute intensive, blocking IO), and the entire zip file was read back in memory, on the async def ... route handler, blocking that fastapi event loop, fix this by running that route as a sync handler, and not reading the file back in-mem, let FileResponse handle streaming it back out asynchronously
  • in action_execution_client theres unnecessary looping of the content stream, just let shutil.copyfileobj handle streaming that raw socket out to the temp file
  • in zip_current_workspace even though its an async handler, it immediately throws the real work onto call_sync_from_async, so might as well just run this route handler entirely on the fastAPI blocking threadpool instead, fix the mimetype of the zip file to the right standard one
  • associate the background unlink directly with the FileResponse object, so when it finishes streaming, it will delete the temp file

Verified by adding a 1GB random incompressible file into the workspace and hitting the /zip-directory endpoint to retrieve the zip. The only issue remaining is that the entire workspace zip file has to be fully generated before the download stream begins. Ideally, it would be streaming the zip the entire way through incrementally as files are compressed.

@@ -21,12 +20,13 @@

from fastapi import Depends, FastAPI, HTTPException, Request, UploadFile
from fastapi.exceptions import RequestValidationError
from fastapi.responses import JSONResponse, StreamingResponse
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont try to hand build a StreamingResponse just let FileResponse handle it, it's already specialized for sending large files

@diwu-sf diwu-sf force-pushed the fix-download-file-event-loop-hang branch from 05a2aaf to ecb7d53 Compare February 14, 2025 07:38
for root, _, files in os.walk(path):
for file in files:
file_path = os.path.join(root, file)
zipf.write(
file_path, arcname=os.path.relpath(file_path, path)
)
temp_zip.seek(0) # Rewind the file to the beginning after writing
content = temp_zip.read()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is bad, seek(0), read() just brings the entire file back in memory, won't scale to workspace with 1GB+ of files

@diwu-sf diwu-sf force-pushed the fix-download-file-event-loop-hang branch from ecb7d53 to 1bf4bfa Compare February 14, 2025 07:46
@enyst enyst requested a review from tofarr February 14, 2025 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant