-
Notifications
You must be signed in to change notification settings - Fork 26
perf(push): add parallel HTTP requests and compression optimization #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add parallel execution utility (ThreadPoolExecutor) for concurrent tasks - Add pigz-based parallel compression with automatic fallback to tarfile - Execute check_env() and state_environment() concurrently in push command - Run compression and signed URL fetch in parallel during upload - Add AENV_DISABLE_PARALLEL env var to disable all optimizations Performance improvement: ~34% faster push time in real-world testing Co-Authored-By: Claude (claude-opus-4-5) <[email protected]>
Summary of ChangesHello @puzhen-ryan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant performance enhancements to the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant performance improvements to the push command by parallelizing network requests and compression. The implementation of parallel utilities for task execution and compression is well-done and includes fallbacks for robustness. The changes are accompanied by a good set of unit and integration tests.
My review focuses on a few areas for improvement:
- Error handling in the parallel workflow to ensure safe operation.
- A minor bug in a new test file.
- A potential regression in flexibility in one of the modified classes.
Overall, this is a great enhancement. Addressing these points will make the implementation even more robust.
| if state_result and state_result.success: | ||
| env_state = EnvStatus.parse_state(state_result.result) | ||
| if env_state.running() and not force: | ||
| click.echo("❌ Environment is being prepared, use --force to overwrite") | ||
| raise click.Abort() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the state_env task fails, the check for a running environment is skipped. This could lead to unintentionally overwriting a running environment without the --force flag because the failure is handled silently. For safety, the operation should be aborted if the state of an existing environment cannot be determined.
| if state_result and state_result.success: | |
| env_state = EnvStatus.parse_state(state_result.result) | |
| if env_state.running() and not force: | |
| click.echo("❌ Environment is being prepared, use --force to overwrite") | |
| raise click.Abort() | |
| if state_result and state_result.success: | |
| env_state = EnvStatus.parse_state(state_result.result) | |
| if env_state.running() and not force: | |
| click.echo("❌ Environment is being prepared, use --force to overwrite") | |
| raise click.Abort() | |
| else: | |
| click.echo("❌ Could not determine the state of the existing environment. Aborting to prevent accidental overwrite.", err=True) | |
| if state_result and state_result.error: | |
| raise state_result.error | |
| raise click.Abort() |
| yield tmpdir | ||
|
|
||
|
|
||
| from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if archive_path and os.path.exists(archive_path): | ||
| os.unlink(archive_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The finally block correctly cleans up the temporary archive file. However, it's worth noting that os.unlink can raise an exception (e.g., PermissionError). While unlikely in this context, wrapping it in a try...except block would make the cleanup even more robust and prevent a cleanup failure from masking the original exception if one occurred in the try block.
| if archive_path and os.path.exists(archive_path): | |
| os.unlink(archive_path) | |
| if archive_path and os.path.exists(archive_path): | |
| try: | |
| os.unlink(archive_path) | |
| except OSError: | |
| pass |
| if self.use_parallel: | ||
| from cli.utils.compression import pack_directory_parallel | ||
|
|
||
| self.archive_path = pack_directory_parallel( | ||
| self.source_dir, | ||
| exclude_patterns=self.pack_kwargs.get("exclude_patterns"), | ||
| use_parallel=True, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When use_parallel=True, TempArchive only passes exclude_patterns to pack_directory_parallel. This is a reduction in flexibility compared to the non-parallel path, which passes all **self.pack_kwargs. Other arguments like compression_level would be ignored. While this works for the current usage, it makes the class's interface inconsistent and less reusable.
JacksonMei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Performance improvement: ~34% faster push time in real-world testing:
aenv pushAENV_DISABLE_PARALLEL=1 aenv pushaenv push(v0.1.4)