-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Description
When a pip subprocess fails and the command is passed as a short list (e.g. [python, '-m', 'pip', 'install', 'pkg']), the exception handler in sdks/python/apache_beam/utils/processes.py raises IndexError instead of the intended RuntimeError with traceback and pip output.
Root cause
The pip-specific branch in call, check_call, and check_output uses a hardcoded index 6 for the "package name" when formatting the error message:
if isinstance(args, tuple) and (args[0][2] == "pip"):
raise RuntimeError(
"Full traceback: {}\n Pip install failed for package: {} \n Output from execution of subprocess: {}"
.format(traceback.format_exc(), args[0][6], error.output)) from error- For
['python', '-m', 'pip', 'install', 'somepkg']the list has only 5 elements (indices 0–4), soargs[0][6]raises IndexError. - The "friendly" pip error path is never shown; users see an IndexError instead.
Additional problem
Even when index 6 exists (e.g. stager’s pip download -r requirements_file with many args), that index may not be a package name (e.g. it can be --find-links). The message "Pip install failed for package: --find-links" is misleading.
Steps to reproduce
- Use
apache_beam.utils.processes.check_call(orcheck_output/call) with a short pip command that fails:
from apache_beam.utils import processes
# Short pip command (5 elements) that will fail (nonexistent package)
cmd = ['python', '-m', 'pip', 'install', 'nonexistent-package-xyz']
processes.check_call(cmd)- When pip fails (e.g. package not found), the code hits the pip branch and formats the message with
args[0][6]. - Actual:
IndexError: list index out of range(index 6 does not exist). - Expected: A
RuntimeErrorwhose message includes the full traceback and pip subprocess output (no IndexError).
Expected behavior
- When a pip subprocess fails, the code should always raise a RuntimeError (with
from error) whose message includes:- The full traceback
- Useful context (e.g. that it was a pip failure; package name only when it can be determined safely)
- The subprocess output (
error.output)
- No IndexError should occur regardless of the length or shape of the command list.
Actual behavior
- For short pip commands (e.g.
pip install <pkg>), IndexError is raised when building the error message, so the intended RuntimeError is never shown. - For some longer pip commands, the message can show a wrong "package" (e.g. an option like
--find-links) because index 6 is assumed to be the package name.
Affected code
- File:
sdks/python/apache_beam/utils/processes.py - Functions:
call,check_call,check_output(pip branch in each, e.g. lines 55–59, 74–78, 93–97) - Relevant line:
.format(traceback.format_exc(), args[0][6], error.output)—args[0][6]is unsafe.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels