Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/dash_evals/src/dash_evals/runner/args_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def _run_from_args(args: argparse.Namespace) -> bool:
dataset = json_dataset(str(args.dataset))

# Build the task instance
task_def = {"name": args.task}
task_def = {"task_name": args.task}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While updating the key to task_name is correct, this dictionary should also include sandbox_type when a sandbox is specified via CLI arguments. Without sandbox_type, task-level validation (such as in validate_sandbox_tools) will default to assuming a local sandbox, which can cause incorrect failures for tasks that require a Linux container (e.g., those using bash_session or text_editor) even when the user has provided a valid sandbox configuration via the --sandbox flag.

Suggested change
task_def = {"task_name": args.task}
task_def = {"task_name": args.task}
if args.sandbox:
task_def["sandbox_type"] = args.sandbox[0]

task_instance = task_func(dataset, task_def) if dataset else task_func(None, task_def)

# Set up logging
Expand Down