Skip to content

Commit

Permalink
Allow Generalized SWE-Bench format for evaluation (All-Hands-AI#3752)
Browse files Browse the repository at this point in the history
* allow generalized swe-bench format

* Update run_infer.py

* fix linter

---------

Co-authored-by: Xingyao Wang <[email protected]>
  • Loading branch information
Jiayi-Pan and xingyaoww authored Sep 6, 2024
1 parent 5718741 commit 43c4a7f
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions evaluation/swe_bench/run_infer.py
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,12 @@ def filter_dataset(dataset: pd.DataFrame, filter_column: str) -> pd.DataFrame:
output_file = os.path.join(metadata.eval_output_dir, 'output.jsonl')
instances = prepare_dataset(swe_bench_tests, output_file, args.eval_n_limit)

if not isinstance(
instances['PASS_TO_PASS'][instances['PASS_TO_PASS'].index[0]], str
):
for col in ['PASS_TO_PASS', 'FAIL_TO_PASS']:
instances[col] = instances[col].apply(lambda x: str(list(x)))

run_evaluation(
instances, metadata, output_file, args.eval_num_workers, process_instance
)

0 comments on commit 43c4a7f

Please sign in to comment.