upload to spanner and add min input and output len #43
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant enhancements to the
benchmark_serving.py
script and related files, focusing on data validation, Spanner integration, and improved configurability for benchmarking datasets. Key changes include adding support for uploading benchmark results to Google Cloud Spanner, introducing minimum input/output length filters, and enabling additional arguments for dataset filtering and Spanner configuration.Enhancements to Benchmarking and Data Validation:
safe_json_value
function to handle NaN and Infinity values for JSON serialization, ensuring compatibility with Spanner and other systems. (benchmark_serving.py
, benchmark_serving.pyR45-R239)min_input_len
andmin_output_len
parameters inget_filtered_dataset
to filter datasets based on minimum sequence lengths. (benchmark_serving.py
, [1] [2]Integration with Google Cloud Spanner:
upload_to_spanner_batch_with_retry
function to upload benchmark results to Spanner with retry logic for batch uploads. (benchmark_serving.py
, benchmark_serving.pyR45-R239)--spanner-instance-id
,--spanner-database-id
) to the CLI parser for configuring Spanner uploads. (benchmark_serving.py
, benchmark_serving.pyR1345-R1356)save_json_results
to optionally upload results to Spanner, controlled by thespanner_upload
flag. (benchmark_serving.py
, benchmark_serving.pyR837-R847)Updates to Benchmark Workflow:
async def benchmark
to pass minimum input/output lengths and enable Spanner uploads. (benchmark_serving.py
, [1] [2]print_and_save_result
to support Spanner uploads and optional server metrics scraping. (benchmark_serving.py
, [1] [2]Shell Script Modifications:
--min-input-length
,--min-output-length
,--spanner-instance-id
, and--spanner-database-id
inlatency_throughput_curve.sh
. (latency_throughput_curve.sh
, [1] [2]