upload to spanner and add min input and output len #43
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant enhancements to the
benchmark_serving.pyscript and related files, focusing on data validation, Spanner integration, and improved configurability for benchmarking datasets. Key changes include adding support for uploading benchmark results to Google Cloud Spanner, introducing minimum input/output length filters, and enabling additional arguments for dataset filtering and Spanner configuration.Enhancements to Benchmarking and Data Validation:
safe_json_valuefunction to handle NaN and Infinity values for JSON serialization, ensuring compatibility with Spanner and other systems. (benchmark_serving.py, benchmark_serving.pyR45-R239)min_input_lenandmin_output_lenparameters inget_filtered_datasetto filter datasets based on minimum sequence lengths. (benchmark_serving.py, [1] [2]Integration with Google Cloud Spanner:
upload_to_spanner_batch_with_retryfunction to upload benchmark results to Spanner with retry logic for batch uploads. (benchmark_serving.py, benchmark_serving.pyR45-R239)--spanner-instance-id,--spanner-database-id) to the CLI parser for configuring Spanner uploads. (benchmark_serving.py, benchmark_serving.pyR1345-R1356)save_json_resultsto optionally upload results to Spanner, controlled by thespanner_uploadflag. (benchmark_serving.py, benchmark_serving.pyR837-R847)Updates to Benchmark Workflow:
async def benchmarkto pass minimum input/output lengths and enable Spanner uploads. (benchmark_serving.py, [1] [2]print_and_save_resultto support Spanner uploads and optional server metrics scraping. (benchmark_serving.py, [1] [2]Shell Script Modifications:
--min-input-length,--min-output-length,--spanner-instance-id, and--spanner-database-idinlatency_throughput_curve.sh. (latency_throughput_curve.sh, [1] [2]