You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ADVANCED_USAGE.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,12 +57,15 @@ Below are all the arguments for `bigcodebench.evaluate` for the remote evaluatio
57
57
-`--id_range`: The range of the tasks to evaluate, default to `None`, e.g. `--id_range 10-20` will evaluate the tasks from 10 to 20
58
58
-`--backend`: The backend to use, default to `vllm`
59
59
-`--base_url`: The base URL of the backend for OpenAI-compatible APIs, default to `None`
60
+
-`--instruction_prefix`: The instruction prefix for the Anthropic backend, default to `None`
61
+
-`--response_prefix`: The response prefix for the Anthropic backend, default to `None`
60
62
-`--revision`: The revision of the model with the vLLM or HF backend, default to `main`
61
63
-`--tp`: The tensor parallel size for the vLLM backend, default to `1`
62
64
-`--trust_remote_code`: Whether to trust the remote code, default to `False`
63
65
-`--tokenizer_name`: The name of the customized tokenizer, default to `None`
64
66
-`--tokenizer_legacy`: Whether to use the legacy tokenizer, default to `False`
65
67
-`--samples`: The path to the generated samples file, default to `None`
68
+
-`--no_execute`: Whether to not execute the samples, default to `False`
66
69
-`--local_execute`: Whether to execute the samples locally, default to `False`
67
70
-`--remote_execute_api`: The API endpoint for remote execution, default to `https://bigcode-bigcodebench-evaluator.hf.space/`, you can also use your own Gradio API endpoint by cloning the [bigcodebench-evaluator](https://huggingface.co/spaces/bigcode/bigcodebench-evaluator) repo and check `Use via API` at the bottom of the HF space page.
68
71
-`--pass_k`: The `k` in `Pass@k`, default to `[1, 5, 10]`, e.g. `--pass_k 1,5,10` will evaluate `Pass@1`, `Pass@5` and `Pass@10`
0 commit comments