Simplify v1 taskset and harness TOML config#1314
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0e47bf9736
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if v1_config: | ||
| env_kwargs["config"] = v1_config |
There was a problem hiding this comment.
Preserve existing env_args.config when applying aliases
When a config mixes the previously documented [eval.env_args.config...] shape with the new top-level/per-eval harness or taskset aliases, this assignment replaces the entire existing config object instead of merging into it. For example, adding a top-level [harness] default to an eval that still has [eval.env_args.config.taskset] silently drops the taskset before vf.load_environment(...), so v1 loaders receive incomplete configuration.
Useful? React with 👍 / 👎.
| if v1_config: | ||
| env_kwargs["config"] = v1_config |
There was a problem hiding this comment.
Merge aliases with env args config in vf-train
For vf-train configs that still use the old [env.args.config...] shape, adding a new top-level [harness] or [env.taskset] alias causes this assignment to overwrite args.config entirely. That makes mixed/migrating v1 training configs lose any existing taskset/harness settings before vf.load_environment(...) is called.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b258188. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ed2fa56e13
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| env_args["config"] = {**existing_config, **child_config} | ||
|
|
||
| legacy_config = config_table(env_args.pop("config", {}), "env_args.config") |
There was a problem hiding this comment.
When a TOML config uses env_args.config as an ordinary load_environment(config=...) kwarg with keys other than taskset or harness, this pop removes that kwarg and the remaining legacy_config entries are never put back into either env_args or the first-class config. For example [eval.env_args.config] foo = "bar" previously reached the environment as config={"foo": "bar"}, but now it is silently dropped before run_evaluation calls vf.load_environment, breaking non-v1 loaders or custom config fields that use the generic env_args path.
Useful? React with 👍 / 👎.

Summary
[harness]defaults plus per-env/per-evaltasksetandharnesstablesenv_argsconfig={...}only at environment load time, preserving existing v1 loadersTests
uv run pytest tests/test_eval_cli.py tests/test_v1_config_aliases.py tests/test_eval_display.py tests/test_eval_utils.py -qNote
Medium Risk
Touches TOML normalization and environment loading for both eval and RL training; mis-merges could change harness/taskset behavior or break existing configs, though changes are localized and covered by new tests.
Overview
Simplifies v1 TOML config by making
taskset/harnessfirst-class sections on eval/env entries while allowing a shared top-level[harness]table to act as defaults across all[[eval]]/[[env]]blocks.Updates config normalization to merge legacy
env_args.config+ per-entry overrides + global harness defaults, and delays bridging intoconfig={taskset,harness}untilvf.load_environment(...)is called (eval runner andvf-train). Docs and tests are updated to reflect the new TOML shape and precedence.Reviewed by Cursor Bugbot for commit ed2fa56. Bugbot is set up for automated code reviews on this repo. Configure here.