Background
Follow-up from PR #113 review comment by @tnetennba3: #113 (comment)
response_generator.py accepts deployment as a parameter (resolving to gpt-5-nano in practice), while the other generators (context_generator.py, question_generator.py, theme_generator.py) hardcode "gpt-5-mini". The mixed approach is inconsistent and the rationale for which model is used where is not documented.
Proposal
- Decide on a consistent way to specify the deployment across all four generators (either all hardcoded, or all parameterised — pick one).
- Add brief comments at each call site explaining why a given model was chosen (e.g.
gpt-5-nano for cheaper bulk response generation, gpt-5-mini for higher-quality context/question/theme generation).
Scope
evals/synthetic/llm_generators/context_generator.py
evals/synthetic/llm_generators/question_generator.py
evals/synthetic/llm_generators/theme_generator.py
evals/synthetic/llm_generators/response_generator.py
Background
Follow-up from PR #113 review comment by @tnetennba3: #113 (comment)
response_generator.pyacceptsdeploymentas a parameter (resolving togpt-5-nanoin practice), while the other generators (context_generator.py,question_generator.py,theme_generator.py) hardcode"gpt-5-mini". The mixed approach is inconsistent and the rationale for which model is used where is not documented.Proposal
gpt-5-nanofor cheaper bulk response generation,gpt-5-minifor higher-quality context/question/theme generation).Scope
evals/synthetic/llm_generators/context_generator.pyevals/synthetic/llm_generators/question_generator.pyevals/synthetic/llm_generators/theme_generator.pyevals/synthetic/llm_generators/response_generator.py