-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Bug Report
Error Message:
TypeError: Object of type ChatCompletionMessage is not JSON serializable
Location: sdk/py/src/braintrust/functions/invoke.py:173
Affected Versions: Current Braintrust SDK (Python)
Impact: Blocks evaluation of multi-turn conversations when using Gemini 2.5 models with LLM judges
Symptoms
- Model-specific: Only occurs with Gemini 2.5 models (e.g.,
gemini-2.5-flash) - Multi-turn specific: Single-turn conversations work fine
- Judge-specific: Only fails when LLM judge scorers are used (local scorers work)
- Consistent failure: Reproducible 100% of the time with the specific conditions
- Other models unaffected: GPT-4o-mini and other models work correctly with the same setup
Reproduction
The issue occurs when:
- Loading a multi-turn conversation dataset
- Evaluating with
gemini-2.5-flash(or other Gemini 2.5 models) - Using LLM judge scorers loaded via
init_function()(e.g.,rag-factuality-llm,rag-relevance-llm,rag-completeness-llm)
Key code path:
# LLM judges loaded from Braintrust
rag_factuality_llm = init_function(project_name=project_name, slug="rag-factuality-llm")
# Evaluation with multi-turn conversation dataset
# → triggers JSON serialization of ChatCompletionMessage at invoke.py:173The ChatCompletionMessage object returned from the Gemini 2.5 model response is not being serialized to a plain dict/JSON-compatible format before being passed to the LLM judge scorer. This works with other models (e.g., GPT-4o-mini) that may return serializable types by default.
Labels
- python
- vertex-ai
Linear issue: https://linear.app/braintrustdata/issue/BRA-2972/json-serialization-error-with-gemini-25-models-in-multi-turn
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels