Skip to content

Conversation

@knjiang
Copy link
Contributor

@knjiang knjiang commented Jan 29, 2026

This PR adds testing framework for transformRequest and transformResponse. This is specifically an offline testing framework where we test lingua transformations against saved snapshots.

We use the existing cases and WASM bindings generated from previous PR transform_request, transform_response and validate_*_json to verify valid transforms and no regressions.

High level, my mental model is:

  • coverage-report ensures internal consistency between the Universal model & all the providers
  • transforms ensures external compatibility with OpenAPI schema validations during every transform and using the actual SDK post-transform.

Diagram:

  Phase 1: Capture (one-time, requires API keys)                                                                                
                                                                                                                                
  getCaseForProvider(caseName, source)                                                                                          
             │                                                                                                                  
             ▼                                                                                                                  
  ┌─────────────────────────────────────┐                                                                                       
  │   transformAndValidateRequest()     │                                                                                       
  │   • transform_request() ────────────┼──► validate_*_request()                                                               
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   callProvider(target, request)     │                                                                                       
  │   • openai.chat.completions.create()│                                                                                       
  │   • anthropic.messages.create()     │                                                                                       
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
           validate_*_response()                                                                                                
                     │                                                                                                          
                     ▼                                                                                                          
           writeFileSync(path, response)                                                                                        
           transforms/{src}_to_{tgt}/{case}.json                                                                                
                                                                                                                                
  Phase 2: Test (CI, no API calls)                                                                                              
                                                                                                                                
  getCaseForProvider(caseName, source)                                                                                          
             │                                                                                                                  
             ▼                                                                                                                  
  ┌─────────────────────────────────────┐                                                                                       
  │   transformAndValidateRequest()     │                                                                                       
  │   • transform_request() ────────────┼──► validate_*_request()                                                               
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
             toMatchSnapshot("request")                                                                                         
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   loadAndValidateResponse()         │                                                                                       
  │   • readFileSync(path) ─────────────┼──► validate_*_response()                                                              
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
  ┌─────────────────────────────────────┐                                                                                       
  │   transformResponseData()           │                                                                                       
  │   • transform_response() ───────────┼──► validate_*_response()                                                              
  └──────────────────┬──────────────────┘                                                                                       
                     │                                                                                                          
                     ▼                                                                                                          
             toMatchSnapshot("response")                                                                                        
                                                                                                                                

Copy link
Contributor Author

knjiang commented Jan 29, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Comment on lines +1 to +4
{
"error": "400 {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"`max_tokens` must be greater than `thinking.budget_tokens`. Please consult our documentation at https://docs.claude.com/en/docs/build-with-claude/extended-thinking#max-tokens-and-context-window-size\"},\"request_id\":\"req_011CXasVyLu26rs4f6bS7DRJ\"}",
"name": "Error"
} No newline at end of file
Copy link
Contributor Author

@knjiang knjiang Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only error i found so far, this seems to be valid since we define in our case max_tokens = 100 with high reasoning.

idk if we want to throw our own error or something. https://github.com/braintrustdata/lingua/blob/main/payloads/cases/simple.ts#L146

@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from 0cc799d to 6093a42 Compare January 29, 2026 02:50
@knjiang knjiang force-pushed the 01-27-request_typescript_and_python_bindings branch 2 times, most recently from 9e4d458 to e383b62 Compare January 29, 2026 04:07
@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from 6093a42 to 5c74eae Compare January 29, 2026 04:07
Comment on lines +327 to +330
map.insert(
"input_tokens_details".into(),
serde_json::json!({ "cached_tokens": self.prompt_cached_tokens.unwrap_or(0) }),
);
Copy link
Contributor Author

@knjiang knjiang Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caught this from the validate_response_json js binding. responses require input_token_details and output_token_details

}
/* eslint-enable @typescript-eslint/consistent-type-assertions */

const isParamCase = (name: string) => name.endsWith("Param");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

temporary, i didn't want to explode the diff so doing this here.

@knjiang knjiang marked this pull request as ready for review January 29, 2026 05:04
@@ -0,0 +1,35 @@
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anthropic_to_chatcompletions means anthropic payload using a chat completions model - we save the actual chat completion response payload so we don't have to incur the LLM cost.

@knjiang knjiang force-pushed the 01-28-testing_framework_for_transformrequest_response branch from 5c74eae to 0c166a9 Compare January 29, 2026 05:38
@knjiang knjiang force-pushed the 01-27-request_typescript_and_python_bindings branch from e383b62 to 8d60e4b Compare January 29, 2026 05:38
@knjiang knjiang requested review from ankrgyl and remh January 30, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants