You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Evaluate the quality of an AI-generated `response` based on the same exact inputs that your LLM used to generate the response.
@@ -176,6 +177,7 @@ def validate(
176
177
context (str): All retrieved context (e.g., from your RAG/retrieval/search system) that was supplied as part of `messages` for generating the LLM `response`. Specifying the `context` (as a part of the full `messages` object) enables Cleanlab to run certain Evals and display the retrieved context in the Web Inferface.
177
178
rewritten_query (str, optional): An optional reformulation of `query` (e.g. to form a self-contained question out of a multi-turn conversation history) to improve retrieval quality. If you are using a query-rewriter in your RAG system, you can provide its output here. If not provided, Cleanlab may internally do its own query rewrite when necessary.
178
179
metadata (object, optional): Arbitrary metadata to associate with this LLM `response` for logging/analytics inside the Project.
180
+
tools (list[ChatCompletionToolParam], optional): Optional definitions of tools that were provided to the LLM in the response-generation call. Should match the `tools` argument in OpenAI's Chat Completions API. When provided to the LLM, its response might be to call one of these tools rather than natural language.
179
181
eval_scores (dict[str, float], optional): Pre-computed evaluation scores to bypass automatic scoring. Providing `eval_scores` for specific evaluations bypasses automated scoring and uses the supplied scores instead. If you already have them pre-computed, this can reduce runtime.
180
182
181
183
Returns:
@@ -188,7 +190,6 @@ def validate(
188
190
189
191
When available, consider swapping your AI response with the expert answer before serving the response to your user.
0 commit comments