[FEATURE]: Evaluate prompt responses using another model

## Problem Statement
- Be able to evaluate responses using another LLM model

## Proposed Solution
- Define a new evaluator that uses an LLM prompt to test.

## Example

### Input Prompt
```
You are a helpful assistant. Answer the user's question clearly and politely.

What is the capital of France?
```

### Example Response
```
The capital of France is Paris.
```

### Evaluator Prompt
```
You are an evaluator checking the quality of a model's response to a user's question.

Rate the model's response from 1 to 5 based on these criteria:
- Accuracy: Is the answer factually correct?
- Friendliness: Is the tone polite and helpful?

Scoring:
- 5: Fully accurate and very friendly
- 3: Mostly accurate or somewhat friendly
- 1: Inaccurate or unfriendly

Question: {{input}}
Response: {{response}}

Only return the numeric score (1–5).
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Evaluate prompt responses using another model #26

Problem Statement

Proposed Solution

Example

Input Prompt

Example Response

Evaluator Prompt

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: Evaluate prompt responses using another model #26

Description

Problem Statement

Proposed Solution

Example

Input Prompt

Example Response

Evaluator Prompt

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions