-
Notifications
You must be signed in to change notification settings - Fork 197
feat: blind second-opinion tool for independent cross-model review #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jamubc
wants to merge
1
commit into
main
Choose a base branch
from
feat/blind-second-opinion
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+539
−1
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| # Second Opinion (Blind Independent Review) | ||
|
|
||
| The `second-opinion` tool sends a problem to Gemini and obtains a completely independent answer — one that is never shown the orchestrator's existing analysis. This prevents *anchoring bias*, where a model's output is unconsciously shaped by a prior answer it was shown. | ||
|
|
||
| ## Why anchoring matters | ||
|
|
||
| When a model is shown an existing answer before being asked to evaluate or improve it, it tends to: | ||
|
|
||
| - Adopt the framing and assumptions of the prior answer uncritically | ||
| - Miss alternative approaches that the first answer did not consider | ||
| - Agree with the prior answer even when it contains errors | ||
|
|
||
| By hiding the orchestrator's answer from the independent solve step, the `second-opinion` tool ensures the second perspective is genuinely fresh. | ||
|
|
||
| ## How it works | ||
|
|
||
| 1. **Blind solve** — The problem text is sent to Gemini with a prompt that instructs it to reason from first principles. The orchestrator's own answer is *not* included in this call, regardless of whether one is provided. | ||
|
|
||
| 2. **Optional comparison** — If `ownAnswer` is provided and `compare` is `true` (the default), a second call compares the two answers and lists agreements and divergences. This comparison step can freely see both answers because the independent answer is already locked in. | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Independent answer only | ||
|
|
||
| ```json | ||
| { | ||
| "tool": "second-opinion", | ||
| "problem": "What database indexing strategy should we use for a write-heavy time-series workload?" | ||
| } | ||
| ``` | ||
|
|
||
| The tool returns the independent answer under a `## Independent answer` heading. | ||
|
|
||
| ### With divergence comparison | ||
|
|
||
| ```json | ||
| { | ||
| "tool": "second-opinion", | ||
| "problem": "What database indexing strategy should we use for a write-heavy time-series workload?", | ||
| "ownAnswer": "We should use a B-tree index on the timestamp column and partition by month.", | ||
| "compare": true | ||
| } | ||
| ``` | ||
|
|
||
| The tool returns the independent answer and then a `## Points of divergence` section that lists where the two answers agree or differ and which position is better supported. | ||
|
|
||
| ### Skipping the comparison | ||
|
|
||
| Set `compare: false` to obtain only the independent answer even when `ownAnswer` is provided. This is useful when you want the raw independent perspective without the comparison overhead. | ||
|
|
||
| ```json | ||
| { | ||
| "tool": "second-opinion", | ||
| "problem": "Explain the tradeoffs between eventual and strong consistency.", | ||
| "ownAnswer": "Strong consistency is always safer.", | ||
| "compare": false | ||
| } | ||
| ``` | ||
|
|
||
| ## Parameters | ||
|
|
||
| | Parameter | Type | Required | Default | Description | | ||
| |-------------|---------|----------|----------------|-------------| | ||
| | `problem` | string | yes | — | The problem or question to be answered independently. Must contain only the problem — no existing answer. | | ||
| | `ownAnswer` | string | no | — | The orchestrator's own answer. Used only in the optional compare step; never forwarded to the solve call. | | ||
| | `model` | string | no | gemini-2.5-pro | Gemini model to use for both calls. | | ||
| | `compare` | boolean | no | `true` | Whether to run the divergence comparison when `ownAnswer` is provided. | | ||
|
|
||
| ## Output format | ||
|
|
||
| ``` | ||
| ## Independent answer | ||
|
|
||
| <Gemini's independent answer> | ||
|
|
||
| --- | ||
|
|
||
| ## Points of divergence | ||
|
|
||
| <Comparison of the two answers, listing agreements and divergences> | ||
| ``` | ||
|
|
||
| The `## Points of divergence` section is omitted if `ownAnswer` was not provided or `compare` is `false`. | ||
|
|
||
| ## Anti-anchoring guarantee | ||
|
|
||
| The `buildSolvePrompt` function — which constructs the prompt for the independent solve call — accepts only the `problem` string. It has no parameter for an existing answer. This is enforced both by the TypeScript type signature and by the tool's execution flow, where `ownAnswer` is explicitly kept out of the first executor call and is only passed to `buildComparePrompt` in the second call. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,130 @@ | ||
| import { z } from 'zod'; | ||
| import { UnifiedTool } from './registry.js'; | ||
| import { Logger } from '../utils/logger.js'; | ||
| import { executeGeminiCLI } from '../utils/geminiExecutor.js'; | ||
| import { | ||
| buildSolvePrompt, | ||
| buildComparePrompt, | ||
| formatResult, | ||
| } from '../utils/secondOpinion.js'; | ||
| import { STATUS_MESSAGES } from '../constants.js'; | ||
|
|
||
| /** | ||
| * Type signature for an executor function compatible with executeGeminiCLI. | ||
| * Accepting an injected executor makes the anti-anchoring invariant testable | ||
| * without spawning real subprocesses. | ||
| */ | ||
| export type GeminiExecutor = ( | ||
| prompt: string, | ||
| model?: string, | ||
| sandbox?: boolean, | ||
| changeMode?: boolean, | ||
| onProgress?: (output: string) => void | ||
| ) => Promise<string>; | ||
|
|
||
| const secondOpinionArgsSchema = z.object({ | ||
| problem: z | ||
| .string() | ||
| .min(1) | ||
| .describe( | ||
| 'The problem or question to be answered independently. Must not include any existing answer — state only the problem.' | ||
| ), | ||
| ownAnswer: z | ||
| .string() | ||
| .optional() | ||
| .describe( | ||
| "The orchestrator's own answer to the problem. Provided only for the optional divergence comparison step — it is NEVER forwarded to the independent solve call." | ||
| ), | ||
| model: z | ||
| .string() | ||
| .optional() | ||
| .describe( | ||
| "Optional Gemini model to use (e.g., 'gemini-2.5-flash'). Defaults to gemini-2.5-pro." | ||
| ), | ||
| compare: z | ||
| .boolean() | ||
| .default(true) | ||
| .describe( | ||
| 'When true (default) and ownAnswer is provided, perform a divergence comparison after the independent solve.' | ||
| ), | ||
| }); | ||
|
|
||
| /** | ||
| * Factory that produces the second-opinion UnifiedTool with a configurable | ||
| * executor. Production code uses the default (executeGeminiCLI). Tests inject | ||
| * a fake executor to capture prompts without spawning subprocesses. | ||
| */ | ||
| export function createSecondOpinionTool( | ||
| executor: GeminiExecutor = executeGeminiCLI | ||
| ): UnifiedTool { | ||
| return { | ||
| name: 'second-opinion', | ||
| description: | ||
| 'Obtain a blind, independent Gemini answer to a problem without exposing any existing answer (anti-anchoring). Optionally compare the independent answer with the orchestrator\'s own answer to surface agreements and divergences.', | ||
| zodSchema: secondOpinionArgsSchema, | ||
| prompt: { | ||
| description: | ||
| 'Obtain an independent second opinion on a problem, then optionally compare it with an existing answer to identify divergences.', | ||
| }, | ||
| category: 'gemini', | ||
|
|
||
| execute: async (args, onProgress) => { | ||
| const { problem, ownAnswer, model, compare = true } = args; | ||
|
|
||
| const problemStr = typeof problem === 'string' ? problem : String(problem ?? ''); | ||
| if (!problemStr.trim()) { | ||
| throw new Error( | ||
| 'A non-empty problem description is required for the second-opinion tool.' | ||
| ); | ||
| } | ||
|
Comment on lines
+75
to
+79
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| // ── Step 1: Independent solve ────────────────────────────────────────── | ||
| // ANTI-ANCHORING: buildSolvePrompt only receives the problem. The | ||
| // ownAnswer value is not accessible to this call site at all. | ||
| const solvePrompt = buildSolvePrompt(problemStr); | ||
|
|
||
| Logger.debug('second-opinion: requesting independent solution'); | ||
| onProgress?.(STATUS_MESSAGES.PROCESSING_START); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| const independentAnswer = await executor( | ||
| solvePrompt, | ||
| model as string | undefined, | ||
| false, | ||
| false, | ||
| onProgress | ||
| ); | ||
|
|
||
| // ── Step 2: Optional divergence comparison ───────────────────────────── | ||
| let comparison: string | undefined; | ||
|
|
||
| const ownAnswerStr = typeof ownAnswer === 'string' ? ownAnswer : undefined; | ||
|
|
||
| if (ownAnswerStr && compare) { | ||
| Logger.debug('second-opinion: performing divergence comparison'); | ||
| onProgress?.('Comparing answers for points of divergence...'); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| const comparePrompt = buildComparePrompt( | ||
| problemStr, | ||
| ownAnswerStr, | ||
| independentAnswer | ||
| ); | ||
|
|
||
| comparison = await executor( | ||
| comparePrompt, | ||
| model as string | undefined, | ||
| false, | ||
| false, | ||
| onProgress | ||
| ); | ||
| } | ||
|
|
||
| return formatResult({ independentAnswer, comparison }); | ||
| }, | ||
| }; | ||
| } | ||
|
|
||
| /** | ||
| * The production tool instance registered in the tool registry. | ||
| * Uses the real executeGeminiCLI executor. | ||
| */ | ||
| export const secondOpinionTool: UnifiedTool = createSecondOpinionTool(); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| /** | ||
| * Pure string-manipulation helpers for the blind second-opinion workflow. | ||
| * | ||
| * ANTI-ANCHORING GUARANTEE: | ||
| * buildSolvePrompt(problem) ONLY takes the problem description — no answer | ||
| * parameter exists — so the orchestrator's own answer can never leak into the | ||
| * independent solve call, even by accident. | ||
| */ | ||
|
|
||
| /** | ||
| * Builds the prompt sent to the independent solver. | ||
| * | ||
| * HARD INVARIANT: this function signature intentionally accepts only `problem`. | ||
| * There is no second parameter for an existing answer. Any attempt to pass an | ||
| * existing answer at call-site would be a TypeScript compile error. This makes | ||
| * the anti-anchoring guarantee statically enforced. | ||
| */ | ||
| export function buildSolvePrompt(problem: string): string { | ||
| return `You are an independent expert providing a fresh solution to the following problem. Approach it from first principles without reference to any prior analysis. | ||
|
|
||
| ## Problem | ||
|
|
||
| ${problem} | ||
|
|
||
| ## Instructions | ||
|
|
||
| - Reason through the problem independently and thoroughly. | ||
| - State your assumptions clearly. | ||
| - Provide a complete, well-structured answer. | ||
| - Do not hedge or truncate your response — give your full analysis.`; | ||
| } | ||
|
|
||
| /** | ||
| * Builds the prompt used to compare the orchestrator's answer with the | ||
| * independently generated answer. | ||
| * | ||
| * This prompt is only executed AFTER the independent solve is complete, so it | ||
| * has no influence on the independent answer. | ||
| */ | ||
| export function buildComparePrompt( | ||
| problem: string, | ||
| ownAnswer: string, | ||
| independentAnswer: string | ||
| ): string { | ||
| return `You are a neutral analyst comparing two independent answers to the same problem. Identify where they agree, where they diverge, and which (if any) divergences are substantive. | ||
|
|
||
| ## Problem | ||
|
|
||
| ${problem} | ||
|
|
||
| ## Answer A | ||
|
|
||
| ${ownAnswer} | ||
|
|
||
| ## Answer B | ||
|
|
||
| ${independentAnswer} | ||
|
|
||
| ## Instructions | ||
|
|
||
| 1. List key **points of agreement** between A and B. | ||
| 2. List key **points of divergence** — focus on substantive differences in conclusions, recommendations, or reasoning, not merely phrasing. | ||
| 3. For each divergence, briefly assess which position (if either) is better supported. | ||
| 4. Conclude with an overall summary of alignment. | ||
|
|
||
| Structure your output with clear headings.`; | ||
| } | ||
|
|
||
| /** | ||
| * Formats the combined output as markdown. | ||
| * | ||
| * The "Independent answer" section is always present. The "Points of | ||
| * divergence" section is included only when a comparison was performed. | ||
| */ | ||
| export function formatResult({ | ||
| independentAnswer, | ||
| comparison, | ||
| }: { | ||
| independentAnswer: string; | ||
| comparison?: string; | ||
| }): string { | ||
| const sections: string[] = [ | ||
| `## Independent answer\n\n${independentAnswer.trim()}`, | ||
| ]; | ||
|
|
||
| if (comparison !== undefined && comparison.trim().length > 0) { | ||
| sections.push(`## Points of divergence\n\n${comparison.trim()}`); | ||
| } | ||
|
|
||
| return sections.join('\n\n---\n\n'); | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you defined specific messages for the second-opinion tool in
SECOND_OPINION_MESSAGESwithinsrc/constants.ts, you should import and use them here instead of importing the genericSTATUS_MESSAGES.