Skip to content

[WIP] Fix flag resolution bug in dry-run mode#1

Open
Copilot wants to merge 1 commit intomainfrom
copilot/fix-flag-resolution-bug
Open

[WIP] Fix flag resolution bug in dry-run mode#1
Copilot wants to merge 1 commit intomainfrom
copilot/fix-flag-resolution-bug

Conversation

Copy link

Copilot AI commented Jan 12, 2026

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.

Original prompt

You are GitHub Copilot acting as a senior Go engineer.

Context:
This repository is "Jermator", a Linux-first, command-centric terminal repair tool.
It must NOT become an autonomous agent. It only plans and executes terminal commands
under strict policy gates. Safety is fail-closed.

Your task is to implement fixes for three known issues. Do NOT add new features.
Do NOT redesign architecture. Only fix what is specified.

========================================
ISSUE 1 — Flag Resolution Bug (CRITICAL)

Problem:
--dry-run is default true and blocks --apply and --exec.
Currently, even when users pass --apply or --exec, the code returns early
in dry-run mode and never executes sandbox/live logic.

Required behavior:

  1. Default mode should be dry-run ONLY when neither --apply nor --exec is set.
  2. If --apply is set:
    • dry-run must be automatically disabled
    • sandbox execution must run (subject to policy)
  3. If --exec is set:
    • dry-run must be automatically disabled
    • live execution path must be entered (with confirmation)
  4. If user explicitly sets --dry-run together with --apply or --exec:
    • reject with a clear error (fail-closed)

Acceptance:

  • Fix flag resolution logic in cmd/jerm/main.go
  • processFailure must correctly reach sandbox/live execution
  • Add table-driven unit tests for flag combinations

========================================
ISSUE 2 — Routing Uses Provider Only, Ignores Model

Problem:
Config routing.by_task supports {provider, model}, but router only selects provider.
Model override is ignored; providers always use their default model.

Required behavior:

  1. Router must return both Provider AND Model (RouteDecision).
  2. If routing.by_task[task].model is set:
    • that model must be used for the LLM request
  3. If lock_model=true:
    • routing model overrides must be ignored
  4. If the routed model is not in providers..models allowlist:
    • fail immediately (do NOT silently fallback)

Acceptance:

  • Update internal/providers/router.go to propagate model decisions
  • Update provider request builders to accept per-call model override
  • Add unit tests for:
    • model override via routing
    • lock_model behavior
    • invalid model rejection

========================================
ISSUE 3 — FixPlan Schema Validation Can Fail-Open

Problem:
If the FixPlan JSON schema validator fails to load,
providers may accept unvalidated LLM output (fail-open).

Required behavior:

  1. Schema validation must be fail-closed.
  2. If schema validator cannot be initialized:
    • LLM planning must be rejected with a clear error
  3. If FixPlan JSON does not conform to schema:
    • reject and allow router fallback
  4. Validation logic must be consistent across OpenAI, Anthropic, Gemini providers.

Acceptance:

  • Initialize schema validator once (shared or injected)
  • Providers must refuse to parse plans if validator is unavailable
  • Add unit tests covering:
    • validator init failure
    • invalid schema rejection
    • valid plan acceptance

========================================
CONSTRAINTS

  • Do NOT add new CLI flags.
  • Do NOT change user-facing semantics beyond what is required.
  • Do NOT add networking, agents, retries, or chat loops.
  • Keep changes minimal and well-scoped.
  • All changes must compile and pass go test ./....

========================================
OUTPUT

  • Implement fixes in code.
  • Add/adjust unit tests.
  • Briefly summarize changes at the end.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.


Summary by cubic

Fixes dry-run flag handling so --apply triggers sandbox execution and --exec triggers live execution, removing the premature early-return that blocked both. If --dry-run is passed with --apply or --exec, we now fail with a clear error.

  • Bug Fixes
    • Corrected flag resolution in cmd/jerm/main.go: dry-run is default only when neither --apply nor --exec is set; conflicting flags are rejected.
    • Ensured processFailure reaches sandbox/live paths (policy and confirmation respected).
    • Added table-driven tests for all flag combinations.

Written for commit 00878a9. Summary will update on new commits.

@coderabbitai
Copy link

coderabbitai bot commented Jan 12, 2026

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI requested a review from fentz26 January 12, 2026 06:01
Copilot stopped work on behalf of fentz26 due to an error January 12, 2026 06:01
@fentz26 fentz26 marked this pull request as ready for review January 12, 2026 08:36
Copilot AI review requested due to automatic review settings January 12, 2026 08:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants