Fix Model Requirements & Fallbacks (PR #3)#4
Conversation
… ports - Revert src/hooks/runtime-enforcement/hook.ts to HEAD (runtime authority unchanged) - Revert src/agents/runtime/state-ledger.ts to HEAD (runtime authority unchanged) - Revert src/agents/runtime/tool-runner.ts to HEAD (runtime authority unchanged) - Revert src/agents/oracle.ts to HEAD (Heidi baseline is already stronger) - Revert src/agents/sisyphus.ts (flat file) to HEAD - Heidi's modular flow preserved - Revert src/agents/builtin-agents.ts to HEAD - sisyphus-agent wiring preserved - Hephaestus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module - Sisyphus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module - Add isGpt5_4Model/isGpt5_3CodexModel type guards to types.ts for Hephaestus dispatch - Update doctor check: remove dynamic-agent-prompt-builder as forbidden (passive library) - Update doctor check: loop guard deferred to separate runtime-only PR Phase 0 changes (Atlas/Gemini/GPT verification wave) and Sisyphus/Hephaestus prompt-layer capability improvements are preserved. No runtime changes in this PR.
…edger
- Removed dynamic-agent-prompt-builder imports from Sisyphus and Hephaestus
- Ported official wording/orchestration into src/agents/prompts/orchestration.ts
- Moved Agent/Skill/Tool/Category types into src/agents/types.ts
- Deleted dynamic-agent-prompt-builder and restored it to the doctor forbidden list
- Tightened runtime enforcement hook to scan current chat flow instead of global historical ledger
- Dropped generic phrase matching ('success') from enforcement checks
- Unused/shadow state-ledger from agents runtime deleted
- Unified state-ledger across agent and core runtime
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the flexibility, intelligence, and reliability of the agent system. It removes previous hardcoded model restrictions, allowing agents like Hephaestus to operate with a wider range of models through dynamic prompt generation. The iterative development loop (Ralph Loop) has been made more robust with advanced verification and session management. Crucially, the underlying 'Truth Model' has been hardened to ensure that all agent actions are verifiable and accurately recorded, preventing agents from making unproven claims. These changes collectively lead to more adaptable and trustworthy autonomous agents. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
…ating ledger entries and implement flow isolation in runtime enforcement to validate state claims against the current execution flow.
…nd allow non-gpt models for hephaestus
e63d6a3 to
34de622
Compare
There was a problem hiding this comment.
Code Review
This pull request is a significant and impressive refactoring that achieves two major goals. First, it successfully decouples agents from hardcoded model requirements, particularly removing the GPT-only restriction for Hephaestus and other categories, which aligns perfectly with the PR's stated objective. The addition of regression tests for model resolution is a great touch.
Second, it introduces a substantial new reliability layer, which you've called the "Truth Model" and "Runtime Enforcement". This includes new safe tools (git_safe, fs_safe), a state ledger, and several hooks to enforce tool contracts and prevent agent hallucinations about state changes. The introduction of flow isolation in the state ledger is a particularly strong improvement for ensuring deterministic verification. While this was a large addition not mentioned in the PR description, the implementation appears robust and well-tested.
Overall, these changes represent a major step forward in the platform's reliability and flexibility. I've left a couple of minor comments below.
Note: Security Review did not run due to the size of the PR.
I am having trouble creating individual review comments. Click here to see my feedback.
.official/src/agents/hephaestus/gpt-5-3-codex.ts (525)
There's a small typo in the agent description. It says 'GPT 5.2 Codex' in a file that is for 'gpt-5-3-codex'. This should be updated for consistency.
"Autonomous Deep Worker - goal-oriented execution with GPT 5.3 Codex. Explores thoroughly before acting, uses explore/librarian agents for comprehensive context, completes tasks end-to-end. Inspired by AmpCode deep mode. (Hephaestus - OhMyOpenCode)",
.runtime/journal/execution.jsonl (1-17)
This appears to be a runtime log file. Log files are typically not checked into source control as they can bloat the repository and cause unnecessary merge conflicts. Please consider adding this file path (.runtime/journal/execution.jsonl) to your .gitignore file.
This PR addresses the hardcoded model requirements that restricted Hephaestus and other agents to GPT models, as requested in PR #3.
Key changes: