Master#3
Conversation
… ports - Revert src/hooks/runtime-enforcement/hook.ts to HEAD (runtime authority unchanged) - Revert src/agents/runtime/state-ledger.ts to HEAD (runtime authority unchanged) - Revert src/agents/runtime/tool-runner.ts to HEAD (runtime authority unchanged) - Revert src/agents/oracle.ts to HEAD (Heidi baseline is already stronger) - Revert src/agents/sisyphus.ts (flat file) to HEAD - Heidi's modular flow preserved - Revert src/agents/builtin-agents.ts to HEAD - sisyphus-agent wiring preserved - Hephaestus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module - Sisyphus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module - Add isGpt5_4Model/isGpt5_3CodexModel type guards to types.ts for Hephaestus dispatch - Update doctor check: remove dynamic-agent-prompt-builder as forbidden (passive library) - Update doctor check: loop guard deferred to separate runtime-only PR Phase 0 changes (Atlas/Gemini/GPT verification wave) and Sisyphus/Hephaestus prompt-layer capability improvements are preserved. No runtime changes in this PR.
…edger
- Removed dynamic-agent-prompt-builder imports from Sisyphus and Hephaestus
- Ported official wording/orchestration into src/agents/prompts/orchestration.ts
- Moved Agent/Skill/Tool/Category types into src/agents/types.ts
- Deleted dynamic-agent-prompt-builder and restored it to the doctor forbidden list
- Tightened runtime enforcement hook to scan current chat flow instead of global historical ledger
- Dropped generic phrase matching ('success') from enforcement checks
- Unused/shadow state-ledger from agents runtime deleted
- Unified state-ledger across agent and core runtime
- Enhanced State Ledger schema (success, verified, changedState, stdout, sessionID) - complete_task & query_ledger mapped to strictly verified, session-scoped truth - execution-journal hook forwards full status booleans and stdout to ledger - tool-contract hook strictly types metadata booleans and asserts ledger presence for state changes - Purged prototype scaffolding (agent-runner, tool-runner, context-builder) - Updated doctor checks and docs for Phase 1 closure
- Removed 'Hephaestus is designed exclusively for GPT' block in no-hephaestus-non-gpt hook - Removed requiresProvider limitations from shared and CLI model requirements for Hephaestus - Added model resolution regression tests specifying Grok resolves properly for config - Updated upstream capability doctor to strictly forbid the return of these GPT-only rule strings
- Wired DETERMINISTIC_TOOLS into active createToolRegistry - Exposed execution, plan, and runtime enforcement hooks via createToolGuardHooks - Enforced strict boolean contract (success, verified, changedState) on all deterministic tools - Added unit tests for isolated query_ledger and complete_task behavior - Updated upstream capability tracker
… for xAI/Grok models by adding cache read tokens back to input.
…t metadata, create `execution.jsonl` for journaling, and add truth model integration tests.
…er entries and adding runtime validation for message claims.
…ating ledger entries and implement flow isolation in runtime enforcement to validate state claims against the current execution flow.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces new agents, enhances existing ones, and implements runtime enforcement and verification mechanisms to improve the reliability and effectiveness of the AI system. It also adds support for new models and optimizes prompts for better performance. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several changes, including creating new files for Oracle and Sisyphus agents, modifying Hephaestus-related files, and updating hooks and tool configurations. The review comments suggest correcting a typo in the Hephaestus agent description, removing a redundant function, simplifying type assertions, improving the prompt patching for Gemini models, and removing an extra space. The changes also involve migrating logic from dynamic prompt builders to native prompts, enforcing stricter tool contracts, and adding a new hook for xAI usage patching.
|
|
||
| return { | ||
| description: | ||
| "Autonomous Deep Worker - goal-oriented execution with GPT 5.2 Codex. Explores thoroughly before acting, uses explore/librarian agents for comprehensive context, completes tasks end-to-end. Inspired by AmpCode deep mode. (Hephaestus - OhMyOpenCode)", |
There was a problem hiding this comment.
There's a small typo in the agent description. It mentions GPT 5.2 Codex, but this file is for GPT 5.3 Codex.
| "Autonomous Deep Worker - goal-oriented execution with GPT 5.2 Codex. Explores thoroughly before acting, uses explore/librarian agents for comprehensive context, completes tasks end-to-end. Inspired by AmpCode deep mode. (Hephaestus - OhMyOpenCode)", | |
| "Autonomous Deep Worker - goal-oriented execution with GPT 5.3 Codex. Explores thoroughly before acting, uses explore/librarian agents for comprehensive context, completes tasks end-to-end. Inspired by AmpCode deep mode. (Hephaestus - OhMyOpenCode)", |
| export function createHephaestusAgent( | ||
| model: string, | ||
| availableAgents?: AvailableAgent[], | ||
| availableToolNames?: string[], | ||
| availableSkills?: AvailableSkill[], | ||
| availableCategories?: AvailableCategory[], | ||
| useTaskSystem = false, | ||
| ): AgentConfig { | ||
| const tools = availableToolNames ? categorizeTools(availableToolNames) : []; | ||
| const skills = availableSkills ?? []; | ||
| const categories = availableCategories ?? []; | ||
| const prompt = availableAgents | ||
| ? buildHephaestusPrompt( | ||
| availableAgents, | ||
| tools, | ||
| skills, | ||
| categories, | ||
| useTaskSystem, | ||
| ) | ||
| : buildHephaestusPrompt([], tools, skills, categories, useTaskSystem); | ||
|
|
||
| return { | ||
| description: | ||
| "Autonomous Deep Worker - goal-oriented execution with GPT 5.2 Codex. Explores thoroughly before acting, uses explore/librarian agents for comprehensive context, completes tasks end-to-end. Inspired by AmpCode deep mode. (Hephaestus - OhMyOpenCode)", | ||
| mode: MODE, | ||
| model, | ||
| maxTokens: 32000, | ||
| prompt, | ||
| color: "#D97706", // Forged Amber - Golden heated metal, divine craftsman | ||
| permission: { | ||
| question: "allow", | ||
| call_omo_agent: "deny", | ||
| } as AgentConfig["permission"], | ||
| reasoningEffort: "medium", | ||
| }; | ||
| } | ||
| createHephaestusAgent.mode = MODE; |
There was a problem hiding this comment.
This createHephaestusAgent function appears to be redundant. The main factory function in src/agents/hephaestus/agent.ts already handles the logic for selecting the correct prompt builder based on the model. This duplicate function is unused and could cause confusion. I recommend removing it to keep the codebase clean and maintain a single source of truth for agent creation.
| if (isGptModel(model)) { | ||
| return { | ||
| ...base, | ||
| prompt: ORACLE_GPT_PROMPT, | ||
| reasoningEffort: "medium", | ||
| textVerbosity: "high", | ||
| } as AgentConfig; | ||
| } | ||
|
|
||
| return { | ||
| ...base, | ||
| thinking: { type: "enabled", budgetTokens: 32000 }, | ||
| } as AgentConfig; | ||
| } |
There was a problem hiding this comment.
The type assertions as AgentConfig on the return statements are likely redundant. Since the function's return type is already declared as AgentConfig, the TypeScript compiler should be able to infer the type of the returned objects. Removing these explicit casts would make the code slightly cleaner.
| if (isGptModel(model)) { | |
| return { | |
| ...base, | |
| prompt: ORACLE_GPT_PROMPT, | |
| reasoningEffort: "medium", | |
| textVerbosity: "high", | |
| } as AgentConfig; | |
| } | |
| return { | |
| ...base, | |
| thinking: { type: "enabled", budgetTokens: 32000 }, | |
| } as AgentConfig; | |
| } | |
| if (isGptModel(model)) { | |
| return { | |
| ...base, | |
| prompt: ORACLE_GPT_PROMPT, | |
| reasoningEffort: "medium", | |
| textVerbosity: "high", | |
| }; | |
| } | |
| return { | |
| ...base, | |
| thinking: { type: "enabled", budgetTokens: 32000 }, | |
| }; |
| if (isGeminiModel(model)) { | ||
| // 1. Intent gate + tool mandate — early in prompt (after intent verbalization) | ||
| prompt = prompt.replace( | ||
| "</intent_verbalization>", | ||
| `</intent_verbalization>\n\n${buildGeminiIntentGateEnforcement()}\n\n${buildGeminiToolMandate()}` | ||
| ); | ||
|
|
||
| // 2. Tool guide + examples — after tool_usage_rules (where tools are discussed) | ||
| prompt = prompt.replace( | ||
| "</tool_usage_rules>", | ||
| `</tool_usage_rules>\n\n${buildGeminiToolGuide()}\n\n${buildGeminiToolCallExamples()}` | ||
| ); | ||
|
|
||
| // 3. Delegation + verification overrides — before Constraints (NOT at prompt end) | ||
| // Gemini suffers from lost-in-the-middle: content at prompt end gets weaker attention. | ||
| // Placing these before <Constraints> ensures they're in a high-attention zone. | ||
| prompt = prompt.replace( | ||
| "<Constraints>", | ||
| `${buildGeminiDelegationOverride()}\n\n${buildGeminiVerificationOverride()}\n\n<Constraints>` | ||
| ); | ||
| } |
There was a problem hiding this comment.
Using string.replace() to patch the prompt for Gemini models is a bit brittle. If the base prompt's structure or the placeholder strings change, these replacements could fail silently. A more robust approach would be to make the prompt builder more modular, allowing different sections to be conditionally included or replaced based on the model. For example, you could have a buildGeminiSpecificSections function that returns the necessary overrides, and incorporate them during the initial prompt construction.
|
|
||
| export function getGeminiAtlasPrompt(): string { | ||
| return ATLAS_GEMINI_SYSTEM_PROMPT | ||
| return ATLAS_GEMINI_SYSTEM_PROMPT |
* fix(phase1): revert runtime file changes, keep prompt-only capability ports
- Revert src/hooks/runtime-enforcement/hook.ts to HEAD (runtime authority unchanged)
- Revert src/agents/runtime/state-ledger.ts to HEAD (runtime authority unchanged)
- Revert src/agents/runtime/tool-runner.ts to HEAD (runtime authority unchanged)
- Revert src/agents/oracle.ts to HEAD (Heidi baseline is already stronger)
- Revert src/agents/sisyphus.ts (flat file) to HEAD - Heidi's modular flow preserved
- Revert src/agents/builtin-agents.ts to HEAD - sisyphus-agent wiring preserved
- Hephaestus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module
- Sisyphus dir: prompt-only, imports hard-blocks/anti-patterns from Heidi's prompts module
- Add isGpt5_4Model/isGpt5_3CodexModel type guards to types.ts for Hephaestus dispatch
- Update doctor check: remove dynamic-agent-prompt-builder as forbidden (passive library)
- Update doctor check: loop guard deferred to separate runtime-only PR
Phase 0 changes (Atlas/Gemini/GPT verification wave) and Sisyphus/Hephaestus
prompt-layer capability improvements are preserved. No runtime changes in this PR.
* fix(phase1): completely eliminate builder from capabilities & unify ledger
- Removed dynamic-agent-prompt-builder imports from Sisyphus and Hephaestus
- Ported official wording/orchestration into src/agents/prompts/orchestration.ts
- Moved Agent/Skill/Tool/Category types into src/agents/types.ts
- Deleted dynamic-agent-prompt-builder and restored it to the doctor forbidden list
- Tightened runtime enforcement hook to scan current chat flow instead of global historical ledger
- Dropped generic phrase matching ('success') from enforcement checks
- Unused/shadow state-ledger from agents runtime deleted
- Unified state-ledger across agent and core runtime
* Refactor the truth model to prevent unverified bash commands from creating ledger entries and implement flow isolation in runtime enforcement to validate state claims against the current execution flow.
* fix(model-requirements): remove hardcoded gpt-5.3-codex constraints and allow non-gpt models for hephaestus
Summary
Changes
Screenshots
Testing
bun run typecheck bun testRelated Issues