docs(gepa): rewrite Section 8 with accurate custom proposer behavior for ReAct

Ju-usc · Ju-usc · commit 776ab9bbb3d8 · 2025-10-25T20:08:41.000-07:00
- Clarify custom proposer receives ALL components (regular + ReAct)
- Add realistic signature with ReAct failure patterns and component types
- Use exact naming from implementation: examples_with_feedback, component_reflective_data, propose_instruction
- Show _format_examples() helper matching real markdown formatting
- Remove regular component handling to keep example focused on ReAct
- Test code example validates successfully
- Fix contradiction: optimize_react_components must be True (not irrelevant)

docs(gepa): clarify custom proposer behavior in routing section

Change 'overrides the default routing' to 'receives all components and handles the optimization logic' to avoid confusion with optimize_react_components which still controls discovery/serialization

docs(gepa): remove discouraging recommendation from custom proposer section

Users reading this section want to learn how to implement custom proposers for ReAct - don't discourage them from doing so
diff --git a/docs/docs/api/optimizers/GEPA/GEPA_Advanced.md b/docs/docs/api/optimizers/GEPA/GEPA_Advanced.md
@@ -550,7 +550,7 @@ When `optimize_react_components=True`, GEPA:
 1. **Discovers ReAct modules** - Finds all `dspy.ReAct` instances in your program (including nested modules)
 2. **Extracts components** - Collects react instructions, extract instructions, and tool schemas from each ReAct module
 3. **Routes to proposers** - Separates components by type and routes them appropriately:
-   - **With custom `instruction_proposer`**: Your custom proposer overrides the default routing and receives all components (both regular instructions and ReAct components)
+   - **With custom `instruction_proposer`**: Your custom proposer receives all components (both regular instructions and ReAct components) and handles the optimization logic
    - **With default proposer**: Regular instructions use default instruction proposer, ReAct components use specialized `ReActModuleProposer`
 4. **Optimizes jointly** - ReAct proposer improves all four components together based on execution feedback
 5. **Applies updates** - Updates your ReAct modules with improved instructions and tool descriptions
@@ -686,30 +686,166 @@ for tool_name, tool in optimized_agent.tools.items():
         print(f"  Argument descriptions:", tool.arg_desc)
 ```
 
-### Compatibility with Custom Instruction Proposers
+### Custom Instruction Proposers and ReAct Optimization
+
+**Important:** When you provide a custom `instruction_proposer`, it receives ALL components (regular predictors AND ReAct modules). You must set `optimize_react_components=True` to enable ReAct module discovery and serialization, then handle the optimization logic yourself.
+
+**How it works internally:**
+
+1. **Component Discovery** - GEPA discovers components in your program:
+   - Regular predictors → keys like `"predict"`, `"chain_of_thought"`
+   - ReAct modules → keys like `"react_module"` or `"react_module:agent_name"`
+
+2. **ReAct Serialization** - When `optimize_react_components=True`, GEPA serializes ReAct modules as JSON:
+   ```json
+   {
+     "react": "instruction for reasoning and tool selection",
+     "extract": "instruction for answer extraction",
+     "tools": {
+       "tool_name": {
+         "desc": "what the tool does",
+         "args": {"param": {"type": "string"}},
+         "arg_desc": {"param": "description of param"}
+       }
+     }
+   }
+   ```
+
+3. **Custom Proposer Receives**:
+   - `candidate: dict[str, str]` - **All values are strings**
+     - Regular component: `candidate["predict"]` → `"Your instruction here"`
+     - ReAct component: `candidate["react_module"]` → `'{"react": "...", "extract": "...", "tools": {...}}'` (JSON as a string)
+   - `reflective_dataset: dict[str, list[ReflectiveExample]]` - **GEPA provides this** 
+     - Contains execution traces: inputs, outputs (including full ReAct trajectory), and your metric's feedback
+     - For ReAct: `Generated_Outputs` includes the entire trajectory with all tool calls and reasoning
+     - Use this to understand what went wrong and guide your improvements
+   - `components_to_update: list[str]` - Component keys to optimize this round
+
+4. **Your Responsibility**:
+   - For ReAct components: Use `json.loads()` to parse, improve all 4 parts, use `json.dumps()` to return
+   - For regular components: Improve the instruction string directly
+   - Return `dict[str, str]` with same keys
+
+**What this means:**
+- Your custom proposer receives ALL components: regular signatures AND ReAct modules
+- GEPA still does discovery and JSON serialization, but YOU handle the optimization logic
+- ReAct components are passed with keys like `"react_module"` or `"react_module:agent_name"`
+
+#### Implementing a Custom Proposer for ReAct
+
+If you need custom logic, you must handle ReAct components yourself. ReAct components are stored as JSON strings containing all 4 parts:
 
-ReAct component optimization works seamlessly with custom instruction proposers. When you provide a custom instruction proposer AND enable `optimize_react_components=True`:
-
-**Component routing:**
-- **Signature instructions** → Your custom instruction proposer
-- **Tool descriptions** → Built-in `ToolProposer` with specialized tool reflection prompt
+```python
+import json
 
-**Key points:**
-- Both operate independently during the same GEPA run
-- Tools receive domain-appropriate optimization guidance (tool selection patterns, usage context)
-- Signatures use your custom logic (task-specific reasoning, formatting, etc.)
-- The built-in tool proposer is not customizable - it always uses `GenerateImprovedToolDescriptionFromFeedback`
+# Define signature for improving ReAct components
+class ImproveReActInstruction(dspy.Signature):
+    """Analyze agent execution failures and improve the instruction.
+    
+    Focus on common ReAct failure patterns:
+    - Tool selection errors (wrong tool chosen)
+    - Missing tool calls (agent gave up without trying)
+    - Incorrect tool arguments
+    - Extraction failures (couldn't extract answer from trajectory)
+    """
+    current_instruction = dspy.InputField(desc="The current instruction being optimized")
+    component_type = dspy.InputField(desc="Type: 'react' (reasoning), 'extract' (extraction), or 'tool' (tool description)")
+    examples_with_feedback = dspy.InputField(desc="Examples showing what went wrong: inputs, outputs, and feedback")
+    improved_instruction = dspy.OutputField(desc="Improved instruction addressing the observed failures")
 
-This separation ensures tools and signatures get appropriate optimization strategies without interference.
 
-```python
-from dspy.teleprompt.gepa.instruction_proposal import MultiModalInstructionProposer
+class CustomProposer:
+    def __call__(self, candidate, reflective_dataset, components_to_update):
+        """
+        When you provide a custom proposer, it receives ALL components (regular + ReAct).
+        
+        Args:
+            candidate: dict[str, str] - All component instructions to update
+                - Regular: "predict" -> "Your instruction..."
+                - ReAct: "react_module" -> JSON string: {"react": "...", "extract": "...", "tools": {...}}
+            reflective_dataset: dict[str, list[ReflectiveExample]]
+                - Component name -> list of examples with Inputs, Generated_Outputs, Feedback
+            components_to_update: list[str] - All components to update this round
+        
+        Returns:
+            dict[str, str] - Updated instructions for all components
+        """
+        propose_instruction = dspy.Predict(ImproveReActInstruction)
+        results = {}
+        
+        for component in components_to_update:
+            if not component.startswith("react_module"):
+                continue  # Skip non-ReAct components (handle them separately if needed)
+            
+            # Parse the JSON config
+            config = json.loads(candidate[component])
+            # config contains: {"react": "...", "extract": "...", "tools": {...}}
+            
+            component_reflective_data = reflective_dataset[component]
+            
+            # Format examples (limit to first 3 for efficiency)
+            formatted_examples = self._format_examples(component_reflective_data[:3])
+            
+            # Improve react instruction (reasoning and tool selection)
+            improved_react = propose_instruction(
+                current_instruction=config["react"],
+                component_type="react",
+                examples_with_feedback=formatted_examples
+            ).improved_instruction
+            
+            # Improve extract instruction (answer extraction from trajectory)
+            improved_extract = config.get("extract", "")
+            if improved_extract:
+                improved_extract = propose_instruction(
+                    current_instruction=improved_extract,
+                    component_type="extract",
+                    examples_with_feedback=formatted_examples
+                ).improved_instruction
+            
+            # Improve tool descriptions (what each tool does and when to use it)
+            improved_tools = {}
+            for tool_name, tool_info in config.get("tools", {}).items():
+                improved_desc = propose_instruction(
+                    current_instruction=tool_info["desc"],
+                    component_type="tool",
+                    examples_with_feedback=formatted_examples
+                ).improved_instruction
+                
+                improved_tools[tool_name] = {
+                    "desc": improved_desc,
+                    "args": tool_info["args"],  # Keep args schema unchanged
+                    "arg_desc": tool_info.get("arg_desc", {})  # Can also improve these
+                }
+            
+            # Return as JSON string
+            results[component] = json.dumps({
+                "react": improved_react,
+                "extract": improved_extract,
+                "tools": improved_tools
+            })
+        
+        return results
+    
+    def _format_examples(self, reflective_data: list) -> str:
+        """Format reflective examples into markdown for the LM."""
+        formatted_parts = []
+        for i, example in enumerate(reflective_data):
+            s = f"# Example {i + 1}\n"
+            for key, val in example.items():
+                s += f"## {key}\n{str(val).strip()}\n\n"
+            formatted_parts.append(s)
+        return "\n\n".join(formatted_parts)
 
 gepa = dspy.GEPA(
     metric=my_metric,
-    reflection_lm=dspy.LM(model="gpt-5", temperature=1.0, max_tokens=32000, api_key=api_key),
-    instruction_proposer=MultiModalInstructionProposer(),  # For signatures
-    optimize_react_components=True,  # Enables ReActModuleProposer
+    reflection_lm=dspy.LM(model="gpt-5", temperature=1.0, max_tokens=32000),
+    instruction_proposer=CustomProposer(),  # Receives ALL components (regular + ReAct)
+    optimize_react_components=True,  # Must be True to discover ReAct modules
     auto="medium"
 )
 ```
+
+**Key points:**
+- ReAct components are JSON strings - use `json.loads()` to parse, `json.dumps()` to return
+- 4 parts to improve: `react` instruction, `extract` instruction, tool `desc`, tool `arg_desc`
+- Tools structure: `{"tool_name": {"desc": "...", "args": {...}, "arg_desc": {...}}}`