Handling OutputValidationError retry loops and workflow hangs with local LLMs

I am currently testing Shannon using a local LLM (Qwen 3.5 122B via an OpenAI-compatible endpoint). During my testing, I've encountered a few workflow stability issues and would appreciate some guidance on how to configure or handle them.

### 1. `OutputValidationError` and Temporal Retry Loops
The `pre-recon` agent successfully progresses through many turns but eventually fails with an `OutputValidationError`. 

The main challenge is that Temporal treats this validation failure as a retryable error (`nonRetryable: false`). As a result, instead of failing fast, the system enters a retry loop. I observed a single `pre-recon` activity running for over 75 minutes (`elapsedSeconds: 4484`) across multiple attempts, consuming significant resources and appearing to hang.

<img width="962" height="276" alt="Image" src="https://github.com/user-attachments/assets/3f0eaf59-56ae-44aa-bd72-45ff99fe7270" />

## Temporal Dashboard Log:
```json
{
  "sdkComponent": "worker",
  "taskQueue": "shannon-pipeline",
  "attempt": 2,
  "activityType": "runPreReconAgent",
  "error": "ApplicationFailure: Agent pre-recon failed output validation",
  "cause": {
    "type": "OutputValidationError",
    "nonRetryable": false
  },
  "durationMs": 386443
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling OutputValidationError retry loops and workflow hangs with local LLMs #260

1. `OutputValidationError` and Temporal Retry Loops

Temporal Dashboard Log:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handling OutputValidationError retry loops and workflow hangs with local LLMs #260

Description

1. OutputValidationError and Temporal Retry Loops

Temporal Dashboard Log:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `OutputValidationError` and Temporal Retry Loops