Heartbeat marks idle reactive agents as `Crashed`, conflating silence with failure

### Description

  ### Title
  Heartbeat marks idle reactive agents as `Crashed`, conflating silence with failure

  ### Summary

  The kernel heartbeat monitor marks an agent `Unresponsive` after 60s of
  inactivity and `Crashed` after another 30s, then fires auto-recovery via
  the supervisor. For reactive agents (no continuous/cron schedule), silence
  is the steady state — there is no activity between user messages. This
  causes false-positive crash detection on every idle period, consuming
  `max_restarts` budget and producing noisy UI state flips even when
  nothing is actually wrong.

  ### Environment

  - OpenFang 0.6.0 (Linux x86_64)
  - Agent: reactive mode, `schedule: reactive`, no cron/continuous
  - `autonomous.heartbeat_interval_secs = 30`, `max_restarts = 10`

  ### Actual

  - `Crashed` fires on any silence > 90s regardless of whether the agent
    is mid-operation or simply waiting for input.
  - Auto-recovery consumes a restart from `max_restarts`; users hitting
    the cap get their agent permanently marked failed for no real reason.
  - Dashboard briefly flashes `Crashed` on window focus before auto-recovery
    completes, which is alarming for users who aren't aware it's a
    false positive.
  - Users can't distinguish from logs whether a real hang occurred or just
    idle time.

  ### Proposed change

  Introduce a new state (e.g. `Idle` or `Awaiting`) that represents
  "no LLM activity but agent is healthy and ready for input." The
  `Crashed` state should only fire when:

  - an LLM call is in-flight and exceeds a configurable deadline, or
  - a `shell_exec`/tool call is blocked beyond its own timeout, or
  - the agent task process has actually panicked/exited.

### Expected Behavior

An idle reactive agent should remain in a steady `Idle` or `Waiting` state, distinct from `Crashed`. `Crashed` should reflect actual failure (panic, unrecoverable error, or stuck LLM call that exceeds a timeout mid-operation), not mere absence of activity.

### Steps to Reproduce

 1. Create or activate any reactive agent (e.g., Browser Hand, Clip Hand,
     or a user-created agent with `schedule: reactive`).
  2. Send it a message, let the turn complete, then leave the chat window
     idle for >90 seconds.
  3. Observe in the dashboard or via `GET /api/agents/<id>`: state
     transitions `Running → Unresponsive → Crashed → (auto-recover) →
     Running`.
  4. No errors in `journalctl -u openfang`, no messages in any queue, no
     exception traces. The agent never actually failed.


### OpenFang Version

0.6

### Operating System

Linux (x86_64)

### Logs / Screenshots

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heartbeat marks idle reactive agents as `Crashed`, conflating silence with failure #1102

Description

Title

Summary

Environment

Actual

Proposed change

Expected Behavior

Steps to Reproduce

OpenFang Version

Operating System

Logs / Screenshots

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Heartbeat marks idle reactive agents as Crashed, conflating silence with failure #1102

Description

Description

Title

Summary

Environment

Actual

Proposed change

Expected Behavior

Steps to Reproduce

OpenFang Version

Operating System

Logs / Screenshots

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Heartbeat marks idle reactive agents as `Crashed`, conflating silence with failure #1102