Skip to content

fix(containers): refresh after lifecycle actions; shell picker (bash/sh) instead of host $SHELL#6

Closed
torosent wants to merge 9 commits into
mainfrom
fix/stop-shell-feedback
Closed

fix(containers): refresh after lifecycle actions; shell picker (bash/sh) instead of host $SHELL#6
torosent wants to merge 9 commits into
mainfrom
fix/stop-shell-feedback

Conversation

@torosent
Copy link
Copy Markdown
Owner

@torosent torosent commented May 4, 2026

Summary

Fixes the bugs visible in the screenshots reported on the containers screen.

Bug 1: x (stop) and other lifecycle actions don't refresh

Pressing x (stop), Shift+K, Shift+R (restart), or p (pause) executed the action against Apple's container runtime but didn't refresh the local snapshot, so the state column kept showing the old state for up to 2 seconds.

Each helper now batches a follow-up ListContainers via state.MakeRefreshedCmd (same pattern delete / prune already used) and emits a clear status toast on success.

Bug 2: s (shell) using $SHELL from the host

Original code did os.Getenv("SHELL"), defaulting to /bin/zsh on macOS — which isn't in most Linux containers. container exec -it <id> /bin/zsh printed Error: failed to exec process Error Domain=NSPOSIXErrorDomain Code=19 "Operation not supported by device" to stderr but returned exit 0, so the TUI never saw an error and the user got dumped back to a glitched screen with no feedback.

Replaced the $SHELL lookup with a small picker modal asking the user to pick bash or sh:

╭────────────────────────────────────────────────╮
│                                                │
│    Open shell in e8019602d0d6                  │
│                                                │
│   ▸ b  bash  (/bin/bash)                       │
│     [s] sh    (/bin/sh)                        │
│                                                │
│  b/s: pick • ↑/↓+Enter: pick • Esc: cancel     │
│                                                │
╰────────────────────────────────────────────────╯

b/s are one-keystroke picks; arrow keys + Enter also work; Esc cancels. The modal emits modals.ShellPickedMsg{ID, Shell}; the containers screen catches it and emits screens.SuspendShellMsg for the app-level tea.ExecProcess handler.

Bug 3: glitched TUI after tea.ExecProcess returns

Even with a successful shell session, after exit the next altscreen frame sometimes rendered on top of stale cells (truncated table + leftover JSON visible — see the screenshot in the issue). The SuspendShellMsg handler now returns tea.WindowSize() after exec to force every screen and the open modal to reflow against the real terminal size.

Bug 4: s on a stopped container fails silently

The shell helper now refuses non-running containers with a clear toast. ExecProcess errors at the app.go layer are also surfaced as a toast so any other failure is visible.

Tests

All verified to fail without the fix and pass with it:

  • TestContainersXStopsContainer
  • TestLifecycleActionsRefreshAfterAction
  • TestContainersSOnStoppedContainerEmitsToast
  • TestContainersSOpensShellPicker
  • TestContainersShellPickedConvertsToSuspend
  • TestShellPicker_*

Full suite green; go vet ./... clean; make build clean.

torosent added 2 commits May 4, 2026 11:18
…-stopped

Two real bugs reported on the containers screen:

1. Pressing 'x' (stop), Shift+K (sigkill), Shift+R (restart), or 'p'
   (pause) executed the action but didn't refresh the table. The
   user had to wait up to 2 seconds for the next poll tick before
   the state column updated. Each helper now batches a follow-up
   ListContainers via state.MakeRefreshedCmd — same pattern as
   delete and prune — and emits a clear status toast on success.

2. Pressing 's' (shell) on a non-running container failed silently.
   tea.ExecProcess ran 'container exec -it <id> <shell>', which
   exits immediately for stopped containers, then the TUI resumed
   to the same screen with no feedback — the user thought 's' was
   broken. The shell helper now refuses non-running containers
   with a clear toast, and the SuspendShellMsg handler in app.go
   surfaces ExecProcess errors as a toast so any other exec
   failure (image without /bin/sh, race with another stop, etc.)
   is visible.

Regression tests:
  - TestContainersXStopsContainer now also asserts the follow-up
    ListContainers call.
  - TestLifecycleActionsRefreshAfterAction covers sigkill, restart,
    and pause.
  - TestContainersSOnStoppedContainerEmitsToast covers the shell
    refusal.
Two follow-ups on top of the previous shell-feedback fix, both prompted by
real screenshots from the user:

1. The host's $SHELL is the wrong shell to use inside a container.
   On macOS users default to /bin/zsh, which is rarely present in
   Linux containers — `container exec -it <id> /bin/zsh` then
   exits with an error to stderr but a 0 exit code, so the TUI
   never sees a failure to surface. Replace the $SHELL lookup with
   a small modal that asks the user to pick bash or sh:
     - One-keystroke picks: 'b' for bash, 's' for sh.
     - Arrow keys + Enter for navigated picks.
     - Esc cancels.
     - Modal emits modals.ShellPickedMsg{ID, Shell}; the containers
       screen converts that to screens.SuspendShellMsg.

2. After tea.ExecProcess returned the screen sometimes rendered
   half-blank with stale cells visible (truncated table + leftover
   inspect JSON). Force a fresh tea.WindowSize() in the new
   shellExecDoneMsg handler so every screen and the open modal
   reflow against the real terminal size and repaint the full
   altscreen.

Tests:
  - TestContainersSOpensShellPicker covers the picker-on-running flow.
  - TestContainersShellPickedConvertsToSuspend covers the modal-result handoff.
  - TestShellPicker_* covers the modal itself (hotkeys, cursor+Enter,
    Esc cancel, view contents).
  - TestContainersSEmitsSuspendShellMsg removed (its $SHELL-based
    contract is intentionally gone).
@torosent torosent changed the title fix(containers): refresh after lifecycle actions; toast on shell-into-stopped fix(containers): refresh after lifecycle actions; shell picker (bash/sh) instead of host $SHELL May 4, 2026
torosent added 7 commits May 4, 2026 13:21
…; clear-screen on resume

Three issues from the latest user report:

1. "I clicked bash and nothing happened, I expected an error."
   Root cause: the picker batches ShellPickedMsg alongside
   CloseModalMsg, but tea.Batch makes no ordering guarantees. When
   ShellPickedMsg arrived first the picker was still top of stack,
   the modal received the message, didn't handle it, and the pick
   was silently dropped. Added an explicit typed case in
   app.Update that forwards ShellPickedMsg directly to the active
   screen — mirroring the ConfirmResultMsg pattern that's been
   correct for delete/prune all along.

2. Defensive probe before tea.ExecProcess. Apple's `container exec`
   returns exit 0 EVEN WHEN THE SHELL ISN'T INSTALLED — it writes
   the error to stderr (visible for milliseconds before altscreen
   re-entry hides it) and exits cleanly. We can't surface a toast
   post-hoc because tea.ExecProcess sees a clean exit. So before
   suspending the TUI we run `container exec <id> test -x <shell>`
   (no -i/-t, 3s timeout) and toast immediately if the probe
   fails: "<shell> not available in <id> — try the other shell".

3. "After I typed exit, I got corrupt graphics."
   tea.WindowSize() alone wasn't enough — bubbletea's renderer
   preserves cells it thinks are unchanged, but altscreen state is
   corrupt because the shell ran with stdout writing to the host
   terminal during the suspend. Issue tea.ClearScreen first
   (\033[2J\033[H) and then re-query window size so every screen
   reflows. Without this, the post-exit frame can show leftover
   shell output and stale modal cells.

Regression test:
  - TestAppShellPickedMsgReachesScreenWhilePickerOpen — feeds
    ShellPickedMsg while the picker is still top of stack and
    asserts the screen produces SuspendShellMsg{ID, Shell}.
    Verified failing without the typed-case fix.
…g + screen Init

The previous attempt batched tea.ClearScreen + tea.WindowSize() (an
async terminal-size probe). Even that wasn't enough — the user
reported the table rendering only a single row after exiting an
in-container shell. Root cause: bubbles/table holds an internal
viewport state that the suspend/resume cycle leaves degenerate, and
the async tea.WindowSize() round-trip means several frames render
with stale dimensions before the real WindowSizeMsg arrives.

Three pieces in order:
  1. tea.ClearScreen — wipes the terminal buffer and resets the
     renderer's cell tracking via repaint().
  2. SYNTHETIC tea.WindowSizeMsg with the dims we already hold in
     m.width/m.height (unchanged during exec). Propagates through
     every screen and modal so each re-runs SetHeight + reflow,
     forcing bubbles/table viewport state to recompute.
  3. Re-Init() the active screen so its polling tick is rearmed
     and a fresh RefreshedMsg fires. Without this, the auto-refresh
     loop is dead because the tick that fired during the suspend
     was consumed without arming a new one.

Avoids tea.WindowSize() entirely — using known dims is faster
(no terminal round-trip) and deterministic.
Previous fixes (ClearScreen alone, tea.WindowSize async, synthetic
WindowSizeMsg with known dims) all left the post-exec frame
glitched in the user's terminal — chrome (banner, table headers,
status bar) missing while the body table rendered correctly.

Bubbletea's RestoreTerminal calls renderer.enterAltScreen()
unconditionally if altscreen was active before the exec, but
enterAltScreen is idempotent: the altscreen is already active
so the call is a no-op and no actual entry sequence is sent to the
terminal. Some terminals (and almost certainly Apple's Terminal +
iTerm2) preserve internal altscreen state from before the suspend
that needs to be flushed.

Force a full altscreen toggle: ExitAltScreen, then EnterAltScreen,
then ClearScreen, then a synthetic WindowSizeMsg, then re-Init the
active screen. tea.Sequence (not Batch) so they run in strict
order. The exit/enter pair sends both \033[?1049l and \033[?1049h,
which forces the terminal to discard the stale altscreen contents
and start fresh.
…s terminal

Validated this time. The earlier "post-exec corrupt graphics" reports
turned out to be a layout bug, not an altscreen issue:

  Bug:  the active screen (containers) sized its bubbles/table
        viewport off msg.Height (FULL terminal height) when it
        should have sized off the body region (terminal minus
        banner + status bar + palette line). Result: table.View()
        returned ~75 lines, BorderedBox wrapped that to 77 lines
        (lipgloss.Height does NOT truncate when content exceeds
        height), and the root model's View() returned 88 lines
        for an 80-row terminal. Bubbletea's renderer truncates
        the TOP rows to fit the actual terminal height — which
        is exactly what the user saw: only the bottom row of the
        banner visible, then a single container row, then black.

  Fix:  added Model.bodyRegionHeight() shared by View() and the
        WindowSizeMsg forwarding. The forwarding now sends the
        active screen and any open modal a WindowSizeMsg whose
        Height is the body region's actual size. Same applied to
        SplashDoneMsg (the initial WindowSizeMsg arrives during
        splash and never reaches the screen) and the Ctrl+E
        header_toggle handler (toggling the banner changes the
        body region height).

  Bonus: removed the now-unnecessary altscreen-toggle dance from
         shellExecDoneMsg. The earlier attempts (ClearScreen,
         tea.WindowSize(), ExitAltScreen+EnterAltScreen) were
         papering over the layout bug. With the screen now sized
         correctly, View() returns exactly m.height lines and
         bubbletea's RestoreTerminal handles the rest. We just
         re-Init the active screen so the polling tick rearms
         and a fresh fetch fires.

Validation:
  - TestViewFitsAfterScreenSized — feeds a full-terminal
    WindowSizeMsg (simulating SIGWINCH or the post-exec resize)
    and asserts View() returns exactly H lines. Verified failing
    without the fix with the exact numbers from the user's
    screenshot ("View() returned 88 lines for 120x80 terminal").
  - TestViewFitsInTerminal — table-driven across 5 terminal
    sizes (incl. the user's actual 120x80).
  - TestViewFitsAfterShellExec — covers the user's specific
    flow (shell exec returned, View must still fit).

I instrumented the running binary, traced the bug to specific line
counts, and confirmed the unit test fails on exactly the numbers
that match the screenshot before claiming the fix.
Validated this time with bubbletea instrumented to log every flush.

The previous "bodyRegionHeight" fix made View() return exactly
m.height lines, which is correct. But the user kept reporting the
post-exec glitch persists. Instrumenting bubbletea's standard
renderer revealed why:

  Post-exec, RestoreTerminal is supposed to repaint the altscreen,
  but in practice the renderer's lastRenderedLines diff cache
  survives the suspend/resume cycle. The next flush sees the new
  View buffer matches lastRenderedLines for most rows and writes
  only a handful of "different" lines — wrote=2 skipped=78,
  outBuf=853 bytes. The other 78 rows are left at whatever the
  altscreen had pre-exec, which is mostly stale/blank cells. The
  user's terminal then shows banner-bottom + a single container
  row + acres of empty space.

  After adding Sequence(ClearScreen, WindowSizeMsg, scr.Init)
  back to the shellExecDoneMsg handler, the bt-trace shows
  wrote=80 skipped=0 outBuf=32203 — the full 80-line View is
  written to the wire post-exec. ClearScreen issues \033[2J\033[H
  AND triggers renderer.repaint() which clears lastRender +
  lastRenderedLines. The synthetic WindowSizeMsg makes every
  screen and the open modal reflow against bodyRegionHeight (so
  output stays 80 lines, never overflowing). Init re-arms the
  polling tick consumed during the suspend.

  tea.Sequence enforces strict ordering — Batch's concurrent
  execution loses the race against the renderer ticker.
Diagnose-only commit. When C9S_TRACE=1 is set in the environment,
View() appends a line to /tmp/c9s-trace.log on every call:

  [View] m=120x80 body=69 showSplash=false stack=0 outLines=80 outChars=32114

The user is reporting the post-exec render glitch persists despite
the bodyRegionHeight + Sequence(ClearScreen, WindowSizeMsg, Init)
fix. My local script(1) capture shows the renderer's flush() correctly
writing 80 lines (32k bytes) post-exec, but the user's terminal
stream shows only 3 lines per flush. Need to know what View() is
actually returning at the moment the flush picks up r.buf — if the
trace shows outLines=3 around shellExecDoneMsg processing, the bug
is in the model layer; if outLines stays at 80, the bug is in the
renderer (likely a write/flush race during RestoreTerminal).

Usage:
  C9S_TRACE=1 script -q /tmp/c9s-session.log ./bin/c9s
  # do the s/b/exit flow
  # quit with :q
  # share /tmp/c9s-trace.log AND /tmp/c9s-session.log
Reverts a brute-force m.width=0 hack that broke
TestViewFitsAfterShellExec (View() returned "" because m.width was
0 at the moment the test inspected View, before the Sequence's
WindowSizeMsg restored it).

Keeps the Sequence(ClearScreen, WindowSizeMsg, scr.Init) approach
which IS the correct strategy. Updated comment to document what
we know from the user's instrumented trace:

  - The model layer is correct: View() always returns 53 lines
    (the user's terminal height) with m=171x53 body=42 outChars=25452.
  - The renderer's lastRendered diff cache survives the
    suspend/resume despite enterAltScreen calling repaint(), which
    leaves canSkip=true for most lines and only ~5KB of bytes get
    written to the terminal post-exec instead of the expected 32KB.
  - tea.ClearScreen Msg → renderer.clearScreen() → repaint() should
    fix this — and IS firing per user's stream (\033[2J at the
    correct byte offset) — but somehow the next flush still skips
    most lines.

Investigation continues; this revert at least keeps the unit tests
passing while we debug.
@torosent
Copy link
Copy Markdown
Owner Author

torosent commented May 5, 2026

Superseded by #7 — the post-exec corrupt render bug we couldn't fix here is fixed by upgrading to bubbletea v2 (which has a cell-based renderer that handles the suspend/resume cycle correctly). All the working fixes from this PR are included in #7. Closing in favor of #7.

@torosent
Copy link
Copy Markdown
Owner Author

torosent commented May 5, 2026

Superseded by #7, which landed all of the lifecycle-action refresh work + shell picker plus the bubbletea v2 upgrade and the EAGAIN-retry fix that finally resolves the post-exec render bug. All commits from this branch are functionally included in 4306994..06f7f3a on main.

@torosent torosent closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant