Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
7f282b6
[codex] Add toolchain guardrails and docs (#95)
cschubiner Apr 4, 2026
acbb70d
refresh codex auth status
cschubiner Apr 7, 2026
bec802c
Fix Codex auth refresh and errored-session recovery
cschubiner Apr 7, 2026
13a0f17
Clean up merged Codex provider refresh wiring
cschubiner Apr 7, 2026
b8fd4f0
test(codex): align replayed tests with upstream provider API
cschubiner Apr 16, 2026
1794582
Sync upstream + replay Codex refresh features (2026-04-16) (#112)
cschubiner Apr 16, 2026
deae3a1
feat(desktop): rebrand T3 Code → ClayCode in user-visible surfaces
cschubiner Apr 16, 2026
838f5f9
Merge pull request #113 from cschubiner/feature/claycode-rebrand
cschubiner Apr 16, 2026
4a9908c
feat(sidebar): add cmd+[/cmd+] shortcuts for browser-history navigation
cschubiner Apr 16, 2026
ac32a8c
feat: rebuild chat workflow feature set
cschubiner Apr 17, 2026
e90eb05
feat: rebuild composer skill picker
cschubiner Apr 17, 2026
d683362
feat: restore deep thread and project search dialogs
cschubiner Apr 17, 2026
06e99be
feat: import codex sessions into durable threads
cschubiner Apr 17, 2026
60ab8ff
feat: restore sidebar rename hotkey
cschubiner Apr 17, 2026
3bebdc7
fix: close rebuild qa gaps
cschubiner May 2, 2026
f2766e5
fix: show reconnect recovery toast reliably
cschubiner May 2, 2026
428b4cf
Fix queued follow-up dispatch gating
cschubiner May 2, 2026
0542032
Document queue steer switching QA
cschubiner May 2, 2026
8c44550
Document final queue steer QA
cschubiner May 2, 2026
5f14407
Merge upstream main into rebuild rollout
cschubiner May 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions .codex/artifacts/qa/codex-auth-refresh-and-send.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Codex Auth Refresh And Send QA

Date: 2026-04-07
Repo: `/Users/canal/.codex/worktrees/1cce/t3code`
Desktop state: `~/.t3/userdata`
Build under test: local desktop rebuild from this worktree

## Scope

Verify the reported auth-refresh/send failure:

1. The visible provider refresh control should complete instead of spinning forever.
2. A resend on a previously errored Codex thread should recover instead of inheriting the stale `Quota exceeded` session state.
3. A brand-new Codex thread should still send and receive a response normally.

## Inventory

- Visible settings refresh on the running desktop app
- Existing errored thread recovery via the live orchestration websocket
- Fresh thread send via the live orchestration websocket

## Environment

- Desktop app launched with `bun run start:desktop:main-state`
- Live ports from the patched launch:
- `http://127.0.0.1:61902`
- `http://127.0.0.1:61903`
- Codex provider reported `Authenticated · ChatGPT Pro Subscription`

## Results

### 1. Visible refresh control

Status: Passed

Steps:

1. Opened `http://127.0.0.1:61903/settings/general`
2. Clicked `Refresh provider status`
3. Waited for the checked timestamp to update

Evidence:

- Page text updated to `Checked just now`
- Provider row still showed `Codex v0.118.0`
- Auth label still showed `Authenticated · ChatGPT Pro Subscription`
- Refresh button was re-enabled after completion

### 2. Existing errored thread recovery

Status: Passed

Target thread:

- Thread id: `a7172a15-7002-4de4-8942-221f1ce58f9c`
- Existing title: `test123`
- Previous persisted state before retry:
- session status: `error`
- last error: `Quota exceeded. Check your plan and billing details.`

Action:

- Dispatched a new `thread.turn.start` against that same thread with:
- user text: `Reply with exactly OK. QA_AUTH_RETRY_1775590365845`

Observed outcome:

- Provider log showed a fresh session restart:
- `session/connecting`
- `session/threadOpenRequested` with `Attempting to resume thread 019d6909-55e7-7e00-adb6-ef516eea229c.`
- `session/threadOpenResolved`
- Persisted session row moved to:
- status: `ready`
- last_error: `NULL`
- updated_at: `2026-04-07T19:32:56.517Z`
- Persisted messages now include:
- user: `Reply with exactly OK. QA_AUTH_RETRY_1775590365845`
- assistant: `OK`

Interpretation:

- This is the exact stale-session recovery path we needed.
- The old errored thread no longer stays poisoned after a resend.

### 3. Fresh thread send

Status: Passed

Target thread:

- Project id: `2dd348e7-e576-411c-98ba-dd161318420a` (`t3code`)
- New thread id: `c244983d-5fb8-44dc-9941-0f63d57429d4`
- Title: `QA_AUTH_FRESH_1775590432926`

Action:

1. Created a new thread in `t3code`
2. Sent:
- `Reply with exactly OK. QA_AUTH_FRESH_1775590432926`

Observed outcome:

- Provider log showed a fresh Codex thread start
- Persisted session row ended in:
- status: `ready`
- last_error: `NULL`
- updated_at: `2026-04-07T19:34:02.480Z`
- Persisted messages now include:
- user: `Reply with exactly OK. QA_AUTH_FRESH_1775590432926`
- assistant: `OK`

## Ship Readiness

Passed:

- Visible provider refresh control
- Existing errored-thread resend recovery
- Fresh thread send/response

Not directly verified:

- The exact same click path inside the Electron sidebar/composer chrome, because this desktop shell was still rendering the empty `No projects yet` sidebar state in automation even while the underlying state DB and websocket server were healthy.

Residual risk:

- There may still be a separate Electron/sidebar state hydration issue, but the auth refresh path and the Codex send/recovery path both behaved correctly against the patched backend/runtime.
53 changes: 53 additions & 0 deletions .codex/artifacts/qa/codex-import-durable-thread.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Codex Import Durable Thread QA

Date: 2026-04-17
Branch: `codex/rebuild-feature-rollout`

## Environment

- Branch-backed local dev server in Chrome via the Computer Use plugin
- Target project: `t3-qa-codex-import-project`
- Full verification gate rerun on this checkpoint:
- `bun fmt`
- `bun lint`
- `bun typecheck`
- `bun run test`
- `bun run build`
- `bun run build:desktop`

## Targeted automated coverage

- `cd apps/web && bun x vitest run --config vitest.browser.config.ts src/components/ChatView.browser.tsx -t "imports a Codex transcript into a durable thread from the global shortcut"`
- `cd apps/server && bun x vitest run src/codexImport/Layers/CodexImport.test.ts`

## Manual QA

### Scenario 1: Import a local Codex session into a durable ClayCode thread

1. Opened the branch-backed app in Chrome on a fresh project-backed thread context.
2. Triggered `Cmd+Shift+I`.
3. Verified the `Import from Codex` dialog opened and loaded live local Codex sessions.
4. Selected a real local Codex session.
5. Chose the target project `t3-qa-codex-import-project`.
6. Confirmed the import.
7. Verified the app navigated to a durable thread route (`/$environmentId/$threadId`) instead of a `/draft/...` route.
8. Verified the imported transcript content rendered in the message timeline.
9. Verified the thread activity showed the import provenance for the Codex session.

Result: pass

### Scenario 2: Reopen the import dialog after import

1. Reopened `Import from Codex` on the imported thread.
2. Verified the imported session row showed the `Imported` pill in the list.
3. Verified the preview pane showed `Import status: Already imported`.
4. Verified the primary action label changed to `Open imported thread`.
5. Confirmed the action and verified the app stayed on the already-imported durable thread rather than creating a duplicate thread.

Result: pass

## Observations

- The rebuild now matches the intended durable-history model: importing creates a real local thread instead of just prefilling a draft.
- Repeat imports are idempotent at the UX level: the UI clearly marks the existing imported thread and reopens it instead of duplicating content.
- The earlier stale React Query state bug is fixed. After importing, the list row, preview pane, and action button all update consistently without requiring a full dialog refresh.
56 changes: 56 additions & 0 deletions .codex/artifacts/qa/deep-search-and-project-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Deep Search + Project Search QA

Date: 2026-04-17
Branch: `codex/rebuild-feature-rollout`

## Terminal gate

- `bun fmt`
- `bun lint`
- `export PATH="$HOME/.nvm/versions/node/v24.13.1/bin:$PATH" && bun typecheck`
- `export PATH="$HOME/.nvm/versions/node/v24.13.1/bin:$PATH" && bun run test`
- `export PATH="$HOME/.nvm/versions/node/v24.13.1/bin:$PATH" && bun run build`
- `export PATH="$HOME/.nvm/versions/node/v24.13.1/bin:$PATH" && bun run build:desktop`

Result:

- passed
- `bun lint` still reports the same 8 pre-existing warnings and 0 errors

## Targeted automated coverage

- `cd apps/web && bun run test -- keybindings.test.ts globalThreadSearch.test.ts projectFolderSearch.test.ts quickThreadSearch.test.ts`
- `cd apps/web && bun run test:browser -- src/components/GlobalThreadSearchDialog.browser.tsx src/components/ProjectFolderSearchDialog.browser.tsx src/components/QuickThreadSearchDialog.browser.tsx`
- `cd apps/web && bun run test:browser -- src/components/GlobalThreadSearchDialog.browser.tsx src/components/ProjectFolderSearchDialog.browser.tsx src/components/QuickThreadSearchDialog.browser.tsx src/components/ChatView.browser.tsx -t "global shortcut"`

Result:

- passed

## Computer Use QA

Environment:

- Chrome against local dev server at `http://localhost:5737`

Checks:

1. Opened the command palette from the live app.
2. Confirmed `Search all threads` was present as a top-level action.
3. Opened `Search All Threads`.
4. Searched for `new` and verified the current `New thread` result appeared with title highlighting.
5. Closed the dialog and reopened the command palette.
6. Confirmed `Search project folders` was present as a top-level action.
7. Opened `Search Project Folders`.
8. Verified the live sidebar project appeared as a selectable result.
9. Selected the result and confirmed the app navigated to a fresh draft-thread route (`/draft/...`).

Result:

- passed for both dialog flows and the project-selection navigation path

Notes:

- Direct shortcut injection through Computer Use remained browser-sensitive, so the authoritative live verification came from opening the dialogs through the app surface rather than relying only on raw modifier chords.
- The live command palette showed older shortcut hints for `threads.searchAll` / `projects.search`. That was not a checked-in code regression: this machine has saved overrides in `~/.t3/userdata/keybindings.json` and `~/.t3/dev/keybindings.json` mapping those commands to older shortcuts. The checked-in defaults, docs, and tests now expect `Cmd/Ctrl+Alt+F` and `Cmd/Ctrl+Alt+P`.
- Queue-row `Alt+Up/Down` parity was already covered in `apps/web/src/components/QueuedFollowUpsPanel.browser.tsx`; a fresh manual live pass for that interaction was blocked here because the local Codex provider on this dev server timed out before a runnable queued-follow-up state could be created.
65 changes: 65 additions & 0 deletions .codex/artifacts/qa/electron-desktop-rebuild.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Electron Desktop QA

Date: 2026-04-17
Branch: `codex/rebuild-feature-rollout`
App: `ClayCode (Alpha)` built desktop bundle
State dir: `/tmp/t3-electron-qa-state`

## Build + launch

- Passed `bun run build:desktop`
- Passed `bun run test:desktop-smoke`
- Launched the built Electron app via `apps/desktop` `bun run start`

## QA inventory

- Desktop app launches from the built Electron bundle
- Project onboarding works in Electron
- Draft-thread creation works in Electron
- Snippet picker opens and inserts a snippet into the composer
- Quick thread search opens from the keyboard shortcut and navigates to a thread
- Desktop Connections settings render the Tailnet access row and network-access confirmation dialog
- Identify any Electron-specific blockers for provider-backed turn flows

## Passed

- Added the repo as a project from the desktop `Add project` flow using `/Users/canal/.codex/worktrees/28c4/t3code`
- Verified the app created a fresh draft thread route on project open
- Switched the provider selector from unavailable Codex to `Claude Sonnet 4.6`
- Opened the snippet picker from the composer and confirmed built-in snippets render in the dialog
- Filtered/selected the built-in `Write Tests` snippet and verified it inserted into the composer
- Opened Quick Thread Search with `Cmd+Shift+F`
- Verified the Quick Thread Search dialog rendered with the expected search/help copy
- Searched for the existing thread and confirmed navigation back into that thread route
- Created another new draft thread from the sidebar and verified the route changed to a new `/draft/<id>`
- Opened `Settings -> Connections`
- Verified the `Tailnet access` row rendered with the live Tailnet hostname and IP
- Toggled `Network access` and verified the confirmation dialog opened with `Cancel` and `Restart and enable`
- Cancelled the dialog successfully and confirmed the settings page returned to its prior state

## Hotkeys

- `Cmd+Shift+F`: passed. Opened Quick Thread Search and navigated back into the existing thread route from a fresh draft thread.
- `Cmd+Shift+S`: passed. Opened the snippet picker in Electron and exposed the built-in snippet list; I also verified snippet insertion into the composer in the same desktop run.
- `Cmd+[` / `Cmd+]`: attempted against real draft/thread route history, but no route change was observed in the later Electron session. Because Computer Use key injection became inconsistent for command shortcuts in that session, I am treating this result as inconclusive rather than claiming a definite product regression.
- `Tab` while a turn is running: passed. I typed `Then list the exact Queue + Steer functions and where each one is called.` during an active `GPT-5.4` run, pressed `Tab`, and verified the queued panel appeared with `1 queued follow-up`, `Ready to dispatch in order.`, and the expected queued row actions.
- `Enter` while a turn is running: passed. I then typed `Also explain how Steer with Enter differs from Queue with Tab.` during the same live run, pressed `Enter`, and verified it posted immediately as a new live user turn instead of going back into the queue.
- `Shift+Tab` composer mode toggle: not separately re-verified in Electron once shortcut delivery became inconsistent; this still needs a clean follow-up pass if we want explicit desktop-only evidence for every shortcut.

## Blocked / not fully verified

- Codex provider-backed turn execution is blocked in this Electron run because the desktop app reports `Codex CLI is not authenticated. Run \`codex login\` and try again.`
- Claude-backed turn execution is also blocked for full happy-path validation because the attempted message failed immediately with `Credit balance is too low`
- I was able to re-verify live-response behavior in Electron with `GPT-5.4`, including:
- successful response rendering from a live provider-backed turn
- queueing a follow-up during an active response with `Tab`
- steering immediately during an active response with `Enter`
- queued follow-up auto-dispatch after the active turn settled
- I did not re-verify queued follow-up save-from-queue behavior in this Electron pass.

## Notes

- The Electron shell, routing, project onboarding, settings surfaces, snippet picker, quick thread search, and draft-thread navigation all behaved correctly in the built desktop app
- Live provider-backed turns are workable in Electron through `GPT-5.4`, even though the Codex and Claude providers remained blocked in this environment for separate auth/quota reasons
- In the successful queue/steer pass, the queued panel disappeared after dispatch and both follow-up user turns were visible in-thread while the assistant continued working, which is the expected user-facing behavior
- Command-shortcut delivery through Computer Use became inconsistent later in the session, so the sidebar-history hotkeys need one cleaner follow-up check before I would mark them definitively passed or failed in Electron
55 changes: 55 additions & 0 deletions .codex/artifacts/qa/final-electron-qa-2026-05-02.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Final Electron QA - 2026-05-02

Branch: `codex/rebuild-feature-rollout`

Workspace: `/Users/canal/.codex/worktrees/28c4/t3code`

Electron QA state:

- App launched from branch with `T3CODE_HOME=/tmp/t3code-electron-qa-home-final-2`.
- QA project: `/tmp/qa-project-claycode-replay`.
- App title and sidebar masthead rendered as `ClayCode (Alpha)` / `ClayCode ALPHA`.
- Electron rebuild completed before the final manual pass, so the visible app used current branch assets.

Manual QA performed with Computer Use:

- Added `/tmp/qa-project-claycode-replay` from the desktop Add project flow in a clean profile.
- Sent a live GPT-5.4 thread seed and verified a rendered model response.
- Verified `cmd+k`, `cmd+shift+s`, `cmd+shift+k`, `cmd+shift+f`, `cmd+alt+f`, and `cmd+alt+p` in Electron after the physical-key fallback fix for Option-modified macOS keys.
- Verified sidebar traversal hotkeys in Electron: `cmd+shift+]`, `cmd+shift+[`, `alt+Down`, `alt+Up`, `alt+shift+Down`, and `alt+shift+Up`.
- Verified Settings rebrand text and Tailscale/network-access confirmation copy in Electron.
- Verified Queue + Steer in Electron with a controlled `sleep 60` run:
- A running turn showed the stop button and active working state.
- The composer displayed `Steer` and `Queue` actions while the turn was active.
- Clicking `Queue` created the `1 queued follow-up` panel with `Alt+Up/Down` and `Alt+Shift+Up/Down` row guidance.
- The queued follow-up auto-dispatched after the running turn settled and rendered `queued follow-up processed`.
- Clicking `Steer` inserted a live steer message during the active turn; the model acknowledged it before the queued follow-up ran.
- Verified controlled command execution path during Queue + Steer: model ran `sleep 20` and `sleep 60`, then rendered the requested completions.

Issues found and fixed during this QA pass:

- Sidebar masthead still showed legacy `T3 Code`; fixed by rendering `APP_BASE_NAME` in `Sidebar`.
- macOS Option-modified direct hotkeys such as `cmd+alt+f` and `cmd+alt+p` did not resolve reliably; fixed by adding `event.code` letter aliases in keybinding resolution.
- Disconnect/reconnect toast copy still hardcoded `T3 Server`; fixed to use `ClayCode Server` via `APP_BASE_NAME`.
- Electron launcher attempted the renamed launcher path during local start; fixed by only using the renamed launcher when `T3CODE_USE_RENAMED_ELECTRON_LAUNCHER=1`.

Additional gap QA performed after the final Electron session:

- Verified GitHub PR pill rendering in Electron with a local project on branch `codex/fake-pr-qa`, a fake GitHub remote, a local upstream tracking ref, and a temporary fake `gh` shim that returned open PR `#4242`. The header action changed to `View PR`, and the sidebar thread row exposed `#4242 PR open: QA fake PR pill`.
- Verified disconnect state in Electron by repeatedly killing the embedded server process. The visible toast and accessibility tree both showed `Disconnected from ClayCode Server`, retry countdown text, and a `Retry now` button.
- Found and fixed a reconnect-success edge case: the success toast could be skipped if WebSocket recovery passed through an intermediate `connecting` state before `connected`. The recovered toast now keys off a remembered prior disconnect timestamp instead of only the immediate previous UI state.
- Rebuilt and relaunched Electron from the patched branch, repeated the server-kill recovery test, and visually verified `Reconnected to ClayCode Server` with the disconnect/reconnect timestamps.

Residual risks / not fully re-exercised manually:

- GitHub PR pills were verified with a controlled local `gh` shim rather than a live GitHub API response, so the UI path is covered but live credential/network behavior remains dependent on the real `gh` environment.

Automated gates to run after this artifact:

- `bun fmt`
- `bun lint`
- `bun typecheck`
- `bun run test`
- `bun run build`
- `bun run build:desktop`
- `bun run test:desktop-smoke`
Loading
Loading