Add MCP support to command mode#290
Conversation
The Ollama docs at github.com/ollama/ollama/blob/main/docs/openai.md now 404 — the OpenAI-compatibility doc was moved to docs/api/openai-compatibility.mdx and published at https://docs.ollama.com/api/openai-compatibility. Point the Setup Guide link there so users can actually read the docs.
Sub-1s recordings were silently returning an empty string, so short utterances like 'yes', 'no', or 'stop' never made it into the transcript and the user had no indication anything went wrong. The original guard existed to avoid a whisper.cpp assertion on buffers shorter than 1s, but it applied unconditionally — including to Parakeet, Apple Speech, and Cohere which have no such constraint. Rather than branch per provider, pad the buffer with trailing silence on the way in: whisper.cpp no longer asserts, every other provider just sees a moment of silence after the speech. An empty buffer (no audio at all) is still an early-return, since padding zero samples wouldn't help. Closes #276
Claude Opus 4.7 uses extended thinking by default and rejects the `temperature` parameter. Sending it causes every request to fail with HTTP 400: `temperature is deprecated for this model`. Add `SettingsStore.isTemperatureUnsupported(_:)` that covers both reasoning models (o1/o3/gpt-5/...) and Claude Opus 4.7+, and use it at each live call site (ContentView, CommandModeService, RewriteModeService) to gate the temperature parameter. `isReasoningModel` still gates `max_completion_tokens` / reasoning token budgets, which remain OpenAI-specific. Fixes #285
Adds abstractation to choose provider strategy based on model provider, for anthropic we use their messages API for others, we start with responses and fallback to completions api
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d8de00483b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "type": "function", | ||
| "function": [ | ||
| "name": "execute_terminal_command", | ||
| "description": """ | ||
| "name": "execute_terminal_command", | ||
| "description": """ |
There was a problem hiding this comment.
Keep chat tool definitions in
function wrapper
This tool schema was flattened to top-level name/description/parameters, but the chat-completions path still sends config.tools as-is, and chat-completions expects each tool under {"type":"function","function":{...}}. When /responses is unavailable and the client falls back to /chat/completions, command-mode tool calls can be rejected or ignored, so terminal/MCP execution stops working on chat-only OpenAI-compatible endpoints.
Useful? React with 👍 / 👎.
| } else if endpoint.contains("/responses") || endpoint.contains("/chat/completions") || endpoint.contains("/api/chat") || endpoint.contains("/api/generate") { | ||
| fullURL = endpoint | ||
| } else { | ||
| fullURL = endpoint + "/chat/completions" | ||
| fullURL = endpoint + "/responses" | ||
| } |
There was a problem hiding this comment.
Retry API verification on chat-completions endpoint
The connection test now defaults non-Anthropic providers to .../responses and treats failures as hard errors, but there is no fallback to .../chat/completions. For providers that are OpenAI-compatible but chat-only, verification will report failure even though runtime requests can still work via chat-completions, creating false negatives in settings and blocking a reliable “verify connection” flow.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf353d4ff6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| where !attemptedResponsesFallback && | ||
| self.isResponsesRequest(currentRequest) && | ||
| routePlan.fallbackFormat != nil && |
There was a problem hiding this comment.
Allow fallback for responses payloads on chat-style endpoints
executeWithRetry only triggers the Responses→chat-completions fallback when isResponsesRequest(currentRequest) is true, but this predicate is path-based. In the same change, ResponsesRouteStrategy.endpoint intentionally keeps /api/chat and /api/generate unchanged, so those requests are sent with Responses-format bodies but are not recognized as “responses requests” here. On chat-only OpenAI-compatible providers, a 400/404 from that first attempt will now fail the turn instead of retrying with chat-completions, breaking the intended compatibility flow.
Useful? React with 👍 / 👎.
| while usedToolNames.contains(candidate) { | ||
| candidate = Self.sanitizeToolName("\(base)_\(counter)") | ||
| counter += 1 |
There was a problem hiding this comment.
Avoid non-terminating dedupe loop for long MCP tool names
This deduplication loop can become non-terminating when the sanitized base name is already 64 characters. sanitizeToolName truncates to 64 chars, so "\(base)_2", "\(base)_3", etc. collapse back to the same truncated string, candidate never changes, and while usedToolNames.contains(candidate) never exits. A server set with two colliding long tool names would hang MCP tool catalog rebuild/reload.
Useful? React with 👍 / 👎.
|
One blocker before merge: chat completions fallback still receives Responses-style tool definitions. In Fix should be small: add a |
|
One more concrete blocker I found in MCP tool registration:
Scenario: two MCP tools with names that differ only after the first 64 sanitized characters, or long server/tool combinations with the same prefix. That would hang MCP bootstrap/tool catalog rebuild. Fix should be small: when adding a collision suffix, reserve suffix length before truncating the base, e.g. truncate to |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7d9c8f5711
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } else { | ||
| // Auto-execute MCP tool | ||
| await self.executeMCPTool(name: tc.name, arguments: tc.arguments, callId: tc.id) |
There was a problem hiding this comment.
Apply approval gating to MCP tool execution
When approval mode is enabled, only terminal commands go through the confirmation branch; MCP calls in the else path are always executed immediately. In a session where an MCP server exposes state-changing actions, the model can invoke them without any user confirmation, which bypasses the expected manual-approval safety behavior.
Useful? React with 👍 / 👎.
| self.currentStep = stepType == .checking ? .checking(tc.command) : .executing(tc.command) | ||
| switch response.turnKind { | ||
| case .toolCallOnly, .toolCallWithText: | ||
| let tc = response.toolCalls[0] |
There was a problem hiding this comment.
Handle all tool calls emitted in a single assistant turn
This path only executes response.toolCalls[0] and drops any additional tool calls returned by the model. For Responses/Anthropic turns that include multiple function calls, only one result is produced and sent back, leaving unresolved calls and causing incorrect follow-up reasoning or incomplete workflows.
Useful? React with 👍 / 👎.
| source: "LLMClient" | ||
| ) | ||
| guard let fallbackFormat = routePlan.fallbackFormat else { continue } | ||
| currentRequest = try self.buildRequest(config, forcedFormat: fallbackFormat) |
There was a problem hiding this comment.
Reapply timeout on responses→chat fallback request
The initial request gets config.timeoutSeconds, but on fallback the code rebuilds currentRequest without restoring that timeout. If the fallback endpoint is slow/unreachable, the retry uses default URLRequest/session timeouts instead of the configured limit, so requests that should fail fast can block much longer.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f6f671218d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let id = item["call_id"] as? String ?? item["id"] as? String ?? "call_\(UUID().uuidString.prefix(8))" | ||
| var accumulator = toolCallAccumulators[id] ?? StreamingToolAccumulator() |
There was a problem hiding this comment.
Key streamed Responses tool calls by item ID
response.function_call_arguments.* events are accumulated under item_id, but response.output_item.* switches to call_id as the dictionary key here. On Responses streaming turns where item_id and call_id differ (the common case), the name lands in one accumulator and the arguments in another, so the emitted tool call often has empty {} arguments. In command mode this can turn valid tool invocations into empty/invalid executions (e.g., missing command), breaking tool-calling on streamed Responses providers.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d1ad9f9c12
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if format == .responses { | ||
| return self.parseResponsesResponse(json) |
There was a problem hiding this comment.
Parse chat-style payloads before accepting Responses success
In processNonStreaming, .responses requests are always parsed as Responses JSON without validating schema compatibility. If an OpenAI-compatible /api/chat endpoint returns HTTP 200 with chat-completions payload (choices) instead of output, this path returns an empty response instead of throwing, so the Responses→chat fallback is never triggered and command mode degrades into empty/incorrect assistant turns.
Useful? React with 👍 / 👎.
| if let runtime = self.runtimes[server.id], runtime.config == server { | ||
| continue |
There was a problem hiding this comment.
Reconnect unchanged MCP servers on forced reload
The forced reload flow (reloadConfiguration(force: true)) still skips servers whose config matches the cached runtime, so “Reload MCP” does not actually reconnect existing sessions. If a server process has died or its tool list changed without a config edit, this continue keeps stale runtimes/tool catalogs in place and the user cannot recover via reload alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8e5e3c5c3f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if isResponses { | ||
| let eventType = json["type"] as? String ?? "" | ||
|
|
There was a problem hiding this comment.
Parse chat-completions SSE when responses stream shape is absent
When format == .responses, the loop always takes this branch and later continues, so events without a Responses type field are dropped before reaching the choices/delta parser below. On chat-only OpenAI-compatible endpoints that return HTTP 200 with chat-completions SSE (choices) instead of Responses events, streaming turns become empty (no content/tool calls) and the Responses→chat fallback never triggers because no error is thrown.
Useful? React with 👍 / 👎.
| private func showCompletionBadgeIfNeeded(success: Bool) { | ||
| guard success else { return } |
There was a problem hiding this comment.
Show completion feedback for failed command-mode runs
This early return makes failure completions skip notch/system feedback entirely, even though callers pass success: false and the badge pipeline already supports failure state. As a result, failed command runs can end without any compact completion indicator/notification, which regresses user-visible completion feedback outside the in-app window.
Useful? React with 👍 / 👎.
# Conflicts: # Sources/Fluid/ContentView.swift # Sources/Fluid/Services/CommandModeService.swift # Sources/Fluid/Services/NotchOverlayManager.swift # Sources/Fluid/Views/NotchContentViews.swift
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4338514ac8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let mcpServers = | ||
| try container.decodeIfPresent([String: Server].self, forKey: .mcpServers) ?? [:] | ||
| self.servers = mcpServers.keys.sorted().compactMap { mcpServers[$0] } |
There was a problem hiding this comment.
Decode legacy
servers array in MCP settings
SettingsDocument.init(from:) only reads mcpServers and silently ignores the servers key even though it is declared in CodingKeys. Any existing settings.json that uses a top-level servers array will therefore load as an empty configuration (no validation error), which disables all MCP servers on reload and makes migration/debugging difficult.
Useful? React with 👍 / 👎.
Description
This PR adds a bunch a features in order to support MCP for command mode execution. The list of the features are:
Type of Change
Related Issues
Testing
swiftlint --strict --config .swiftlint.yml Sourcesswiftformat --config .swiftformat Sourcessh build_incremental.shNotes