Skip to content

Add MCP support to command mode#290

Open
grohith327 wants to merge 28 commits intomainfrom
mcp-support
Open

Add MCP support to command mode#290
grohith327 wants to merge 28 commits intomainfrom
mcp-support

Conversation

@grohith327
Copy link
Copy Markdown
Collaborator

Description

This PR adds a bunch a features in order to support MCP for command mode execution. The list of the features are:

  • Add separate strategies for routing Anthropic requests and OpenAI compatible requests.
    • If the model provider is anthropic, messages api is used. Else responses api is preferred and completions is used as fallback
  • Minor updates to the command mode chat UI - moved around the buttons a little bit
  • Added a mechanism to define the MCP config and edit it from the FluidVoice app
  • Fixed some issues on tool calls are rendered on the chat
  • Once action is completed mode, depending on whether the user active display screen has notch - a compact dynamic island notification is shown for success, else a system notification is sent for success
  • Added Markdown syntax rendering in the command mode chat

Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update

Related Issues

Testing

  • Tested on Apple Silicon Mac
  • Tested on macOS [version]
  • Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources
  • Ran formatter locally: swiftformat --config .swiftformat Sources
  • Built locally: sh build_incremental.sh

Notes

  • Due to the size of the change, would recommend pulling the code and testing once.
  • Do a beta release first

grohith327 and others added 22 commits April 14, 2026 16:29
The Ollama docs at github.com/ollama/ollama/blob/main/docs/openai.md
now 404 — the OpenAI-compatibility doc was moved to
docs/api/openai-compatibility.mdx and published at
https://docs.ollama.com/api/openai-compatibility. Point the Setup
Guide link there so users can actually read the docs.
Sub-1s recordings were silently returning an empty string, so short
utterances like 'yes', 'no', or 'stop' never made it into the transcript
and the user had no indication anything went wrong.

The original guard existed to avoid a whisper.cpp assertion on buffers
shorter than 1s, but it applied unconditionally — including to Parakeet,
Apple Speech, and Cohere which have no such constraint. Rather than
branch per provider, pad the buffer with trailing silence on the way
in: whisper.cpp no longer asserts, every other provider just sees a
moment of silence after the speech.

An empty buffer (no audio at all) is still an early-return, since
padding zero samples wouldn't help.

Closes #276
Claude Opus 4.7 uses extended thinking by default and rejects the
`temperature` parameter. Sending it causes every request to fail with
HTTP 400: `temperature is deprecated for this model`.

Add `SettingsStore.isTemperatureUnsupported(_:)` that covers both
reasoning models (o1/o3/gpt-5/...) and Claude Opus 4.7+, and use it at
each live call site (ContentView, CommandModeService, RewriteModeService)
to gate the temperature parameter.

`isReasoningModel` still gates `max_completion_tokens` / reasoning token
budgets, which remain OpenAI-specific.

Fixes #285
Adds abstractation to choose provider strategy based on model provider,
for anthropic we use their messages API for others, we start with
responses and fallback to completions api
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d8de00483b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 22 to +24
"type": "function",
"function": [
"name": "execute_terminal_command",
"description": """
"name": "execute_terminal_command",
"description": """
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep chat tool definitions in function wrapper

This tool schema was flattened to top-level name/description/parameters, but the chat-completions path still sends config.tools as-is, and chat-completions expects each tool under {"type":"function","function":{...}}. When /responses is unavailable and the client falls back to /chat/completions, command-mode tool calls can be rejected or ignored, so terminal/MCP execution stops working on chat-only OpenAI-compatible endpoints.

Useful? React with 👍 / 👎.

Comment on lines +580 to 584
} else if endpoint.contains("/responses") || endpoint.contains("/chat/completions") || endpoint.contains("/api/chat") || endpoint.contains("/api/generate") {
fullURL = endpoint
} else {
fullURL = endpoint + "/chat/completions"
fullURL = endpoint + "/responses"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Retry API verification on chat-completions endpoint

The connection test now defaults non-Anthropic providers to .../responses and treats failures as hard errors, but there is no fallback to .../chat/completions. For providers that are OpenAI-compatible but chat-only, verification will report failure even though runtime requests can still work via chat-completions, creating false negatives in settings and blocking a reliable “verify connection” flow.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf353d4ff6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +309 to +311
where !attemptedResponsesFallback &&
self.isResponsesRequest(currentRequest) &&
routePlan.fallbackFormat != nil &&
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow fallback for responses payloads on chat-style endpoints

executeWithRetry only triggers the Responses→chat-completions fallback when isResponsesRequest(currentRequest) is true, but this predicate is path-based. In the same change, ResponsesRouteStrategy.endpoint intentionally keeps /api/chat and /api/generate unchanged, so those requests are sent with Responses-format bodies but are not recognized as “responses requests” here. On chat-only OpenAI-compatible providers, a 400/404 from that first attempt will now fail the turn instead of retrying with chat-completions, breaking the intended compatibility flow.

Useful? React with 👍 / 👎.

Comment on lines +505 to +507
while usedToolNames.contains(candidate) {
candidate = Self.sanitizeToolName("\(base)_\(counter)")
counter += 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid non-terminating dedupe loop for long MCP tool names

This deduplication loop can become non-terminating when the sanitized base name is already 64 characters. sanitizeToolName truncates to 64 chars, so "\(base)_2", "\(base)_3", etc. collapse back to the same truncated string, candidate never changes, and while usedToolNames.contains(candidate) never exits. A server set with two colliding long tool names would hang MCP tool catalog rebuild/reload.

Useful? React with 👍 / 👎.

@altic-dev
Copy link
Copy Markdown
Owner

One blocker before merge: chat completions fallback still receives Responses-style tool definitions.

In buildChatCompletionsBody, body["tools"] = config.tools forwards tools shaped like {"type":"function","name":...,"parameters":...}. If /v1/responses is rejected and we fall back to /v1/chat/completions, Chat Completions expects {"type":"function","function":{"name":...,"parameters":...}}, so command mode tool calls can 400 or be ignored exactly on fallback providers.

Fix should be small: add a normalizeToolsForChatCompletions(...) helper and use it in buildChatCompletionsBody, while keeping normalizeToolsForResponses(...) for Responses.

@altic-dev
Copy link
Copy Markdown
Owner

One more concrete blocker I found in MCP tool registration: uniqueToolName can loop forever when two tool names collide after the 64-character truncation.

base is already sanitized/truncated to 64 chars. On collision, candidate = sanitizeToolName("\(base)_\(counter)") appends a suffix, but sanitizeToolName truncates back to 64 chars, so the suffix is dropped and candidate remains identical to base forever.

Scenario: two MCP tools with names that differ only after the first 64 sanitized characters, or long server/tool combinations with the same prefix. That would hang MCP bootstrap/tool catalog rebuild.

Fix should be small: when adding a collision suffix, reserve suffix length before truncating the base, e.g. truncate to 64 - suffix.count then append _2, _3, etc.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d9c8f5711

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +706 to +708
} else {
// Auto-execute MCP tool
await self.executeMCPTool(name: tc.name, arguments: tc.arguments, callId: tc.id)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply approval gating to MCP tool execution

When approval mode is enabled, only terminal commands go through the confirmation branch; MCP calls in the else path are always executed immediately. In a session where an MCP server exposes state-changing actions, the model can invoke them without any user confirmation, which bypasses the expected manual-approval safety behavior.

Useful? React with 👍 / 👎.

self.currentStep = stepType == .checking ? .checking(tc.command) : .executing(tc.command)
switch response.turnKind {
case .toolCallOnly, .toolCallWithText:
let tc = response.toolCalls[0]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle all tool calls emitted in a single assistant turn

This path only executes response.toolCalls[0] and drops any additional tool calls returned by the model. For Responses/Anthropic turns that include multiple function calls, only one result is produced and sent back, leaving unresolved calls and causing incorrect follow-up reasoning or incomplete workflows.

Useful? React with 👍 / 👎.

source: "LLMClient"
)
guard let fallbackFormat = routePlan.fallbackFormat else { continue }
currentRequest = try self.buildRequest(config, forcedFormat: fallbackFormat)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reapply timeout on responses→chat fallback request

The initial request gets config.timeoutSeconds, but on fallback the code rebuilds currentRequest without restoring that timeout. If the fallback endpoint is slow/unreachable, the retry uses default URLRequest/session timeouts instead of the configured limit, so requests that should fail fast can block much longer.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f6f671218d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/Services/LLMClient.swift Outdated
Comment on lines +1094 to +1095
let id = item["call_id"] as? String ?? item["id"] as? String ?? "call_\(UUID().uuidString.prefix(8))"
var accumulator = toolCallAccumulators[id] ?? StreamingToolAccumulator()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Key streamed Responses tool calls by item ID

response.function_call_arguments.* events are accumulated under item_id, but response.output_item.* switches to call_id as the dictionary key here. On Responses streaming turns where item_id and call_id differ (the common case), the name lands in one accumulator and the arguments in another, so the emitted tool call often has empty {} arguments. In command mode this can turn valid tool invocations into empty/invalid executions (e.g., missing command), breaking tool-calling on streamed Responses providers.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1ad9f9c12

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +861 to +862
if format == .responses {
return self.parseResponsesResponse(json)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse chat-style payloads before accepting Responses success

In processNonStreaming, .responses requests are always parsed as Responses JSON without validating schema compatibility. If an OpenAI-compatible /api/chat endpoint returns HTTP 200 with chat-completions payload (choices) instead of output, this path returns an empty response instead of throwing, so the Responses→chat fallback is never triggered and command mode degrades into empty/incorrect assistant turns.

Useful? React with 👍 / 👎.

Comment on lines +273 to +274
if let runtime = self.runtimes[server.id], runtime.config == server {
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reconnect unchanged MCP servers on forced reload

The forced reload flow (reloadConfiguration(force: true)) still skips servers whose config matches the cached runtime, so “Reload MCP” does not actually reconnect existing sessions. If a server process has died or its tool list changed without a config edit, this continue keeps stale runtimes/tool catalogs in place and the user cannot recover via reload alone.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8e5e3c5c3f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +1063 to +1065
if isResponses {
let eventType = json["type"] as? String ?? ""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse chat-completions SSE when responses stream shape is absent

When format == .responses, the loop always takes this branch and later continues, so events without a Responses type field are dropped before reaching the choices/delta parser below. On chat-only OpenAI-compatible endpoints that return HTTP 200 with chat-completions SSE (choices) instead of Responses events, streaming turns become empty (no content/tool calls) and the Responses→chat fallback never triggers because no error is thrown.

Useful? React with 👍 / 👎.

Comment on lines +889 to +890
private func showCompletionBadgeIfNeeded(success: Bool) {
guard success else { return }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Show completion feedback for failed command-mode runs

This early return makes failure completions skip notch/system feedback entirely, even though callers pass success: false and the badge pipeline already supports failure state. As a result, failed command runs can end without any compact completion indicator/notification, which regresses user-visible completion feedback outside the in-app window.

Useful? React with 👍 / 👎.

# Conflicts:
#	Sources/Fluid/ContentView.swift
#	Sources/Fluid/Services/CommandModeService.swift
#	Sources/Fluid/Services/NotchOverlayManager.swift
#	Sources/Fluid/Views/NotchContentViews.swift
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4338514ac8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +25 to +27
let mcpServers =
try container.decodeIfPresent([String: Server].self, forKey: .mcpServers) ?? [:]
self.servers = mcpServers.keys.sorted().compactMap { mcpServers[$0] }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Decode legacy servers array in MCP settings

SettingsDocument.init(from:) only reads mcpServers and silently ignores the servers key even though it is declared in CodingKeys. Any existing settings.json that uses a top-level servers array will therefore load as an empty configuration (no validation error), which disables all MCP servers on reload and makes migration/debugging difficult.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[✨ FEATURE] MCP server support in Command Mode

3 participants