Skip to content

fix: enable cluster shard/node switching and improve dropdown usability#253

Open
alexey-temnikov wants to merge 1 commit intovalkey-io:mainfrom
alexey-temnikov:fix/cluster-node-switching
Open

fix: enable cluster shard/node switching and improve dropdown usability#253
alexey-temnikov wants to merge 1 commit intovalkey-io:mainfrom
alexey-temnikov:fix/cluster-node-switching

Conversation

@alexey-temnikov
Copy link
Copy Markdown
Collaborator

Summary

Fixes the inability to switch between Valkey cluster shards/nodes when connected to a cluster environment (e.g., ElastiCache with 6 shards, 18 nodes). Also includes credential fix from #252 and a dropdown usability improvement.

Bug Fixes

1. Cluster node switching disabled in header dropdown (app-header.tsx)

Root cause: The dropdown iterated over all cluster nodes discovered via CLUSTER SLOTS, but checked allConnections[primaryKey].status === CONNECTED per node. Since only the entry node gets CONNECTED status in the Redux store, all other shard buttons were disabled.

Fix: Removed per-node disabled check. All nodes in the dropdown are now clickable when any node in the cluster is connected. Added isCurrentNode highlighting and cluster-wide effectiveConnected check for the toggle button.

2. Route guard redirecting to /connect on shard switch (useIsConnected.ts)

Root cause: RequireConnectionuseIsConnected checked selectStatus(id) for the specific node in the URL. When navigating to a different shard, that node's status was Not Connected, causing a redirect.

Fix: For cluster routes (when clusterId is present), check if ANY node in the same cluster is connected, not just the specific :id node.

3. Server-side client lookup failure for non-entry nodes (utils.ts, action handlers)

Root cause: clients.get(connectionId) returned undefined for shard nodes that weren't the original entry point, because only the entry node's connectionId is stored in the server's clients map. The GlideClusterClient can route to any node, but the lookup failed.

Fix: Added resolveClient() utility that falls back to finding the cluster client via clusterNodesMap when direct lookup fails. Updated stats.ts, keys.ts, cluster.ts, and command.ts handlers to use it.

4. Crash when navigating to undiscovered shard (app-header.tsx)

Root cause: selectConnectionDetails(id!) returned undefined for nodes that had no entry in valkeyConnection state (e.g., shard 6 never individually connected), causing destructuring to throw.

Fix: Added safe fallback: ?? {} with proper type casting.

5. Credential passing for empty username (from #252) (cluster-node.tsx)

Root cause: primary.username && primary.password short-circuits to false when username is "" (e.g., ElastiCache default user), dropping both credentials and causing NOAUTH errors.

Fix: Gate only on primary.password and default username to primary.username ?? "".

6. Missing CA certificates for TLS (Dockerfile.app)

Fix: Install ca-certificates in the production runtime image for TLS connections to ElastiCache.

UX Enhancement

7. Dropdown triggered by entire badge, not just chevron icon (app-header.tsx)

Problem: The node switcher dropdown could only be opened by clicking the tiny chevron icon — the node name text was not clickable.

Fix: Moved onClick handler to the Badge component so the entire area (text + icon) toggles the dropdown. Added badgeRef to the outside-click handler to prevent mousedown/click event conflicts.

Files Changed

File Change
apps/frontend/src/hooks/useIsConnected.ts Cluster-aware connection check
apps/frontend/src/components/ui/app-header.tsx Dropdown fixes + usability
apps/frontend/src/components/cluster-topology/cluster-node.tsx Credential fix
apps/server/src/utils.ts resolveClient() helper
apps/server/src/actions/stats.ts Use resolveClient
apps/server/src/actions/keys.ts Use resolveClient
apps/server/src/actions/cluster.ts Use resolveClient
apps/server/src/actions/command.ts Use resolveClient
docker/Dockerfile.app Add ca-certificates

Testing

Tested against a live ElastiCache Valkey cluster (6 shards, 2 replicas per shard, TLS enabled):

  • ✅ Connect to cluster via single node
  • ✅ Switch between all 6 shard primaries via header dropdown
  • ✅ Switch to shard with no prior connection history
  • ✅ Dropdown toggles on text click (not just icon)
  • ✅ No console errors on any navigation
  • ✅ Dashboard loads correct per-shard data after switch

- Fix cluster node switching in header dropdown
- Fix RequireConnection guard for cluster routes
- Fix server-side client resolution for non-entry cluster nodes
- Fix credential passing for empty username (PR valkey-io#252)
- Add ca-certificates to Dockerfile for TLS
- Make entire badge clickable for dropdown toggle

Signed-off-by: Alexey Temnikov <[email protected]>
const navigate = useNavigate()
const { id, clusterId } = useParams<{ id: string; clusterId: string }>()
const { host, port, username, alias } = useSelector(selectConnectionDetails(id!))
const connectionDetails = useSelector(selectConnectionDetails(id!)) ?? {} as Partial<ReturnType<ReturnType<typeof selectConnectionDetails>>>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this change is unnecessary, let's keep the original (line 24)


// For cluster mode, consider connected if any node in the cluster is connected
const isClusterConnected = clusterId
? Object.values(allConnections ?? {}).some(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move the nullish coalescing operator higher where (line 35) where allConnections is initialized using useSelector

<li className="flex flex-col gap-1" key={primaryKey}>
<button className="flex items-center cursor-pointer hover:bg-primary/20"
disabled={!nodeIsConnected}
<button className={`flex items-center cursor-pointer hover:bg-primary/20 ${isCurrentNode ? "bg-primary/10" : ""}`}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use cn util instead of raw templates

const connections = useSelector(selectConnections)

// For cluster routes, consider connected if ANY node in the same cluster is connected
if (clusterId && status !== CONNECTED) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hook is a convenience util to get connection status by URL params effectively. This change breaks the abstraction boundary — part of the job is done in the selector, another part — in the hook. What effectively this change is doing — is making clusterId an argument for the selector that has only one argument. So the selector now needs to accept two parameters (this is a minimal change surface) or accept an object { id/nodeId, clusterId } (which is a much bigger surface for the change). But in any case, derived state should be still produced in the selector, not hook.

Another option is (also within the selector) is to flatten.filter connections — this way we don't even need clusterId.

variant="bodySm"
>
<Dot className={isConnected ? "text-green-500" : "text-gray-400"} size={45} />
<Dot className={effectiveConnected ? "text-green-500" : "text-gray-400"} size={45} />
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this change the semantic of the Dot? I.e. it used to mean that "you, user, are connected to this node in the cluster" but now it means "you, user, are connected to SOME node in this cluster" — if so, this is not a desired outcome — the users connecting from Electron app will hate it because they cannot even physically connect to all nodes, they actually need to see what instances (of 12 max) they are connected to. Instead of "connected", we can think of it as "have active glide client + ws connections".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants