Skip to content

Commit 3b3a44e

Browse files
committed
Add Kubernetes Registry proposal with catalog system design
Signed-off-by: Daniele Martinoli <[email protected]>
1 parent a384737 commit 3b3a44e

File tree

6 files changed

+1122
-0
lines changed

6 files changed

+1122
-0
lines changed
Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
# Kubernetes Registry Implementation for ToolHive Operator
2+
3+
## Problem Statement
4+
5+
The ToolHive operator currently supports managing individual MCP servers through `MCPServer` CRDs, but lacks a centralized registry mechanism within Kubernetes. This creates challenges in discoverability, catalog management, upstream compatibility, and operational complexity.
6+
7+
## Goals
8+
9+
- **Native Kubernetes Registry**: Implement registry functionality using Custom Resource Definitions
10+
- **Upstream Format Support**: Leverage existing upstream conversion capabilities for ecosystem compatibility
11+
- **Multi-Registry Support**: Support both local registry entries and external registry synchronization.
12+
- **Registry Hierarchy**: Support the multi-registry hierarchy defined in the upstream model
13+
- **Application Integration**: Provide REST API for programmatic access to registry data
14+
- **GitOps Compatibility**: Enable declarative registry management through CRD-based operations
15+
16+
## Architecture Overview
17+
18+
The Kubernetes registry implementation extends the operator with the `MCPRegistry` CRD and supporting controllers that work with the existing MCPServer CRD to provide a complete registry-to-deployment workflow.
19+
20+
## CRD Design Overview
21+
22+
### MCPRegistry CRD
23+
24+
The `MCPRegistry` CRD represents a registry source and synchronization configuration with these key components:
25+
26+
- **Source Configuration**: Support for ConfigMap, URL, Git, and Registry API sources
27+
- **Format Specification**: Handle both ToolHive and upstream registry formats
28+
- **Sync Policy**: Automatic and manual synchronization with configurable intervals
29+
- **Filtering**: Include/exclude servers based on names, tags, tiers, and transports
30+
31+
### Job CRDs (Phase 3)
32+
33+
Declarative operation CRDs for GitOps compatibility:
34+
- `MCPRegistryImportJob`: Declarative import operations
35+
- `MCPRegistryExportJob`: Declarative export operations
36+
- `MCPRegistrySyncJob`: Declarative synchronization operations
37+
38+
**Detailed specifications**: See [kubernetes-registry/crd-specifications.md](kubernetes-registry/crd-specifications.md)
39+
40+
## Key Features and Capabilities
41+
42+
### 1. Registry Management
43+
44+
The ToolHive operator provides comprehensive registry management through specialized components:
45+
46+
#### Registry Controller
47+
- **Synchronization**: Automatic and manual synchronization with external registry sources
48+
- **Format Conversion**: Bidirectional conversion between ToolHive and upstream registry formats
49+
- **Filtering**: Include/exclude servers based on configurable criteria (names, tags, tiers, transports)
50+
- **Status Tracking**: Monitor sync status, error conditions, and statistics
51+
- **Server Labeling**: Automatically apply registry relationship labels to discovered servers
52+
53+
#### Registry API Service
54+
- **REST API**: HTTP endpoints for programmatic registry and server discovery
55+
- **Authentication**: Integration with Kubernetes RBAC and service account tokens
56+
- **Filtering**: Query servers by registry, category, transport type, and custom labels
57+
- **Format Support**: Return data in both ToolHive and upstream registry formats
58+
59+
### 2. Registry Sources
60+
61+
The implementation supports multiple registry source types, all with both ToolHive and upstream formats.
62+
All sources support configurable synchronization policies including automatic sync intervals, retry behavior, and update strategies.
63+
64+
Registry sources can be organized in hierarchies as defined in the [MCP Registry Ecosystem Diagram](https://github.com/modelcontextprotocol/registry/blob/main/docs/ecosystem-diagram.excalidraw.svg), enabling upstream registries to aggregate from multiple sources.
65+
66+
This aggregation approach, combined with maintaining the ToolHive registry schema, addresses provenance data handling by extracting it from upstream registry extensions during format conversion.
67+
68+
#### ConfigMap Source
69+
- Store registry data directly in Kubernetes ConfigMaps
70+
- Ideal for small, manually managed registries
71+
- Immediate updates when ConfigMap changes
72+
73+
#### URL Source
74+
- Fetch registry data from HTTP/HTTPS endpoints
75+
- Support for authentication via Secret references
76+
- Custom headers for API integration
77+
78+
#### Git Source
79+
- Clone registry data from Git repositories
80+
- Branch and path specification
81+
- Authentication via SSH keys or tokens
82+
- Version tracking and change detection
83+
84+
#### Registry Source
85+
- Reference another registry's REST API endpoint as a data source
86+
- Enables registry hierarchies and aggregation patterns across clusters
87+
- Supports filtering and transformation of upstream registry data
88+
- Works with any registry implementation that exposes the standard API
89+
- Useful for creating curated subsets or company-specific views of upstream registries
90+
91+
### 3. Server-Registry Relationships
92+
93+
#### Automatic Labeling
94+
When deployed servers are created from registries, the controller automatically applies standardized labels during resource creation:
95+
96+
```yaml
97+
labels:
98+
toolhive.stacklok.io/registry-name: upstream-community
99+
toolhive.stacklok.io/registry-namespace: toolhive-system
100+
toolhive.stacklok.io/server-name: filesystem-server
101+
toolhive.stacklok.io/tier: Official
102+
toolhive.stacklok.io/category: filesystem
103+
```
104+
105+
These labels enable filtering, grouping, and querying servers by their registry source.
106+
107+
#### Pre-deployed Server Association
108+
Existing MCPServer resources can be associated with registries by applying the standard labels, enabling unified management across manually deployed and registry-synchronized servers.
109+
110+
## Quick Start Example
111+
112+
```yaml
113+
apiVersion: toolhive.stacklok.io/v1alpha1
114+
kind: MCPRegistry
115+
metadata:
116+
name: upstream-community
117+
namespace: toolhive-system
118+
spec:
119+
displayName: "MCP Community Registry"
120+
format: upstream
121+
source:
122+
type: url
123+
url:
124+
url: "https://registry.modelcontextprotocol.io/servers.json"
125+
syncPolicy:
126+
enabled: true
127+
interval: "1h"
128+
```
129+
130+
**Comprehensive examples**: See [kubernetes-registry/usage-examples.md](kubernetes-registry/usage-examples.md)
131+
132+
## Implementation Overview
133+
134+
The implementation follows a phased approach:
135+
136+
1. **Phase 1**: Core Registry CRD and basic synchronization
137+
2. **Phase 2**: External sources, REST API for applications
138+
3. **Phase 3**: CRD-based operations, automatic labeling
139+
4. **Phase 4**: Production features and filtering
140+
5. **Phase 5**: Advanced integration (optional)
141+
142+
**Detailed implementation plan**: See [kubernetes-registry/implementation-plan.md](kubernetes-registry/implementation-plan.md)
143+
144+
## CLI Integration
145+
146+
New registry management commands:
147+
- `thv registry list/add/sync/remove` - Registry lifecycle management
148+
- `thv registry import/export` - Data migration operations
149+
- `thv search/show` - Enhanced server discovery across registries
150+
151+
**Complete CLI reference**: See [kubernetes-registry/usage-examples.md](kubernetes-registry/usage-examples.md)
152+
153+
## Security and Operations
154+
155+
### Security Model
156+
- **RBAC Integration**: Granular permissions for registry operations
157+
- **Source Validation**: URL restrictions and content validation
158+
- **Authentication**: Secure handling of external source credentials
159+
- **Audit Logging**: Comprehensive operation tracking
160+
161+
### Success Metrics
162+
- **Adoption**: Registry resource creation and server association rates
163+
- **Performance**: <30s sync time, >99% success rate, <100MB memory usage
164+
- **Usability**: Reduced manual configuration complexity
165+
- **Ecosystem**: Upstream registry coverage and format conversion accuracy
166+
167+
**Complete details**: See [kubernetes-registry/implementation-plan.md](kubernetes-registry/implementation-plan.md)
168+
169+
## Future Enhancements
170+
171+
1. **Catalog System** (see [kubernetes-registry/catalog-design.md](kubernetes-registry/catalog-design.md))
172+
- MCPCatalog CRD for curated server collections
173+
- Approval workflows and validation pipelines
174+
- OCI artifact distribution for catalog sharing
175+
- Role-based catalog access and governance
176+
177+
2. **Advanced Registry Features**
178+
- Registry federation and cross-cluster synchronization
179+
- Webhook-based real-time registry updates
180+
- Registry analytics and usage metrics
181+
- Content verification and signature validation
182+
183+
3. **Template System** (see [../mcp-server-template-system.md](../mcp-server-template-system.md))
184+
- Comprehensive template parameter system
185+
- Template versioning and inheritance
186+
- Integration with Helm and Kustomize
187+
- Interactive template wizards and validation
188+
189+
4. **Integration Expansions**
190+
- GitOps workflow integration with ArgoCD/Flux
191+
- CI/CD pipeline integration
192+
- Service mesh integration for advanced networking
193+
- Multi-cluster registry synchronization
194+
195+
5. **Community Features**
196+
- Community ratings and reviews for registry entries
197+
- Automated server discovery from popular repositories
198+
- Registry contribution workflows and governance
199+
200+
## Conclusion
201+
202+
The Kubernetes Registry implementation provides a cloud-native approach to MCP server management that:
203+
204+
- **Leverages Kubernetes APIs** for native resource management and RBAC
205+
- **Integrates with existing tooling** through standard kubectl and custom CLI commands
206+
- **Supports ecosystem growth** through upstream format compatibility and conversion
207+
- **Enables GitOps workflows** through declarative resource definitions
208+
- **Scales operationally** with automated synchronization and registry-based deployment
209+
210+
This implementation transforms ToolHive into a comprehensive Kubernetes-native platform for MCP server lifecycle management, maintaining backward compatibility while enabling ecosystem integration through upstream format support.

0 commit comments

Comments
 (0)