Conversation
| bucket: acme-data | ||
| ``` | ||
|
|
||
| ## Goals |
There was a problem hiding this comment.
This is great! I wonder, however, if we should add a concept of the 'owner' of a slate. for example, if you have a prometheus server that's backed by a slate TSDB. Simply deleting the s3 buckets will orphan the server. We need to be able to know that there is a service that depends on the slate, where it lives, etc. so that the operations maintain the integrity of the system.
There was a problem hiding this comment.
Makes sense. This gets into liveness I guess. I was hoping we could go with a model where the catalog only communicates with object storage. Perhaps we could add some kind of explicit fencing marker to each opendata system, which must be added by the writer. The purpose is to fence writers/readers and signal the catalog that it is safe to delete.
There was a problem hiding this comment.
To make sure I understand, the marker would be added by the writer to indicate that the slate is no longer being written to? And that makes it safe to delete? If so, I'd presume the readers would also have to write a marker. This is more like a lease system than fencing markers if that's how you are thinking it would work.
From a user perspective, if you have a service, you want to manage the lifecycle of the service as a whole. For example, I think a useful admin delete operation would deprovision the service and optionally delete the data. Not just delete the data.
There was a problem hiding this comment.
The marker could be like a poison pill inserted into the slatedb manifest. It would kill readers and writers. I think the main point is trying to define the communication model. How does the catalog interact with provisioning systems? How does the catalog interact with system readers/writers? The ideal from my perspective is that all communication with the catalog is done through object storage. It is a return of our "storage as protocol" idea at its heart. For example, the catalog could write provisioning requests as files in object storage. Some kind of k8s service could watch that file and do the actual provisioning work. Deletion workflows could follow a similar pattern.
There was a problem hiding this comment.
Downstream systems could just be catalog readers I guess. They might follow changes to the catalog as any SlateDb reader does and act when necessary.
There was a problem hiding this comment.
I like this line of thinking. So then the catalog needs to hold enough metadata to make those flows possible. I think this is captured under 'define process for registering/deleting slates' in your goals. So I can imagine this type of flow in the longer term.
- User issues a deprovisioning request for some OpenData database
db0. this could be through a UI, terraform, whatever. - This request is received by some control plane.
- The control plane writes this poison pill into the catalog.
- A k8s operator detects the poison pill and instructs the readers/writers services to shut down. This means it needs a mapping from the slate to the reader and writer pods. Alternately, the the readers and writer pods of
db0could catalog readers, read the catalog, and mark themselves for deprovisioning with the operator simply executing the action. I like the latter approach because it works for really any deployment model and doesn't require the operator to maintain additional metadata.
Does that match what you had in mind?
There was a problem hiding this comment.
Yeah, I think that's right. I guess the main point is that downstream systems are just Readers in our Slatedb-backed framing. So provisioning systems would follow the catalog as readers. Perhaps they could even modify the catalog themselves by temporarily assuming the Writer role. Perhaps we do not need long-lived writers at all. If we could get a model like this to work, it would remove a huge amount of complexity. You don't need to have catalog as a persistent service.
There was a problem hiding this comment.
I added some text to the RFC about the communication model. I like it. It leans into the advantages of slatedb/object storage.
There was a problem hiding this comment.
Nice. I like the direction. I think SlateDB transactions will be crucial to making this work.
|
|
||
| This RFC proposes a catalog system for OpenData that serves as a central management plane for OpenData storage systems. The catalog provides a single point to manage metadata about the systems ("slates") a user has installed, including their names, types, and object storage configuration. The catalog itself is implemented as a slate backed by SlateDB, following the same patterns as other OpenData subsystems. | ||
|
|
||
| ## Motivation |
There was a problem hiding this comment.
I had a review of this lined up but now I'm wondering whether this RFC is over-prescribed to a dogfood philosophy. What if instead we tried to design this as k8s-native?
I posed this question to Claude and here's an alternative we came up with:
The current RFC essentially builds a bespoke control plane on top of SlateDB, but if your primary deployment target is Kubernetes, you'd be reinventing machinery that K8s already provides (watches, reconciliation loops, status subresources, RBAC, etc.).
Instead of the catalog being a SlateDB-backed store that components poll, the Kubernetes API server becomes the catalog. Each slate is represented as a Custom Resource, and an operator reconciles desired state to actual state.
Then using the CRDs we could use:
$ kubectl get slates
NAME TYPE BUCKET PHASE
events log s3://acme-data/events Provisioned
metrics timeseries s3://acme-data/metrics Provisioned
$ kubectl apply -f - <<EOF
apiVersion: opendata.io/v1alpha1
kind: Slate
metadata:
name: orders
spec:
type: log
objectStore:
bucket: s3://acme-data/orders
EOF
slate.opendata.io/orders created
$ kubectl get slate orders -o jsonpath='{.status.phase}'
Provisioned
$ kubectl delete slate orders
slate.opendata.io/orders deleted
Tradeoffs
| Aspect | SlateDB-backed Catalog | K8s-native CRDs |
|---|---|---|
| Dependency | Only object storage | Requires Kubernetes |
| Discovery | Must know catalog location | Standard K8s API discovery |
| Watches | Polling (or custom mechanism) | Native watch support |
| Auth/RBAC | Custom | K8s RBAC out of the box |
| Tooling | Custom CLI | kubectl, GitOps, Helm, etc. |
| Dogfooding | Uses OpenData's own primitives | External control plane |
| Portability | Runs anywhere with object storage | K8s-only (or needs abstraction) |
The big benefit, then, of using opendata is that you install just one operator and one common language for CRDs instead of learning a new operator and new CRD for each of the data systems your deploy in your k8s stack.
There was a problem hiding this comment.
I also think philosophically it's OK to lean into kubernetes + object storage as the two primitives we rely on. The 'pitch' in my mind is that those two solve the hardest distributed systems problems: the former solves elastic compute and the latter solves elastic storage/consistency. Without both opendata's vision can't come to fruition.
Another big win with using kubectl as the primary control plane CLI is that the AI agents are really good with it.
There was a problem hiding this comment.
We should definitely design to be k8s native, but I don't think a catalog as being discussed here is occupying the same place as k8s. Any system will need some storage to figure out what's deployed, where it's deployed etc. The deployments very often span regions and k8s clusters. The question is: where is that information going to live. We need a catalog for that, which drives the k8s actions in a particular region.
There was a problem hiding this comment.
I guess I conflated the two. I believe we should start by figuring out the deployment models in a single k8s cluster and work up from there. The CLI as proposed here has a lot of overlap with the type of things that k8s should handle for me if its all within a single k8s cluster.
I'm not convinced that multi-region/multi-k8s is something we should figure out until we have a solid understanding of the single-k8s, multi-AZ design. A single k8s cluster can span multiple AZs, which is likely where 99% of data systems stop.
There was a problem hiding this comment.
I agree with the concern. I don't think we want the catalog directly involved in provisioning. At the same time, I'm not too comfortable being super dogmatic about k8s and sticking it at the heart of the system. I took a shot at reframing the catalog in the latest patch. Rather than tracking a target state, the catalog might simply track the current state. It might be aware of active readers/writers in the system. Kubernetes could consult the catalog prior to deprovisioning a resource rather than having the catalog drive deprovisioning itself for example. Not sure if this is enough value to justify the catalog's existence just yet. I suspect we need to let this stew for a while.
| 1. A user creates a Kubernetes Custom Resource specifying a new slate. | ||
| 2. The K8s operator provisions the slate with a catalog reference in its configuration. | ||
| 3. When the slate starts, it assumes the **Writer** role to register itself in the catalog. | ||
| 4. CLI tooling or other components can observe the catalog as **Readers** to discover running slates. |
There was a problem hiding this comment.
I like the new version a lot. I agree this needs time to bake. And I think the crux of what needs to bake is this 4th item. What do the readers actually do with the data in the catalog. I think this, more than anything, is what will inform if and how the catalog co-exists with orchestration systems like k8s.
It would be useful to have a catalog for all opendata systems. This PR attempts to sketch out the potential scope for an initial catalog. We can complete design details once there's general consensus about it.