[catalog] Sketch out initial RFC scope by hachikuji · Pull Request #115 · opendata-oss/opendata

hachikuji · 2026-01-21T20:01:41Z

It would be useful to have a catalog for all opendata systems. This PR attempts to sketch out the potential scope for an initial catalog. We can complete design details once there's general consensus about it.

apurvam · 2026-01-21T20:37:59Z

rfcs/0003-catalog.md

+    bucket: acme-data
+```
+
+## Goals


This is great! I wonder, however, if we should add a concept of the 'owner' of a slate. for example, if you have a prometheus server that's backed by a slate TSDB. Simply deleting the s3 buckets will orphan the server. We need to be able to know that there is a service that depends on the slate, where it lives, etc. so that the operations maintain the integrity of the system.

Makes sense. This gets into liveness I guess. I was hoping we could go with a model where the catalog only communicates with object storage. Perhaps we could add some kind of explicit fencing marker to each opendata system, which must be added by the writer. The purpose is to fence writers/readers and signal the catalog that it is safe to delete.

To make sure I understand, the marker would be added by the writer to indicate that the slate is no longer being written to? And that makes it safe to delete? If so, I'd presume the readers would also have to write a marker. This is more like a lease system than fencing markers if that's how you are thinking it would work.

From a user perspective, if you have a service, you want to manage the lifecycle of the service as a whole. For example, I think a useful admin delete operation would deprovision the service and optionally delete the data. Not just delete the data.

The marker could be like a poison pill inserted into the slatedb manifest. It would kill readers and writers. I think the main point is trying to define the communication model. How does the catalog interact with provisioning systems? How does the catalog interact with system readers/writers? The ideal from my perspective is that all communication with the catalog is done through object storage. It is a return of our "storage as protocol" idea at its heart. For example, the catalog could write provisioning requests as files in object storage. Some kind of k8s service could watch that file and do the actual provisioning work. Deletion workflows could follow a similar pattern.

Downstream systems could just be catalog readers I guess. They might follow changes to the catalog as any SlateDb reader does and act when necessary.

I like this line of thinking. So then the catalog needs to hold enough metadata to make those flows possible. I think this is captured under 'define process for registering/deleting slates' in your goals. So I can imagine this type of flow in the longer term.

User issues a deprovisioning request for some OpenData database db0. this could be through a UI, terraform, whatever.

This request is received by some control plane.

The control plane writes this poison pill into the catalog.

A k8s operator detects the poison pill and instructs the readers/writers services to shut down. This means it needs a mapping from the slate to the reader and writer pods. Alternately, the the readers and writer pods of db0 could catalog readers, read the catalog, and mark themselves for deprovisioning with the operator simply executing the action. I like the latter approach because it works for really any deployment model and doesn't require the operator to maintain additional metadata.

Does that match what you had in mind?

Yeah, I think that's right. I guess the main point is that downstream systems are just Readers in our Slatedb-backed framing. So provisioning systems would follow the catalog as readers. Perhaps they could even modify the catalog themselves by temporarily assuming the Writer role. Perhaps we do not need long-lived writers at all. If we could get a model like this to work, it would remove a huge amount of complexity. You don't need to have catalog as a persistent service.

I added some text to the RFC about the communication model. I like it. It leans into the advantages of slatedb/object storage.

Nice. I like the direction. I think SlateDB transactions will be crucial to making this work.

agavra · 2026-01-22T04:33:30Z

rfcs/0003-catalog.md

+
+This RFC proposes a catalog system for OpenData that serves as a central management plane for OpenData storage systems. The catalog provides a single point to manage metadata about the systems ("slates") a user has installed, including their names, types, and object storage configuration. The catalog itself is implemented as a slate backed by SlateDB, following the same patterns as other OpenData subsystems.
+
+## Motivation


I had a review of this lined up but now I'm wondering whether this RFC is over-prescribed to a dogfood philosophy. What if instead we tried to design this as k8s-native?

I posed this question to Claude and here's an alternative we came up with:

The current RFC essentially builds a bespoke control plane on top of SlateDB, but if your primary deployment target is Kubernetes, you'd be reinventing machinery that K8s already provides (watches, reconciliation loops, status subresources, RBAC, etc.).

Instead of the catalog being a SlateDB-backed store that components poll, the Kubernetes API server becomes the catalog. Each slate is represented as a Custom Resource, and an operator reconciles desired state to actual state.

Then using the CRDs we could use:

$ kubectl get slates NAME TYPE BUCKET PHASE events log s3://acme-data/events Provisioned metrics timeseries s3://acme-data/metrics Provisioned $ kubectl apply -f - <<EOF apiVersion: opendata.io/v1alpha1 kind: Slate metadata: name: orders spec: type: log objectStore: bucket: s3://acme-data/orders EOF slate.opendata.io/orders created $ kubectl get slate orders -o jsonpath='{.status.phase}' Provisioned $ kubectl delete slate orders slate.opendata.io/orders deleted

Tradeoffs

Aspect SlateDB-backed Catalog K8s-native CRDs

Dependency Only object storage Requires Kubernetes

Discovery Must know catalog location Standard K8s API discovery

Watches Polling (or custom mechanism) Native watch support

Auth/RBAC Custom K8s RBAC out of the box

Tooling Custom CLI kubectl, GitOps, Helm, etc.

Dogfooding Uses OpenData's own primitives External control plane

Portability Runs anywhere with object storage K8s-only (or needs abstraction)

The big benefit, then, of using opendata is that you install just one operator and one common language for CRDs instead of learning a new operator and new CRD for each of the data systems your deploy in your k8s stack.

I also think philosophically it's OK to lean into kubernetes + object storage as the two primitives we rely on. The 'pitch' in my mind is that those two solve the hardest distributed systems problems: the former solves elastic compute and the latter solves elastic storage/consistency. Without both opendata's vision can't come to fruition.

Another big win with using kubectl as the primary control plane CLI is that the AI agents are really good with it.

We should definitely design to be k8s native, but I don't think a catalog as being discussed here is occupying the same place as k8s. Any system will need some storage to figure out what's deployed, where it's deployed etc. The deployments very often span regions and k8s clusters. The question is: where is that information going to live. We need a catalog for that, which drives the k8s actions in a particular region.

I guess I conflated the two. I believe we should start by figuring out the deployment models in a single k8s cluster and work up from there. The CLI as proposed here has a lot of overlap with the type of things that k8s should handle for me if its all within a single k8s cluster.

I'm not convinced that multi-region/multi-k8s is something we should figure out until we have a solid understanding of the single-k8s, multi-AZ design. A single k8s cluster can span multiple AZs, which is likely where 99% of data systems stop.

I agree with the concern. I don't think we want the catalog directly involved in provisioning. At the same time, I'm not too comfortable being super dogmatic about k8s and sticking it at the heart of the system. I took a shot at reframing the catalog in the latest patch. Rather than tracking a target state, the catalog might simply track the current state. It might be aware of active readers/writers in the system. Kubernetes could consult the catalog prior to deprovisioning a resource rather than having the catalog drive deprovisioning itself for example. Not sure if this is enough value to justify the catalog's existence just yet. I suspect we need to let this stew for a while.

apurvam · 2026-01-22T21:10:09Z

rfcs/0003-catalog.md

+1. A user creates a Kubernetes Custom Resource specifying a new slate.
+2. The K8s operator provisions the slate with a catalog reference in its configuration.
+3. When the slate starts, it assumes the **Writer** role to register itself in the catalog.
+4. CLI tooling or other components can observe the catalog as **Readers** to discover running slates.


I like the new version a lot. I agree this needs time to bake. And I think the crux of what needs to bake is this 4th item. What do the readers actually do with the data in the catalog. I think this, more than anything, is what will inform if and how the catalog co-exists with orchestration systems like k8s.

Sketch out initial RFC scope

5ffb19c

apurvam reviewed Jan 21, 2026

View reviewed changes

Draft communication model through storage

6a344bd

agavra reviewed Jan 22, 2026

View reviewed changes

Take provisioning out of scope

63d9a14

apurvam reviewed Jan 22, 2026

View reviewed changes


		This RFC proposes a catalog system for OpenData that serves as a central management plane for OpenData storage systems. The catalog provides a single point to manage metadata about the systems ("slates") a user has installed, including their names, types, and object storage configuration. The catalog itself is implemented as a slate backed by SlateDB, following the same patterns as other OpenData subsystems.

		## Motivation

Aspect	SlateDB-backed Catalog	K8s-native CRDs
Dependency	Only object storage	Requires Kubernetes
Discovery	Must know catalog location	Standard K8s API discovery
Watches	Polling (or custom mechanism)	Native watch support
Auth/RBAC	Custom	K8s RBAC out of the box
Tooling	Custom CLI	kubectl, GitOps, Helm, etc.
Dogfooding	Uses OpenData's own primitives	External control plane
Portability	Runs anywhere with object storage	K8s-only (or needs abstraction)

Conversation

hachikuji commented Jan 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hachikuji Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apurvam Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agavra Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Tradeoffs

Uh oh!

agavra Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hachikuji Jan 21, 2026 •

edited

Loading

apurvam Jan 21, 2026 •

edited

Loading

agavra Jan 22, 2026 •

edited

Loading

agavra Jan 22, 2026 •

edited

Loading