Problem
Today, configuring embedded seekdb in PowerMem means using the OceanBase config surface (DATABASE_PROVIDER=oceanbase plus OCEANBASE_HOST= left empty, with OCEANBASE_PATH=./seekdb_data, etc.). seekdb and a remote OceanBase cluster share most of the configuration — same SQL surface, same backend class — but they are not 100% identical:
- seekdb is always embedded; it never makes sense for it to take an
OCEANBASE_HOST.
- The on-disk data directory (
OCEANBASE_PATH today) is a seekdb-only concept; it is meaningless for a remote OceanBase cluster.
- Some toggles have different sensible defaults — e.g. the native hybrid-search SQL extension is available out of the box in current seekdb releases but not in older OceanBase versions, so the recommended default differs.
- Connection pool tuning (
OCEANBASE_POOL_RECYCLE, OCEANBASE_POOL_PRE_PING) is meaningful for a remote cluster but is effectively a no-op for the embedded NullPool path.
Forcing both backends through the same OCEANBASE_* env namespace makes the .env ambiguous: a user can not tell from the keys alone whether the deployment is embedded or remote, and a setting that is valid for one mode (e.g. OCEANBASE_PATH) silently misbehaves under the other.
Proposal
Expose seekdb as a first-class database provider with its own configuration namespace, fully separated from the OceanBase namespace:
- Register
seekdb as a distinct provider name (in addition to oceanbase), even though it can share the backend class with OceanBase internally.
- Introduce a
SEEKDB_* env var namespace parallel to OCEANBASE_*:
SEEKDB_PATH, SEEKDB_DATABASE, SEEKDB_COLLECTION
SEEKDB_INDEX_TYPE, SEEKDB_VECTOR_METRIC_TYPE, SEEKDB_EMBEDDING_MODEL_DIMS
SEEKDB_TEXT_FIELD, SEEKDB_VECTOR_FIELD, SEEKDB_PRIMARY_FIELD, SEEKDB_METADATA_FIELD, SEEKDB_VIDX_NAME
SEEKDB_POOL_RECYCLE, SEEKDB_POOL_PRE_PING
SEEKDB_INCLUDE_SPARSE, SEEKDB_ENABLE_NATIVE_HYBRID
- Hard namespace isolation — when
DATABASE_PROVIDER=seekdb, only SEEKDB_* keys are read; when DATABASE_PROVIDER=oceanbase, only OCEANBASE_* keys are read. No cross-namespace fallbacks, no implicit defaulting from the wrong provider's settings.
- Provider-specific defaults that reflect the deployment shape:
seekdb: empty host (= embedded), on-disk path, native hybrid SQL extension enabled by default.
oceanbase: non-empty host required (default 127.0.0.1 as a hint), OCEANBASE_PATH rejected as a configuration error since it is a seekdb concept.
- Split
.env.example documentation so the seekdb and oceanbase blocks are independent, with each variable annotated with its purpose, recommended value, and alternatives. Pure seekdb .env files no longer contain OCEANBASE_* keys at all.
Why this matters
- Clearer ops — operators can tell from the env keys alone which backend a deployment is using, without having to inspect
OCEANBASE_HOST to decide whether it's actually a remote cluster or a misnamed embedded setup.
- Fewer foot-guns —
OCEANBASE_PATH on a real OceanBase cluster is currently silent / confusing; under the proposal it becomes a loud validation error pointing at the correct provider.
- Different defaults that make sense — embedded seekdb gets the modern defaults (native hybrid on) without changing behaviour for users still pointing at older remote clusters.
- Easier docs — each block of
.env.example.full documents exactly one deployment shape, so newcomers can copy a single section instead of having to mentally subtract OceanBase-specific settings.
Acceptance criteria
A working implementation along these lines is already on the way in PR #945; this issue is to motivate and track the change at the spec level so the design can be discussed independently of the patch.
Problem
Today, configuring embedded seekdb in PowerMem means using the OceanBase config surface (
DATABASE_PROVIDER=oceanbaseplusOCEANBASE_HOST=left empty, withOCEANBASE_PATH=./seekdb_data, etc.). seekdb and a remote OceanBase cluster share most of the configuration — same SQL surface, same backend class — but they are not 100% identical:OCEANBASE_HOST.OCEANBASE_PATHtoday) is a seekdb-only concept; it is meaningless for a remote OceanBase cluster.OCEANBASE_POOL_RECYCLE,OCEANBASE_POOL_PRE_PING) is meaningful for a remote cluster but is effectively a no-op for the embedded NullPool path.Forcing both backends through the same
OCEANBASE_*env namespace makes the.envambiguous: a user can not tell from the keys alone whether the deployment is embedded or remote, and a setting that is valid for one mode (e.g.OCEANBASE_PATH) silently misbehaves under the other.Proposal
Expose seekdb as a first-class database provider with its own configuration namespace, fully separated from the OceanBase namespace:
seekdbas a distinct provider name (in addition tooceanbase), even though it can share the backend class with OceanBase internally.SEEKDB_*env var namespace parallel toOCEANBASE_*:SEEKDB_PATH,SEEKDB_DATABASE,SEEKDB_COLLECTIONSEEKDB_INDEX_TYPE,SEEKDB_VECTOR_METRIC_TYPE,SEEKDB_EMBEDDING_MODEL_DIMSSEEKDB_TEXT_FIELD,SEEKDB_VECTOR_FIELD,SEEKDB_PRIMARY_FIELD,SEEKDB_METADATA_FIELD,SEEKDB_VIDX_NAMESEEKDB_POOL_RECYCLE,SEEKDB_POOL_PRE_PINGSEEKDB_INCLUDE_SPARSE,SEEKDB_ENABLE_NATIVE_HYBRIDDATABASE_PROVIDER=seekdb, onlySEEKDB_*keys are read; whenDATABASE_PROVIDER=oceanbase, onlyOCEANBASE_*keys are read. No cross-namespace fallbacks, no implicit defaulting from the wrong provider's settings.seekdb: empty host (= embedded), on-disk path, native hybrid SQL extension enabled by default.oceanbase: non-empty host required (default127.0.0.1as a hint),OCEANBASE_PATHrejected as a configuration error since it is a seekdb concept..env.exampledocumentation so the seekdb and oceanbase blocks are independent, with each variable annotated with its purpose, recommended value, and alternatives. Pureseekdb.envfiles no longer containOCEANBASE_*keys at all.Why this matters
OCEANBASE_HOSTto decide whether it's actually a remote cluster or a misnamed embedded setup.OCEANBASE_PATHon a real OceanBase cluster is currently silent / confusing; under the proposal it becomes a loud validation error pointing at the correct provider..env.example.fulldocuments exactly one deployment shape, so newcomers can copy a single section instead of having to mentally subtract OceanBase-specific settings.Acceptance criteria
seekdbis a registered provider name, independent fromoceanbase, sharing the backend class.SEEKDB_*env vars are defined and documented for every config knob that makes sense in embedded mode.DATABASE_PROVIDER=seekdbignoresOCEANBASE_*env keys;DATABASE_PROVIDER=oceanbaseignoresSEEKDB_*.DATABASE_PROVIDER=oceanbaserequires a non-empty host and rejectsOCEANBASE_PATH..env.example/.env.example.fullhave separate, self-contained seekdb and oceanbase configuration blocks.A working implementation along these lines is already on the way in PR #945; this issue is to motivate and track the change at the spec level so the design can be discussed independently of the patch.