Skip to content

Provide a dedicated seekdb configuration namespace (separate from oceanbase) #946

@happyhust

Description

@happyhust

Problem

Today, configuring embedded seekdb in PowerMem means using the OceanBase config surface (DATABASE_PROVIDER=oceanbase plus OCEANBASE_HOST= left empty, with OCEANBASE_PATH=./seekdb_data, etc.). seekdb and a remote OceanBase cluster share most of the configuration — same SQL surface, same backend class — but they are not 100% identical:

  • seekdb is always embedded; it never makes sense for it to take an OCEANBASE_HOST.
  • The on-disk data directory (OCEANBASE_PATH today) is a seekdb-only concept; it is meaningless for a remote OceanBase cluster.
  • Some toggles have different sensible defaults — e.g. the native hybrid-search SQL extension is available out of the box in current seekdb releases but not in older OceanBase versions, so the recommended default differs.
  • Connection pool tuning (OCEANBASE_POOL_RECYCLE, OCEANBASE_POOL_PRE_PING) is meaningful for a remote cluster but is effectively a no-op for the embedded NullPool path.

Forcing both backends through the same OCEANBASE_* env namespace makes the .env ambiguous: a user can not tell from the keys alone whether the deployment is embedded or remote, and a setting that is valid for one mode (e.g. OCEANBASE_PATH) silently misbehaves under the other.

Proposal

Expose seekdb as a first-class database provider with its own configuration namespace, fully separated from the OceanBase namespace:

  1. Register seekdb as a distinct provider name (in addition to oceanbase), even though it can share the backend class with OceanBase internally.
  2. Introduce a SEEKDB_* env var namespace parallel to OCEANBASE_*:
    • SEEKDB_PATH, SEEKDB_DATABASE, SEEKDB_COLLECTION
    • SEEKDB_INDEX_TYPE, SEEKDB_VECTOR_METRIC_TYPE, SEEKDB_EMBEDDING_MODEL_DIMS
    • SEEKDB_TEXT_FIELD, SEEKDB_VECTOR_FIELD, SEEKDB_PRIMARY_FIELD, SEEKDB_METADATA_FIELD, SEEKDB_VIDX_NAME
    • SEEKDB_POOL_RECYCLE, SEEKDB_POOL_PRE_PING
    • SEEKDB_INCLUDE_SPARSE, SEEKDB_ENABLE_NATIVE_HYBRID
  3. Hard namespace isolation — when DATABASE_PROVIDER=seekdb, only SEEKDB_* keys are read; when DATABASE_PROVIDER=oceanbase, only OCEANBASE_* keys are read. No cross-namespace fallbacks, no implicit defaulting from the wrong provider's settings.
  4. Provider-specific defaults that reflect the deployment shape:
    • seekdb: empty host (= embedded), on-disk path, native hybrid SQL extension enabled by default.
    • oceanbase: non-empty host required (default 127.0.0.1 as a hint), OCEANBASE_PATH rejected as a configuration error since it is a seekdb concept.
  5. Split .env.example documentation so the seekdb and oceanbase blocks are independent, with each variable annotated with its purpose, recommended value, and alternatives. Pure seekdb .env files no longer contain OCEANBASE_* keys at all.

Why this matters

  • Clearer ops — operators can tell from the env keys alone which backend a deployment is using, without having to inspect OCEANBASE_HOST to decide whether it's actually a remote cluster or a misnamed embedded setup.
  • Fewer foot-gunsOCEANBASE_PATH on a real OceanBase cluster is currently silent / confusing; under the proposal it becomes a loud validation error pointing at the correct provider.
  • Different defaults that make sense — embedded seekdb gets the modern defaults (native hybrid on) without changing behaviour for users still pointing at older remote clusters.
  • Easier docs — each block of .env.example.full documents exactly one deployment shape, so newcomers can copy a single section instead of having to mentally subtract OceanBase-specific settings.

Acceptance criteria

  • seekdb is a registered provider name, independent from oceanbase, sharing the backend class.
  • SEEKDB_* env vars are defined and documented for every config knob that makes sense in embedded mode.
  • DATABASE_PROVIDER=seekdb ignores OCEANBASE_* env keys; DATABASE_PROVIDER=oceanbase ignores SEEKDB_*.
  • DATABASE_PROVIDER=oceanbase requires a non-empty host and rejects OCEANBASE_PATH.
  • .env.example / .env.example.full have separate, self-contained seekdb and oceanbase configuration blocks.
  • Defaults differ where the deployment shapes differ (e.g. native hybrid).

A working implementation along these lines is already on the way in PR #945; this issue is to motivate and track the change at the spec level so the design can be discussed independently of the patch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions