Skip to content

Logging and observability standardization across decoupled jido_* repos #244

@nshkrdotcom

Description

@nshkrdotcom

This follows the current logging / test-noise investigation work and surfaces one ecosystem constraint before implementation spreads further.

Tagging @pcharbon70 and @mikehostetler for direction.

Core constraint

Not every jido_* library depends on jido core directly or transitively.

That means a helper that lives only in jido core is not automatically an ecosystem-wide solution, and adding a dependency on core purely to share a logging helper would be the wrong coupling for some repos.

What seems clear already

  1. Libraries should stay quiet by default.
  2. High-volume internal activity should primarily be represented as telemetry, not human logs.
  3. Expensive log messages should use lazy Logger evaluation.
  4. Libraries should not choose app/runtime concerns such as JSON formatter, log shipping vendor, or error sink.
  5. Test defaults should prefer :warning, with explicit capture for tests that assert info / debug output.

Immediate fixes that do not need to wait

These can move in parallel regardless of the helper/package decision:

  1. lower noisy per-operation logs in upstream repos that currently emit them at elevated levels by default
  2. stop logging full params/context on hot paths by default
  3. tighten test-time logger defaults and fix tests explicitly where lower-level logs are asserted
  4. standardize lazy logging and safe inspect usage in repos being cleaned up now

Proposed path

Now

  1. standardize the policy first
  2. define a tiny canonical helper API that repos can implement locally while the API settles
  3. define the telemetry event taxonomy and metadata namespace
  4. define a starter redaction/truncation policy
  5. use a lightweight CI guard first, for example an rg-based check for eager interpolated Logger calls

Suggested local helper API shape:

  • debug(fun, metadata \\ [])
  • info(fun, metadata \\ [])
  • warning(fun, metadata \\ [])
  • error(fun, metadata \\ [])
  • safe_inspect(term, opts \\ [])

Suggested starter redaction policy:

  • always redact keys such as api_key, token, secret, password, authorization, credentials
  • truncate long binaries / strings to a bounded length
  • never log raw params or raw context maps by default on hot paths
  • log keys / counts / sizes instead unless a repo explicitly opts into verbose debug output

Later, if the team wants it

After the API has been proven across a few repos, extract a thin shared infra package such as jido_observe / jido_telemetry if that still looks worthwhile.

That package should stay domain-light and avoid circular dependencies on higher-level Jido libraries.

Questions to unblock the work

  1. Do we want to standardize by written policy first and defer the shared package decision, or do we want to create a thin shared package now?
  2. Is it acceptable for the current cleanup work to use small local helper modules in repos under cleanup, as long as they share the same API and semantics?
  3. Do we want to reserve a jido_* metadata namespace now for cross-repo consistency, even before extraction?
  4. Which repo should own the first-pass ADR / convention doc for logging + telemetry taxonomy?

One related architectural point

Logging cleanup can proceed now, but the ADR should also make an explicit call on async trace propagation across BEAM process boundaries:

  • preserve parent context
  • explicitly propagate it
  • or intentionally drop it at certain boundaries

That should be an explicit contract, not an accidental behavior.

I have working notes/docs locally already and can turn those into a tighter ADR draft once there is direction on the questions above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    parkedWorth keeping open, not worth acting on now

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions