perf: Add vector search stage metrics by franciscojavierarceo · Pull Request #5974 · ogx-ai/ogx

franciscojavierarceo · 2026-05-27T16:07:19Z

What changed

Adds ogx.vector_io.query_stage_duration_seconds for retrieval stage timing.
Adds ogx.vector_io.query_result_count for returned chunk counts.
Emits stage metrics from the shared vector store query path for embedding generation, backend search, and neural reranking.
Adds unit coverage for metric instrument definitions and stage emission from VectorStoreWithIndex.query_chunks.

Why

The existing router-level retrieval duration tells us that search was slow, but not whether the regression came from embedding, backend lookup, reranking, or unexpectedly large result sets. These stage-level metrics make future RAG/search regressions easier to localize.

Review notes

This draft intentionally includes the same hot-path logging cleanup as #5972 because both changes touch src/ogx/providers/utils/memory/vector_store.py and the logging hook inspects that file. If #5972 lands first, this PR should shrink to the metric changes after updating.

Validation

git diff --check
Commit hooks passed, including ruff, ruff format, mypy, provider codegen, logging checks, and repository policy hooks.
Attempted targeted test: uv run pytest tests/unit/telemetry/test_vector_io_metrics.py tests/unit/rag/test_vector_store.py -q, but direct uv run did not start because the shell uv is 0.5.29 and the repo requires >=0.7.0.

Add a configurable prefix-to-encoding mapping with sensible defaults for common non-OpenAI model families (llama, mistral, claude, gemma, qwen, phi, deepseek). This is the first step toward supporting compaction with non-OpenAI models. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

Replace the 2-step resolution (admin config OR tiktoken) with a 5-step chain: per-request extra_body override, admin default, tiktoken built-in, model-family prefix mapping, character-based estimate. Explicit choices fail hard with InvalidParameterError; automatic resolution falls through. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> # Conflicts: # docs/docs/providers/responses/inline_builtin.mdx # src/ogx/providers/inline/responses/builtin/config.py # src/ogx/providers/inline/responses/builtin/responses/openai_responses.py # tests/unit/providers/inline/responses/builtin/responses/test_tokenizer_resolution.py

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

franciscojavierarceo added 5 commits May 11, 2026 09:38

Merge remote-tracking branch 'upstream/main'

ca18548

perf: Add vector search stage metrics.

d35b039

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

franciscojavierarceo added RAG Relates to RAG functionality of the agents API python Pull requests that update python code codex labels May 27, 2026

franciscojavierarceo changed the title ~~[codex] Add vector search stage metrics~~ perf: Add vector search stage metrics May 27, 2026

chore: Rerun CI after title fix.

8255edd

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Add vector search stage metrics#5974

perf: Add vector search stage metrics#5974
franciscojavierarceo wants to merge 6 commits into
ogx-ai:mainfrom
franciscojavierarceo:codex/vector-search-stage-metrics

franciscojavierarceo commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

franciscojavierarceo commented May 27, 2026

What changed

Why

Review notes

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant