fix(embedder): use native fetch for Ollama + external AbortSignal propagation#389
fix(embedder): use native fetch for Ollama + external AbortSignal propagation#389jlin53882 wants to merge 3 commits intoCortexReach:masterfrom
Conversation
…works Root cause: OpenAI SDK HTTP client does not reliably abort Ollama TCP connections when AbortController.abort() fires in Node.js. This causes stalled sockets that hang until the gateway-level 120s timeout. Fix: Add isOllamaProvider() to detect localhost:11434 endpoints, and embedWithNativeFetch() using Node.js 18+ native fetch instead of the OpenAI SDK. Native fetch properly closes TCP connections on abort. Added Test 8 (testOllamaAbortWithNativeFetch) to cjk-recursion-regression test suite. Also added standalone test (pr354-standalone.mjs) and 30-iteration stress test (pr354-30iter.mjs). Fixes CortexReach#361.
…estOllamaAbort regression
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e8a7bfe1ec
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (this.isOllamaProvider()) { | ||
| try { | ||
| return await this.embedWithNativeFetch(payload, signal); |
There was a problem hiding this comment.
Preserve timeout behavior on Ollama batch embeddings
Routeing all Ollama traffic through embedWithNativeFetch here drops the SDK timeout safety for calls that do not provide a signal. In this commit, embedBatchQuery/embedBatchPassage still default to signal === undefined, so batch Ollama requests now run on raw fetch with no built-in timeout and can hang on stalled sockets instead of failing fast. This is a regression for batch embedding flows (e.g., bulk ingestion paths) that previously relied on SDK-level request timeouts.
Useful? React with 👍 / 👎.
Summary
Fixes Issue #361: Ollama embedding requests hang indefinitely because the OpenAI SDK's HTTP client does not reliably abort TCP connections when
AbortController.abort()fires.Root Cause
When
AbortController.abort()fires for an Ollama embedding request, the OpenAI SDK's HTTP client keeps the TCP socket open until the OS-level timeout (~120s). This stalls the entirebefore_prompt_buildhook and causesauto-recallto time out.Solution
1.
src/embedder.ts— Ollama native fetch pathAdded
isOllamaProvider()andembedWithNativeFetch():embedWithRetry()routes Ollama requests toembedWithNativeFetch()instead of the OpenAI SDK.2.
src/embedder.ts— ExternalAbortSignalpropagationwithTimeout()now accepts and merges an optionalexternalSignal:All public APIs (
embedQuery,embedPassage,embedBatchQuery,embedBatchPassage) now acceptsignal?: AbortSignal.3.
test/embedder-ollama-abort.test.mjs— Regression testVerifies abort actually interrupts a slow server response:
127.0.0.1:11434delays 5 secondsAbortSignalfires at 2 secondsResult:
aborted in 2029ms < 5000ms threshold✅Review Feedback Addressed
Reviewer correctly noted that the original
testOllamaAbortWithNativeFetch()hardcoded127.0.0.1:11434instead of usingwithServer's port, so it always hit "connection refused" without exercising the slow handler. Fixed in this commit.Testing
node --test test/embedder-ollama-abort.test.mjs # ✔ Ollama embedWithNativeFetch aborts slow request within expected time (2029.86ms)Checklist
isOllamaProvider()correctly detectslocalhost:11434/127.0.0.1:11434embedWithNativeFetch()uses native fetch, respectsAbortSignalAbortSignalmerges with internal timeout controllersignal?: AbortSignal