Feature Request: Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential

Is your feature request related to a problem? Please describe.
First of all, thank you for creating such an elegant and mathematically rigorous memory system. The Mode B architecture aligns perfectly with the "data sovereignty" philosophy many of us value.

However, for non-English users (e.g., Chinese, Japanese, Arabic), the hardcoded local embedding model (bge-small-en-v1.5 via Transformers.js) becomes a significant bottleneck. While the mathematical layer (Fisher-Rao, Sheaf, Langevin) is language-agnostic, its effectiveness is entirely dependent on the quality of the input vectors. The English-optimized model struggles with non-English semantics, leading to a substantial degradation in retrieval quality, which undermines the otherwise brilliant mathematical core of the system.

Describe the solution you'd like
I propose an elegant and maintainable solution: Open the embedding layer to user configuration.

Instead of hardcoding the Transformers.js model, SuperLocalMemory could respect the embedding configuration in config.json when mode: "b" is active.

Specifically, if a user configures an OpenAI-compatible endpoint (like many of us do for the LLM), the system should use that for embeddings as well. For example:

json
"embedding": {
  "provider": "openai",
  "model_name": "Qwen3-Embedding",
  "dimension": 1024,
  "api_endpoint": "http://192.168.50.140:8045/v1/embeddings",
  "api_key": "not-needed"
}
This approach requires zero additional maintenance from you for different languages. You don't need to research, package, or optimize models for every language. You simply provide the interface, and the global community of users can plug in their own best-in-class local models (e.g., BAAI/bge-m3, intfloat/multilingual-e5, etc.) that are best suited for their language and hardware.

Describe alternatives you've considered
Modifying config.json (Current Behavior): We've extensively tested this. The system currently ignores the custom embedding.provider and embedding.api_endpoint settings in Mode B. It falls back to the internal Transformers.js model and reports Ollama embedder not available (model=nomic-embed-text). Falling back. This confirms the configuration is not being honored.

Switching to Mode C: This leverages a cloud LLM for memory synthesis, which improves fact extraction but does not solve the core embedding problem. Moreover, it requires sending data to the cloud, which conflicts with the "local-first" and "data sovereignty" principles that drew many of us to this project.

Forking and Patching: This is a high-maintenance, unsustainable solution for the end-user.

Additional context
This feature request is about unlocking the true potential of your mathematical engine. By making the embedding layer configurable, SuperLocalMemory transforms from a predominantly English-focused tool into a universal, language-agnostic memory operating system. This is a massive value proposition for the global open-source community. It allows users worldwide to fully leverage your brilliant mathematical architecture without compromising on their native language support or data privacy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Feature Request: Support configurable local embedding endpoints (e.g., OpenAI-compatible API) to unlock non-English language potential #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions