Releases · irfanalidv/ragfallback

04 Apr 12:42

irfanalidv

v2.2.0

1813668

v2.2.0 — Native async retrieval + CacheMonitor Latest

Latest

What's new

Native async retrieval

AdaptiveRAGRetriever.aquery_with_fallback() — real coroutine using LangChain ainvoke(). Enables true concurrent eval in GoldenRunner and production FastAPI backends. Falls back to thread pool automatically if the underlying model doesn't implement ainvoke.

CacheMonitor

ragfallback.tracking.CacheMonitor wraps any LangChain retriever and tracks hit rate, per-category latency (hit vs miss), TTL expiry, and LRU eviction. Zero new dependencies — stdlib only. Pass to GoldenRunner via cache_monitor= and cache stats appear in GoldenReport alongside RAGAS scores and P95 latency.

GoldenRunner upgrade

run_async() now uses native aquery_with_fallback() — 75 queries run concurrently instead of serializing through a thread pool.

Install

pip install ragfallback==2.2.0

Numbers

102 unit tests passing (Python 3.9 / 3.10 / 3.11)
CI regression gate green on SQuAD golden dataset
21 new tests added this release

Full changelog

See CHANGELOG.md for complete details.

Assets 2

03 Apr 14:40

irfanalidv

v2.1.0

219933a

v2.1.0 — MLOps evaluation layer (RAGAS + CI regression gate + GoldenRunner)

What's new

ragfallback now ships a complete MLOps evaluation layer — something most RAG libraries don't include at all.

`ragfallback/mlops/` — new package

`GoldenRunner`

Runs your retrieval pipeline against a labeled golden dataset (JSON file or list[dict]), tracks per-sample latency, computes recall@3, recall@5, and P95 latency across all samples. Fully async via asyncio.gather.

`RagasHook`

Wraps RAGAS evaluation — faithfulness, answer relevance, context precision, context recall. Falls back to heuristic scoring if ragas is not installed. No crash, logged warning only.

`BaselineRegistry`

Stores metric snapshots per dataset in a committed JSON file. compare_or_fail() raises RegressionError if any quality metric drops more than 5%, or P95 latency spikes more than 12% vs the stored baseline.

`QuerySimulator`

Generates adversarial query mixes from any base query set:

short_keyword — first 2 content words only
long_nl — expanded with verbose instruction prefix
ambiguous — proper nouns stripped
out_of_domain — completely unrelated topic injection

simulate_unhappy_paths() produces all 4 types for every input query (4× expansion).

`MLflowLogger`

Logs all GoldenReport fields as MLflow metrics and params. No-op if mlflow is not installed.

`generate_locustfile(output_path, endpoint)`

Writes a ready-to-run Locust load test file simulating realistic RAG traffic — short keyword (40%), long NL (20%), out-of-domain (10%).

CI regression gate

A new mlops-regression-gate job runs on every push to main:

Builds golden dataset from SQuAD (CC BY-SA 4.0, no API key needed)
Indexes passages in ChromaDB using all-MiniLM-L6-v2 (local, no API key)
Runs GoldenRunner async across 20 samples
Calls compare_or_fail() against committed examples/baselines.json
Exits 0 (pass) or 1 (regression detected)

Bug fixes

recall_at_k now counts distinct relevant docs in top-k so duplicates cannot push recall above 1.0
BaselineRegistry.compare_or_fail accepts a separate latency_threshold parameter (default 0.12) for looser P95 gating in noisy CI environments

Install

pip install ragfallback[mlops]

python examples/build_golden_dataset.py
python examples/ci_regression_gate.py

Full changelog

See CHANGELOG.md

Assets 2

03 Apr 06:02

irfanalidv

v2.0.2

c8cbafc

v2.0.2 — PyPI Description Fix

ragfallback v2.0.2

What's Changed

PyPI package description updated — The long description on PyPI now accurately reflects what the library does: intelligent fallback mechanisms for RAG pipelines, including query variation, confidence scoring, cost tracking, and silent failure prevention.

No API changes. Safe to upgrade from v2.0.1.

Install

pip install ragfallback==2.0.2

Or with open-source extras (recommended):

pip install ragfallback[huggingface,sentence-transformers,faiss]==2.0.2

Quick Example

from ragfallback import AdaptiveRAGRetriever

retriever = AdaptiveRAGRetriever(
    vector_store=vector_store,
    llm=llm,
    embedding_model=embeddings,
    fallback_strategy="query_variations",
    max_attempts=3
)

result = retriever.query_with_fallback(question="What is the revenue?")
print(result.answer)       # The answer
print(result.confidence)   # 0.92
print(result.cost)         # $0.0000 (with open-source stack)

Full Changelog: https://github.com/irfanalidv/ragfallback/blob/main/CHANGELOG.md
PyPI: https://pypi.org/project/ragfallback/2.0.2/

Assets 2

Releases: irfanalidv/ragfallback

v2.2.0 — Native async retrieval + CacheMonitor

What's new

Native async retrieval

CacheMonitor

GoldenRunner upgrade

Install

Numbers

Full changelog

Uh oh!

v2.1.0 — MLOps evaluation layer (RAGAS + CI regression gate + GoldenRunner)

What's new

ragfallback/mlops/ — new package

GoldenRunner

RagasHook

BaselineRegistry

QuerySimulator

MLflowLogger

generate_locustfile(output_path, endpoint)

CI regression gate

Bug fixes

Install

Full changelog

Uh oh!

v2.0.2 — PyPI Description Fix

ragfallback v2.0.2

What's Changed

Install

Quick Example

Uh oh!

`ragfallback/mlops/` — new package

`GoldenRunner`

`RagasHook`

`BaselineRegistry`

`QuerySimulator`

`MLflowLogger`

`generate_locustfile(output_path, endpoint)`