Skip to content

docs(es-frozen): break-even, HA, RTT context + B-phase measurements#133

Merged
masumi-ryugo merged 3 commits into
mainfrom
esfrozen-revision
Jun 19, 2026
Merged

docs(es-frozen): break-even, HA, RTT context + B-phase measurements#133
masumi-ryugo merged 3 commits into
mainfrom
esfrozen-revision

Conversation

@masumi-ryugo

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #131: hardens the Elasticsearch frozen-tier use-case doc for DD/technical buyers, per a detailed revision spec. Two commits (measurements first, then doc).

Part B — new measurements (all real, raw JSON in results/)

  • B1 RTT injection (rtt-injection.json): per-connection latency proxy (toxiproxy, 0/5/20/50 ms) instead of global tc netem (which would perturb a co-tenant bench). Two-sided result: analytics queries are RTT-invariant (S4 ±1 ms at every RTT); the heavy cold top-N+sort overhead grows with RTT (+7%→+70%) because its sidecar-partial GETs each cost a round-trip.
  • B2 sidecar overhead (sidecar-overhead.json): S4 issues the same backend GET count as passthrough and zero separate .s4index GETs — it folds the index into each data GET as a sidecar-partial range. No extra cold-path round-trip.
  • B3 break-even (breakeven.py + breakeven.json): parameterized (--s4-host-usd-month, --instances). Net-positive from ~11 TB (1 instance) / ~23 TB (HA 2) on standard-default; all codecs net-positive at 500 TB–1 PB even with HA.
  • B4 HA failover (ha-failover.json): 2 stateless S4 instances behind nginx, 7/7 checks pass (cold query, warm query, snapshot PUT all survive killing one instance). Surfaced + documented a real SigV4 gotcha (nginx must preserve Host).
  • B5 recompact concurrency: documented-not-tested (per the repo's own TOCTOU admission).

Part A — doc changes

Break-even section (real ratios + parameterized host cost, 500 TB–1 PB); new "Availability & HA" section (S4 = read-path SPOF, mitigated by ≥2 stateless instances); recompact↔snapshot consistency caveat (HEAD→PUT TOCTOU, quiet-window); cold-latency reframed (absolute = no-RTT, transferable = relative overhead) with the RTT table; TL;DR reworded to −15–27%; honest Deepfreeze/LogsDB positioning.

All existing honesty preserved (+6.5–9.5% cold-sort, best_compression double-compression, zstd-19 slowloris PARTIAL, 4M-doc round-trip). No fabricated numbers — every figure traces to results/*.json.

🤖 Generated with Claude Code

masumi-ryugo and others added 3 commits June 19, 2026 15:24
…art B)

Follow-up measurements layered non-destructively on top of the existing A-D
phases. All measured locally against MinIO (no AWS billing), 4M docs, S4 v1.2.2.

- B1 phase_b1_rtt.{py,sh}: cold frozen-search latency under injected backend RTT
  (toxiproxy, 0/5/20/50ms one-way, direct vs S4 zstd-3). Finding: analytics
  queries are RTT-invariant (S4 +/-1ms at every RTT); the heavy cold top-N+sort
  is NOT invariant -- S4 overhead grows +7.1% -> +69.8% from 0 -> 50ms because
  its sidecar-partial GETs each cost a backend round-trip. -> results/rtt-injection.json
- B2 phase_b2_sidecar.py: .s4index sidecar cold-path overhead. S4 issues the SAME
  backend GET count as a passthrough baseline (8 vs 8) and ZERO separate
  .s4index-keyed GETs; the sidecar is folded into each data GET as a
  path="sidecar-partial" covering range. -> results/sidecar-overhead.json
- B3 breakeven.py: parameterised break-even model on the measured saved_ratio,
  HA(2) & non-HA(1), 500TB & 1PB net savings (all net-positive). -> results/breakeven.json
- B4 phase_b4_ha.{sh}/phase_b4_ha_failover.py: 2 stateless S4 instances behind an
  nginx LB; kill one -> cold/warm query + snapshot PUT all survive (7/7 PASS).
  -> results/ha-failover.json
- B5 results/recompact-concurrency.json: documented-not-tested (recompact vs ES
  snapshot/_cleanup is unsafe by the tool's own HEAD->PUT TOCTOU admission).
- results/REVISION-NOTES.md: one-page summary of what measured / what's TODO.
- README: section 5 documents the new phases + the nginx-SigV4-Host gotcha.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Part A)

Reflects the Part B measurements into the use-case doc; preserves all existing
honesty (the +6.5-9.5% cold-sort overhead, best_compression double-compression
caveat, zstd-19 slowloris PARTIAL failure, 4,000,000-doc round-trip).

- A1 Break-even: real saved_ratio + explicit parameterised $/host model, scaled
  to 500TB-1PB; asserts standard-default is net-positive above ~23 TB (HA 2x).
- A2 New "Availability & HA" section: S4 is a read-path hard dependency / SPOF
  for cold frozen search; mitigate with >=2 stateless instances behind an LB or
  multi-value DNS (sidecars in S3 -> stateless). Cites the B4 failover smoke +
  the nginx SigV4 Host-header gotcha. Recommended-config now points at the LB.
- A3 recompact<->snapshot consistency caveat: must NOT run concurrently with ES
  snapshot/_cleanup; HEAD->PUT TOCTOU silently overwrites; --older-than is a
  mitigation not a guarantee; recommend an exclusive quiet window.
- A4 Cold-latency context: absolute 2-4ms are no-RTT local-MinIO; the
  transferable metric is S4's RELATIVE overhead. Adds the B1 RTT table (analytics
  RTT-invariant; top-N+sort overhead GROWS +7.1% -> +69.8% from 0 -> 50ms) and a
  B2 note that the sidecar adds no extra cold-path backend round-trip.
- A5 TL;DR storage reworded to "-15-27% (default max; best_comp/LogsDB -15-22%)";
  -27% framed as zero-migration upside for default-codec clusters.
- A6 Honest Elastic Deepfreeze note (Glacier rotation = cheaper for truly-cold;
  S4 sweet spot = warmish-frozen kept in Standard; complementary not exclusive),
  LogsDB shelf-life caveat, and a restrained "this is ONE use case" scope note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Reconcile Result 4 "every block cold" setup with the B2 op-count (analytics
  queries answered without a backend GET in that run).
- Label the RTT-injection table as a separate toxiproxy rerun whose 0ms
  baseline (+7.1%) differs from the headline +9.5% — read columns as the same
  run's overhead growing with RTT, not against the headline.
- Narrow the HA claim to "behind a health-checking load balancer that routes to
  healthy upstreams" (what B4 measured); move multi-value DNS to
  "validate in your environment" (DNS caching / pooled connections / JVM retry
  are environment-specific).
- Clarify the warm-query HA row stayed unaffected because it did not need the
  repository (not a survivor-backed read).
- Label the break-even "1 PB" row as "1 PB (1000 TB)" to disambiguate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@masumi-ryugo masumi-ryugo merged commit 785103f into main Jun 19, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant