S4 supports four SSE modes (table below). The Range GET fast-path
introduced in v0.9 #106 partial-fetches only the enclosing encrypted
chunks for a given byte range instead of pulling the full body — but it
only works for SSE-S4 chunked (--sse-chunk-size > 0, S4E6 wire
envelope). The other three modes fall back to the v0.8.12 #120 buffered
path (full decrypt → frame-parse → slice).
| SSE mode | CLI flag | Wire envelope | Range GET fast-path? |
|---|---|---|---|
| SSE-S4 chunked (default since v0.8 #52) | --sse-s4-key <path> + --sse-chunk-size 1048576 (default) |
S4E6 |
✅ partial-fetch via v3 sidecar |
| SSE-S4 buffered (back-compat) | --sse-s4-key <path> + --sse-chunk-size 0 |
S4E2 |
❌ buffered fallback |
| SSE-C (customer-provided key) | per-request x-amz-server-side-encryption-customer-* headers |
S4E3 |
❌ buffered fallback |
| SSE-KMS (envelope, per-object DEK) | --kms-local-dir <dir> (or --features aws-kms) |
S4E4 |
❌ buffered fallback |
| Multipart with any SSE | (any of the above on a multipart PUT) | per-part S4Ex |
❌ no sidecar emitted (v0.8.16 #151) |
Why only chunked SSE-S4? Non-chunked envelopes (S4E2 / S4E3 /
S4E4) wrap the entire body under one AES-256-GCM authentication tag.
AEAD decrypt is only defined over the full ciphertext + AAD + tag
quadruple — there is no "verify just the prefix" mode — so partial
plaintext cannot be exposed without fetching and tag-verifying the
whole body. This is the AEAD security contract, not an optimization
deferment. The S4E6 chunked envelope (v0.8 #52, refined in
v0.8.1 #57) explicitly slices the plaintext into fixed-size chunks
and emits one tag per chunk with a nonce derived from a per-PUT
salt + chunk index, which is what makes chunk-aligned partial
decrypt well-defined. Full per-mode walkthrough lives in
security/sse-partial-fetch-constraint.md.
Operator recommendation: for Range-GET-heavy workloads on large objects (parquet / ORC footer reads, video segment seeks, log-line slice reads) where SSE is required, scope your data to SSE-S4 chunked to keep the fast-path. The 1 MiB default chunk size matches the typical parquet row-group read pattern; smaller chunks give finer-grained partial fetch at higher tag overhead, larger chunks reduce on-disk tag bytes but do more wasted decrypt per Range GET.
s4-server \
--sse-s4-key /etc/s4/sse.key \
--sse-chunk-size 1048576 \
...If SSE-KMS or SSE-C is required by your key-management posture,
either accept the buffered Range GET cost or restructure the data
into smaller objects so the buffered fetch is bounded. Chunked-KMS
(provisional S4E7) and chunked-SSE-C (provisional S4E8)
envelopes are v0.11+ roadmap candidates, not promised features.
/health— liveness probe, always 200 OK/ready— readiness probe, runsListBucketsagainst the backend/metrics— Prometheus text format (s4_requests_total{op,codec,result},s4_bytes_in_total,s4_bytes_out_total,s4_request_latency_seconds,s4_policy_denials_total{action,bucket})- Structured JSON logs (
--log-format json) with per-request fields:op,bucket,key,codec,bytes_in,bytes_out,ratio,latency_ms,ok - OpenTelemetry traces (
--otlp-endpoint http://collector:4317) — each PUT/GET emitted ass4.put_object/s4.get_objectspan with semantic attributes; export to Jaeger / Tempo / Grafana / AWS X-Ray.
- Native HTTPS / TLS (v0.2) —
--tls-cert/--tls-keyfor direct termination viatokio-rustls + ring, ALPN advertisesh2thenhttp/1.1. No reverse-proxy required for HTTPS deployments. - Bucket policy enforcement at the gateway (v0.2) —
--policy <path>accepts an AWS-style bucket policy JSON; every PUT / GET / DELETE / List / Copy / UploadPartCopy is evaluated with explicit Deny > explicit Allow > implicit Deny semantics (matches AWS). Subset:Effect,Action(e.g.s3:GetObject/s3:*),Resourcewith glob,Principal(SigV4 access-key match). Denials are bumped ons4_policy_denials_total{action,bucket}.
- CRC32C stored per-object (single PUT) or per-frame (multipart), verified on GET
copy_objectS4-aware: source'ss4-*metadata is preserved acrossMetadataDirective: REPLACE(prevents silent corruption of the destination)- Zstd decompression bomb hardening:
Decoder + take(manifest.original_size + 1024)caps the decode at the manifest's declared size (+ a small overshoot margin) so a zero-size manifest paired with a high-ratio frame surfaces as a typedIo("bomb detected")instead of unbounded RAM growth. The cap is still bound by the manifest claim itself — a 5 GiB manifest is honored up to 5 GiB, so operators must additionally enforce a per-request memory ceiling at the listener (--max-body-bytes/ a future per-frame cap) for adversarial uploads
- Each compressed object is stored as
<key>+<key>.s4indexsidecar. S3 lifecycle rules must move both files together — a split pair breaks Range GET (sidecar in IA + main in Glacier ⇒InvalidObjectState). - Recommended:
"Filter": {}(whole bucket) or aFilter.Prefixrule that covers bothfoo/...andfoo/....s4index. Avoid size- or suffix-scoped filters that catch one but not the other. - See storage-class-transitions.md
for two example lifecycle JSONs (IA-after-30d and prefix→Glacier-after-60d),
the anti-pattern walkthrough, and a
head-objectdrift-audit recipe. - v1.2: a
transitionrule in ans4 maintainpolicy automates the same change from the S4 side, with the sidecar guaranteed to accompany its main object — sees4 maintain.
s4 parquet-recompact <bucket>/<prefix>reads cold Parquet objects and re-encodes their column chunks to zstd, writing back a native Parquet (pyarrow / Spark / Trino / DuckDB read it directly — no S4 in the read path). It is an offline rewrite (likes4 recompact), not the transparent gateway.- Build-time feature, off by default (keeps the Arrow tree out of the default
build, the same shape as
--features aws-kms): build withcargo install s4-server --features parquet-recompact. - Safety: dry-run by default;
--executeadditionally requires--allow-lossy-physical-rewrite. Each object is value-verified (per row group, bounded memory, Parquet physical-schema-tree compared) before the in-place overwrite — structural drift is a conservative skip, a decoded-value mismatch is a hard failure (downgradable with--tolerate-value-mismatch), a corrupt footer is a hard failure; it never overwrites with unverified data. Already-zstd objects are detected from the footer and skipped (idempotent). Objects under SSE / Object-Lock /Expires/ archive tier / sort-order / bloom-filter metadata are skipped, not silently rewritten. The PUT is conditional (If-Match+ pre-PUT re-HEAD of ETag / Last-Modified / version-id); run on cold/quiescent prefixes (--older-than). - Measured −36.6% over snappy / −51.7% over uncompressed in a local benchmark — see the cold-Parquet use case.