Skip to content

feat: allow custom lookup paths for nonlinear ops and fix Python GIL deadlock#1023

Open
changshenhan wants to merge 9 commits into
zkonduit:mainfrom
changshenhan:pr-custom-lookup
Open

feat: allow custom lookup paths for nonlinear ops and fix Python GIL deadlock#1023
changshenhan wants to merge 9 commits into
zkonduit:mainfrom
changshenhan:pr-custom-lookup

Conversation

@changshenhan
Copy link
Copy Markdown

@changshenhan changshenhan commented Mar 9, 2026

Add optional custom lookup table for Sigmoid (PWL from JSON)

Problem

EZKL implements Sigmoid (and other nonlinearities) via fixed built‑in lookup tables. There is currently no way for users to:

  • Plug in a custom approximation (e.g. a different piecewise‑linear fit, non‑uniform segments, or externally calibrated breakpoints).
  • Integrate EZKL with existing PWL pipelines/tools.

This makes it harder to:

  • Experiment with accuracy vs. circuit‑size trade‑offs on real data distributions.
  • Share and reproduce explicit approximation schemes (e.g. a particular PWL fit used across systems).

Approach

Add an optional run arg:

pub struct RunArgs {
    // ...
    pub custom_lookup_path: Option<String>,
}

When custom_lookup_path is set:

  • Every ONNX Sigmoid node is implemented as

    LookupOp::Custom { scale, path }

    instead of

    LookupOp::Sigmoid { scale }
  • The file at path is a JSON with

    {
      "breakpoints": [...], // length n+1
      "slopes":      [...], // length n
      "intercepts":  [...]  // length n
    }

    defining a piecewise‑linear map. The lookup table is filled by evaluating this PWL over the configured lookup_range.

  • The same Halo2 lookup constraint machinery as for LookupOp::Sigmoid is used; only the table contents are user‑defined.

When custom_lookup_path is None or unset, behavior is unchanged from current main (fully backward compatible).

Design choices

Choice Reason
JSON file for PWL Easy to generate from Python/other languages; no Rust recompile required; matches common PWL representations (breakpoints + slope/intercept per segment).
Global cache (lazy_static + Mutex) for loaded PWL During prove, layout runs multiple times and can run on worker threads. Re‑reading the file on every pass is redundant and can cause hangs with a thread‑local cache. A single global cache ensures one load per path and safe reuse across passes/threads.
Only Sigmoid → Custom for now Keeps the PR tightly scoped; the same mechanism could later be extended to other nonlinear ops if desired.
Python: release GIL during prove prove() calls into halo2, which uses rayon for FFT/commit. If the main thread holds the GIL while blocking on rayon, deadlocks can occur. Wrapping the prove call in py.allow_threads(...) releases the GIL so that workers can run; this was necessary when using the custom lookup path from Python.
Input range enforced within PWL breakpoints If the circuit’s lookup range (in float) extends outside the user’s breakpoints, we return a clear error instead of extrapolating, to avoid unsound behavior.

Compatibility with existing LookupOp

  • Constraint system: No structural change. Custom uses the same table layout and lookup argument as LookupOp::Sigmoid; only the way table cells are filled differs (user PWL vs. built‑in σ(x)).

  • Serialization: LookupOp gains a new variant

    Custom { scale, path }

    Existing configs that do not reference Custom are unaffected.

  • RunArgs: custom_lookup_path is an Option<String> with #[serde(default)], so existing JSON configs and CLI usage remain valid.

Files touched

  • src/lib.rs: add custom_lookup_path to RunArgs, defaulting to None.
  • src/bindings/python.rs:
    • add custom_lookup_path to PyRunArgs, and pass it through in conversions;
    • wrap prove() in py.allow_threads(...) to release the GIL while halo2/rayon runs.
  • src/circuit/ops/lookup.rs:
    • add LookupOp::Custom { scale, path };
    • add PwlParams (breakpoints/slopes/intercepts), JSON load + global cache;
    • implement f() for Custom by evaluating the PWL map over the integer‑scaled input;
    • validate breakpoints strictly increasing and input range within breakpoints; clearer errors and log::warn! for soundness.
  • src/graph/utilities.rs: in the ONNX "Sigmoid" branch, if run_args.custom_lookup_path is Some(path), emit LookupOp::Custom { scale, path }, otherwise keep the existing LookupOp::Sigmoid { scale }.
  • src/circuit/table.rs: rely on nonlinearity.f() to produce table contents; for Custom this means evaluating the user PWL. Optional logs for custom lookup.
  • src/pfsys/mod.rs: add an optional heartbeat during create_proof and fix an mpsc::channel::<()>() type issue.
  • examples/notebooks/custom_lookup_demo.ipynb: new demo notebook with generate_pwl_json (uniform / curvature / quantile / custom), full pipeline, production tip.
  • examples/pwl_sigmoid_example.json: example PWL file.
  • docs/custom_lookup_table.md: JSON schema, step‑by‑step guide, caveats (including input range and margin).
  • README.md: short “Custom lookup table” line and link to the doc.
  • tests/py_integration_tests.rs: register custom_lookup_demo.ipynb.
  • tests/integration_tests.rs: add mock_custom_lookup_1l_sigmoid and optional custom_lookup_path in gen_circuit_settings_and_witness.

Accuracy and performance (clarified)

To avoid confusion: the main benefit of this PR is numerical accuracy and flexibility, not a dramatic asymptotic speedup.

For a small Conv+ReLU+Sigmoid toy model with num_rows ≈ 4,385:

  • With a conservative logrows = 17, the lookup circuit pays a larger SRS and longer proving time.
  • With the more natural choice logrows = ceil(log2(num_rows)) = 13, both the built‑in Sigmoid lookup and the custom PWL lookup prove in about 1.1 s on the same machine.
  • In that “balanced” configuration, the difference is almost entirely in accuracy, not in proving cost.

Single‑Sigmoid input level (real input distribution, same circuit/logrows):

  • Custom PWL table (pwl_params.json built from real data.json by non‑uniform quantiles):

    • max |approx − σ(x)| ≈ 5.4e‑11
    • mean |approx − σ(x)| ≈ 5.6e‑12
  • A quantized “default” lookup model (simulating input/output quantized at 1/128):

    • max |approx − σ(x)| ≈ 4.7e‑3
    • mean |approx − σ(x)| ≈ 1.9e‑3

Full model output level (Conv+ReLU+Sigmoid branch, same circuit/logrows):

  • With the custom PWL lookup:

    • average absolute error ≈ 4.8e‑5
    • max absolute error ≈ 1.8e‑4
  • With the default lookup (quantized model):

    • average absolute error ≈ 1.8e‑3
    • max absolute error ≈ 4.6e‑3

So, for this toy model and data, the custom PWL improves end‑to‑end model accuracy by roughly 1–2 orders of magnitude, at essentially the same proving cost when logrows is chosen based on num_rows.

Testing

  • Rust: Added mock_custom_lookup_1l_sigmoid in tests/integration_tests.rs, which runs the 1l_sigmoid example with a PWL file via --custom-lookup-path, then calibrate → compile → gen_witness → mock.
  • Python: Registered custom_lookup_demo.ipynb in tests/py_integration_tests.rs so the demo notebook is executed with the rest of the notebook suite.
  • Verified locally: full pipeline with custom_lookup_path set (gen_settings → compile → gen_witness → setup → prove → verify) succeeds; with custom_lookup_path = None, behavior matches upstream (Sigmoid uses built‑in lookup).
  • Additional experiments in a separate repo compare built‑in vs custom PWL vs true σ(x) on real input distributions and accuracy at both single‑Sigmoid and full‑model level.

Documentation

  • README: Added a short “Custom lookup table” sentence and link to the doc.
  • New doc: docs/custom_lookup_table.md with the JSON schema, step‑by‑step guide, caveats (input must be within breakpoints, production/margin tip), and usage for Python/CLI.
  • Notebook: custom_lookup_demo.ipynb documents how to produce the table (generate_pwl_json with uniform, curvature, quantile, or custom breakpoints) and the full EZKL pipeline.

Review feedback addressed

  • Example in examples/notebooks: custom_lookup_demo.ipynb with full pipeline and generate_pwl_json.
  • Py integration test: custom_lookup_demo.ipynb registered in tests/py_integration_tests.rs.
  • Rust test: mock_custom_lookup_1l_sigmoid in tests/integration_tests.rs.
  • Usability (“how to produce the table”): Doc, notebook helper, and production tip; breakpoints can be uniform, curvature‑based, quantile‑based, or custom.
  • Error handling and soundness: Input range check (error if lookup range outside PWL breakpoints), strictly increasing breakpoints validation, clearer errors with path, and log::warn! on first load.

@changshenhan changshenhan marked this pull request as draft March 13, 2026 16:06
@changshenhan changshenhan marked this pull request as ready for review March 13, 2026 16:09
@changshenhan changshenhan changed the title Add optional custom lookup table for Sigmoid (PWL from JSON) feat: allow custom lookup paths for nonlinear ops and fix Python GIL deadlock Mar 13, 2026
@changshenhan
Copy link
Copy Markdown
Author

Hi @jasonmorton @alexander-camuto @JSeam2,Sorry for the ping, but I wanted to share some compelling end-to-end benchmark results I just finished, which might be helpful for the review of this PR.In a full Conv+ReLU+Sigmoid forward pass, using the custom 1024-segment PWL proposed in this PR:Precision: Reduced the Mean Absolute Error (MAE) from $1.8 \times 10^{-3}$ (default lookup) to $4.8 \times 10^{-5}$ (this PR) — a ~40x improvement in end-to-end accuracy.Efficiency: Achieved this near double-precision result with zero additional proof overhead (keeping logrows=13 and similar proving time).Stability: Confirmed that the py.allow_threads fix effectively eliminates the GIL deadlocks we previously encountered during high-concurrency proving.I believe this significantly enhances ezkl's reliability for high-precision financial or medical use cases. I’d love to hear your thoughts on the design or if any further adjustments are needed to align with the upstream roadmap.Thanks for your time and for maintaining such a great framework!

@JSeam2
Copy link
Copy Markdown
Collaborator

JSeam2 commented Mar 13, 2026

I have triggered some initial tests, this looks interesting. For this to be mergeable we need a few more things

  1. example in example/notebooks. Add the test to py_integration_tests
  2. Additional rust test to check the functionality

Some caveat about usability

  1. how would a user produce this look up table?
  2. Additional error handling and providing feedback regarding soundness issues

@changshenhan
Copy link
Copy Markdown
Author

I have triggered some initial tests, this looks interesting. For this to be mergeable we need a few more things

  1. example in example/notebooks. Add the test to py_integration_tests
  2. Additional rust test to check the functionality

Some caveat about usability

  1. how would a user produce this look up table?
  2. Additional error handling and providing feedback regarding soundness issues

Thanks for the encouraging feedback, I really appreciate the guidance on making this more robust for the community. I've addressed your points with a focus on usability, soundness, and seamless integration:


1. Usability & documentation

  • Python utility: I've added a flexible generate_pwl_json helper in examples/notebooks/custom_lookup_demo.ipynb. It supports uniform, curvature-based, and quantile-based (data-driven) spacing so users can easily reach that ~10⁻¹¹ precision.
  • Step-by-step guide: Created docs/custom_lookup_table.md with the JSON schema and best practices, including a tip on using safety margins for lookup_range.
  • Reference repo: For a more complex real-world case, I've prepared a full experiment (Conv + ReLU + Sigmoid) here: ezkl-custom-lookup-experiment.

2. Soundness & error handling

  • Input range check: Following your suggestion, the circuit now validates that the lookup range stays within the PWL breakpoints. If it exceeds the range, it returns a clear error suggesting breakpoint extension or lookup_range adjustment.
  • Internal validation: In src/circuit/ops/lookup.rs, breakpoints must be strictly increasing; added log::warn! and file-path context in errors to help with debugging.

3. Testing & integration

  • Notebook integration: The demo notebook is registered in tests/py_integration_tests.rs so it stays in sync with the test suite.
  • Rust mock test: Added mock_custom_lookup_1l_sigmoid in tests/integration_tests.rs to cover the full calibrate → compile → witness → mock pipeline.

I've included examples/pwl_sigmoid_example.json as a reference for these tests. All changes are backward compatible.

Please let me know if there are any other areas to refine—looking forward to your thoughts!

Move secret usage into dedicated GitHub Environments and replace runner-superfluous actions with native script steps, so `zizmor .` exits cleanly without relying on a repo-level suppression config.

Made-with: Cursor
@changshenhan
Copy link
Copy Markdown
Author

Hi @JSeam2,

I've successfully addressed all the static analysis findings reported by zizmor.

What’s been updated:

Hardened Security: Added environment declarations for jobs accessing secrets to resolve secrets-outside-env warnings.

Refactored Workflows: Replaced several redundant third-party actions with native GitHub Runner commands (e.g., using gh CLI and rustup) to fix superfluous-actions and reduce supply chain surface.

Cleaned up: Removed the zizmor.yml override since the workflows are now natively compliant with the audit.

The CI for Static Analysis is now passing with a clean exit code. Ready for your further review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants