Skip to content

Releases: richardposner/RuleMonkey

RuleMonkey 3.2.0

18 May 06:02

Choose a tag to compare

Added

  • Parameter sweeps: parameter_scan and bifurcate.
    RuleMonkeySimulator gains two methods — parameter_scan(ScanSpec, seed) and bifurcate(ScanSpec, seed) — the RuleMonkey equivalents of
    BioNetGen's parameter_scan and bifurcate actions. A sweep runs the
    model at each value of one parameter (an explicit value list, or a
    linear / geometric min/max/n_points range) and records the
    endpoint observable and global-function values, matching BNG's
    extraction of the last .gdat row per run. parameter_scan with
    reset_conc=false and bifurcate carry molecular state over between
    points; bifurcate runs the forward and backward sweeps as one
    continuous trajectory so a bistable model surfaces hysteresis. New
    ScanSpec / ScanResult / BifurcateResult types in types.hpp. A
    new rm_scan command-line tool exposes both modes and writes the
    result in tab-separated .scan format on stdout (function columns
    gated behind --print-functions, mirroring
    #7; see
    docs/scan_format.md). Closes
    #8. SSA
    trajectories and existing output are unchanged; the header-only ABI
    change means consumers must rebuild against the new headers.

  • Global-function values in the public API. rulemonkey::Result
    now carries function_names and function_data alongside
    observable_names / observable_data, populated at every output time
    point; function_data is column-major (function_data[fn_idx][t_idx])
    and parallel to observable_data. RuleMonkeySimulator gains
    function_names() (XML declaration order, captured at construction)
    and get_function_values() (live-session readback, mirroring
    get_observable_values()). These expose the BNGL begin functions
    entries — the derived quantities models commonly use as their
    measured/fitted outputs (e.g. Clusters() = monomer + dimer + …) —
    which the engine already evaluates internally for rate laws. Only
    global (non-local) functions are surfaced; local functions evaluate
    per-molecule and have no single global value, so function_names may
    be shorter than the model's full begin functions block. The API
    surface is unconditional. Closes
    #7. SSA
    trajectories and observable output are unchanged; the header-only ABI
    change means consumers must rebuild against the new headers.

  • rm_driver --print-functions. A new opt-in flag that appends the
    model's global-function values as trailing .gdat columns (after the
    observables). Off by default, mirroring BNGL's print_functions=>1:
    the default .gdat stays observables-only and byte-identical to what
    earlier RM versions emitted. The flag governs only rm_driver's text
    output — the in-process Result API exposes the values regardless.

  • tests/cpp/function_values_test.cpp — regression test for the new
    function surface: function_names() declaration order, the
    column-major shape of Result::function_data, per-sample algebraic
    consistency with the observables each function derives from (covering
    nested function-of-function settle order), live get_function_values()
    readback against get_observable_values(), the no-session throw, and
    the empty-not-absent function surface of a model with no functions.

  • Cooperative cancellation hook on run() / simulate() / step_to().
    Each of the three public entry points now accepts an optional
    rulemonkey::CancelCallback (a std::function<bool()>) that the SSA
    event loop polls roughly every 1024 events; returning false raises
    rulemonkey::Cancelled (a std::runtime_error subclass) at a safe
    between-event point. Empty callbacks disable polling and pay no
    per-event overhead. This unblocks the BNGsim timeout kwarg for the
    RuleMonkey backend (closes
    #3); the prior
    workaround of wrapping each evaluation in a subprocess can now go
    away. Source-compatible — existing callers see only the defaulted
    parameter — but mangled-name ABI changes, so consumers must rebuild
    against the new headers.

  • tests/cpp/cancellation_test.cpp — regression test for the four
    behavioral contracts the new hook adds: pre-cancelled callback throws
    on entry, Cancelled inherits std::runtime_error, mid-session
    simulate() cancellation leaves the session live with
    current_time() strictly inside the requested window and is
    recoverable via destroy_session() + re-initialize(), and an
    always-true callback produces a bit-identical trajectory to the
    no-callback path.

  • Species enumeration, canonical complex labeling, and .species
    output.
    RuleMonkey can now enumerate the distinct chemical species
    in the live pool by graph isomorphism. A new DIY canonical-labeling
    core (cpp/rulemonkey/canonical.{hpp,cpp} — 1-WL color refinement
    plus individualization–refinement for symmetric residue such as rings
    and homo-oligomers; no nauty/bliss, preserving the cleanroom property)
    assigns each complex a canonical normalized-BNGL label.
    RuleMonkeySimulator gains enumerate_species() (returns SpeciesRow
    records — a new type in types.hpp), write_species_file(path)
    (BNG-format .species output, live species only, NFsim -ss parity —
    see docs/species_format.md),
    species_count(canonical_species), and total_complex_count(). A new
    rm_driver --species <path> flag writes the .species file from the
    command line. A cached-incremental labeling mode (per-complex cached
    label with dirty-bit invalidation in the structural mutators) is built
    and validated by a Debug/ASan-build invariant — cached label equals a
    from-scratch recompute, gated by the RULEMONKEY_CANONICAL_CACHE_SELFCHECK
    compile definition — awaiting its downstream consumer. Closes
    #9 §2. New
    ctest cases canonical_test, species_enumeration_test. Header-only
    ABI change — consumers must rebuild against the new headers.

  • Session API: live expression evaluation and pattern-keyed species
    methods.
    On an active session, RuleMonkeySimulator gains
    evaluate_expression(expr, extra) — compiles and evaluates an
    arbitrary BNGL expression against the live session (parameters,
    observables, global functions, and time()/t; an optional extra
    map shadows those names on clash) — and four pattern-keyed species
    methods, get_species_count / add_species / remove_species /
    set_species_count, each taking a BNGL species-pattern string. A new
    runtime BNGL species-pattern parser (cpp/rulemonkey/pattern_parser.{hpp,cpp})
    backs the latter four: it accepts exact, fully-specified, connected
    species (every component listed, stateful components with a concrete
    ~state, numeric bonds) and rejects partial patterns (!+ / !? /
    omitted components). get_species_count canonicalizes the parsed
    species and reuses the species_count lookup above;
    add_/remove_/set_ resync all rule propensities after the
    structural change. Closes
    #9 §1 (and,
    with §2 above and §3 — which needed no work — issue #9 in full). New
    ctest cases evaluate_expression_test, pattern_parser_test,
    species_methods_test. Header-only ABI change — consumers must
    rebuild against the new headers.

Changed

  • Expression evaluator: hand-rolled parser replaced with ExprTk. The
    BNGL rate-law / function / parameter math evaluator (expr_eval) is now
    ExprTk, via the
    bngsim::ExprTkEvaluator wrapper RuleMonkey shares with its BNGsim
    integration host. All four expression consumers — global functions,
    rate-law ASTs, the simulator parameter cascade, and local functions —
    moved at once; the hand-rolled recursive-descent parser and AstNode
    tree-walker are gone. Expression evaluation is ~16–30% faster per call
    on function-rate models (no effect on mass-action); SSA trajectories
    are bit-identical to 3.1.x. Closes
    #6. No public
    API or header change. Build note: ExprTk is vendored under
    third_party/ and compiled only in a standalone build — a CMake gate
    (if(TARGET bngsim::expression)) links the host's copy inside a BNGsim
    build instead. scripts/vendor_exprtk.py --check guards the vendored
    copy against drift from its pinned BNGsim commit.

  • CMake vendoring defaults. The minimum CMake version is now 3.20.
    RULEMONKEY_BUILD_TESTS and RULEMONKEY_BUILD_CLI default to
    PROJECT_IS_TOP_LEVEL, RULEMONKEY_WARNINGS_AS_ERRORS defaults off
    when RuleMonkey is added as a subdirectory, and tests no longer depend
    on CMAKE_SOURCE_DIR.

  • Local-function rate laws: redundant per-molecule observable
    re-evaluation eliminated.
    On models with local-function rate laws,
    evaluate_local_rate recomputed each rule's local observables from
    scratch (count_embeddings_*) for every affected molecule on every
    event — up to ~75% of wall time on local-function-heavy models.
    evaluate_observable_on now routes tracked Molecules-type
    observables through the per-molecule obs_mol_contrib table that the
    species-observable incremental machinery already maintains and
    refreshes before the propensity recompute each event: per-molecule
    scope becomes a table read, complex-wide scope a sum over the complex
    — no embedding counts. A from-scratch recompute remains as a
    bounds-checked fallback, and a Debug/ASan-build invariant (gated by
    the RULEMONKEY_LOCAL_OBS_SELFCHECK compile definition) cross-checks
    the fast path against it. Wall-time reductions: isingspin_localfcn
    71%, ANx 20%, AN 16%, t3 9%. Cl...

Read more

RuleMonkey 3.1.2

02 May 19:55
0be4cb6

Choose a tag to compare

[3.1.2] — 2026-05-02

Added

  • docs/internals.md — engine-internals reading guide for
    contributors about to modify cpp/rulemonkey/engine.cpp. Covers
    the SSA event loop, the three pattern-matching layers
    (count_embeddings_single, count_multi_mol_fast,
    count_2mol_1bond_fc), complex tracking on bind/unbind, propensity
    computation and incremental_update, the 2-mol/1-bond fast-path
    specialization, fire_rule's OpType switch, and the five
    select_reactants paths. Cites engine.cpp line ranges as anchors.

  • "Adding a new profile" recipe in engine_profile.hpp. Five
    mechanical steps to wire a gate, struct, member, increment site,
    and report function for a new hot path. Existing per-profile gate
    comments and field-level documentation were already strong; the
    missing piece was a contributor recipe.

  • tests/cpp/error_paths_test.cpp — pins down that the
    documented public-API error surfaces throw std::runtime_error
    (not std::exception, not silent failure) for: missing XML file,
    malformed XML, unknown set_param name, and the four mutators
    that reject calls while a session is active (set_param,
    clear_param_overrides, set_molecule_limit,
    set_block_same_complex_binding). Previously these paths
    existed in simulator.cpp but were only exercised indirectly by
    the corpus parity tests.

  • harness/perf_diff.py — diffs per-model wall-time between two
    feature_coverage_report.md files. Sorts by absolute Δ%;
    flags ±15% as SLOWER / FASTER; marks NEW / GONE for
    models present on only one side. Companion .github/workflows/perf-diff.yml
    runs the full feature_coverage benchmark on both PR base and HEAD
    on the same runner (controls hardware variance) and uploads the
    diff as an artifact. Not a hard gate — shared GitHub runners are
    noisy enough that single-model deltas of 30%+ come from
    neighbour-VM contention rather than real regressions.

Changed

  • -Werror is on by default for the in-tree build, gated by
    RULEMONKEY_WARNINGS_AS_ERRORS=ON. Default ON so a stray warning
    shows up on the developer's machine before it lands in CI.
    Downstream consumers building RM as a subdirectory or against an
    installed package can opt out with
    -DRULEMONKEY_WARNINGS_AS_ERRORS=OFF if their toolchain flags
    things ours does not. Verified clean against AppleClang 17;
    CI exercises Linux clang and gcc.

  • CI asan job is now a Linux + macOS matrix. Same code, same
    compiler family (clang), but different stdlib (libstdc++ vs
    libc++) and different sanitizer-runtime image — exactly the
    divergence that hides UB on one platform and reveals it on the
    other. The CI step sets
    UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=1 and
    ASAN_OPTIONS=detect_leaks=0 to keep diagnostic output uniform
    across platforms.