Skip to content

Releases: Automattic/newspack-nodes

v0.18.3

17 Jun 11:21

Choose a tag to compare

Fixed

  • The Topology dashboard resolves a Topic's {partition} template instead of rendering the literal token. The graph view recognized only the <partition> (angle) token, so a multi-partition Topic vertex (firehose.p{partition} — e.g. the aggregator's hub fan-in) showed the literal token with "No segments" rather than grouping into one firehose entity with its concrete per-partition rows. It now matches both the <partition> and {partition} tokens.

Removed

  • Dead Topology_Registry::basenames_for() (and its test) — superseded by the layout-agnostic resolved_resource_dirs() (which the GC and dashboard already use). It had no production caller; the <partition>-only, Partition-only regex was a stale parser.

v0.18.2

17 Jun 10:36

Choose a tag to compare

Changed

  • Command_Interpreter_Node now logs unauthorized commands instead of failing silently. A rejected command emits a rate-limited WARNING: unauthorized: <verb> - TM_COMMAND from: <path> payload: ... audit line (via drop_message). This surfaces cross-session REPL/IPC traffic — e.g. a cmd issued in one pivoted session reaching another session's interpreter, where the auth gate correctly refuses it. Rejection behavior is unchanged (still refused with a TM_COMMAND|TM_ERROR reply); this only adds the audit log.

v0.18.1

17 Jun 10:10

Choose a tag to compare

Fixed

  • wp nodes cli no longer dumps other sessions' IPC (including browser traffic) into a pivoted REPL. The reply path read the shared worker output partition with a Consumer targeting _output, and Consumer::forward_line() unconditionally overwrites each emitted message's TO with its target — so every session's replies were rewritten to _output and the Dumper's per-PID to_filter could no longer drop the ones addressed elsewhere. The output-IPC consumer now sinks through a plain Node, which stamps TO from its target only when TO is empty (soft route), so each reply keeps its own TO and other sessions are filtered out. A regression test pins the Consumer (unconditional) vs Node (soft) TO behavior.

Security

  • Pinned the @babel/core (≤7.29.0) and js-yaml (≤4.1.1) transitive dev dependencies to patched versions (^7.29.6 / ^4.2.0) via overrides, clearing the Dependabot advisories that had no auto-fix PR.

Changed

  • Release CI now builds on Node 24 / npm 11 (was Node 20 / npm 10), matching contributors' toolchain so committed lockfiles stay in sync.

v0.18.0

17 Jun 07:49

Choose a tag to compare

Added

  • Consumer gains an opt-in line_mode (config verb set_line_mode) that forwards one line per poll instead of a whole read-block. A Consumer whose sink does heavy per-message work (e.g. an LLM enrich) would otherwise drain a full 64 KB block in one fire(), freezing the worker's heartbeat through the burst; line mode spreads it across drain cycles so the worker stays live. Enable it in a topology before the first poll: cmd <consumer>:config set_line_mode. Default (batch) behavior is unchanged — poll_active still pipelines a read every tick; line mode reads only once the buffer is dry of complete lines. Internally drain_buffer() is one offset-scanning pass capped by line_mode (1 line) or unbounded (batch), a single O(n) scan that also fixes a latent batch-mode cursor drift on an empty (\n\n) line (the old rtrim+explode dropped the empty line's byte, mis-aligning the next read), and forward_line() is the per-line emit seam.

Changed

  • Tail is line-only and overrides just the forward_line() emit seam, reusing Consumer's buffer/cursor scan. Tail's duplicated drain loop is gone — it customizes only how a line is emitted (raw TM_BYTESTREAM bytes vs an unpacked Message). The unused block-buffered / binary buffer_modes (set by no .tsl — only tests) are removed, and a Tail now supports line_mode for free. The inherited overflow guard logs the real runtime node class (Tail: / Consumer:).

Changed

  • arguments() parsing is centralized in Schema_Reflection::parse_schema_args() — a missing token takes the arg's schema default, or throws if the arg is required. This retires ADR-11's empty-string short-circuit (the recurring footgun where every config-bearing arguments() override had to mirror if ( '' === $args ) return; or derive filesystem-root junk like /p0). The parser now records the raw string into $this->arguments (so dump_config() still round-trips) and is the single source of truth for defaults. Behavior change: a bare make_node <Type> <name> of a node with a required arg (Partition dir, Consumer source_dir, Topic dir_template, Hook hook_name) now throws at construction (fail fast and loud) instead of yielding an unconfigured node that derives garbage. Topic's num_partitions moved from required to default 1 (belt-and-suspenders behind the usual <config:num_partitions> token). ADR-11 and its AGENTS.md row are updated to match.

  • Service CIs are no longer draggable in the topology console palette, but keep their inspector verb buttons. Service CIs are mounted into the request graph (never make_node'd), so dragging one from the palette only mints a stray duplicate. The palette now skips the Service category, while the class stays in the Classes_CI catalog — so selecting a mounted Service CI in the canvas still renders its command/request buttons in the inspector (which resolves verbs via catalog.find( shell_name ), independent of the palette). This is the palette-only hide that category: 'Hidden' couldn't give (Hidden drops the class from the catalog entirely, killing the inspector buttons too).

Added

  • The build kit externalizes @wordpress/blocks and @wordpress/block-library. Added both to the kit's WP_EXTERNALS allow-list (global: window.wp.blocks / window.wp.blockLibrary, handles wp-blocks / wp-block-library), so a dashboard bundle that uses the editor's block APIs (e.g. pasteHandler, registerCoreBlocks) reuses WordPress's own enqueued registry instead of bundling a duplicate. Additive — only affects bundles that import those packages.

  • The DevTools hub tabs are deep-linkable, and Raw Logs can deep-link a selected log. The hub page slug is now newspack-nodes-hub (was newspack-nodes-topology; Admin::TOPOLOGY_MENU_SLUGAdmin::HUB_MENU_SLUG), and each hub tab has a distinct URL: ?page=newspack-nodes-hub&tab=console | &tab=topologies | &tab=raw-logs. DevtoolsTabHost gained an opt-in syncUrl prop (the hub passes it; the floating overlay and every other consumer stay URL-free): it reads the initial tab from ?tab=<slug> and mirrors the active tab back via history.replaceState (no reload, no back-button spam), preserving sibling params. Tab descriptors gained an optional slug (defaults to the tab id). Raw Logs reads ?log=<name> once on its first non-empty catalog to seed the selection (never clobbering a later user pick) and writes ?log= when you choose a log. The Topology Manager's console links now point at ?page=newspack-nodes-hub&tab=console&topology=<name>. No backward-compat redirect for the old slug. Each tab declares the query param it owns (Console topology, Raw Logs log); switching tabs drops the other tabs' params, so the URL only ever carries the active tab's.

Fixed

  • Topology Manager headings align, and the redundant per-partition "stalled" pill is gone. The per-partition stalled pill duplicated both the partition badge and the right-hand health rollup (three "stalled"s); it's removed, leaving the health rollup as the single stalled indicator. The per-partition cluster and the health field now have fixed-width slots so the badges line up across rows.

  • Node::set_state() no longer emits a PHP 8.4 deprecation. Its $payload parameter was implicitly nullable (string $payload = null); it's now explicitly ?string $payload = null.

  • The debug overlay behaves when you switch DevTools hub tabs. Each hub tab now persists its OWN overlay canvas layout — the node-position key is per-tab (…:hub:<tab>) instead of one shared …:hub:local key that loaded one tab's positions as garbage anchors on every other tab (this happened even with the overlay closed during the switch). And the overlay's graph follows the active tab on a switch instead of collapsing to _output or freezing on the previous tab. The destructive auto-resync (a resetGraph() on the mount bump that tore down the freshly-mounted tab's nodes) is removed; the build-delegated mount bump is kept but now only makes the overlay rebuild its _metadata poll on the fresh backbone — the previous code left that poll hitchhiking the torn-down _router, so the graph froze on the old tab (clicking Reset Graph + Reset Layout was the manual workaround). The window position stays global (only node positions are per-tab); the manual chips still work.

  • Switching to the Console tab with the debug overlay open no longer white-screens. It threw node name collision: _output already registered: the Console registers a _output node in its mount effect, while the overlay (gated off the Console tab, but via a one-commit-late useEffect) still held its own render-registered _output. DevtoolsTabHost's tab button now reports the tab change to the host SYNCHRONOUSLY (in the click handler, not only the follow-up effect), so React batches it: the overlay unmounts in the SAME commit the Console mounts, and React runs the overlay's _output-removal (passive cleanup) before the Console's _output-registration (passive effect) — they never coexist.

  • Saving the Nodes Runtime settings no longer deactivates every topology. The active-topology set (newspack_nodes_topologies) was a registered settings-group option that the settings page never rendered (the Topology Manager is the activation UI). WordPress's options.php sanitizes every registered option in the group from $_POST on Save, and an absent one ran through sanitize_topologies(null)[], wiping the active set on every Save. The field is now overlay-only (ui: false): still loaded + autoloaded for the per-request config overlay, but outside the settings group and the reset surface, so Save (and Reset) can't touch it. The activate/deactivate verbs + Supervisor::check_config keep the conflict protection the form sanitizer used to add, so the now-obsolete Admin::sanitize_topologies is removed.

  • Deactivating ALL topologies at once now stops every running worker instead of leaving them up for ~10 more minutes. When the operator deactivates the whole fleet (e.g. toggling every topology off in the Topology Manager), check_config() short-circuited on the empty active set before reconcile_lock_dirs() — the established stop path — so no worker ever got a restart flag and each self-respawned until its lock aged out. The supervisor now drains every worker lock dir (*.p<N>.lock.d, leaving supervisor.lock.d untouched) before exiting when it previously had an active fleet. Cold start (nothing was ever active) still exits quietly and drops no flags.

  • The Topology Manager's supervisor restart button matches the per-topology restart buttons. It shared the worker-tree's worker-restart-btn style (smaller, muted, different shape) while sitting directly above the per-topology controls; it now uses the same nodes-tm__restart style, and the now-unused .worker-restart-btn rule is removed.

  • Topology names get a fixed-width label column so their status pills line up. The per-topology heading laid the name and pills in a flex row with no reserved label width, so the P0/P1/badge pills started at a different x on every row (name length drove it); a min-width on .nodes-tm__name aligns them into a consistent column.

  • The example-ai-newsletter demo now reads its own example-scored log, isolated from a real plugin. Its topology wrote the scored partition/consumer to the bare scored.p* path and Insights_CI_Demo read the same — but that path is substrate-global, so the demo dashboard showed whatever real plugin's scored log happened to be populated (e.g. the product newspack-ai-newsletter's), not its own deterministic data. The demo's scored partition, consumer offsets, and the demo CI's glob are now namespaced to example-scored.p* (its digest.md was already namespaced), so the example is self-contained.

  • **SSE ingress rejects malformed typel...

Read more

v0.17.0

12 Jun 17:29

Choose a tag to compare

Added

  • Eslint rule banning the deprecated isSmall Button prop. react/forbid-component-props rejects isSmall (deprecated in WP 6.2 for size="small") at the JSX-attribute level. This repo has no current usage, but it owns the canonical shared JS the sibling plugins inline via @newspack-nodes/shared — the same rule now guards all three repos so the family can't drift back.

  • Live-mode targets editor in the topology console + debug overlay. The edit-mode targets UI (target chips with a clearable × + "+ add target…" dropdown) now renders in the Inspector's live/view branch too, so an operator can disconnect a running node's targets — previously only live connect (via canvas drag) was possible. The chip × dispatches a runtime disconnect_node <node> <target>; the dropdown reuses the existing connect_node. Edits mutate the running node's target only — no .tsl persistence (that stays edit-mode's job). The editor reads the node's full uncollapsed targets (not the headOf-collapsed / registration-polluted graph edges) so a path target disconnects by its exact value, and is hidden for reserved nodes. New useGraphHandlers.onRemoveEdge (mirrors onConnect); no new runtime verbs.

  • send_struct shell builtin (PHP Shell_Node + JS ShellNode). send_struct <path> <json> sends a TM_STRUCT message whose VALUE is the parsed JSON to <path> — the structured-data counterpart to send (which sends a TM_BYTESTREAM string). Single-quote the JSON so the quote-aware tokenizer keeps it as one token (send_struct echo '{ "foo": 23, "bar": 42 }'), exactly as Tachikoma's send_hash did. Invalid JSON surfaces the decoder's error and sends nothing (the builtin runs in the Shell, before the command-interpreter's central catch, so it reports the error itself). Renamed from Tachikoma's send_hash/TM_STORABLE to send_struct/TM_STRUCT to match our wire vocabulary.

Changed

  • JS base Node arguments is the trivial Tachikoma getter/setter, mirroring PHP. The base set arguments now only stores the raw string — it no longer auto-walks nodeSchema().arguments onto declared properties. The positional walk moved to an exported parseSchemaArgs( node, args ) helper (the JS analog of PHP's Schema_Reflection::parse_schema_args), which a node opts into from its own setter override; SseConnectorNode now does so explicitly. This matches PHP Node::arguments() (trivial) + the Schema_Reflection trait, closing a JS/PHP fidelity gap. An empty arguments string is now a no-op (PHP-faithful) rather than applying schema defaults; Timer_Node is unaffected (its interval_ms comes from setTimer(), not the walk).
  • Hook_Node passes the message VALUE to the WordPress hook, not the whole positional envelope. do_action() / apply_filters() now receive message[VALUE] (the payload) instead of the 7-field message array, so hook callbacks work with the data directly. In filter mode the apply_filters() return is always adopted as the new VALUE, and TYPE is set from its shape — a list-array return marks the message TM_STRUCT, anything else TM_BYTESTREAM (the prior "drop a non-list return and forward the previous message with a warning" behavior is gone). Hook_Node also forwards via parent::fill() now, so it participates in the uniform targetTO stamping contract. Callbacks registered on a Hook_Node's hook must take/return the payload, not a [TYPE, …, VALUE] array.

Fixed

  • SseConnectorNode drops frames from a closed (or reopened-past) stream. The msg/heartbeat listeners now bail when their EventSource is no longer the connector's current one, so a late frame delivered after close() — a teardown race, or a test double that retains listeners — never reaches the torn-down sink. Without the guard the forwarded frame hit a sink-less node and fill() threw (Node.fill requires a wired sink); graph teardown is now race-safe.
  • JS runtime missing TM_NOREPLY reply-control flag (PHP/JS fidelity drift). The browser substrate's message.js lacked TM_NOREPLY = 512 and the JS ShellNode / CommandInterpreterNode had no want_reply machinery — so the JS side could never suppress a reply the way the PHP runtime does for script/topology-load commands. Ported the PHP behavior: message.js now exports TM_NOREPLY = 512; ShellNode gains a wantReply() accessor (default true) and a stampNoreply() step that ORs TM_NOREPLY onto built TM_COMMAND messages (in parse() and sendCommand()) when reply is unwanted; CommandInterpreterNode._respond() now suppresses the routed reply for a TM_NOREPLY command, surfacing only an error to stderr — mirroring PHP Command_Interpreter_Node::interpret().
  • Hook_Node::fill() counted each message twice. The switch to parent::fill() (which increments counter) left a redundant local ++$this->counter; removed so the node counts each message once.

v0.16.2

12 Jun 09:06

Choose a tag to compare

Changed

  • SseConnectorNode (_sse) now tracks stream liveness, and the Raw Logs dashboard reads it for "Xs ago". The connector — the one node that sees every inbound frame — stamps a public lastEventTime on each msg AND on the server's idle heartbeat event (snooped, not routed, so the topology-console transcript is unaffected), and clears it on close(). The Raw Logs viewer sources its "Xs ago" staleness from _sse.lastEventTime instead of row arrivals, so an idle-but-healthy stream resets the counter on each heartbeat instead of climbing like a dead connection; a real drop (no heartbeats) leaves it frozen and "ago" climbs as the intended warning. Sibling application dashboards (event-logger-nodes) read the same _sse.lastEventTime.

v0.16.1

12 Jun 08:03

Choose a tag to compare

Changed

  • Job_Worker_Node dispatches on the entry-level k field, not type. jobs.log / jobintake.log entries carry the job kind (job | remote_job) under k — the same firehose category field Job_Intake writes verbatim, Job_Router carries through, and the hub's Stream_Merger rewrites jobremote_job — so the executor reads k to pick the local vs. remote handler map. This makes the kind field uniform end-to-end (firehose category → jobintake → jobs.log → worker) with no rename at any hop, and fixes jobintake-sourced jobs being silently dropped when a topology wires jobintake:consumer straight to jobs:partition (bypassing the router, e.g. event-logger-nodes' combined topology). Any plugin that hand-builds a job entry must key the kind as k, not type.

v0.16.0

12 Jun 07:02

Choose a tag to compare

Added

  • Partition_Node::void_warranty() — lifts the 4 KB PIPE_BUF write cap WITHOUT acquiring the per-partition exclusivity lock; the no-lock sibling of allow_large_writes(). The caller asserts it is the partition's sole writer (e.g. a worker that already owns the topology lock); concurrent writers + this = silent torn-write corruption, so the name is deliberately alarming. dump_config round-trips the distinct verb. The lock-acquiring allow_large_writes() remains for cross-process write targets.
  • Consumer_Node snapshots a named node's state into the offsetlog (Tachikoma's cache_type=snapshot). set_snapshot_node() — a :config verb — co-commits {offset, cache} as ONE record on each checkpoint, taking the node's duck-typed save_state(), and restores it via restore_state() on respawn. So a stateful node like the request-builder resumes its in-flight cache aligned with the resumed read offset — no separate state file, no offset/cache drift. The offsetlog is void_warranty'd (the worker is its sole writer); a missing snapshot node logs loudly rather than dropping state silently.
  • Topology_Registry::find_conflicts() / write_set() — detect when two enabled topologies would write the same file (a data partition or a Consumer offsetlog) and corrupt it. The write-set is parsed from make_node Partition / make_node Topic paths (both append to the log at their path arg) + make_node Consumer offsetlog args; conflicts are reported as topology pairs + the shared resource.
  • Write-conflict enforcement at both gates. Admin::sanitize_topologies() rejects the whole topology selection when the enabled set has a write-conflict — it raises a settings error naming the conflicting pairs and keeps the previously-saved set rather than persisting a config that corrupts its own logs. The supervisor re-checks the active set in check_config() and refuses to spawn any worker (exits loudly; the cron retries each minute) when the set conflicts — a second line of defense for a config-FILE override that bypasses the admin UI. Together these replace the cross-process protection the per-partition allow_large_writes lock used to provide, letting worker-output partitions drop the lock for void_warranty.
  • Registration edges on the topology canvas. dump_metadata now emits a per-node registrations field — the node-name register() listeners, keyed by event — via the new Node::registered_listeners() accessor, and parseMetadata turns each into a dashed, informational canvas edge from emitter to listener (SchematicCanvas is-registration: dotted sage, event-name hover tooltip, no edit-mode hit-target). Event wiring that was previously invisible is now drawn. Edges appear only between two visible nodes: a registration whose emitter is hidden scaffolding (e.g. every Timer's TIMER subscription to _router) draws nothing, since parseMetadata already skips scaffolding. Both producers — PHP cmd_dump_metadata and JS dumpMetadataPayload — emit the field only when non-empty, keeping them byte-identical (PHP [] vs JS {}).
  • register / unregister REPL verbs (both the worker-tier and in-browser interpreters) — Tachikoma's register <source name> <target name> <event> / unregister <source name> <target name> <event>. register wires target as a node-name listener for event on source (source->register( event, target )); unregister drops it. The means to create the registration edges live from the console — the siblings of connect_node / disconnect_node. Validation matches the reference (source + target must exist for register; unregister skips the target check so a vanished target can still be cleared), and registering an event the source hasn't declared surfaces as a command error.

Changed

  • Timer scheduling state moved onto Timer_Node itself. A timer now carries its own interval_ms / oneshot / next_fire / active / fire_count as public properties, and Event_Framework::set_timer() takes just the node (set_timer( Timer_Node $node )) — the framework reads the cadence off the node and stamps the first fire. Timer_Node::set_key() is folded into a key() getter/setter, and is_active() / fire_count() are dropped in favor of the public $active / $fire_count. Partition's heartbeat timer arms via arguments() + key() to match. The Router-hitchhike preconditions stay fail-loud: no-arg set_timer() throws if the timer is unnamed or no _router is present (now covered by tests asserting each message).
  • SSE drain test seam is the check_slot Closure, not set_test_mode() / set_test_iterations(). Those helpers — and the bounded-iteration counter inside run_stream_loop() — are removed; tests bound the loop by assigning a counting closure to the production slot-liveness seam SSE_Out_Node::$check_slot. Production behavior is unchanged: connection_aborted() plus the real slot check still terminate the stream.
  • Consumer_Node reads one block per poll (drain-then-read), not all segments at once. poll() now dispatches through a $poll_cb function pointer (Tachikoma's $self->{fill}): each tick drains the already-buffered block, then reads at most one READ_BLOCK_BYTES (64 KB) block via get_batch(), then yields the event loop — mirroring Tachikoma fire() + Partition::process_get, so a worker draining a backlog can't monopolize the drain loop. publish_position() is throttled to once per PUBLISH_INTERVAL_S (1 s) instead of every tick. The internal $line_remainder became $buffer (read-ahead). The removed public MAX_POLL_BYTES constant is replaced by READ_BLOCK_BYTES; Reqgrep_Command (event-logger-nodes) carries its own READ_CHUNK_BYTES instead.
  • Classes_CI_Node extends Service_CI_Node like the other service interpreters. Its single list verb is now declared in node_schema() carrying its handler, and the base constructor derives the dispatch table — replacing the bespoke __construct() + command_table(). Behavior is unchanged (same catalog output); it just stops being the one service CI that hand-rolled its own command table.

Fixed

  • Consumer snapshot restore is order-independent — a forward-referenced snapshot node no longer discards its cache on every restart. set_snapshot_node() now only records the name; the offsetlog read and restore_state() are deferred to the first poll() (a new poll_init phase) which runs inside the drain loop, after the whole topology graph is built. A per-node-serialized topology that emits set_snapshot_node request-builder before make_node request-builder previously logged "snapshot node missing or has no restore_state(); discarding restored cache" and dropped the request-builder's in-flight LRU cache on every worker recycle; the restore now finds the by-then-built node and resumes its state aligned with the resumed read offset. arguments() no longer does offsetlog I/O at construction; the IPC-input Consumer seeks next_offset('end') at build and poll_init overrides with the durable checkpoint when one exists (resume wins).

  • Topology console Connect↔Disconnect toggle flips again (and works for the in-browser JS tee). dump_metadata now stamps a reserved _header.pwd carrying the requesting session's reply pivot — the reverse_cwd, i.e. the inbound message FROM, which is the exact target a connect_node with no target stores on a Tee. Both tiers emit it: the worker's cmd_dump_metadata from the envelope FROM (full snapshot only), the in-browser dumpMetadataPayload as the bare _output. parseMetadata exposes it as graph.pwd (and preserves each node's FULL targets, since canvas edges head-collapse every session's pivot to one shared _repl), and the Inspector toggles on an exact node.targets.includes(parsed.pwd). Previously the matcher reconstructed a hardcoded path whose middle segment (_http) the v0.15.0 _Node rename changed to _output, so it never matched and the button was stuck on "Connect". Two further mismatches are reconciled: the pwd arrives ending in the POLLING node's reply segment (…/_metadata, since the canvas polls FROM _metadata) but a tail target ends in the shell's _output, so canonicalReplyPivot() forces the final segment to _output (used by both the matcher and the optimistic patch); and dumpMetadataPayload now reports the SHELL name (Tee, stripping _Node) like the worker does, so the in-browser tee's Connect button doesn't vanish on the next poll under a TeeNode class.

  • Live-graph gestures reflect immediately instead of waiting out the ~5s metadata poll. The Tee tail/disconnect optimistic patches were wrong — tail replaced the whole fan-out array with the string _output, disconnect cleared ALL of a Tee's edges. They now append/remove only this session's pwd in the array (preserving the Tee's other edges), the Trace button optimistically patches debug_state so it flips at once, and augmentWithVirtualEdges preserves the graph's pwd so the toggle survives verb-arg edge synthesis.

  • Timers initialize next_fire when armed. Event_Framework::set_timer() now stamps next_fire = now + interval_ms/1000. Without it a timer fired immediately on the first drain pass and the next-wait calculation went negative, busy-looping instead of sleeping the interval (an idle SSE stream emitted no heartbeats). Partition's heartbeat interval is computed with intdiv() so a non-÷3 stale_timeout can't produce a non-integer string the Timer argument validator would reject.

  • Node-name event dispatch delivers directly instead of re-routing through _router. notify()'s node-name path resolved the listener via Core::node( $listener ) and then ALSO stamped TO = listener, so the TM_INFO was re-routed through _router on top of the direct fill(). Across an SSE pivot that re-route lands on an endpoint where neither the listener n...

Read more

v0.15.1

11 Jun 13:35

Choose a tag to compare

Added

  • Supervisor cron diagnostics now cover late schedule_event vetoes too. WordPress reports those as schedule_event_false, but by the time a late filter sees the failure the event object has been replaced by a falsy value; Nodes now remembers the supervisor event at the start of the schedule_event chain and logs the callback chain if a later callback vetoes it.

Changed

  • Substrate runtime wiring is no longer built at plugin-file scope. The node-class namespaces (for make_node), the <config:…> token namespace, the stock-topology dir, and the shared Core::$memd handle moved out of newspack-nodes.php file scope into the idempotent Bootstrap::ensure_runtime_wired(), called lazily from the entry points that actually use the node graph / cache: rest_api_init (priority 1, before route callbacks), the admin and WP-CLI blocks, and the supervisor cron tick. A plain frontend page view touches none of these, so it no longer autoloads the Config System / Command_Interpreter_Node / Topology_Registry or opens a \Memcached connection it never uses — cutting substrate plugin-load from ~1.6ms to ~0.24ms (the per-request hot path the v0.13.0 Config System had regressed). get_topology_catalog() self-wires, so the catalog can't be read partially built.
  • Removed the internal newspack_nodes/enable_supervisor filter (renamed from newspack_nodes/enable_logging), which only ever existed to support tests — no config field, no production caller. The supervisor enable gate is now the Bootstrap::$supervisor_enabled_override test seam with the same default-on behavior; is_logging_enabled() is renamed is_supervisor_enabled(). Construction of the Supervisor is injectable via the new Bootstrap::$supervisor_factory test seam.
  • Raise the declared PHP floor to 8.2, matching production syntax already used by the substrate (File_Writer trait constants, plus PHP 8.1 array_is_list() calls), and align the bundled example/plugin-writing guide with that floor.

v0.15.0

10 Jun 08:20

Choose a tag to compare

Security

  • dump_node now redacts credentials for every node, not just the one that remembered to. Node::dump_node() reflects every property, so any node holding a secret printed it raw to the REPL (dump_node my_node) and into logs — a credential-disclosure vector. Redaction was bolted onto a single Remote_Source_Node::dump_node() override (it scrubbed auth_password/auth_token); every other node was unprotected. The base now redacts any non-empty property whose name reads as a credential (password, passwd, secret, token, credential, api_key, private_key — deliberately not bare auth, so auth_username and authorize survive), replacing the value with [REDACTED]. The bespoke Remote_Source_Node override is removed (the base covers its fields). Empty secrets stay visible as '' so an operator can tell a credential is unset.

Changed

  • Node sheds two god-object concerns into opt-in traits; arguments() is now the trivial Tachikoma getter/setter. Base Node::arguments() no longer walks node_schema()['arguments'] — it just stores and returns the raw string (if ( @_ ) { $self->{arguments} = shift }), matching the reference. The schema-reflection machinery (the positional-arg walk + the {name}:config interpreter auto-wire) moved into a new Schema_Reflection trait, used only by the ~11 nodes that actually parse args or auto-wire a :config sibling. The fail-loud durable-write primitive (write_all() + the write_failures counter + MAX_WRITE_ATTEMPTS) moved into a new File_Writer trait, used only by Log and Partition. A logic node (Tee, Echo, Hook, Router…) no longer inherits a schema walker, an interpreter auto-wirer, or an fwrite retry loop it never uses. Arg-parsing nodes call $this->parse_schema_args( $args ) from their own arguments() override; the nodes with :config command handlers (Partition, Request_Builder, Stream_Merger, Flame_Builder, plus the bundled newspack-ai-newsletter example nodes) call $this->auto_wire_interpreter() in their ctor. Single-arg / no-:config nodes (Timer, Cache_Warmer_Tick, Health_Check_Tick, Job_Router) parse inline and carry neither trait. Behavior is unchanged; dump_config round-trips identically. write_failures now appears in dump_node for file-writing nodes only (not every node).
  • as_string() and has_value() moved from Node to Core. They're generic scalar/presence helpers used well beyond Node (Message-field coercion, Perl-length()-style presence checks); they live on Core now. Callers use Core::as_string() / Core::has_value(). Another step toward a lean Node.
  • Nodes now resolves the @newspack-nodes/{shared,debug-overlay} aliases in its own build/jest/eslint, and dogfoods them. The three @newspack-nodes/* aliases are the public JS consumption surface, but only runtime was wired in nodes' own build.mjs/jest.config.jsshared and debug-overlay existed only in consumer builds, so nodes' own dashboards reached into src/shared/ via relative paths (../../shared/hooks/...) that a third party reading nodes as the reference can't copy. Both aliases are now wired into build.mjs (esbuild), jest.config.js (moduleNameMapper), and .eslintrc.js (core-modules + no-unresolved ignore), all pointing at nodes' own canonical src/ (it is the home — no sibling fallback), and the 10 cross-package shared/ imports were rewritten to @newspack-nodes/shared/… so nodes imports shared code through the exact path consumers use.

Fixed

  • Parallel test runs no longer delete each other's live temp dirs (a real coverage-suite flake). make_temp_dir() used sys_get_temp_dir() . '/' . $prefix . uniqid() — bare uniqid() is microtime-based and collides across the parallel processes run-coverage spawns (nodes + ELN + pyrobase at once), and the nodes and ELN suites both defaulted to the newspack-nodes-test- prefix (ELN inherits the helper) and both rm -rf /tmp/newspack-nodes-test-* in their run-coverage.sh — so each suite deleted the other's in-flight Partition segment dirs mid-run (surfacing as e.g. rotate_segment adopting segment 0 instead of 1). Now: make_temp_dir() is PID + more-entropy unique; ELN defaults to its own newspack-event-logger-nodes-test- prefix and purges only that; and the base TestCase::tearDown() auto-removes every dir it handed out (test-only; a temp dir is only temporary if someone deletes it).
  • Failed log writes are no longer silently swallowed (disk full = data loss with no signal). Log_Node::fill() ignored its fwrite() return value entirely, and Partition_Node had a separate loop_fwrite() that did retry/return on a stall but whose callers just dropped the batch with only a rate-limited print — two divergent write paths, one of which lost firehose lines on a full disk with zero trace. Both now route through one shared Node::write_all() primitive (the substrate's single durable-write seam): it retries short writes up to MAX_WRITE_ATTEMPTS, and on a genuine stall (ENOSPC / broken pipe) increments a new write_failures counter — surfaced in dump_node for every node — and emits one loud, rate-limited line naming the path. The happy path is unchanged (a single fwrite, no extra work). Log_Node also no longer advances its rotation size counter or triggers a rotate on a write that didn't land, and the companion-index write is covered by the same primitive. Consumers (event-logger-nodes, pyrobase, nuclear-gyrobase) inherit the fix through Log_ManagerLog/Partition.
  • Lock steal is now atomic and no longer sleeps in request scope. Lock_Node::acquire()'s orphan-takeover path had two flaws. (1) It judged an orphan (lock dir with no heartbeat — owner crashed between mkdir and the heartbeat write) by calling sleep(ORPHAN_GRACE_S) inside acquire(), which runs in request scope (SSE / CLI), stalling those requests a full second. It now judges by the dir's own mtime (time() - filemtime(dir) >= ORPHAN_GRACE_S) with no blocking — a real orphan's dir mtime stays at its creation time because nothing is written into the dir before the heartbeat. (2) The takeover itself was force_release_at() then mkdir() — two racers could both pass the staleness check, each delete the other's freshly-written heartbeat, and both believe they hold the lock (a reachable partition double-writer). Takeover now goes through steal_atomically(), which rename()s the dir aside (directory rename is atomic, so exactly one racer wins) before recreating it; the single-holder guarantee rests on mkdir-on-existing failing. Companion: Supervisor::reconcile_lock_dirs() now reaps leaked *.lock.d.stealing.* scratch dirs older than STALE_TIMEOUT (a process killed in the two-syscall steal window would otherwise leak one with nothing to clean it up).
  • Consumer recovers when its cursor segment is wiped and recreated smaller. A full retention sweep deletes every segment of a log and the Partition writer restarts numbering at 0 — but the durable offsetlog survives, so a Consumer restoring its checkpoint could land with cursor_off far past EOF of the recreated segment. poll() only handled the deleted-segment case (cursor id missing from the segment list), so the consumer waited forever for the file to grow back past the stale offset; in production this silently wedged jobs:consumer (evTemplate and remote-manager jobs piled up in jobs.log, unexecuted) while the firehose consumer recovered only because its cursor id happened to no longer exist. Cursor recovery now lives in one shared normalize_cursor(): a missing cursor segment rewinds to the oldest segment, and a cursor (plus pending partial line) past the EOF of a recreated segment rewinds to offset 0. Companion fix: GET_LAG computed lag from the raw cursor, so both recovery cases read as bytes_behind=0 / caught_up=true — masking a wedged consumer as healthy; it now normalizes first and reports the replay poll() will actually do.