chore(release): version packages#2401
Merged
Merged
Conversation
908d1d2 to
d14c369
Compare
4ab0b7c to
a06ae3a
Compare
5dd33e7 to
f1542c5
Compare
f1542c5 to
ccf4d83
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.
Releases
nexus-agents@2.71.0
Minor Changes
#2404
8aeabe8Thanks @williamzujkowski! - Addimprovement_reviewMCP tool (PR 2 of epic #2402). Replaces the deleted self-development engine with a focused, threshold-gated observability-driven loop.What it does: reads existing observability primitives (
OutcomeStore,fitness-audit) and surfaces patterns that cross documented thresholds as candidate signals. WhenfileIssues=true, files candidate GitHub issues viagh issue create(rate-limited to 5 per run, deduped against open issues by signal key). Never auto-merges.Detectors:
detectCliPerformanceFloor— CLI × category success rate < 60% with ≥ minSampleSize observations (default 5)detectFailureCategoryConcentration— single failure category > 50% of failures with ≥ 10 failuresdetectFitnessSignals— fitness score below floor (default 90) AND/OR critical fitness findingsSafety:
gh issue createinvoked viaexecFile(no shell — safe against command injection fromerrorMessagecontent)execFilewith literal-phrase search of signal key in bodyInputs:
lookbackDays(default 7),fileIssues(default false → return signals only),minSampleSize(default 5),fitnessFloor(default 90).Outputs:
{ window, totalOutcomes, signals[], issuesFiled[], issuesSkipped[] }.Skill count unchanged at 26. MCP tool count: 37 → 38. New file:
src/mcp/tools/improvement-review.ts(~430 LOC) +improvement-review.test.ts(18 unit tests for the threshold detectors). Wired intomcp/index.ts,mcp/tools/index.ts,cli-server-tools.ts, andtool-annotations.ts.Closes the build half of epic #2402. Replaces the unwired engine deleted in PR #2403 (~7,700 LOC). Net code delta: −7,000 LOC.
#2511
65b7398Thanks @williamzujkowski! - Deprecate the unused sandbox executor surface (#2499). The OS-level sandbox executors inpackages/nexus-agents/src/security/sandbox/(DenoSandboxExecutor,DockerSandboxExecutor,createSandboxExecutor,getSandboxExecutor/getSandboxExecutorOrNull,policyToDenoFlags,collectPolicyConfigurationWarnings) carry@deprecatedJSDoc tags pointing at #2499. Behaviour is unchanged in this release — the symbols still work, just emit IDE/lint deprecation warnings.The supported sandbox surface remains the validation primitives (
validateCommand,validateArgs,SandboxPolicytypes,DEVELOPMENT_POLICY,READONLY_POLICY) consumed bycli/sandbox-exec.tsfor command-allowlist gating. Those are NOT deprecated.Why: the executor classes have no production callers. The product direction (epic #2500) is "compatible with running inside a host-provided sandbox" (Codex sandbox, Claude Code sandbox, OpenCode's docker template, locked-down CI) — not "ship our own sandbox runtime." Carrying ~600 lines of unreachable executor code makes the module look more capable than it is and tempts new contributors to extend a layer that doesn't run.
Migration: most consumers are internal (this repo) — the deprecated symbols are still exported but should not be the basis of new work. External consumers using
createSandboxExecutorshould plan to migrate to either (a) host-provided sandbox boundaries, or (b) the validation primitives directly.Removal: tracked separately. After this minor release ships, a follow-up issue will delete the executor classes + their tests in a single PR.
#2521
2a284d8Thanks @williamzujkowski! - Extract the SWE-bench harness frompackages/nexus-agents/src/swe-bench/to its own repo:nexus-eval-swebench. Per the harness-extraction policy (epic #2514, originally #1960). Closes #2515.What changed:
packages/nexus-agents/src/swe-bench/(~101 files, ~11,594 LOC of runtime + tests) is deleted.packages/nexus-agents/src/exports/swe-bench.tsand the corresponding re-export fromindex.tsare removed —SWEBenchRunner,EvaluationHarness,SWEBenchInstance,SWEBenchPrediction,SWEBenchVariant,SWEBenchConfig, etc. are no longer exported fromnexus-agents.packages/nexus-agents/src/cli/swe-bench-command.tsis deleted.nexus-agents swe-benchCLI subcommand is preserved as a deprecation shim for one minor release — prints a migration message pointing atnpx nexus-eval-swebenchand exits with code 3 (INVALID_ARGS). Removed in the next minor.packages/nexus-agents/src/swe-bench/mcp-config.ts(used bypipeline/expert-bridge.tsto spawn child Claude CLI sessions with MCP access) is relocated topackages/nexus-agents/src/cli-adapters/child-mcp-config.ts— the helper is generic CLI-spawn infrastructure, not benchmark-specific.Migration:
Note that
nexus-eval-swebenchv0.2 is a clean-room rewrite — it does NOT re-export the legacySWEBenchRunnerAPI. The new adapter takes anyIModelAdapterand producesSweBenchPredictiondirectly. See the v0.2 README for the new shape.Why: keeps the published nexus-agents bundle lean — the SWE-bench harness was ~11,594 LOC of evaluation-only code that consumers running orchestration / MCP tools never needed at runtime. The harness-extraction policy concentrates benchmark code in dedicated
nexus-eval-*repos so they can evolve independently. Per discussion in #2515, no breaking-change concern: the only consumers of the legacynexus-agents/swe-benchexports were the eval repo itself (now self-contained) and the in-tree CLI subcommand (now a shim).#2520
c3f1a7eThanks @williamzujkowski! - Extract Atbench (agent-trajectory safety benchmark, originally #1981) frompackages/nexus-agents/src/benchmarks/atbench/to its own repo:nexus-eval-atbench. Per the harness-extraction policy (epic #2514, originally #1960).Behaviour changes:
packages/nexus-agents/src/benchmarks/atbench/directory is deleted —import { ATBenchAdapter } from 'nexus-agents/benchmarks/atbench'no longer works. Migrate toimport { ATBenchAdapter } from 'nexus-eval-atbench'.packages/nexus-agents/src/cli/atbench-command.tsis deleted.nexus-agents atbenchCLI subcommand is preserved as a deprecation shim for one minor release — it prints a migration message pointing atnpx nexus-eval-atbenchand exits with code 3 (INVALID_ARGS). The shim is removed in the next minor.Migration:
The eval repo is published at npm as
nexus-eval-atbenchand peer-depsnexus-agents >= 2.33.1.Why: keeps the published nexus-agents bundle lean — atbench was ~1,328 LOC of benchmark-only code that consumers running orchestration / MCP tools never need at runtime. The harness-extraction policy concentrates benchmark code in dedicated
nexus-eval-*repos so they can evolve independently.No public-API breakage: atbench was never exposed via
nexus-agents's top-levelexports/, only via the deep import path above. Operators using the CLI subcommand get the shim's migration message; library consumers using the deep import get a build error pointing at the new package.Patch Changes
#2400
cb7e5d0Thanks @williamzujkowski! - Tiers 2 + 3 of epic #2398 — enhanceui-ux-designskill with patterns from Apache-2.0-licensed nexu-io/open-design:Tier 2 — Brand extraction protocol (5 steps with explicit safety guards per security voter):
http://,file://,ftp://, protocol-relative).rules/untrusted-input.mdgrep -hoiEpatterns for hex codes, font families, spacing scalebrand-spec.md— path-traversal guard (cwd subtree only)Tier 3 — 9-section DESIGN.md schema — portable design-system structure adopted from Open Design as the canonical brand-spec format. Sections: Visual theme / Color palette / Typography / Component stylings / Layout / Depth & elevation / Dos and don'ts / Responsive strategy / Agent prompt guide. Cross-tool portable (Open Design, Claude Design, future nexus-agents UI tooling).
Tier 2.5 (bundled) — 8-dimension brief input format — structured brief schema (palette / accent / typography / display / layout / mood / density / exclude) with default-resolution rules and "don't silently default" discipline.
License: Apache-2.0 attribution in section quotes. Pure-patch — additive only, no API change.
Tier 4 (P0/P1/P2 standardization) skipped after audit — severity language across skills is already domain-appropriate (
critical/high/medium/lowfor security per CVSS,P1/P2for issue priority). No drift; no convergence needed.#2403
bd70f9dThanks @williamzujkowski! - Delete deadsrc/workflows/self-development/engine (PR 1 of epic #2402).The engine (~7,700 LOC source + tests) was authored before our observability primitives existed (
OutcomeStore,weather_report,LinUCB,fitness-audit). By the time those landed, no consumer had wired up to invoke its runner —package.json,.github/workflows/, and CLI dispatch all bypass it. Six months of unwired existence + an in-place replacement (theimprovement_reviewMCP tool from PR 2 of #2402, plus the manualdogfooding-issuesskill) make this a clean Tier-A internal-only removal perdeprecation-and-migration.Removed:
src/workflows/self-development/(58 files: engine, phases, audit-trail, github-client shim, git-client, docker-sandbox, notifications incl.WebhookNotificationHandler, etc.)scripts/run-self-dev.tsrunnerworkflows/templates/self-development.yamldocs/archive/workflows/self-dev-{phases,execution,operations,validation}.mdUpdated:
docs/workflows/SELF_DEVELOPMENT_WORKFLOW.mdrewritten as a historical pointer to epic #2402src/scm/{github-provider,index}.ts,src/exports/scm.ts,src/cli-adapters/cli-to-model-adapter.ts,src/security/sandbox/default-policies.ts,docs/architecture/UNTRUSTED_INPUT_HARDENING.mdPublic API: unchanged (the module had zero
src/exports/*reach).Verified locally:
pnpm typecheckclean,pnpm lintclean,pnpm vitest run: 25,811 pass / 16 skipped (was 26,386 — 575 tests deleted along with the dead engine).