feat: convert doc tools to MCP resources and skill references#12
feat: convert doc tools to MCP resources and skill references#12kingpanther13 wants to merge 32 commits into
Conversation
Remove ha_get_dashboard_guide, ha_get_card_documentation,
ha_get_domain_docs, and ha_config_info from tools/list. Static
content moves to skill reference files (skills repo v1.2.0),
GitHub-fetching capabilities become ha://docs/ resource templates.
- Add ha://docs/cards/{card_type} and ha://docs/domains/{domain}
resource URI templates in server.py
- Update skills submodule to include dashboard, card, domain, and
config reference files
- Delete tools_config_info.py, dashboard_guide.md, card_types.json
- Default ENABLE_SKILLS_AS_TOOLS to true so clients without resource
support retain access to documentation
- Add instructions note for resource-capable clients to disable
skills-as-tools
- Update all cross-references in tool descriptions, error suggestions,
docs, and tests
Closes homeassistant-ai#766
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the way documentation and skills are integrated into the Home Assistant MCP server, moving towards a more standardized and robust resource-based model. It streamlines access to Home Assistant documentation, enhances the stability of OAuth authentication through state persistence, and improves the security of Python sandbox expressions. These changes aim to provide a more consistent and reliable experience for agents interacting with the Home Assistant ecosystem. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent pull request that significantly improves the project's architecture by converting documentation tools into MCP resources and skills, aligning with modern MCP best practices. The removal of several tools and static files cleans up the codebase nicely.
Beyond the main feature, I'm impressed with the other high-quality changes included:
- The robust implementation of OAuth state persistence is a major enhancement for server reliability.
- The security hardening in the Python sandbox to prevent chained calls is a critical improvement.
- The refactoring of the HACS and filesystem tool availability checks to correctly handle the API response format and improve error handling is a great fix.
The test suite has been thoroughly updated to cover all new features and fixes, which is fantastic to see. I've only found a couple of minor points for discussion. Great work!
- Remove unused asyncio import from tools_utility.py (ruff F401) - Rewrite test_resources.py to test skills-vendor reference files instead of deleted dashboard_guide.md and card_types.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/gemini review the original review you did was based on an erroneous PR with extra stuff in it. I've fixed it and now the actual PR is here; please repeat your review as if you never did the last one, I dismissed all of your previous items in case they weren't accurate. |
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed refactoring that moves documentation-providing tools into MCP resources and skills. This aligns well with MCP best practices by reducing the number of tools and centralizing documentation. The changes are consistently applied across the codebase, including updates to documentation, error messages, and tests. However, I've found a critical issue in the new resource implementation in server.py that includes a path traversal vulnerability, insufficient error handling, and missing test coverage. Addressing this is important for security and robustness.
- Add _sanitize_slug() to reject path separators (/, \, ..) in card_type and domain parameters - Add explicit 404 handling with helpful error messages pointing to skill reference files for valid values Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ha://docs/ resource templates introduced a new pattern (server-side GitHub proxying) that diverges from the existing skills architecture where all resources are local file reads. The skill reference files already contain card/domain lists and GitHub URL patterns — the existing skills framework handles this use case adequately. - Remove _register_doc_resources() from server.py - Update all cross-references to point to ha_get_skill_home_assistant_best_practices instead - Remove ha://docs/ mentions from server instructions, AGENTS.md, README.md, DOCS.md, and UAT stories Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/gemini review I just did a significant refactor, review again |
There was a problem hiding this comment.
Code Review
This pull request is a significant architectural improvement, refactoring several documentation-providing tools into MCP resources and skills. The changes are consistently applied throughout the codebase, including documentation, error messages, and tests. The removal of old tools and static files is clean. I've found one area for improvement in the unit tests to ensure packaging rules are more robustly verified.
- Make test_pyproject_includes_resources check for the specific resources/skills-vendor/**/* pattern instead of generic resources/ - Remove resources/*.md and resources/*.json globs from pyproject.toml since dashboard_guide.md and card_types.json were deleted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…assistant-ai#727) * feat: Add CategorizedSearchTransform for search-based tool discovery Introduces an opt-in search transform (ENABLE_TOOL_SEARCH=true) that replaces the full 89+ tool catalog with a unified BM25 search tool and three categorized call proxies (read/write/delete), dramatically reducing idle context token usage for LLMs. Key design: - Unified search_tools searches ALL tools, results include annotations - call_read_tool (readOnlyHint) — clients can auto-approve - call_write_tool (destructiveHint) — requires confirmation - call_delete_tool (destructiveHint) — requires confirmation - Critical tools pinned via always_visible for individual permission gating: ha_restart, ha_reload_core, ha_backup_create/restore, ha_get_overview, ha_report_issue - ResourcesAsTools (list_resources/read_resource) auto-pinned when enable_skills_as_tools is enabled - Server instructions updated with tool discovery workflow - ha_get_overview includes tool discovery hint when active Subclasses FastMCP's BM25SearchTransform (~200 lines), leveraging native catalog bypass, auth filtering, and search infrastructure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve CI failures, bugs, and review feedback for search transform - Fix Ruff RUF012: annotate _PINNED_TOOLS with ClassVar[list[str]] - Fix runtime crash: move Context import out of TYPE_CHECKING (Pydantic needs it at runtime for Tool.from_function type resolution) - Fix critical bug: override get_tool() so categorized proxy tools (ha_call_read_tool, ha_call_write_tool, ha_call_delete_tool) are resolvable when the LLM calls them - Fix _categorize_tool: require destructiveHint AND name pattern for delete category (was matching on name pattern alone) - Fix search tool missing readOnlyHint annotation - Rename all tool references to ha_* prefix (ha_search_tools, ha_call_read_tool, etc.) per naming convention - Add DEFAULT_PINNED_TOOLS constant; use in tools_search.py (DRY) - Add lazy cache rebuild with _cache_built flag - Add enable_tool_search toggle to dev add-on config UI - Add 20 unit tests for CategorizedSearchTransform Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: enrich ha_get_state description for BM25 search discoverability Add domain keywords (lights, switches, sensors, etc.) and natural language terms (status, current) so BM25 can match common queries like "lights status" to this tool. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Gemini review round 2 — structured errors, DRY constants - Replace ValueError with ToolError + create_error_response per style guide structured error requirements - Deduplicate _PINNED_TOOLS: use DEFAULT_PINNED_TOOLS as single source - Extract proxy descriptions to module constants shared between transform_tools() and get_tool() for consistency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct ha_get_overview docstring that said "standard (default)" The parameter default is "minimal" but the docstring said "Use 'standard' (default)" — causing LLMs to always pick standard and blow out context with the full entity list. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: improve proxy guidance, sequential calls, and double-wrap guard - Enrich ha_search_tools description with full bootstrap workflow, calling example, anti-nesting warning, and sequential call guidance - Add param structure and sequential call hints to all proxy descriptions - Add server-side double-wrap unwrapping in categorized_call() so LLMs that accidentally nest name/arguments still succeed - Update ha_get_overview tool_discovery hint with param and sequential info - Append skills hint to search tool description when skills_as_tools is on - Auto-enable enable_skills when enable_skills_as_tools is set (dependency) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: soften resource claim in instructions, tighten double-wrap guard - Replace hard "your client supports resources" claim with soft "should be able to" and explicit fallback path to tools - Add catalog membership check to double-wrap detection so it only unwraps when the inner name is a known tool in the catalog - Auto-enable enable_skills when enable_skills_as_tools is set Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pre-pin skill guidance tools for forward compatibility with homeassistant-ai#732 When skills-as-tools is enabled, pin any skill guidance tool names stored in _skill_tool_names (registered by homeassistant-ai#732) so they remain visible in list_tools() alongside other pinned tools. Uses getattr with empty default so this is a no-op until homeassistant-ai#732 merges. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add SearchKeywordsTransform to improve BM25 ranking for common queries Adds a lightweight FastMCP Transform that appends extra keywords to specific tool descriptions so BM25 ranks them higher for natural language queries. Only active when enable_tool_search is on — original tool docstrings are unchanged. Fixes three BM25 ranking issues identified during search quality testing: - "find entities" now ranks ha_search_entities above ha_deep_search - "get/read automation" now ranks ha_config_get_automation above set - "create helper" now ranks ha_config_set_helper above remove_helper To adjust rankings for additional tools, add an entry to the _SEARCH_KEYWORDS dict in server.py — no other files need changing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unnecessary string quotes from type annotations (ruff UP037) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: make direct tool calls explicit and demote ha_deep_search in BM25 Update server instructions and search tool description to clarify that discovered tools can be called directly by name (preferred) rather than requiring proxy usage. Extend SearchKeywordsTransform with description override support to narrow ha_deep_search's BM25 footprint, and boost 8 additional competing tools to outrank it for common queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: reduce search transform max_results from 10 to 5 Matches FastMCP's BM25SearchTransform default of 5 to reduce per-search context cost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pin ha_search_entities as always-visible tool Entity search is a high-frequency tool that shouldn't require a tool search call to discover. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pin ha_config_get_automation and ha_config_set_automation High-frequency automation tools shouldn't require a search call to discover. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
homeassistant-ai#802) * chore: migrate from pre-commit to lefthook for parallel hook execution Replace pre-commit with lefthook to enable parallel execution of mypy and unit tests during git commits, reducing wall-clock time from ~40s to ~20s. Hook execution order: - Group 1 (parallel): ruff --fix + uv lock - Group 2 (parallel): mypy + unit tests lefthook is a compiled Go binary that uses goroutines for concurrency, avoiding the bash job control limitations in sandboxed environments. - Add lefthook.yml with two-stage parallel hook config - Replace pre-commit>=4.0.0 with lefthook>=1.10.0 in dev deps - Remove .pre-commit-config.yaml - Update CONTRIBUTING.md with new install command and migration guide - Update .dockerignore * fix: run both mypy invocations regardless of first exit code
Take upstream's expanded resource access instructions for the ENABLE_SKILLS_AS_TOOLS code path in server.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address Gemini review: the prompt asks to look up card type documentation, which now routes through the skills framework. Add ha_get_skill_home_assistant_best_practices and read_resource to the expected tools list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The merge with upstream/master reset the submodule pointer to 308c4d4 which predates the reference files added in skills#25. Restore to d0a72ea which includes dashboard-guide.md, dashboard-cards.md, and domain-docs.md needed by the tests and this PR's functionality. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…omeassistant-ai#807) Bumps the github-actions group with 1 update: [renovatebot/github-action](https://github.com/renovatebot/github-action). Updates `renovatebot/github-action` from 46.1.5 to 46.1.6 - [Release notes](https://github.com/renovatebot/github-action/releases) - [Changelog](https://github.com/renovatebot/github-action/blob/main/CHANGELOG.md) - [Commits](renovatebot/github-action@v46.1.5...v46.1.6) --- updated-dependencies: - dependency-name: renovatebot/github-action dependency-version: 46.1.6 dependency-type: direct:production update-type: version-update:semver-patch dependency-group: github-actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ssistant-ai#808) Bumps the dev-dependencies group with 2 updates: [pytest-cov](https://github.com/pytest-dev/pytest-cov) and [ruff](https://github.com/astral-sh/ruff). Updates `pytest-cov` from 7.0.0 to 7.1.0 - [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest-cov@v7.0.0...v7.1.0) Updates `ruff` from 0.15.6 to 0.15.7 - [Release notes](https://github.com/astral-sh/ruff/releases) - [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md) - [Commits](astral-sh/ruff@0.15.6...0.15.7) --- updated-dependencies: - dependency-name: pytest-cov dependency-version: 7.1.0 dependency-type: direct:development update-type: version-update:semver-minor dependency-group: dev-dependencies - dependency-name: ruff dependency-version: 0.15.7 dependency-type: direct:development update-type: version-update:semver-patch dependency-group: dev-dependencies ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [openai](https://github.com/openai/openai-python) from 2.28.0 to 2.29.0. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](openai/openai-python@v2.28.0...v2.29.0) --- updated-dependencies: - dependency-name: openai dependency-version: 2.29.0 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ant-ai#810) Bumps [testcontainers](https://github.com/testcontainers/testcontainers-python) from 4.14.1 to 4.14.2. - [Release notes](https://github.com/testcontainers/testcontainers-python/releases) - [Changelog](https://github.com/testcontainers/testcontainers-python/blob/main/CHANGELOG.md) - [Commits](testcontainers/testcontainers-python@testcontainers-v4.14.1...testcontainers-v4.14.2) --- updated-dependencies: - dependency-name: testcontainers dependency-version: 4.14.2 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: upgrade to Python 3.14 for Home Assistant 2026.3 compatibility Home Assistant 2026.3 ships with Python 3.14. Update all Python version references across the project: - pyproject.toml: requires-python, classifier, ruff target-version - .python-version: 3.13 → 3.14 - Dockerfiles (3): builder and runtime base images - CI workflows (8): PYTHON_VERSION env vars, python-version inputs, container image tags - uv.lock: requires-python marker - Issue template: placeholder version Docker image digests removed (Renovate will pin them). uv.lock header updated; full regeneration via `uv lock` recommended in CI before merge. Closes homeassistant-ai#699 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add build deps for Python 3.14 and restore image digest pins - Dockerfiles: add gcc + libffi-dev to builder stage for cffi compilation (no prebuilt 3.14 wheels yet), restore SHA256 digest pins for both builder and runtime images - CI (pr.yml): switch lint and unit-tests from slim container to runner-based setup (matching e2e-validation pattern) — ubuntu runners have gcc/rust needed for source builds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove quoted type annotations for Python 3.14 (UP037) Python 3.14 implements PEP 649 (deferred evaluation of annotations), making quoted forward references unnecessary. Ruff UP037 now flags these when target-version is py314. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add libc-dev to Docker builder stage for cffi compilation gcc needs C standard library headers (assert.h, etc.) to compile cffi. The --no-install-recommends flag skipped libc-dev which is only "recommended" by gcc, not a hard dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add make and autoconf for jq source build in Docker The jq Python package builds jq 1.8.1 from source on Python 3.14 (no prebuilt wheels) and requires make/autoconf for its configure step. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: add temp workflow to regenerate uv.lock for Python 3.14 The lockfile was generated for Python 3.13 and only contains cp313 wheel entries. Python 3.14 can't use those, causing Docker builds to fall back to source compilation. This workflow regenerates the lockfile with cp314 wheel entries. Also reverts unnecessary build-tool additions to Dockerfiles — the proper fix is the lockfile, not adding gcc/Rust to the builder. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update uv.lock wheel entries from cp313 to cp314 The lockfile was generated for Python 3.13 and only contained cp313 wheel entries. When uv sync --locked runs on Python 3.14, it can't use cp313 wheels and falls back to building from source (sdist), which requires gcc, Rust, etc. Updated 203 wheel entries to cp314 using PyPI JSON API. Remaining 12 cp313 entries are macOS/Windows/i686 platforms where the filename pattern differs — these don't affect linux x86_64 Docker builds. Also removes the temporary lockfile regeneration workflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update Python version reference in documentation issue template The documentation.yml issue template still referenced Python 3.13+ in placeholder examples. Updated to 3.14+ to match the rest of the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…g empty data (homeassistant-ai#812) When get_states() failed, asyncio.gather with return_exceptions=True silently swallowed the exception, causing ha_get_overview to return success=true with total_entities=0 — indistinguishable from an empty HA instance. Changes: - Raise entities fetch exception immediately so the outer error handler converts it to a proper ToolError with actionable suggestions - Log services fetch failures at warning level and include partial=true + warnings[] in the response so callers can distinguish a real zero from a fetch failure - Move inline `import random` to module level Fixes homeassistant-ai#811
…ssistant-ai#816) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…ate annotations (homeassistant-ai#817) The CI container images in pr.yml were stuck at uv 0.9.30 while Dockerfiles are at 0.11.0, because Renovate had no annotations to detect them. Added `# renovate: datasource=docker depName=...` comments so future uv bumps are applied automatically. Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR homeassistant-ai#700 removed the jq dependency from upstream. The previous merge resolution incorrectly kept jq imports and jq test classes that no longer exist in upstream/master. - Remove jq try/import block and _JQ_UNAVAILABLE_ERROR from tools_config_dashboards.py - Replace TestJqTransformAndFindCard with TestFindCard (drop all jq_transform tests, keep find_card tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sistant-ai#805) * docs: clarify Claude.ai "Couldn't reach MCP server" is normal, annotate stateless session logs Claude.ai shows a "Couldn't reach the MCP server" error during its initial connection handshake even though the server connects successfully afterward. This causes confusion for users who believe they cannot connect. Additionally, the MCP SDK logs "Terminating session: None" on every request in stateless mode, which looks alarming but is routine cleanup. - Add setup note in claude-ai.md explaining the error is expected - Add FAQ entries for both the connection error and the session log message - Add log filter to annotate "Terminating session: None" with "(Normal — stateless mode)" in both main and addon entry points Closes homeassistant-ai#783 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: rename filter to public, add unit tests, fix ruff import order - Rename _StatelessSessionLogFilter to StatelessSessionLogFilter since it is imported cross-module (addon start.py) - Add unit tests for the log filter covering annotation, real sessions, other loggers, and unrelated messages - Fixes ruff isort violation in start.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add detailed Cloudflare AI crawler instructions to landing page and docs Expand the browser landing page with step-by-step instructions for disabling Cloudflare's "Block AI training bots" setting, the most common connection issue for Cloudflare users. Add the same instructions with screenshot to FAQ, cloudflared setup guide, and addon DOCS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Clarify landing page: tell user to copy URL from address bar Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Landing page: clarify to paste full URL with private key into MCP settings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add tip to ask Claude directly to verify MCP connection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add Cloudflare AI crawler warning and Claude connection tips to setup wizard The setup wizard is a single-page app in setup.astro — the markdown content files don't render in the wizard UI. Add the AI crawler blocking warning directly to the Cloudflare Tunnel instructions block, and add connection verification steps to the Claude.ai config section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Setup wizard: render UI client steps as instruction blocks, add verify tip - Claude.ai, ChatGPT, Raycast, Open WebUI steps now render as formatted HTML instruction blocks instead of inside a code box - Hide the config code box + copy button for UI-based clients - Add "Verify Your Server" instruction block for all HTTP-based setups telling users to visit the URL in their browser to confirm it works Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix: render instructions after config generation so UI steps appear The instructionsEl.innerHTML was being set before UI client steps were pushed to the array. Move the render call to after config generation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move Cloudflare screenshot to repo, replace all GitHub attachment URLs Gemini flagged that user-uploaded GitHub assets can have URLs change or be removed. Moved the screenshot to site/public/images/ and updated all references to use the permanent GitHub Pages URL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt-ai#818) Bumps [smol-toml](https://github.com/squirrelchat/smol-toml) from 1.6.0 to 1.6.1. - [Release notes](https://github.com/squirrelchat/smol-toml/releases) - [Commits](squirrelchat/smol-toml@v1.6.0...v1.6.1) --- updated-dependencies: - dependency-name: smol-toml dependency-version: 1.6.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#819) Bumps [requests](https://github.com/psf/requests) from 2.32.5 to 2.33.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.32.5...v2.33.0) --- updated-dependencies: - dependency-name: requests dependency-version: 2.33.0 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#820) Bumps [yaml](https://github.com/eemeli/yaml) from 2.8.2 to 2.8.3. - [Release notes](https://github.com/eemeli/yaml/releases) - [Commits](eemeli/yaml@v2.8.2...v2.8.3) --- updated-dependencies: - dependency-name: yaml dependency-version: 2.8.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps and [picomatch](https://github.com/micromatch/picomatch). These dependencies needed to be updated together. Updates `picomatch` from 4.0.3 to 4.0.4 - [Release notes](https://github.com/micromatch/picomatch/releases) - [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md) - [Commits](micromatch/picomatch@4.0.3...4.0.4) Updates `picomatch` from 2.3.1 to 2.3.2 - [Release notes](https://github.com/micromatch/picomatch/releases) - [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md) - [Commits](micromatch/picomatch@4.0.3...4.0.4) --- updated-dependencies: - dependency-name: picomatch dependency-version: 4.0.4 dependency-type: indirect - dependency-name: picomatch dependency-version: 2.3.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…t-ai#826) Bumps [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro) from 5.16.11 to 5.18.1. - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/astro@5.18.1/packages/astro/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/astro@5.18.1/packages/astro) --- updated-dependencies: - dependency-name: astro dependency-version: 5.18.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Skills-as-tools replaces several removed documentation tools, so the default is intentionally True to preserve documentation access for all clients. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… deep search (homeassistant-ai#814) * fix: add badge search, exact match mode, and dashboard search to deep search (homeassistant-ai#801) - ha_dashboard_find_card now searches view-level badges (views[n].badges), catching entity references in badge chips that were previously missed during rename operations - ha_deep_search gains exact_match parameter (default: True) that uses substring matching instead of fuzzy scoring, eliminating false positives when searching for known entity IDs - ha_deep_search gains 'dashboard' search type that fetches and searches all storage-mode dashboard configurations for entity references Closes homeassistant-ai#801 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract _score_deep_match to deduplicate exact_match scoring logic Addresses Gemini Code Assist review — the scoring logic for exact_match was duplicated across automations, scripts, and helpers. Extracted into a single _score_deep_match helper method. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add exact_match parameter to ha_search_entities and ha_get_integration Per maintainer discussion in homeassistant-ai#801, fuzzy matching should be disabled by default across all search tools. This adds exact_match=True (default) to: - ha_search_entities: routes directly to substring matching, skipping fuzzy search entirely. Set exact_match=False for typo-tolerant search. - ha_get_integration: skips fuzzy scoring branch when filtering by query, keeping only exact substring matches. Set exact_match=False for fuzzy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: update E2E tests for exact_match default in ha_search_entities Tests that specifically verify fuzzy search behavior now pass exact_match=False, since the default changed to True. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: also search sections-view header cards in ha_dashboard_find_card Addresses feedback from @Patch76 on homeassistant-ai#801 — views[n].header.card in sections views accepts a card (typically Markdown with Jinja2 templates) that can contain entity references. Like badges, it's a sibling of views[n].cards and was missed by card-focused search. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address sergeykad review — error handling, bool coercion, tests - Dashboard search now raises on failure instead of silently returning empty results (fixes false-confidence problem) - Dropped `or False` from `coerce_bool_param(..., default=True)` calls that undermined the True default (3 locations) - Fixed potential None.lower() crash in ha_get_integration query filter - Badge search now triggers on card_type="badge" (not just entity_id) - Added unit tests for badge search, header card search, strategy dashboards - Added E2E tests for exact_match default, fuzzy opt-in, dashboard search type, and search_entities exact_match default Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…in permissions (homeassistant-ai#828) * Potential fix for code scanning alert no. 31: Workflow does not contain permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * ci: add workflow-level permissions to test-installer-scripts.yml Addresses code scanning alerts homeassistant-ai#23, homeassistant-ai#26, homeassistant-ai#27 by adding explicit permissions: contents: read at the workflow level, matching the fix already applied to semver-release.yml for alerts homeassistant-ai#29-31. All three jobs (test-macos, test-windows, check-demo-env) only need read access to checkout code and run scripts. --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Sergey <sergey@example.com>
…#830) Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.5 to 46.0.6. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](pyca/cryptography@46.0.5...46.0.6) --- updated-dependencies: - dependency-name: cryptography dependency-version: 46.0.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ed) (homeassistant-ai#829) Add migration guide to OAUTH.md with security context (SSRF and XSS), add troubleshooting entries to FAQ.md and site FAQ, update README banner to link to the migration guide instead of the GitHub issue. Closes homeassistant-ai#749
Resolved conflict in tools_config_dashboards.py — kept the PR's deletion of ha_get_dashboard_guide and ha_get_card_documentation (converted to skill references). All upstream changes from homeassistant-ai#814 (badge search, header card search, exact_match, formatting) are preserved via auto-merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The HMAC secret is regenerated on every restart, so all pre-existing tokens are already invalidated. The only scenario for an unsigned token reaching a running server is an attacker crafting a plain base64 blob to bypass HMAC verification. Remove the path entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t infrastructure Outcome of going through all 60 findings from Gemini Code Assist + pr-review-toolkit (code-reviewer, pr-test-analyzer, silent-failure- hunter, comment-analyzer). 3 wrong (skipped: PATH-resolved node binary mis-flagged as hardcoded; project-relative esbuild path mis-flagged as hardcoded; theoretical FakeBroadcastChannel constructor-throw). Remaining real items addressed: Harness: - vm.runInContext replaces window.eval / indirect eval to clear the "no eval()" style-guide flag (Gemini #1, #2). - Timer-callback and broadcast-listener throws now record into the errors list instead of being silently swallowed (homeassistant-ai#30, homeassistant-ai#32). - SAFETY_CAP exhaustion records a clear "runaway setInterval" error instead of breaking silently (#4, homeassistant-ai#31). - Non-navigation jsdomErrors route to errors (not console) so tests asserting `not result.errors` catch them (homeassistant-ai#38). - Transpile failure short-circuits init eval to avoid cascading syntax errors from un-transpiled TS (homeassistant-ai#34). - FakeBroadcastChannel.postMessage now delivers to peer same-name channels in the same context per spec (#5). - Time-faked surface documented accurately (Date.now / setTimeout / setInterval only; new Date / performance.now still wall-time) (homeassistant-ai#46). - New broadcastChannelUnavailable param simulates the `typeof BroadcastChannel === 'undefined'` browsing context so the production null-guard branch is exercised (homeassistant-ai#15). - Dead comments and rot-prone duplications removed (homeassistant-ai#47, homeassistant-ai#49, homeassistant-ai#51, homeassistant-ai#56, homeassistant-ai#57, homeassistant-ai#58, homeassistant-ai#59, homeassistant-ai#66). extract_astro_vars.mjs: - vm.runInContext replaces (0, eval) (Gemini #2). - Multi-line `import { a, b } from 'x';` now stripped robustly (#7). - Eval errors wrapped with the source path for actionable failures (homeassistant-ai#35). _js_harness.py: - Wrong test file name and workflow path in docstring fixed (homeassistant-ai#41, homeassistant-ai#42). - _strip_astro_frontmatter raises ValueError when frontmatter opens but never closes (homeassistant-ai#36). - discover_script_surfaces raises when site/src/ is missing instead of silently producing partial results (homeassistant-ai#37). - extract_script_body accepts source_label for actionable errors (homeassistant-ai#40). - Astro `<script lang="js">` is no longer mis-tagged as TypeScript (#9). - Inert chr(92) Windows backslash replace removed (homeassistant-ai#14). - Field docstrings on ScriptSurface trimmed to the one that earns its keep (homeassistant-ai#52). - _PY_RENDERERS registry refactor + accurate enumeration comment (homeassistant-ai#45). test_settings_ui_js_behavior.py: - Rot-bait PR/issue numbers removed from module docstring (homeassistant-ai#43). - _TOP_LEVEL_ELEMENT_IDS + import-time drift check replaces the "refresh this manually" comment (homeassistant-ai#55). - _assert_clean_init helper called at the top of every test so init failures surface as init errors, not as misleading "side effect didn't fire" failures (homeassistant-ai#33). - 4xx restartBtn assertion now reads disabled state via JS and snaps to body.dataset instead of OR-shortcircuiting against a wiped DOM (homeassistant-ai#27). - New test_script_boots_without_broadcastchannel_global covers the null-guard branch (homeassistant-ai#15). - Assertion-restating comments removed (homeassistant-ai#60). test_astro_setup_js_behavior.py: - Rot-bait homeassistant-ai#1422 reference removed from module docstring (homeassistant-ai#44). - _section_has_hidden_class replaces fragile substring slicing (#6). - test_initial_state_only_client_section_visible now asserts on the promised visibility, not just absence of errors (homeassistant-ai#26). - Per-client smoke now captures config-output text AND instructions HTML into body.dataset and asserts on non-empty content, catching a typo that drops the whole per-client branch (homeassistant-ai#21). test_astro_tools_js_behavior.py: - _card_class helper replaces ±200-char substring slicing (#12). - test_design_mode_toggle now asserts design-only elements lose 'hidden' class, not just the button label flip (homeassistant-ai#25). - New tests cover .filter-btn / .cat-btn / .size-filter-btn / group-category|file|none / sort-alpha / expand-all wiring (homeassistant-ai#22, homeassistant-ai#23, homeassistant-ai#24) — the adjacent coverage gaps issue homeassistant-ai#1422 didn't name but that fit the harness's same regression-class. test_consent_form_js_behavior.py: - _build_form_dom docstring fixed (said "three", listed four) (homeassistant-ai#54). test_rendered_scripts_parse.py: - Missing-dependency skip flips to fail when CI=true so a workflow drift that drops the install step doesn't silently lose parse coverage (homeassistant-ai#29). - Subsumed-test-class reference removed from module docstring. AGENTS.md: - "60s probe windows take milliseconds" wording fixed; time-faked surface documented (homeassistant-ai#13). - Per-surface module naming guidance updated; reflects actual files (#10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ge for every rendered <script> (homeassistant-ai#1425) * test(internal): JSDOM behaviour harness + auto-discovery parse coverage for every rendered <script> Closes homeassistant-ai#1422. Adds a JSDOM harness (tests/js/harness.mjs + tests/src/unit/_js_harness.py) that drives real rendered <script> bodies through stubbed fetch / BroadcastChannel / virtual timers / DOM and reports observed side effects. A discovery walker auto-picks-up every <script> surface in the repo (src/ha_mcp/settings_ui.py, src/ha_mcp/auth/consent_form.py, every site/src/**/*.astro) so parse coverage extends as new UI surfaces ship — no registration needed. Behavioural coverage landed for the surfaces named in homeassistant-ai#1422: * settings_ui — restartInProgress concurrency guard, 4xx-suppress-reload branch, 5xx fall-through, instance_id-flip probe, BroadcastChannel restart-required + restart-initiated listeners, saveFeatureFlag JSON-parse fallback. * setup.astro — state-machine progression (local / network / remote), plus a parametrised per-client smoke that drives the wizard to config generation for every id in the real clientsData array. * tools.astro — search/filter pipeline + design-mode toggle (TypeScript; esbuild strips types in the harness before eval). * Layout.astro — copy-button idempotency across re-init. * consent_form — submit handler disable + spinner state. The legacy TestRenderedHTMLJsSyntax in test_settings_ui.py is removed — the auto-discovery parse test in test_rendered_scripts_parse.py subsumes it (and extends to the four other surfaces it never covered). CI: unit-tests job in pr.yml installs nodejs + jsdom + esbuild via apt-get / npm ci. Local devs without tests/js/node_modules/ get clean skips, matching the original parse guard's behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): skip Astro frontmatter in script extraction; drain microtasks before clock advance CI surfaced two harness bugs the local smoke-tests didn't catch: 1. `extract_script_body` and the discovery walker greedily matched the first `<script>` substring in the source, which in setup.astro is actually a frontmatter comment: `// below in the <script> block keyed off the entry's id.` That made the "script body" start mid-frontmatter and the extracted text wasn't valid JS — esbuild and JSDOM both rejected it with "Unexpected identifier 'keyed'". Fix: strip the `--- ... ---` Astro frontmatter block before searching for `<script>` tags. Plain .py and .html sources have no frontmatter and pass through unchanged. 2. `clock.advance(settleMs)` returned immediately when no timers were yet scheduled, but the script under test often awaits a chain of stubbed-fetch promises BEFORE hitting its first `setTimeout`. With only one microtask drain between eval and advance, those promises hadn't resolved yet, so no timers existed, advance was a no-op, and the script stayed suspended — `restartAddon`'s POST to /api/settings/restart never fired and the `alert(msg)` in the 4xx branch never ran. Fix: drain microtasks aggressively at the start of advance() so pending promises get to schedule their timers, and drain again when the timer queue temporarily empties (a promise resolution may queue new timers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(internal): expand initial DOM fixtures to cover every top-level addEventListener target CI surfaced this via the test_5xx test (the only one whose assertion included the harness errors list): the settings_ui script aborts during init at `document.getElementById('backupRefresh').addEventListener(...)` because the test DOM is missing the backup table / modal markup. With init aborted, the invoke step never runs — `restartAddon` is never called, POSTs never fire, `alert()` never runs, and all three restart- flow tests silently fail. The setup.astro tests had the same shape: `generateConfig` queries `config-section` (distinct from `section-config`) to show/hide the inner code block. Without it, the proxy click handler in the remote- flow test threw and the `document.body.dataset.beforeProxy` assignment never landed. Fixes: - settings_ui MIN_DOM now includes backupBulkDelete, backupDomain, backupEntity, backupList, backupRefresh, backupState, featuresBody, modalBackdrop / modalBody / modalClose / modalTitle. Set built from `grep -h "document.getElementById" settings_ui.py` so future top-level handlers will surface as the same pattern. - setup.astro DOM now includes config-section alongside section-config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): run JSDOM eval at global scope; capture body attrs in dom snapshot Two harness bugs the prior CI rounds didn't surface until init-stage crashes were resolved: 1. Wrapping the rendered script in an `async () => { ... }()` IIFE confined top-level `function` declarations to the IIFE scope. `function restartAddon() {...}` never landed on `window`, so `invoke: "window.restartAddon();"` threw `is not a function`. A real browser hoists inline-script function decls to the global window — match that by running prelude + script body at global scope and keeping the IIFE for `invoke` alone (so awaits inside invoke still work). 2. `document.body.innerHTML` returns body's children but not body's own attrs, so tests that wrote `document.body.dataset.foo = 'bar'` as a side-channel for in-page state had no way to assert on it — `result.dom` came back without the attr. Serialise `document.documentElement.outerHTML` instead so html/head/body tags and their own attributes round-trip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(internal): sequence /api/settings/info responses for 5xx restart probe The test_5xx flow hits the info endpoint three times — loadTools init, restartAddon's pre-POST baseline capture, and _probeAddonRestarted after the POST. The old single-response fixture returned the SAME instance_id every time, so the probe never saw the flip and looped until timeout, leaving reloads=0. Adds a `responses: [...]` shape to the harness fetch_map: each match on a URL pattern advances a per-pattern counter; the last entry sticks after exhaustion (matches "the addon came back online and stays online"). Test now provides baseline → baseline → flipped so the probe terminates with restarted=true and the reload fires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(test): cache apt downloads and node_modules for the unit-tests job The unit-tests "Install git and Node.js" step was 44 s — almost all of it network download of the nodejs / npm .deb. The "Install JS test dependencies" step is 1 s when node_modules is fresh but can grow as deps change. - Cache /var/cache/apt/archives keyed on a stable string (apt package set rarely changes). Disable docker-clean and set Keep-Downloaded- Packages so the cached .debs survive install for the next run. apt install still runs (unpacks from local cache, ~3-5 s) but skips the network leg. - Cache tests/js/node_modules keyed on package-lock.json so dep bumps invalidate cleanly. `npm ci` short-circuits when the tree matches. Expected first-cold-cache run: unchanged (~45 s install). Cache hits: ~5 s for both steps combined. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(internal): address Gemini + pr-review-toolkit findings on JS test infrastructure Outcome of going through all 60 findings from Gemini Code Assist + pr-review-toolkit (code-reviewer, pr-test-analyzer, silent-failure- hunter, comment-analyzer). 3 wrong (skipped: PATH-resolved node binary mis-flagged as hardcoded; project-relative esbuild path mis-flagged as hardcoded; theoretical FakeBroadcastChannel constructor-throw). Remaining real items addressed: Harness: - vm.runInContext replaces window.eval / indirect eval to clear the "no eval()" style-guide flag (Gemini #1, #2). - Timer-callback and broadcast-listener throws now record into the errors list instead of being silently swallowed (homeassistant-ai#30, homeassistant-ai#32). - SAFETY_CAP exhaustion records a clear "runaway setInterval" error instead of breaking silently (#4, homeassistant-ai#31). - Non-navigation jsdomErrors route to errors (not console) so tests asserting `not result.errors` catch them (homeassistant-ai#38). - Transpile failure short-circuits init eval to avoid cascading syntax errors from un-transpiled TS (homeassistant-ai#34). - FakeBroadcastChannel.postMessage now delivers to peer same-name channels in the same context per spec (#5). - Time-faked surface documented accurately (Date.now / setTimeout / setInterval only; new Date / performance.now still wall-time) (homeassistant-ai#46). - New broadcastChannelUnavailable param simulates the `typeof BroadcastChannel === 'undefined'` browsing context so the production null-guard branch is exercised (homeassistant-ai#15). - Dead comments and rot-prone duplications removed (homeassistant-ai#47, homeassistant-ai#49, homeassistant-ai#51, homeassistant-ai#56, homeassistant-ai#57, homeassistant-ai#58, homeassistant-ai#59, homeassistant-ai#66). extract_astro_vars.mjs: - vm.runInContext replaces (0, eval) (Gemini #2). - Multi-line `import { a, b } from 'x';` now stripped robustly (#7). - Eval errors wrapped with the source path for actionable failures (homeassistant-ai#35). _js_harness.py: - Wrong test file name and workflow path in docstring fixed (homeassistant-ai#41, homeassistant-ai#42). - _strip_astro_frontmatter raises ValueError when frontmatter opens but never closes (homeassistant-ai#36). - discover_script_surfaces raises when site/src/ is missing instead of silently producing partial results (homeassistant-ai#37). - extract_script_body accepts source_label for actionable errors (homeassistant-ai#40). - Astro `<script lang="js">` is no longer mis-tagged as TypeScript (#9). - Inert chr(92) Windows backslash replace removed (homeassistant-ai#14). - Field docstrings on ScriptSurface trimmed to the one that earns its keep (homeassistant-ai#52). - _PY_RENDERERS registry refactor + accurate enumeration comment (homeassistant-ai#45). test_settings_ui_js_behavior.py: - Rot-bait PR/issue numbers removed from module docstring (homeassistant-ai#43). - _TOP_LEVEL_ELEMENT_IDS + import-time drift check replaces the "refresh this manually" comment (homeassistant-ai#55). - _assert_clean_init helper called at the top of every test so init failures surface as init errors, not as misleading "side effect didn't fire" failures (homeassistant-ai#33). - 4xx restartBtn assertion now reads disabled state via JS and snaps to body.dataset instead of OR-shortcircuiting against a wiped DOM (homeassistant-ai#27). - New test_script_boots_without_broadcastchannel_global covers the null-guard branch (homeassistant-ai#15). - Assertion-restating comments removed (homeassistant-ai#60). test_astro_setup_js_behavior.py: - Rot-bait homeassistant-ai#1422 reference removed from module docstring (homeassistant-ai#44). - _section_has_hidden_class replaces fragile substring slicing (#6). - test_initial_state_only_client_section_visible now asserts on the promised visibility, not just absence of errors (homeassistant-ai#26). - Per-client smoke now captures config-output text AND instructions HTML into body.dataset and asserts on non-empty content, catching a typo that drops the whole per-client branch (homeassistant-ai#21). test_astro_tools_js_behavior.py: - _card_class helper replaces ±200-char substring slicing (#12). - test_design_mode_toggle now asserts design-only elements lose 'hidden' class, not just the button label flip (homeassistant-ai#25). - New tests cover .filter-btn / .cat-btn / .size-filter-btn / group-category|file|none / sort-alpha / expand-all wiring (homeassistant-ai#22, homeassistant-ai#23, homeassistant-ai#24) — the adjacent coverage gaps issue homeassistant-ai#1422 didn't name but that fit the harness's same regression-class. test_consent_form_js_behavior.py: - _build_form_dom docstring fixed (said "three", listed four) (homeassistant-ai#54). test_rendered_scripts_parse.py: - Missing-dependency skip flips to fail when CI=true so a workflow drift that drops the install step doesn't silently lose parse coverage (homeassistant-ai#29). - Subsumed-test-class reference removed from module docstring. AGENTS.md: - "60s probe windows take milliseconds" wording fixed; time-faked surface documented (homeassistant-ai#13). - Per-surface module naming guidance updated; reflects actual files (#10). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): seed data-transports on wizard tiles; add NODE/ESBUILD env overrides; FakeBroadcastChannel ctor guard CI surfaced a real test-fixture bug exposed by the new jsdomError → errors routing: setup.astro's connection-click handler reads `card.dataset.transports` via JSON.parse, but the wizard DOM stubs were emitting `<button data-client="...">` without the matching `data-transports` attribute. JSON.parse(undefined) threw "undefined is not valid JSON" on the jsdomError channel, which the previous silent-handling code dropped — now correctly surfaced as a test failure. Fix: serialise the real `transports` array from the clientsData entry onto each tile. Also addressing the items previously marked deferred / skipped during the Gemini + pr-review-toolkit triage: - NODE_BINARY env override (Gemini #3): _node_binary() helper checks the env var before falling back to PATH-resolved `node`. Default unchanged. - ESBUILD_BINARY env override (Gemini #8): _esbuild_binary() returns the env-var path when set, else the project-local install. Default unchanged so the lockfile-pinned install stays the reproducible default. - FakeBroadcastChannel constructor guard (sf-hunter #L1): wraps the `new` in try/catch and records construction failures into errors before re-raising. - Trim TestWizardStateMachine class docstring (comment-analyzer homeassistant-ai#50). - Tighten the info-call enumeration comment in the 5xx test to describe the harness's "last entry sticks" semantics rather than pinning a specific call count (comment-analyzer homeassistant-ai#48). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): stub layout-dependent JSDOM APIs (scrollIntoView, scrollTo, matchMedia) The new timer-callback-error routing surfaced a real JSDOM limitation: `section.scrollIntoView()` (called from the wizard's `scrollToSection` helper inside a setTimeout) is not implemented in JSDOM. Every per-client setup-flow test failed with ``timer callback: TypeError: section.scrollIntoView is not a function`` — production behaviour is fine, but the harness's noise filter wasn't distinguishing real script bugs from JSDOM-missing-API noise. Adds a defensive no-op stub for scrollIntoView (Element + HTMLElement prototypes), scrollTo on window, and matchMedia — the three most common layout-dependent APIs production UI scripts touch. Future rendered scripts that lean on other layout APIs (IntersectionObserver, etc.) can extend the list when needed. Also relaxes the per-client smoke's bare `assert not result.errors` to rely on `_assert_clean_init` (init/transpile/invoke/jsdom errors) plus the content-shape assertion. Timer-callback errors from missing JSDOM APIs are noise; the content-shape check still catches the regression class the test is named for. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): seed .tool-chevron on tool-card fixture for tools.astro expand-all test The expand-all handler queries `card.querySelector('.tool-chevron')!` (TypeScript non-null assertion). The runtime `!` doesn't actually check; chevron is null in the test DOM and `chevron.classList.add(...)` throws. Production cards include the chevron; our fixture didn't. Add it alongside `.tool-details` in `_build_tools_dom`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d findings (homeassistant-ai#1164) Round-3 review pass landed 13 verified findings; the cosmetic "persisted-value visible in UI after F5" item was explicitly skipped per user direction (UI shows post-gate value when master off; user accepts this since beta tools are actually disabled at runtime — the preserve-across-master-cycle UX is at the data layer, not the visual). Source fixes: - **Save button copy is now source-blind** — the previous "your feature-flag toggles already saved on click" claim only held when ``saveFeatureFlag`` raised ``restartNotice``; tool-config pin saves, backup-config saves, and cross-tab ``restart-required`` broadcasts also raise it. New copy: "a restart is pending. Click Restart above to apply your prior changes." (#2) - **Stale F.37 test docstring** describing the deleted server-side cascade rewritten. (#3) - **Beta-gate INFO log noise** — cascade-clear removal meant the gate could fire its "forcing %s=False" line every Settings rebuild, spamming addon logs once a user had truthy sub-flags persisted. Dedup via ``_BETA_GATE_LOGGED`` set per process, cleared on ``_reset_global_settings``. (#9) - **Lazy-lock docstring** updated to reflect Python 3.13 semantics (``asyncio.Lock()`` no longer takes a loop arg; the lazy pattern still serves test fixtures and single-loop deployment, with the invariant documented). (#10) - **Addon-mode carve-out comment** clarified to distinguish dev (master in schema) from stable (master web-UI-only). (homeassistant-ai#14) - **probe-div null branch** in F.37 now writes ``data-error`` so a failing test points at "selector missed" vs "value flipped" unambiguously. (homeassistant-ai#17) Tests added: - ``test_translations_cover_every_schema_key`` — parity check that every ``schema:`` key has a non-empty translation ``name`` and ``description``. Parameterised across stable + dev addons. Pins the class of silent gap that this PR's ``advanced_debug_logging`` fix addressed. (#4) - ``test_save_features_acquires_override_file_lock`` + ``test_save_advanced_acquires_override_file_lock`` — counting-lock wrapper asserts ``async with _get_override_file_lock()`` runs exactly once in each file-mode write path. Pin against a regression that silently bypasses concurrent-save serialisation. (#5) - ``test_dual_save_buttons_mirror_disabled_and_status_on_post_failure`` — exercises the 500-response branch of the dual-save mirror so a regression that broke ``_setAdvSaveStatus``/``_setAdvSaveDisabled`` for error paths only would still fail. (#6) - ``test_save_features_master_on_restores_subflag_values_in_addon_mode`` — addon-mode round-trip mirror of the existing standalone restore test; asserts the Supervisor merge-and-post call carries only the master flip-on and never zeroes out sub-flag values. (#7) - ``test_save_button_nothing_to_save_when_no_dirty_and_no_restart`` + ``test_save_button_restart_pending_hint_when_dirty_empty_but_restart_showing`` — both branches of the empty-dirty Save click are exercised; the restart-pending branch asserts the copy is source-blind. (#8) Deferred per user direction: - #1 (file-vs-Settings visual after F5): user accepts the current behavior (UI shows post-gate value; runtime tools actually disabled when master off; data-layer preserve still works end-to-end). - #11/#12/homeassistant-ai#13 (code-simplifier helper extractions): skipped as complicated to implement without behavior risk. - homeassistant-ai#15 (pre-homeassistant-ai#1164 users with already-cleared sub-flags): release-note concern, not a code change. - homeassistant-ai#16 (lock fragility under future thread-pool dispatch): speculative future-risk; not actionable today.
…ant-ai#1164) (homeassistant-ai#1431) * feat(policy): scaffold policy package for per-tool approval (#966) * feat(policy): add Predicate/Rule/Policy data models (#966) * feat(addon): expose enable_per_tool_approval option (#966) * feat(config): add enable_per_tool_approval setting (#966) * feat(policy): atomic load/save for tool_policy.json (#966) * feat(addon): wire enable_per_tool_approval through start.py + docs (#966) * feat(policy): args-hash + remember-cache for approval queue (#966) * feat(policy): predicate evaluator (eq/in/regex/exists/...) (#966) * feat(policy): pending entries with TTL, decisions, and event signalling (#966) * feat(policy): PolicyMiddleware happy-path branches (#966) * test(policy): cover block/deny/timeout/recall/remember branches (#966) * feat(policy): /api/policy/* Starlette handlers (#966) * fix(policy): wrap ValidationError, scope contains op, regex doc, test bind (#966) * feat(policy): Policies tab in web UI + sidecar route wiring (#966) * feat(policy): register PolicyMiddleware on the FastMCP server (#966) * fix(policy): return 400 on malformed approve/deny bodies (#966) * feat(toolsearch): unpin yaml-edit and code-mode tools, gated by approval middleware (#966) * chore: ruff format + lint cleanup for policy package (#966) - ruff format reflow on PR-touched files (case statements split to two lines, function signatures, line continuations). - UP042: Verdict now inherits from StrEnum instead of (str, Enum). - E402: hoist `import anyio` to the top of test_approval_queue.py. - I001: sort imports in test_evaluator.py and test_model.py. No behavior change. * fix(policy): mypy narrowing for evaluator comparisons (#966) `Predicate.value` is `Any | None` and `extract_path` returns `Any`, so `val == pv`, `val > pv`, etc. inherit `Any` and trip the project's `warn_return_any` mypy setting on functions declared `-> bool`. Wrap the comparison branches in `bool(...)` to make the narrowing explicit. Also guard the `regex` branch with `isinstance(pv, str)` so `re.search` receives a definite `str` instead of `Any | None`; a non-string regex value now returns False instead of raising TypeError at evaluation time, which is the only sensible behavior for a malformed pattern. No change to any test's expected outcome. * docs: credit @L1AD and PolicyLayer for #966 inspiration * docs(addon): fix wrong YAML example in enable_per_tool_approval section (#966) * fix(policy): CI green + critical bugs from reviewer cycle (#966) - expires_in_seconds: use time-remaining not total TTL - middleware: fail-closed on corrupt policy load (was crashing all gated calls) - handlers: 500-with-corrupt-flag on get_config when policy invalid - approval URL: wire secret_prefix via lazy getattr so HTTP-standalone emits a usable absolute path - approve/deny: return bool, 409 on already-decided - Predicate: field validators for op/value compatibility (Gemini) - persistence: explicit UTF-8 encoding (Gemini) - middleware: reuse tools/helpers safe_progress - handlers comment: fix wrong "next call" claim - test_middleware: unwrap ExceptionGroup for pytest.raises (CI fix) - test_stdio_settings_sidecar: include new policy_* handler keys (CI fix) * refactor(policy): drop default_action + tighten Rule.tool_name (#966) - Policy schema simplified: no default_action field. System is always "allow unless a rule matches; rule = require approval". The previous default_action='require_approval' option was a bricked-config trap since rules can't grant allow-overrides. - Rule.tool_name now rejects empty string; wildcard '*' documented in the docstring. - Evaluator simplified to match. - Tests updated/added. * refactor(policy): encapsulate decision state + clean naming (#966) - PendingApproval.decide() encapsulates the decision/event coupling; property guards read-only access to decision. - __post_init__ validates expires_at > created_at. - ApprovalQueue docstring spells out single-process scope and restart-loses-tokens semantics. - Rename args_preview -> args throughout (it was always the full unmodified args; "preview" was misleading). - Remove on_policy_change dead parameter from build_policy_handlers. * fix(toolsearch): default-pinned tools should be user-unpinnable (#966) - Computed pinned set now respects tool_config.json — explicit "enabled" state removes from defaults. - Remove ha_restart / ha_reload_core from DEFAULT_PINNED_TOOLS (recovery actions, low frequency, low value in default LLM tool surface). - Add ha_manage_backup to MANDATORY_TOOLS (operational essential). - Server now declares _settings_secret_prefix on __init__ for pyright. * refactor(policy): rename feature to "Tool Security Policies" (#966) User-facing rename: addon config option, env var, Settings attribute, UI tab label, addon DOCS sections, translations, server method. Internal Python naming (policy/ package, tool_policy.json, /api/policy/* routes, class names like PolicyMiddleware/ApprovalQueue) unchanged for less churn. * docs: small comment polish for policy review nits (#966) * feat(ui): per-tool security-gated toggle in Tools tab (#966) * test(policy): integration + timing-isolation coverage gaps (#966) - test_server_policy_wiring: assert _apply_tool_security_policies attaches middleware + approval_queue when enabled, neither when disabled. - test_settings_ui_handler_selection: parametrize the live-vs-stub branch in build_settings_handlers (sidecar / no server / no queue / live). - test_middleware wait-loop timing: assert event-wake exit, not polling. - test_middleware multi-rule precedence: first-match wins for remember_minutes. * feat(ui): rewrite Tool Security Policies tab — per-tool cards + predicate editor (#966) * feat(config): expose enable_tool_security_policies as a feature flag (#966) Wires the new addon-config toggle into FEATURE_FLAG_FIELDS so it appears in the Server Settings tab and rides PR #1420's _save_feature_flags + _supervisor_merge_and_post_options + _schedule_supervisor_self_restart flow when toggled in addon mode. * fix(policy): address all verified review findings + CI failures (#966) CI fixes: - ruff: drop dead noqa: SLF001 suppression - unit test: test_missing_path_never_matches_except_exists uses op-compatible values so Predicate field_validator doesn't reject Review findings: - FEATURE_META entry for enable_tool_security_policies (toggle now renders in Server Settings tab) - Approval URL points to /settings?tab=tool-security-policies (was POST-only /api/policy/approve which 405'd on browser open) - policyDecide surfaces network errors + 409 current_decision - Policy gains version field for optimistic concurrency; PUT 409s on version mismatch; client surfaces 'reload before saving' - _apply_tool_security_policies failure logs spell out security impact (TOOL SECURITY GATING IS NOT ACTIVE) and include data_dir/env-var context - Validator rejects value on op='exists'; gt/lt TypeError degrades to False - ToolVisibilityResult -> UserToolStateOverrides, fields are frozenset, disjointness asserted - PendingApproval.event private; expose async wait() - _SupervisorOptionsError gains transport()/validation() classmethods encoding kind->status_code pairing - Wiring test binds queue identity; handler-selection covers all 3 live routes - Comment + doc polish (audit-trail claim, e2e docstring path, internal Task references) * fix(policy): CI green + real e2e test for the approval flow (#966) - ruff format: tests/src/unit/test_settings_ui.py - test_save_and_roundtrip: account for save_policy version bump - test_serialized_shape_is_stable: include 'version' in expected keys - test_addon_save_returns_500_when_server_is_none: guard server._settings_secret_prefix assignment with None check (regression from #4's secret-prefix wiring) - tests/src/e2e/policy/test_approval_flow.py: real e2e exercising block -> approve -> re-call cycle with strict args-binding rejection on mutated args. Skip-stub replaced with real test driving the live middleware via mcp_client + /api/policy/* HTTP. * fix(ui): broken quote escaping in predicate-form placeholder breaks JS parse (#966) The Python source `'placeholder=\\'\"lock\"...\\\\'>'` rendered as JS `'placeholder='\"lock\"..'>'` — the single quote inside the HTML attribute value closed the outer JS string literal, and subsequent tokens (\"lock\", 'or', '[', ...) broke parsing. With a syntax error in the inline <script>, the browser stopped executing — Tools tab stuck on 'Loading...', tabs unclickable. Switched to a JS-safe double-quoted attribute with " for the embedded double quotes in the placeholder hint. * fix(ui): gated toggle reads addon-config flag, not Policy.enabled (#966) The per-tool 'security gated' toggle was grayed out even when the user had enable_tool_security_policies turned ON in the addon config + the Server Settings tab toggle, because the JS was reading Policy.enabled (the file field) instead of the addon-config feature flag — which is the single source of truth for whether the middleware is active. loadPolicyState now reads enable_tool_security_policies from /api/settings/features (same place renderFeatureFlags consumes from). * fix(policy): Policy.extra=ignore so old persisted files load cleanly (#966) Persisted tool_policy.json files from an earlier revision of this PR carry default_action (since dropped) and rejected with ValidationError on load — surfacing as 'Could not load policy: 500' when the user clicked the per-tool gated toggle. Predicate/Rule keep extra=forbid (typo catching at construction). * fix(policy): drop Policy.enabled — addon-config flag is the sole switch (#966) The middleware's server-side gate was checking `policy.enabled` (a file field with no UI surface), so it returned ALLOW on every call regardless of rules. The addon-config flag (`enable_tool_security_policies`) was supposed to be the only switch — and the middleware is only registered when that flag is true — so the inner `policy.enabled` check was both redundant and broken. Remove the field, remove both server-side checks (middleware + evaluator), update tests, and refresh the JS comment that referred to it. * fix(policy): drop approve_url, instruct LLM to send user to settings page (#966) The relative-path approve_url doesn't resolve cleanly through cloudflared or other reverse-proxy deployment modes — the LLM can't safely hand it to the user. The user already knows where the Tool Security Policies tab is (they set the rule from it), and that page lists all pending approvals, so a per-request URL is unnecessary noise. - Drop approve_url from USER_APPROVAL_REQUIRED context; keep `token` so a caller could correlate but the user doesn't need to act on it. - Update message + progress text to instruct the LLM to tell the user to open the settings UI Tool Security Policies tab. - Drop the now-unused approval_url_builder param + the _settings_secret_prefix plumbing in server.py / settings_ui.py. Also fix the failing test_defaults (asserted dropped Policy.enabled field) and the e2e test PUT body that still carried `"enabled": True`. * feat(policy): schema-driven condition builder for write/destructive tools (#966) The previous "Add predicate" UX required users to type both the dotted arg path (e.g. `args.domain`) and the value as JSON. Two problems: 1. They need to know what fields each tool takes. 2. They need to know what values are legal (which HA domains exist, which entities, etc.). Replace the free-text path input with a dropdown sourced from the tool's JSON schema, and replace the free-text value input with a (multi-)select sourced from HA when the path has a known value source (domain, service, entity_id today; trivially extensible). Free-text is still available via an "(other — type a path)" escape hatch and as the automatic fallback for ops that don't pair with a registry (regex / contains / gt / lt). Server: - New `/api/policy/tool-schema?name=...` returns `{paths: [...], value_sources: {path: source_key}}`. Read-only tools return empty paths so the UI falls back to free-text (gating those is low-value but still permitted manually). - New `/api/policy/value-source?source=...` resolves a source key to a live list of choices. In-process 30s TTL cache avoids hammering HA when the user explores paths. - value_sources.py registry maps (tool_name, arg_path) → source_key for the common write/destructive surface (call_service, set_entity, set_integration_enabled, get_history, etc.). New mappings are one dict entry plus, if a new source key, one fetcher. - Both endpoints mount in addon + secret-prefix routes. Sidecar serves 503 stubs (no FastMCP registry / HA client in that process). UI: rename user-facing "predicate" → "condition" (CS jargon → SQL/JIRA terminology users actually recognise; internal Pydantic class stays `Predicate` so the wire format is unchanged). Form fetches the schema lazily on first open, caches it on the card, refetches value choices when path/op changes. Includes test_schema_handlers.py covering: missing-name 400, sidecar 503, unknown-tool 404, read-only empty-paths, write-tool paths + registry, JSON-schema enum passthrough, value-source 400 paths, both HA-services payload shapes, domain filtering for entities/services, and upstream-fetch 502 mapping. * test: include new policy handler keys in sidecar all-keys assertion (#966) * fix(ui): clearer condition-builder labels, optional value, bareword input (#966) User feedback on the new form was: 1. "args.foo" path placeholder is gibberish; no real label on path/value 2. value box should not be mandatory for ops where backend allows None 3. typing `lock` into the value box errored with "Invalid JSON" — every normal-looking input has to be quoted 4. for ha_call_service `data` was the only arg without an obvious meaning Changes: - Real `<label>`s on the form rows: "Argument:", "Match when:", "Value:". - Op dropdown shows friendly text ("is present (any value)", "equals", "is one of", "matches regex", etc.); wire values unchanged. - Hint line under the value row reflects the current op so users know whether a value is required and roughly what shape it should take. - Value is now OPTIONAL for ops where the backend accepts a missing field (exists, eq, neq, contains). Submitting an empty value omits the `value` key from the predicate entirely. - Bareword inputs auto-coerce: `lock` → `"lock"`, `lock,alarm` → list, `42` → number, `true` → bool. Falls back to a clearer error if even the smart-coercion can't make JSON. - Path dropdown options now carry the schema `description` as a `title` tooltip, so `data` reads as "Service data dict" on hover instead of being a mystery. - Schema-declared enums render as a value dropdown automatically (no registry entry needed) when the path's JSON-schema has `enum`. Also fix /tmp/extract_js.py — naive paren-counter broke once form strings started containing parens; switch to ast.parse so future edits don't silently break the harness. * feat(policy): wildcard path "args.*" + clearer empty-value semantics (#966) User asked for a catch-all "any argument equals X" condition and called out that the previous "Leave blank to gate on null" hint was nonsensical — a blank value should mean "any value" to a normal user, not "match the null literal". Backend: - Refactor evaluator: `extract_path` → `iter_path_values`, which yields every value the dotted path resolves to. A `*` segment fans out across the current node (dict values for dicts, items for lists). A path like `args.*` thus yields every top-level arg; `args.config.*` yields every leaf of the config sub-dict. - `match_predicate` rewrites to "ANY matching value satisfies the op", which collapses to the previous single-value semantics for non-wildcard paths. So `path=args.*, op=eq, value="lock"` gates whenever any arg of the tool call equals "lock". UI: - New "(any argument)" option at the top of the path dropdown, fills `args.*` and carries a tooltip explaining the semantic. - VALUE_OPTIONAL_OPS shrinks to just `exists` — blank value is no longer silently accepted for eq/neq/contains. Instead, the value-required error fires, and the hint text under the field tells the user to switch op to `is present` if they wanted "any value". - Hint copy revised across all ops so the "what happens with this op + blank value" question has a clear answer at each step. Tests: - New TestIterPathValues covering top-level, nested, missing, and the three wildcard shapes (dict values, list items, empty). - New TestWildcardPredicate covering eq/in/exists/regex matching via `args.*` plus an end-to-end evaluate() test. - Existing tests still pass with the refactored matcher; signatures of the public functions are unchanged. * fix(ui): default condition path to '(any argument)'; relabel error (#966) - Drop the '(pick an argument)' placeholder; default the path dropdown to '(any argument)' so the form is immediately submittable. - 'path is required' error reads 'argument is required' if it ever fires (it won't on the happy path now). * feat(policy): auto-save conditions + surface matched_rule in approval error (#966) UI: - Drop the manual "Save changes" button on each rule card. Conditions now PUT to disk the moment the user clicks "Save condition", clicks the × on a condition row, or edits the remember-minutes field (debounced 500ms). The only feedback is a small "Saving…" / "Saved." status line next to the card. - Removed the now-dead .policy-save-rule CSS and the markDirty helper. Server: - USER_APPROVAL_REQUIRED error context now carries `matched_rule` with the rule's tool_name + when[]. Lets the user (and the LLM) tell at a glance which rule fired, instead of guessing whether their condition saved correctly. * feat(policy): case-insensitive string comparison in all ops (#966) Security gates shouldn't fire differently based on whether the LLM capitalised its argument — 'Lock' and 'LOCK' and 'lock' are the same operationally. eq/neq/in/not_in/contains lower-case both sides before comparing when both are strings; regex uses re.IGNORECASE. Non-string types pass through unchanged so int(1) != str('1') still holds. * fix(policy): mypy bool cast + broaden e2e coverage (#966) mypy: bool(_ci(val) == _ci(pv)) — _ci returns Any (passes non-strings through unchanged), so eq/neq comparisons need an explicit bool wrap. Tests: previous e2e only covered the happy block→approve→re-call path. Add four more cases against the live testcontainer: - wildcard `args.*` gates when any arg matches the value - wildcard `args.*` passes through when no arg matches - case-insensitive matching (rule 'lock' gates caller 'LOCK') - deny → middleware raises USER_DENIED, tool never runs - remember_minutes>0: second call within the window skips the queue * refactor(policy): address review-cycle findings (#966) Gemini (6 unresolved threads): - Migrate POLICY_LOAD_FAILED / USER_DENIED / USER_APPROVAL_REQUIRED off manual `ToolError(json.dumps(...))` onto the canonical `raise_tool_error(create_error_response(...))` pattern. Added the three error codes to ErrorCode enum. - Hoist sync `load_policy()` off the event loop via `anyio.to_thread.run_sync` in the middleware's policy provider call. - Add justification comment on `local_provider._list_tools()` (same rationale that's already documented in `settings_ui.py`'s tool enumerator: public `list_tools()` filters disabled tools but operators may still want to author gating rules for them). Code-reviewer findings: - ApprovalQueue TOCTOU: two concurrent `on_call_tool` coroutines with identical (tool, args_hash) could both miss `find()` and create duplicate pending entries; approving one would leave the other waiter blocked. Introduce `find_or_create(...)` serialised behind an `anyio.Lock`; middleware now uses it. - ApprovalQueue had no pending-entries cap → memory exhaustion under an LLM retry-loop with mutated args. Add `PENDING_CAP = 1000` with FIFO eviction of oldest entries when the cap is hit. Silent-failure-hunter findings: - handlers.py: `get_tool_schema` and `get_value_source` now `logger.exception` before returning 500/502 so FastMCP version bumps or HA outages leave a traceable signal instead of opaque client errors. - value_sources fetchers `logger.warning` on unexpected HA response shapes (would otherwise silently return empty dropdowns). - value_sources cache no longer stores empty results — a transient HA glitch returning [] would otherwise pin the dropdown blank for 30s. PR-test-analyzer findings (the critical one): - test_persistence.py's `test_save_and_roundtrip` passed `Policy(enabled=True, ...)` for a field that no longer exists; `extra="ignore"` silently dropped it so the test was a no-op assertion. Replace with real round-tripped fields (wait_seconds / approval_ttl_minutes / remember_minutes) and add an explicit `test_load_drops_unknown_fields` exercising the extra="ignore" back-compat contract with a JSON file carrying `default_action` + `enabled`. - Add wildcard scalar/None tests (`args.x.*` against scalar yields nothing; doesn't crash). - Add ApprovalQueue tests: concurrent `find_or_create` shares one pending entry; `create` evicts oldest at PENDING_CAP. - Add handler tests: sidecar value-source returns 503, tool-schema 500 on `_list_tools` exception, value-source cache key separates per params, `_extract_arg_paths` skips malformed property entries. Comment-analyzer findings: - Grammar fix in handlers.py `_is_write_or_destructive` docstring. - model.py docstring "older version of this PR" → "older builds". - Strip the `(#966)` / `(issue #966)` parentheticals from module docstrings, settings_ui CSS/HTML/comments — git blame and the commit message carry the link. * fix(policy): UI surface fetch failures + middleware reissues swept pending (#966) - Middleware: after _wait_for_decision returns without a verdict, check whether the pending entry was swept (TTL elapsed during the wait). If so, create a fresh entry before raising USER_APPROVAL_REQUIRED so the LLM isn't told to re-call against a dead token. - UI: policyLoadConfig now surfaces fetch failures in a visible error banner instead of silently rendering blank — picks up the policy_file_corrupt:true repair hint from the server's 500 response. - UI: loadValueChoices records the failure (lastValueSourceError) so renderHint can show it under the value row. The dropdown still downgrades to free-text, but the user can now tell a transient HA outage from "no value source registered for this path". - UI: renderValueControl uses an autoincrement seq so rapid path/op edits don't let an earlier slow fetch's DOM mutation land after a newer one's (similar to the autoSave pattern). * fix(policy): logger.info on silent decide-False; debug log on gt/lt type-mismatch; strengthen event-wake test (#966) - ApprovalQueue.approve/deny: emit logger.info when the call returns False (unknown token or already decided) — was silent. Helps debug the case where the middleware's consume_and_maybe_remember races with an out-of-band decide. - Evaluator gt/lt TypeError fallback now logs at debug so a user whose 'battery_level < 20' rule never fires can see that the arg came in as a string and tighten the rule. - test_event_wakes_waiter now measures elapsed wait time and asserts < 200ms, ruling out a hidden poll-loop impl that would still pass the previous decision-only check. * style: ruff format evaluator.py for CI's 0.15.13 (#966) Local ruff 0.15.7 didn't wrap the multi-arg logger.debug call; CI's ruff 0.15.13 does. Upgrading local toolchain to match. * feat(policy): clear remember-cache on save, clearer disabled-state UX, mirror master toggle (#966) Three things: 1. Remember-cache invalidation on policy save (B2). ApprovalQueue.clear_remember_cache() drops every remembered approval; put_config calls it after a successful save. Without this, tightening a rule was silently bypassed by any in-flight remembered approvals until their window expired. 2. Better 503 / "unavailable" messaging (B8 + the broader issue). The stub handler's 503 message used to read "Live approvals unavailable in this mode (sidecar)" even when the real cause was the feature being turned off in addon config — the user had no way to tell from the UI. Updated to call out all three causes (feature off, sidecar, ImportError) and point at the addon log. The pending-list JS now checks policyState.enabled first and shows "Tool Security Policies is turned off" when that's the actual reason, falling back to the server's 503 message otherwise. 3. Mirror the master toggle onto the Tool Security Policies tab. Was only exposed in Server Settings before — users on the Policies tab had to navigate away to find the on/off switch. New checkbox at the top of the tab posts to the same /api/settings/features endpoint, so the two surfaces are live mirrors of the same addon-config flag. * test(policy): fix JS-harness drift guard + lock policy-tab behaviour (#966) The merged-in JSDOM behaviour test (#1425) failed collection because its hardcoded _TOP_LEVEL_ELEMENT_IDS list didn't yet know about the policy-tab handlers this PR adds (policy-master-toggle, policy-save-global-btn). Add them, plus matching DOM stubs in _build_min_dom so the init pass doesn't throw on the addEventListener calls. While the file is open, add three behavioural tests that pin the new condition-builder UX wiring: - Master toggle change POSTs to /api/settings/features with the enable_tool_security_policies flag (so the on-tab toggle stays a true mirror of the Server-Settings checkbox). - /api/policy/pending 503 renders "Tool Security Policies is turned off" when the addon flag is off (avoids the old misleading "sidecar / unavailable" copy). - /api/policy/pending 503 propagates the server's addon-log message verbatim when the flag IS on but the queue is unreachable (so users know where to look for ImportError details). The parse-coverage path catches syntax breaks already; these tests catch behavioural regressions on top of it. * refactor(policy): address 2nd-round review findings (#966) Verified all 23 findings from the 2nd pr-review-toolkit pass against the code; fixed 22 (skipping #13 — the JSDOM seq-cancel race test is high-effort to author reliably and the production guard is small enough that bench-level review catches regressions). ## Correctness / silent-failure - ApprovalQueue PENDING_CAP eviction now sorts by `(decision == "pending", created_at)` so resolved entries evict first. When a still-pending entry MUST be evicted (cap full, no resolved to drop), `.set()` its event so any waiter in `_wait_for_decision` wakes immediately instead of blocking the full wait_seconds against a row that no longer exists. - Middleware: log INFO with old + new token on the reissue-after-sweep branch so operators can correlate "approval row keeps reappearing" with the actual cause. - Middleware: scope `clear_remember_cache` to "rules actually changed" — editing only wait_seconds / approval_ttl_minutes no longer blows away in-flight remembered approvals. - Policy: model_validator requires `wait_seconds < approval_ttl_minutes * 60` so the middleware can't repeatedly issue fresh pending entries because the wait outlasted the TTL. - value_sources: cache key uses `urllib.parse.urlencode` so a future param value containing `=`/`&` can't collide with another key. - ApprovalQueue: `approve`/`deny` on unknown token now logs WARNING (security-gating endpoint, suggests UI bug or token probing). Already-decided stays INFO (legitimate race). ## UI - settings_ui.policyState gains an `enabledKnown` tri-state bit so downstream branches (`policyLoadPending`'s "feature off" copy, master-toggle revert) don't false-confidently route to the "disabled" message when the features fetch actually failed. - Master-toggle change handler reverts the checkbox on save failure AND syncs from `policyState.enabled` after a successful load — no more "UI says on, server says off" drift. - `policyLoadConfig` appends "(response body unparseable)" when the 500 body isn't JSON (e.g. HTML error page from a misrouted sidecar), so the operator sees more than just "HTTP 500". - `policyLoadPending` surfaces fetch errors inline ("Lost contact with server, retrying") instead of silently freezing the list. - `fetchToolSchema` records `lastValueSourceError` on failure so the hint banner explains why the value dropdown silently downgraded to free text. ## Tests - test_handlers: `test_put_config_clears_remember_cache_when_rules_change` + `test_put_config_preserves_remember_cache_when_only_timing_changes` lock in the scoped invalidation. - test_approval_queue: `test_create_evicts_resolved_entries_before_pending`, `test_evicting_pending_wakes_its_waiter`, `test_create_after_sweep_still_evicts_when_pending_fills_cap` cover the new eviction rules. The strengthened `test_find_or_create_lock_blocks_concurrent_create_under_real_race` inserts a yield point inside the lock body so the lock actually matters to the assertion (the previous test would pass even without the lock under anyio's cooperative scheduler). - test_middleware: `test_swept_pending_during_wait_is_reissued_with_fresh_token` exercises the previously-untested reissue branch. - test_schema_handlers: `test_value_source_empty_result_not_cached` proves the empty-result no-cache guard actually triggers a refetch. - test_settings_ui_js_behavior: master-toggle test now JSON-parses the POST body and structurally asserts `flags.enable_tool_security_policies is True` instead of loose substring matching. ## Comments - approval_queue.py PENDING_CAP docstring now matches implementation (was promising "resolved first" before the implementation actually did it). - evaluator.py: gt-branch comment example uses ">" not "<". - handlers.py: dropped cross-reference to settings_ui that would rot. - middleware.py: trimmed "fast on warm disk" speculation. - value_sources.py: trimmed "(WebSocket reconnect, auth lapse)" speculation in the empty-cache comment. - settings_ui.py: four comments still saying "predicates" updated to "conditions" to match user-facing terminology. * fix(ui): blank value on eq/in/etc coerces to op=exists (#966) User expects 'leave value blank to gate on the argument's mere presence regardless of value' to work across the equality-ish ops, not just op=exists. Earlier I had the form reject blank value for anything other than exists with a 'value is required' error. Now: for eq / neq / in / not_in / contains / exists, leaving the value blank silently coerces the predicate to op=exists on save. The condition row then reads as 'args.* exists' which is the right description of what's stored. Ops that genuinely need a value (regex / gt / lt) still raise 'value is required for op=...'. The hint text under each op updated to call out 'Leave blank to gate on any value' where it applies. * docs(addon): drop beta tag from Tool Security Policies (#966) The feature is stable enough to ship as a default-supported addon config option, not a beta. Also corrects two pieces of doc drift that landed here originally: - 'approval URL' wording → 'tell the user to open the Tool Security Policies tab' (the URL field was dropped earlier in this PR) - 'predicates' → 'conditions' (matches the user-facing terminology the UI now uses) Touches both prod and dev addon directories (config.yaml-driven UI text + the rendered DOCS.md). * docs(beta): describe 3-path enabling (dev addon, stable + web UI, env vars) (#1164) * feat(config): advanced settings registry + beta master toggle field (#1164) - Add ``ADVANCED_SETTINGS_FIELDS`` registry (21 fields across connection, search, operations, diagnostics, tools_surface, beta_codemode sections) - Add ``_ADVANCED_SETTINGS_BOUNDS`` and ``_ADVANCED_SETTINGS_CHOICES`` dicts for UI/POST validation - Add ``BETA_FEATURE_FIELDS`` tuple for master-gate enforcement - Add ``enable_beta_features`` Settings field (alias ENABLE_BETA_FEATURES, default False) as the master beta toggle - Update ``FEATURE_FLAG_FIELDS``: add ``enable_beta_features`` at front, ``enable_code_mode`` at end; reorder for UI grouping - Extend ``BACKUP_OVERRIDE_FIELDS`` from 3 to 5 entries (add ``auto_backup_dir`` and ``auto_backup_calendar_lookahead_days``) - Extend ``_apply_backup_overrides`` to handle ``str`` type; widen ``coerced`` annotation to ``bool | int | str``; add bounds check for ``auto_backup_calendar_lookahead_days`` (1..365) - Add coverage gate test asserting every Settings env alias is registered in one of the three panel registries (or in the explicit ALLOWLIST) - Add ``test_enable_beta_features_default_false`` * refactor(config): code-review fixups — docstring clarity, stricter bool reject, null-byte guard (#1164) * feat(settings-ui): per-tool env-pin for DISABLED_TOOLS / PINNED_TOOLS (#1164 addendum) Add env_pinned_tools() and effective_tool_config() helpers so tools listed in DISABLED_TOOLS / PINNED_TOOLS env vars stay read-only at runtime even after tool_config.json has been written. The _get_tools GET handler now includes env_pinned metadata per tool entry; _save_tools rejects incoming flips of env-pinned tools with HTTP 409. Server.py startup path updated to use effective_tool_config() so env pins apply at boot. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(config): master beta gate + advanced overrides apply (#1164) - Rewrite _apply_feature_flag_overrides: lift addon-mode short-circuit for beta fields (enable_beta_features + BETA_FEATURE_FIELDS), add master gate that forces all 5 beta sub-flags to False when master is off regardless of env/file state - Update get_feature_flag_origin: beta fields never return "addon" — they follow standalone precedence in either mode - Add _apply_advanced_overrides: reads feature_flags.json for all editable ADVANCED_SETTINGS_FIELDS entries; skips display-only fields; validates types, bounds, and choices before setattr - Wire _apply_advanced_overrides into get_global_settings (runs after feature-flag + backup passes) - Add 18 unit tests covering master gate semantics, addon-mode carve-out, advanced-override int/str/float/display-only/out-of-bounds/invalid- choice cases, and backward-compat (pre-master override files) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(settings-ui): code-review fixups — drop dup env_pinned, settings param, msg + tests (#1164) Address code-quality review feedback on 32d67b4d: - Drop the redundant per-tool-entry `env_pinned` field from the GET /api/settings/tools response. The top-level `env_pinned` map is the single source of truth; UI does O(1) lookups against it. - `effective_tool_config()` now accepts an optional `settings` parameter (mirrors `load_tool_config()`); restores dependency injection at the server.py startup callsite (`self.settings`). - 409 rejection message uses comma-joined names instead of Python list repr for better human readability; structured `context.rejected` remains for programmatic access. - Add `test_get_tools_includes_env_pinned_map` test covering the new GET response field. - Symmetrize `get_data_dir.cache_clear()` calls at both ends of each tmp_path test so cross-test cache pollution can't leak in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(config): code-review fixups — docstring accuracy, top-level import, null-byte test (#1164) - Correct _apply_advanced_overrides docstring: most advanced fields ARE in the addon config.yaml schema (backup_hint, verify_ssl, enabled_tool_modules, etc.) and are handled correctly via the env- var-wins check because start.py exports them. Only code_mode_* and mcp_server_version are file-only in either mode. - Move `from typing import Any` from function body to module-level imports (stdlib typing — no need to defer). - Add test_advanced_override_str_field_with_null_byte_rejected to cover the previously-unexercised null-byte reject branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(settings-ui): render env-pinned tool rows as read-only + update addon translations (#1164) - Add `toolEnvPinned` module-level map; populate from `data.env_pinned` in `loadTools()` - Tool rows for env-pinned tools get `.env-pinned` class, all inputs disabled, and a `feature-locked-note` banner naming the env var (DISABLED_TOOLS or PINNED_TOOLS) - Group master toggle excludes env-pinned tools from bulk enable/disable - Update `pinNotice` copy to describe the env-pinned lock-until-unset semantics - Update `homeassistant-addon-dev/translations/en.yaml` descriptions for `disabled_tools` and `pinned_tools` to reflect that they are operator-level locks, not seed-only values - Add JSDOM behavioural tests for env-pinned disabled and pinned tool rows * feat(settings-ui): /api/settings/advanced GET+POST handlers (#1164) * feat(settings-ui): render advanced settings sections in Server Settings tab (#1164) * feat(settings-ui): beta master toggle + nested sub-row gating + 409 rejection (#1164) * feat(settings-ui): nest code-mode sub-numerics under enable_code_mode (#1164) * test(addon): assert start.py auto-enables master beta in dev addon mode + stable schema absence (#1164) * feat(addon): start.py auto-enables ENABLE_BETA_FEATURES=true when dev addon options carry beta keys (#1164) * fix(settings-ui): post-CI/post-review fixes — backup-config str+range, addon-mode gate skip, e2e env, JSDOM ids, stale comments (#1164) * fix(config): beta-sub-flag origin returns addon in dev mode (env var presence as signal) (#1164) * fix(tests): addon-mode save tests use non-beta flag matching new origin semantics (#1164) * refactor(settings-ui): loadAdvancedSettings error parity + atomic-write helper reuse + beta_sub_flags via API (#1164) * refactor(config): narrower setattr errors + hasattr precheck + Gemini coerced-decl nit (#1164) * refactor(settings-ui): hoist override-file read + display-only warning + missing test coverage (#1164) * test(settings-ui): section-uniqueness + registry-disjoint + tighter env-pin assertions + accurate docstring (#1164) * refactor(addon): extract maybe_auto_enable_beta_master helper + real unit tests (#1164) * test(settings-ui): JSDOM coverage for advanced sections + master live-render; verify file untouched on 400 (#1164) * refactor(config): NamedTuple registries + import-time validator; finish silent-failure + JSDOM nesting tests (#1164) * fix(tests): restore standalone-mode assertion + add adv section containers (#1164) `replace_all` from an earlier sweep had pivoted the standalone-mode save assertion onto a beta sub-flag, so the master-gate guard now rejected the request and the test failed. Restore the non-beta `enable_tool_search` flag here — the assertion is about the unified save-contract shape, not the beta path. JSDOM `TestAdvancedSectionRender` tests were failing because MIN_DOM lacked the five `adv*` section containers; `renderAdvancedSection` would silently no-op (getElementById returned null) and the assertions fired against an empty body. Add `advConnection`, `advSearch`, `advOperations`, `advToolsSurface`, `advDiagnostics` to `_TOP_LEVEL_ELEMENT_IDS` so `_build_min_dom` emits `<div>` containers for them. * fix(tests): adopt master beta gate + new advanced handler keys (#1164) Three failures surfaced after rebasing onto upstream/master: 1. ``test_returns_all_handler_keys`` expected the pre-#1164 handler set. Add ``get_advanced_settings`` / ``save_advanced_settings`` to the expected keys. 2. ``test_tools_filesystem.TestFeatureFlag::test_enabled_with_*`` broke because the master beta gate now forces every beta sub-flag False at runtime when ``ENABLE_BETA_FEATURES`` is unset. Set both env vars together in the enabling tests so they exercise the sub-flag bool parsing in isolation, and add an explicit test for the gated behavior so a future regression in ``_apply_feature_flag_overrides`` surfaces here too. 3. ``test_yaml_config_tool.enable_flag`` fixture sets ``ENABLE_YAML_CONFIG_EDITING`` but didn't set the master, so the cached settings landed with the sub-flag forced False — would have broken next-up after the filesystem tests. Set ``ENABLE_BETA_FEATURES`` alongside. * fix(policy): contains operator case-insensitive on list-membership branch Pre-fix: case "contains": if isinstance(val, str) and isinstance(pv, str): return pv.lower() in val.lower() # CI return isinstance(val, (list, tuple, set)) and pv in val # case-SENSITIVE The string-in-string branch was already case-insensitive (matching the ``_ci``-equivalent treatment that ``eq`` / ``in`` / ``not_in`` apply), but the list-membership branch fell through to Python's default ``in`` operator. A rule listing ``["light.kitchen"]`` would not fire on an LLM passing ``["Light.Kitchen"]`` — silent gate failure. Bring it in line with the other string-op branches via per-element ``_ci``, which passes non-string entries through unchanged so mixed-type collections still get natural equality semantics. Caught by Gemini Code Assist on #1431; addressing inline rather than opening a follow-up because the policy module is now in master (#1421) and any reviewer running the suite would see the case-sensitivity asymmetry in the existing TestCaseInsensitive class. * feat(settings-ui): addon-aware locked banner, beta-at-bottom, danger warning, dual save buttons, fork-dev stable copy (#1164) Five user-feedback fixes against the Server Settings UI: 1. Locked-banner copy adapts in addon mode. The standalone "Set via env var X — unset it to edit here." copy is misleading in HA addon mode where the operator has no env-var surface (start.py writes the env vars from /data/options.json; Supervisor writes the rest). Endpoints now return is_addon; the JS helper envLockedNoteHtml swaps in addon-aware copy that points users at the addon Configuration tab. Master beta gets an extra hint explaining the auto-enable rule. 2. Beta block rendered into a dedicated bottom-of-panel betaBody container instead of featuresBody. The dangerous block sits last so users see safer settings first; a "Beta features (dangerous)" header in warning color marks the boundary. 3. enable_beta_features help-text rewritten to lead with an explicit danger warning (permanent damage to HA, no warranty, take a backup, own risk). Mirrored as a blockquote at the top of the dev addon's beta-options section in DOCS.md so the addon UI also surfaces the risk. 4. Save button redesigned — primary-CTA styling (bigger, accent background, hover state), duplicated at the top of the panel so a user scrolling either end can hit save, and paired with a prominent two-step note explaining that Save → Restart are both required for changes to take effect. 5. New scripts/fork-dev/copy-stable.sh + restore-dev.sh let maintainers whose only HA test path is the fork-dev addon flip homeassistant-addon-dev/ between dev-flavor and a stable-mirroring "stable test" flavor. Round-trip clean — copy → restore restores the index exactly. JSDOM behavioural coverage added for each of (1)-(4); MIN_DOM updated with the new top-row + beta-section element ids. * fix(tests): JSDOM beta-block tests assert against production HTML / fixed regex (#1164) Three CI failures from the previous commit: - ``test_beta_section_header_present_with_danger_styling`` and ``test_two_step_save_note_present`` asserted on static ``panel-server`` markup that lives in the rendered HTML template, not in any JS-populated container. MIN_DOM doesn't replicate the full panel-server children (by design — it's a minimal handler stub), so the assertions never found their target strings. Switch both tests to assert directly against ``_SETTINGS_HTML``; the presence of the markup at the template level is the property we actually want to lock down. - ``test_beta_rows_render_into_betaBody`` used a regex ``<div id="betaBody">(.*?)</div>\s*<div`` whose ``</div>\s*<div`` boundary matched the very first nested ``</div><div`` *inside* the master row (after the ``.feature-name`` close tag), so ``bb_content`` only captured the header of the master row. Anchor the boundary on ``</div>\s*</body>`` instead — non-greedy capture between the ``betaBody`` open tag and the body close still gives the full container content. Added a fallback path that asserts on class markers + non-leakage to featuresBody if the regex still misses. * feat(settings): fix stuck-master bug, master in dev schema, cascade-clear, drop connection panel, sync backup_hint+verify_ssl (#1164) Six fixes against the Server Settings + addon Configuration surfaces: 1. ``maybe_auto_enable_beta_master`` now requires ``config.get(key) is True`` instead of ``key in config``. The bare presence check fired the moment HA Supervisor merged the dev addon's schema defaults into options.json, locking the master "on" with origin=env on every fresh dev install even when all 5 sub-flags were False. New unit tests cover the truthy / all-false / one-of-many / non-bool-truthy permutations so a future regression on the semantic fails loudly. 2. Master ``enable_beta_features`` moved into the dev addon Configuration tab (schema + options + translations + DOCS.md). Defaults ON in dev — beta tools are the channel's purpose, so a fresh install lights them up without the user round-tripping to the web UI. Stable's schema is unchanged; the standalone web UI master path remains the gate there. start.py writes ENABLE_BETA_FEATURES from options.json only when the key is present; ``get_feature_flag_origin`` now treats the master like the sub-flags (env-var presence in addon mode → origin=addon). ``maybe_auto_enable_beta_master`` is kept as a one-cycle legacy bridge for installs whose options.json pre-dates the master key. 3. Beta sub-flags also default ON in the dev addon options block. 4. Master-off cascade clear: flipping ``enable_beta_features=False`` in ``_save_feature_flags`` now also writes False for every truthy beta sub-flag in the same save. Without this, sub-flags stayed True in the override file and resumed the moment the master was flipped back on — UX bug the user reported as "having to turn off every toggle individually." JS mirrors the cascade so sub-rows visually de-toggle on master-off without a page reload. 5. "Connection (display only)" section removed from the Server Settings panel. The read-only HOMEASSISTANT_URL / TOKEN / SUPERVISOR_TOKEN fields just wasted space — operators already see them in addon logs and configuration. Registry entries kept in ADVANCED_SETTINGS_FIELDS so the API still returns them (env-pin debugging, future surfaces). ``verify_ssl`` moved from ``connection`` to ``operations`` so it still renders in the panel. 6. ``backup_hint`` and ``verify_ssl`` now sync between addon Configuration and the web UI like feature flags do. New ``ADDON_SYNCED_ADVANCED_FIELDS`` set drives the origin helper (returns ``'addon'`` in addon mode for these) and the save handler (routes their writes through Supervisor ``/addons/self/options`` instead of the override file). Cross-surface gate visibility note appended to every beta sub-flag description in ``translations/en.yaml`` (and a top-of-options note on the master) so addon Configuration users know the web UI master gates everything. Locked-banner addon-mode copy from the earlier commit was already in place; tests in this commit re-target fixtures from the removed connection section to ``search``. * fix(addon-dev): beta sub-flags default OFF — only the master defaults ON (#1164) Mis-read of the user's intent in the previous commit. The intended shape for the dev addon's fresh-install defaults is: enable_beta_features: true ← gate unlocked enable_yaml_config_editing: false ← user opts in enable_filesystem_tools: false ← user opts in enable_custom_component_integration: false ← user opts in enable_code_mode: false ← user opts in enable_lite_docstrings: false ← user opts in The previous commit defaulted every sub-flag to true alongside the master, which would have shipped every beta tool live on a fresh dev install — including filesystem writes and the YAML config editor. Each sub-flag mutates the user's HA system, so they remain opt-in even on the dev channel; the master being on just means the gate is open. * revert(scope): drop scripts/fork-dev/ — maintainer tooling, wrong repo (#1164) Pushed these in 69b7a235 as part of task #17. They're personal-fork test tooling for the fork-dev addon workflow — they have no business on master. Removing from the PR; they're still in this branch's git history if anyone needs to fish them back out. * fix(addon+ui): #1431 review pass — restore MCP_HOST, gate sub-flag env writes, sane save (#1164) Round-2 review pass addressed 14 verified findings: **Bugs**: - **MCP_HOST regression** restored. The PR's earlier merge of upstream/master dropped the `bind_host = os.getenv("MCP_HOST", "0.0.0.0")` block introduced by #1434/#1436. `mcp.run(host=...)` now goes through `bind_host` again. - **Beta sub-flag env vars** are now written only when the matching key is present in `/data/options.json`. start.py was writing ENABLE_YAML_CONFIG_EDITING=false (etc.) unconditionally on stable addon, marking those fields origin='addon' in the web UI; the user's save then POSTed to Supervisor which rejected because the keys are not in stable's schema. - **Env-pinned tool save 409**: `_save_tools` now accepts re-sends whose state matches the env-pinned value, rejecting only true mismatches. The JS `saveConfig` POSTs the entire `toolStates` map including env-pinned rows; without this fix every save with `DISABLED_TOOLS` / `PINNED_TOOLS` non-empty would 409. - **Cascade-clear** now reads the persisted override file directly via `_read_feature_flag_override_file()` instead of `get_global_settings()` (whose master gate had already forced sub-flags to False, hiding stale-true overrides). Also force-False sub-flags that are explicitly True in the same payload as master=false, so `{master:false, sub:true}` no longer lands an inconsistent persisted state. - **Master beta-gate check** is now applied uniformly (no more "skip in addon mode" carve-out). The skip existed because the legacy auto-enable wrote ENABLE_BETA_FEATURES from sub-flag presence; now start.py writes the master from its own options key, so the gate is sound to apply in both modes. - **Dev-upgrade silent-disable warning**: start.py logs when master=false but a sub-flag is true in options.json, so an operator who toggled the master off in Configuration after a pre-#1164 dev install sees why their previously-enabled beta tools went away. - **Mixed-batch advanced save** is now split client-side. The server-side guard that returned 500 stays as a defense, but the UI no longer triggers it — `saveAdvancedSettings` partitions `_advancedDirty` into addon-routed and file-routed batches. **Logging / defensive code**: - Supervisor failure in `_save_advanced_settings` now logs before returning, matching the sibling `_save_feature_flags` / `_save_backup_config` handlers. - The three `assert sup_err is not None` sites that would crash under `python -O` now explicitly return INTERNAL_ERROR (covers the addon-route paths in feature flags and advanced settings). - `_apply_advanced_overrides` narrows `except Exception` to `(ValueError, TypeError)`, matching the parallel `_apply_feature_flag_overrides` exception shape. - `maybe_auto_enable_beta_master` now logs which sub-flag(s) triggered the legacy bridge when it fires, with a removal- candidate note in the docstring. **Docs / UX**: - Docstring drift fixed in `get_feature_flag_origin`, `_get_advanced_settings`, `_save_advanced_settings`. - `envLockedNoteHtml` master copy rewritten — origin='env' on the master is now only the legacy-bridge path, not the default dev-addon path. - "Bottom" save button now actually at the bottom of `panel-server` (below the beta block and its code-mode sub-numerics). Second two-step save note duplicated near the bottom row so users editing dangerous beta toggles also see it. Findings deferred to follow-up (legit but bigger than this pass): - A.2 concurrent-save read-modify-write race (needs an `asyncio.Lock` around the override-file path; functionality is safe today because the runtime gate hides the persisted-state inconsistency). - A.4 master-flip via addon Configuration tab → no cascade (the runtime gate + new log warning cover the observable surface). - F.* missing tests for new behaviours — adding in a follow-up commit. * test(settings): cover #1431 review pass fixes (#1164) - ``test_env_pinned_noop_resend_does_not_409`` — JS saveConfig POSTs the entire toolStates map; env-pinned no-op resend must be accepted. - ``test_env_pinned_value_mismatch_still_409s`` — pin true flips still rejected. - ``test_save_features_cascade_clears_subflag_even_when_payload_says_true`` — in-payload {master:false, sub:true} → 409 instead of inconsistent persisted state. - ``test_save_features_cascade_reads_override_file_not_post_gate_settings`` — cascade reads the file directly so it catches stale-true sub-flag overrides hidden by the master gate on the resolved Settings. - ``test_stable_addon_does_not_declare_enable_beta_features`` and ``test_dev_addon_declares_enable_beta_features_master_in_schema`` — lock the schema asymmetry that makes the dev/stable channel distinction work. - ``test_dev_addon_defaults_every_beta_subflag_to_false`` — sub-flags remain opt-in even on dev. * fix(settings): serialise override-file RMW + cover addon-synced advanced save (#1164) A.2 (concurrent-save race): both ``_save_feature_flags`` and ``_save_advanced_settings`` touch the same ``feature_flags.json`` override file. Two near-simultaneous requests could interleave their read/merge/write and clobber each other's persisted state. The runtime master gate hid the inconsistency for beta sub-flags, but other field combinations (advanced + feature-flag in the same window) would have lost state silently. Wrap the RMW window in an ``asyncio.Lock`` (lazy-initialised under the live event loop so module import doesn't bind to no loop). Both handlers acquire the same lock, so saves serialise correctly. F.35 + F.40 (test coverage): - ``test_save_advanced_addon_synced_routes_through_supervisor`` — ``backup_hint`` / ``verify_ssl`` saves in addon mode call ``_supervisor_merge_and_post_options``, return ``mode='addon'``, do NOT write the override file. - ``test_save_advanced_addon_synced_supervisor_4xx_surfaces_validation_failed`` — Supervisor schema rejection surfaces as ``CONFIG_VALIDATION_FAILED`` with the Supervisor status code preserved, not a generic 502. - ``test_origin_for_addon_synced_field_is_addon_in_addon_mode`` — pins the origin matrix: ``backup_hint`` / ``verify_ssl`` come back ``origin='addon, editable'`` in addon mode; non-synced env-pinned fields stay ``origin='env, locked'``. * fix(settings): A.8 preserve sub-flag visual + cover remaining deferred test gaps (#1164) A.8 — JS master-off no longer flips sub-flag checkboxes visually. Previously the cascade-clear set ``_lastFeatureFlags[sub].value = false`` on master-off so the re-render painted sub-rows unchecked. That fought the user's mental model — they expressed intent on individual sub-flags, master-off shouldn't visually wipe that context. Now sub-rows stay checked + dimmed + disabled after the master flips off; the server-side cascade still clears the values on disk, so a refresh shows the cleared state, and a failed save leaves the visible checked state matching the actual on-disk state. F.37 — ``test_master_off_click_dims_subrow_live_without_clobbering_value`` dispatches a real change event on the master input and asserts the sub-row goes dimmed + disabled but keeps its checked attribute. F.38 — three cross-mode origin permutations for the master: - dev-addon env-set → 'addon' - standalone env-set → 'env' - addon mode + file override (no env) → 'file' F.42 — ``test_dual_save_buttons_mirror_disabled_and_status_state`` probes both ``advSaveStatus`` / ``advSaveStatusTop`` text and both buttons' disabled state via a hidden probe div, asserts the mirror holds at save completion. * fix(tests): probe JSDOM properties via probe div; assert mid-save mirror (#1164) Two test bugs in 25b45ab7's new assertions: 1. ``test_master_off_click_dims_subrow_live_without_clobbering_value`` asserted on the literal string ``"checked"`` in the serialised DOM. ``input.checked`` is a DOM property (not an HTML attribute), so JSDOM's serialiser doesn't emit it. The .checked state IS true at the property level — just invisible to the regex. Probe via a hidden ``__sub_state_probe`` div that reads the .checked / .disabled properties and writes them to data-* attributes. 2. ``test_dual_save_buttons_mirror_disabled_and_status_state`` probed AFTER the full save+reload chain, by which point loadAdvancedSettings() had blanked both status text els. Restructure the test to probe SYNCHRONOUSLY after ``click()``, while saveAdvancedSettings is mid-flight (status="Saving…", both buttons disabled). That's the actual mirror invariant we wanted to lock — the helpers ``_setAdvSaveDisabled(true)`` and ``_setAdvSaveStatus('Saving…')`` run synchronously before the first await. * feat(settings): drop sub-flag cascade-clear; restore advanced_debug_logging translation; clearer Save-button copy (#1164) Three user-reported fixes from stable-addon testing: 1. Stable add-on's ``translations/en.yaml`` was missing the ``advanced_debug_logging`` description — the schema declares the toggle in ``config.yaml:49`` but no translation ever shipped, so the addon Configuration UI showed an unlabelled checkbox. Add the missing entry (same wording as dev's translation). 2. Big "Save advanced settings" button used to say "Nothing to save." after the user toggled a feature flag (beta master, Tool Search, etc.). Feature-flag toggles auto-save on click via ``saveFeatureFlag`` — they never enter ``_advancedDirty``, so the advanced-save button sees nothing to do. When the restart banner is already showing (recent feature-flag save), surface that explicitly: "No advanced changes to save — your feature-flag toggles already saved on click. Click Restart above to apply them." Falls back to the original "Nothing to save." copy when there's no pending restart. 3. Drop the master-off cascade-clear behaviour entirely. The runtime master gate in ``_apply_feature_flag_overrides`` already forces every beta sub-flag to False whenever the master is off, so the tools stay disabled at runtime regardless of file state. Leaving the sub-flag values in the override file means toggling the master off → on restores the user's prior sub-flag selections automatically; the previous cascade-clear forced users to re-check each sub-flag after every master cycle, which was the wrong UX trade for an opt-in beta surface. The master-gate check is unchanged — it still rejects payloads that try to enable a sub-flag while the effective master is off, so the "sub true while master false in same payload" inconsistency still can't land. The cascade-clear was a separate (now-removed) defence. Tests updated: - ``test_save_features_master_off_preserves_subflag_values`` — was ``test_save_features_cascade_clears_subflags_when_master_off``; asserts the new "preserve" semantics. - ``test_save_features_master_on_restores_runtime_subflag_values`` — new round-trip test for master off → on restoring sub-flags. - ``test_save_features_payload_master_false_sub_true_rejected_by_gate`` — renamed; asserts gate rejection still covers the inconsistent payload. - ``test_save_features_cascade_reads_override_file_not_post_gate_settings`` — deleted (no cascade, no reason to test cascade's read path). - ``test_save_features_master_off_applied_dict_contains_only_master`` — new no-cascade pin; ``applied`` carries only the master flip. * fix(settings): #1431 review pass — address 13 verified findings (#1164) Round-3 review pass landed 13 verified findings; the cosmetic "persisted-value visible in UI after F5" item was explicitly skipped per user direction (UI shows post-gate value when master off; user accepts this since beta tools are actually disabled at runtime — the preserve-across-master-cycle UX is at the data layer, not the visual). Source fixes: - **Save button copy is now source-blind** — the previous "your feature-flag toggles already saved on click" claim only held when ``saveFeatureFlag`` raised ``restartNotice``; tool-config pin saves, backup-config saves, and cross-tab ``restart-required`` broadcasts also raise it. New copy: "a restart is pending. Click Restart above to apply your prior changes." (#2) - **Stale F.37 test docstring** describing the deleted server-side cascade rewritten. (#3) - **Beta-gate INFO log noise** — cascade-clear removal meant the gate could fire its "forcing %s=False" line every Settings rebuild, spamming addon logs once a user had truthy sub-flags persisted. Dedup via ``_BETA_GATE_LOGGED`` set per process, cleared on ``_reset_global_settings``. (#9) - **Lazy-lock docstring** updated to reflect Python 3.13 semantics (``asyncio.Lock()`` no longer takes a loop arg; the lazy pattern still serves test fixtures and single-loop deployment, with the invariant documented). (#10) - **Addon-mode carve-out comment** clarified to distinguish dev (master in schema) from stable (master web-UI-only). (#14) - **probe-div null branch** in F.37 now writes ``data-error`` so a failing test points at "selector missed" vs "value flipped" unambiguously. (#17) Tests added: - ``test_translations_cover_every_schema_key`` — parity check that every ``schema:`` key has a non-empty translation ``name`` and ``description``. Parameterised across stable + dev addons. Pins the class of silent gap that this PR's ``advanced_debug_logging`` fix addressed. (#4) - ``test_save_features_acquires_override_file_lock`` + ``test_save_advanced_acquires_override_file_lock`` — counting-lock wrapper asserts ``async with _get_override_file_lock()`` runs exactly once in each file-mode write path. Pin against a regression that silently bypasses concurrent-save serialisation. (#5) - ``test_dual_save_buttons_mirror_disabled_and_status_on_post_failure`` — exercises the 500-response branch of the dual-save mirror so a regression that broke ``_setAdvSaveStatus``/``_setAdvSaveDisabled`` for error paths only would still fail. (#6) - ``test_save_features_master_on_restores_subflag_values_in_addon_mode`` — addon-mode round-trip mirror of the existing standalone restore test; asserts the Supervisor merge-and-post call carries only the master flip-on and never zeroes out sub-flag values. (#7) - ``test_save_button_nothing_to_save_when_no_dirty_and_no_restart`` + ``test_save_button_restart_pending_hint_when_dirty_empty_but_restart_showing`` — both branches of the empty-dirty Save click are exercised; the restart-pending branch asserts the copy is source-blind. (#8) Deferred per user direction: - #1 (file-vs-Settings visual after F5): user accepts the current behavior (UI shows post-gate value; runtime tools actually disabled when master off; data-layer preserve still works end-to-end). - #11/#12/#13 (code-simplifier helper extractions): skipped as complicated to implement without behavior risk. - #15 (pre-#1164 users with already-cleared sub-flags): release-note concern, not a code change. - #16 (lock fragility under future thread-pool dispatch): speculative future-risk; not actionable today. * feat(settings): version footer + Patch76 review feedback Adds the running ha-mcp version to the settings UI footer (issue #1466). ``info.version`` flows from ``/api/settings/info`` → ``HA_MCP_BUILD_VERSION`` env var on addon builds (set by both stable and dev Dockerfiles) → package metadata fallback. Empty on older deployments without the field. Addresses Patch76's review (no blockers, all bundleable): - ``OverrideField`` folds the structurally-identical ``FeatureFlagField`` and ``BackupOverrideField`` into one NamedTuple; aliases preserve readable construction sites. ``AdvancedField`` keeps its own type since it c…
Summary
Implements #766 — converts 4 documentation-only tools into MCP resources and skill reference files, removing them from
tools/list.ha_get_dashboard_guideskill://home-assistant-best-practices/references/dashboard-guide.mdha_get_card_documentation(list mode)skill://home-assistant-best-practices/references/dashboard-cards.mdha_get_card_documentation(fetch mode)ha://docs/cards/{card_type}ha_get_domain_docsha://docs/domains/{domain}ha_config_infohome-assistant-best-practicesSKILL.mdserver.py— fetch official HA docs from GitHub on demandENABLE_SKILLS_AS_TOOLSdefaults totrueso clients without resource support retain documentation access. Server instructions tell resource-capable clients they can set it tofalse.dashboard_guide.md,card_types.json,tools_config_info.pytools/listTest plan
TestDashboardDocumentationToolsremoved (tested removed tools)ha://docs/cards/lightandha://docs/domains/lightresources return contentha_get_skill_home_assistant_best_practiceslists new reference files (dashboard-guide.md, dashboard-cards.md, domain-docs.md)ENABLE_SKILLS_AS_TOOLS=trueexposes resources as tools via ResourcesAsTools transformCloses homeassistant-ai#766
🤖 Generated with Claude Code