Skip to content

feat: convert doc tools to MCP resources and skill references#12

Open
kingpanther13 wants to merge 32 commits into
masterfrom
feat/docs-tools-to-resources
Open

feat: convert doc tools to MCP resources and skill references#12
kingpanther13 wants to merge 32 commits into
masterfrom
feat/docs-tools-to-resources

Conversation

@kingpanther13

Copy link
Copy Markdown
Owner

Summary

Implements #766 — converts 4 documentation-only tools into MCP resources and skill reference files, removing them from tools/list.

Removed tool Replacement
ha_get_dashboard_guide Skill reference: skill://home-assistant-best-practices/references/dashboard-guide.md
ha_get_card_documentation (list mode) Skill reference: skill://home-assistant-best-practices/references/dashboard-cards.md
ha_get_card_documentation (fetch mode) Resource template: ha://docs/cards/{card_type}
ha_get_domain_docs Resource template: ha://docs/domains/{domain}
ha_config_info Skill anti-pattern rows in home-assistant-best-practices SKILL.md
  • Resource templates registered in server.py — fetch official HA docs from GitHub on demand
  • Skills submodule updated to v1.2.0 (companion: skills#25)
  • ENABLE_SKILLS_AS_TOOLS defaults to true so clients without resource support retain documentation access. Server instructions tell resource-capable clients they can set it to false.
  • Cross-references updated across 10+ files: tool descriptions, error suggestions, docs, AGENTS.md, tests
  • Static files deleted: dashboard_guide.md, card_types.json, tools_config_info.py
  • Net -1086 lines, 4 fewer tools on tools/list

Test plan

  • All modified Python files compile
  • No remaining references to removed tool names (except explanatory comment in server.py)
  • E2E test class TestDashboardDocumentationTools removed (tested removed tools)
  • UAT stories updated to reference resources instead of removed tools
  • E2E test with real HA instance: verify ha://docs/cards/light and ha://docs/domains/light resources return content
  • Verify ha_get_skill_home_assistant_best_practices lists new reference files (dashboard-guide.md, dashboard-cards.md, domain-docs.md)
  • Verify ENABLE_SKILLS_AS_TOOLS=true exposes resources as tools via ResourcesAsTools transform

Closes homeassistant-ai#766

🤖 Generated with Claude Code

Remove ha_get_dashboard_guide, ha_get_card_documentation,
ha_get_domain_docs, and ha_config_info from tools/list. Static
content moves to skill reference files (skills repo v1.2.0),
GitHub-fetching capabilities become ha://docs/ resource templates.

- Add ha://docs/cards/{card_type} and ha://docs/domains/{domain}
  resource URI templates in server.py
- Update skills submodule to include dashboard, card, domain, and
  config reference files
- Delete tools_config_info.py, dashboard_guide.md, card_types.json
- Default ENABLE_SKILLS_AS_TOOLS to true so clients without resource
  support retain access to documentation
- Add instructions note for resource-capable clients to disable
  skills-as-tools
- Update all cross-references in tool descriptions, error suggestions,
  docs, and tests

Closes homeassistant-ai#766

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the way documentation and skills are integrated into the Home Assistant MCP server, moving towards a more standardized and robust resource-based model. It streamlines access to Home Assistant documentation, enhances the stability of OAuth authentication through state persistence, and improves the security of Python sandbox expressions. These changes aim to provide a more consistent and reliable experience for agents interacting with the Home Assistant ecosystem.

Highlights

  • Documentation Access Refactor: Converted four documentation-only tools (ha_get_dashboard_guide, ha_get_card_documentation, ha_get_domain_docs, ha_config_info) into MCP resources and skill references, removing them from the direct tool list. Documentation is now primarily accessed via ha://docs/cards/{card_type} and ha://docs/domains/{domain} resource templates, or specific skill references.
  • OAuth State Persistence: Implemented persistence for OAuth state (clients, refresh tokens, and their mappings) to disk, ensuring that active sessions can survive server restarts. This includes atomic writes and graceful handling of missing or corrupt state files.
  • Enhanced Python Sandbox Validation: Strengthened the Python sandbox expression validation to explicitly block calls on subscript results (e.g., config['fn']()) and chained calls (e.g., config.get('fn')()), improving security and predictability.
  • Improved Custom Component Error Handling: Refactored error handling for missing custom components (like HACS or ha_mcp_tools), now raising a ToolError with a specific COMPONENT_NOT_INSTALLED error code and helpful suggestions, rather than returning a generic error dictionary.
  • Default ENABLE_SKILLS_AS_TOOLS to True: Changed the default value of the ENABLE_SKILLS_AS_TOOLS setting from false to true to ensure broader access to documentation and skills for clients that do not natively support MCP resources.
  • New Browser Landing Page: Added a new browser landing page that provides a human-readable message and setup instructions when a GET request is made to the MCP endpoint, instead of a bare 'Method Not Allowed' error.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (4)
    • .github/workflows/_update-addon-config.yml
    • .github/workflows/pr.yml
    • .github/workflows/renovate.yml
    • .github/workflows/semver-release.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that significantly improves the project's architecture by converting documentation tools into MCP resources and skills, aligning with modern MCP best practices. The removal of several tools and static files cleans up the codebase nicely.

Beyond the main feature, I'm impressed with the other high-quality changes included:

  • The robust implementation of OAuth state persistence is a major enhancement for server reliability.
  • The security hardening in the Python sandbox to prevent chained calls is a critical improvement.
  • The refactoring of the HACS and filesystem tool availability checks to correctly handle the API response format and improve error handling is a great fix.

The test suite has been thoroughly updated to cover all new features and fixes, which is fantastic to see. I've only found a couple of minor points for discussion. Great work!

Comment thread src/ha_mcp/utils/python_sandbox.py
Comment thread CHANGELOG.md
Comment thread homeassistant-addon/CHANGELOG.md
- Remove unused asyncio import from tools_utility.py (ruff F401)
- Rewrite test_resources.py to test skills-vendor reference files
  instead of deleted dashboard_guide.md and card_types.json

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kingpanther13

Copy link
Copy Markdown
Owner Author

/gemini review

the original review you did was based on an erroneous PR with extra stuff in it. I've fixed it and now the actual PR is here; please repeat your review as if you never did the last one, I dismissed all of your previous items in case they weren't accurate.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring that moves documentation-providing tools into MCP resources and skills. This aligns well with MCP best practices by reducing the number of tools and centralizing documentation. The changes are consistently applied across the codebase, including updates to documentation, error messages, and tests. However, I've found a critical issue in the new resource implementation in server.py that includes a path traversal vulnerability, insufficient error handling, and missing test coverage. Addressing this is important for security and robustness.

Comment thread src/ha_mcp/server.py Outdated
kingpanther13 and others added 2 commits March 21, 2026 15:03
- Add _sanitize_slug() to reject path separators (/, \, ..) in
  card_type and domain parameters
- Add explicit 404 handling with helpful error messages pointing
  to skill reference files for valid values

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ha://docs/ resource templates introduced a new pattern (server-side
GitHub proxying) that diverges from the existing skills architecture
where all resources are local file reads. The skill reference files
already contain card/domain lists and GitHub URL patterns — the
existing skills framework handles this use case adequately.

- Remove _register_doc_resources() from server.py
- Update all cross-references to point to
  ha_get_skill_home_assistant_best_practices instead
- Remove ha://docs/ mentions from server instructions, AGENTS.md,
  README.md, DOCS.md, and UAT stories

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kingpanther13

Copy link
Copy Markdown
Owner Author

/gemini review

I just did a significant refactor, review again

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant architectural improvement, refactoring several documentation-providing tools into MCP resources and skills. The changes are consistently applied throughout the codebase, including documentation, error messages, and tests. The removal of old tools and static files is clean. I've found one area for improvement in the unit tests to ensure packaging rules are more robustly verified.

Comment thread tests/src/unit/test_resources.py Outdated
kingpanther13 and others added 2 commits March 21, 2026 15:49
- Make test_pyproject_includes_resources check for the specific
  resources/skills-vendor/**/* pattern instead of generic resources/
- Remove resources/*.md and resources/*.json globs from pyproject.toml
  since dashboard_guide.md and card_types.json were deleted

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…assistant-ai#727)

* feat: Add CategorizedSearchTransform for search-based tool discovery

Introduces an opt-in search transform (ENABLE_TOOL_SEARCH=true) that
replaces the full 89+ tool catalog with a unified BM25 search tool
and three categorized call proxies (read/write/delete), dramatically
reducing idle context token usage for LLMs.

Key design:
- Unified search_tools searches ALL tools, results include annotations
- call_read_tool (readOnlyHint) — clients can auto-approve
- call_write_tool (destructiveHint) — requires confirmation
- call_delete_tool (destructiveHint) — requires confirmation
- Critical tools pinned via always_visible for individual permission
  gating: ha_restart, ha_reload_core, ha_backup_create/restore,
  ha_get_overview, ha_report_issue
- ResourcesAsTools (list_resources/read_resource) auto-pinned when
  enable_skills_as_tools is enabled
- Server instructions updated with tool discovery workflow
- ha_get_overview includes tool discovery hint when active

Subclasses FastMCP's BM25SearchTransform (~200 lines), leveraging
native catalog bypass, auth filtering, and search infrastructure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve CI failures, bugs, and review feedback for search transform

- Fix Ruff RUF012: annotate _PINNED_TOOLS with ClassVar[list[str]]
- Fix runtime crash: move Context import out of TYPE_CHECKING (Pydantic
  needs it at runtime for Tool.from_function type resolution)
- Fix critical bug: override get_tool() so categorized proxy tools
  (ha_call_read_tool, ha_call_write_tool, ha_call_delete_tool) are
  resolvable when the LLM calls them
- Fix _categorize_tool: require destructiveHint AND name pattern for
  delete category (was matching on name pattern alone)
- Fix search tool missing readOnlyHint annotation
- Rename all tool references to ha_* prefix (ha_search_tools,
  ha_call_read_tool, etc.) per naming convention
- Add DEFAULT_PINNED_TOOLS constant; use in tools_search.py (DRY)
- Add lazy cache rebuild with _cache_built flag
- Add enable_tool_search toggle to dev add-on config UI
- Add 20 unit tests for CategorizedSearchTransform

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: enrich ha_get_state description for BM25 search discoverability

Add domain keywords (lights, switches, sensors, etc.) and natural
language terms (status, current) so BM25 can match common queries
like "lights status" to this tool.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address Gemini review round 2 — structured errors, DRY constants

- Replace ValueError with ToolError + create_error_response per style
  guide structured error requirements
- Deduplicate _PINNED_TOOLS: use DEFAULT_PINNED_TOOLS as single source
- Extract proxy descriptions to module constants shared between
  transform_tools() and get_tool() for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct ha_get_overview docstring that said "standard (default)"

The parameter default is "minimal" but the docstring said
"Use 'standard' (default)" — causing LLMs to always pick standard
and blow out context with the full entity list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: improve proxy guidance, sequential calls, and double-wrap guard

- Enrich ha_search_tools description with full bootstrap workflow,
  calling example, anti-nesting warning, and sequential call guidance
- Add param structure and sequential call hints to all proxy descriptions
- Add server-side double-wrap unwrapping in categorized_call() so
  LLMs that accidentally nest name/arguments still succeed
- Update ha_get_overview tool_discovery hint with param and sequential info
- Append skills hint to search tool description when skills_as_tools is on
- Auto-enable enable_skills when enable_skills_as_tools is set (dependency)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: soften resource claim in instructions, tighten double-wrap guard

- Replace hard "your client supports resources" claim with soft
  "should be able to" and explicit fallback path to tools
- Add catalog membership check to double-wrap detection so it only
  unwraps when the inner name is a known tool in the catalog
- Auto-enable enable_skills when enable_skills_as_tools is set

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pre-pin skill guidance tools for forward compatibility with homeassistant-ai#732

When skills-as-tools is enabled, pin any skill guidance tool names
stored in _skill_tool_names (registered by homeassistant-ai#732) so they remain
visible in list_tools() alongside other pinned tools. Uses getattr
with empty default so this is a no-op until homeassistant-ai#732 merges.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add SearchKeywordsTransform to improve BM25 ranking for common queries

Adds a lightweight FastMCP Transform that appends extra keywords to
specific tool descriptions so BM25 ranks them higher for natural
language queries.  Only active when enable_tool_search is on — original
tool docstrings are unchanged.

Fixes three BM25 ranking issues identified during search quality testing:
- "find entities" now ranks ha_search_entities above ha_deep_search
- "get/read automation" now ranks ha_config_get_automation above set
- "create helper" now ranks ha_config_set_helper above remove_helper

To adjust rankings for additional tools, add an entry to the
_SEARCH_KEYWORDS dict in server.py — no other files need changing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove unnecessary string quotes from type annotations (ruff UP037)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: make direct tool calls explicit and demote ha_deep_search in BM25

Update server instructions and search tool description to clarify that
discovered tools can be called directly by name (preferred) rather than
requiring proxy usage. Extend SearchKeywordsTransform with description
override support to narrow ha_deep_search's BM25 footprint, and boost
8 additional competing tools to outrank it for common queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: reduce search transform max_results from 10 to 5

Matches FastMCP's BM25SearchTransform default of 5 to reduce
per-search context cost.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pin ha_search_entities as always-visible tool

Entity search is a high-frequency tool that shouldn't require
a tool search call to discover.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pin ha_config_get_automation and ha_config_set_automation

High-frequency automation tools shouldn't require a search call
to discover.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
sergeykad and others added 17 commits March 22, 2026 07:40
homeassistant-ai#802)

* chore: migrate from pre-commit to lefthook for parallel hook execution

Replace pre-commit with lefthook to enable parallel execution of mypy and
unit tests during git commits, reducing wall-clock time from ~40s to ~20s.

Hook execution order:
- Group 1 (parallel): ruff --fix + uv lock
- Group 2 (parallel): mypy + unit tests

lefthook is a compiled Go binary that uses goroutines for concurrency,
avoiding the bash job control limitations in sandboxed environments.

- Add lefthook.yml with two-stage parallel hook config
- Replace pre-commit>=4.0.0 with lefthook>=1.10.0 in dev deps
- Remove .pre-commit-config.yaml
- Update CONTRIBUTING.md with new install command and migration guide
- Update .dockerignore

* fix: run both mypy invocations regardless of first exit code
Take upstream's expanded resource access instructions for
the ENABLE_SKILLS_AS_TOOLS code path in server.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address Gemini review: the prompt asks to look up card type
documentation, which now routes through the skills framework.
Add ha_get_skill_home_assistant_best_practices and read_resource
to the expected tools list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The merge with upstream/master reset the submodule pointer to 308c4d4
which predates the reference files added in skills#25. Restore to
d0a72ea which includes dashboard-guide.md, dashboard-cards.md, and
domain-docs.md needed by the tests and this PR's functionality.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…omeassistant-ai#807)

Bumps the github-actions group with 1 update: [renovatebot/github-action](https://github.com/renovatebot/github-action).


Updates `renovatebot/github-action` from 46.1.5 to 46.1.6
- [Release notes](https://github.com/renovatebot/github-action/releases)
- [Changelog](https://github.com/renovatebot/github-action/blob/main/CHANGELOG.md)
- [Commits](renovatebot/github-action@v46.1.5...v46.1.6)

---
updated-dependencies:
- dependency-name: renovatebot/github-action
  dependency-version: 46.1.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ssistant-ai#808)

Bumps the dev-dependencies group with 2 updates: [pytest-cov](https://github.com/pytest-dev/pytest-cov) and [ruff](https://github.com/astral-sh/ruff).


Updates `pytest-cov` from 7.0.0 to 7.1.0
- [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst)
- [Commits](pytest-dev/pytest-cov@v7.0.0...v7.1.0)

Updates `ruff` from 0.15.6 to 0.15.7
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](astral-sh/ruff@0.15.6...0.15.7)

---
updated-dependencies:
- dependency-name: pytest-cov
  dependency-version: 7.1.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: dev-dependencies
- dependency-name: ruff
  dependency-version: 0.15.7
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: dev-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [openai](https://github.com/openai/openai-python) from 2.28.0 to 2.29.0.
- [Release notes](https://github.com/openai/openai-python/releases)
- [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md)
- [Commits](openai/openai-python@v2.28.0...v2.29.0)

---
updated-dependencies:
- dependency-name: openai
  dependency-version: 2.29.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ant-ai#810)

Bumps [testcontainers](https://github.com/testcontainers/testcontainers-python) from 4.14.1 to 4.14.2.
- [Release notes](https://github.com/testcontainers/testcontainers-python/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-python/blob/main/CHANGELOG.md)
- [Commits](testcontainers/testcontainers-python@testcontainers-v4.14.1...testcontainers-v4.14.2)

---
updated-dependencies:
- dependency-name: testcontainers
  dependency-version: 4.14.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: upgrade to Python 3.14 for Home Assistant 2026.3 compatibility

Home Assistant 2026.3 ships with Python 3.14. Update all Python version
references across the project:

- pyproject.toml: requires-python, classifier, ruff target-version
- .python-version: 3.13 → 3.14
- Dockerfiles (3): builder and runtime base images
- CI workflows (8): PYTHON_VERSION env vars, python-version inputs,
  container image tags
- uv.lock: requires-python marker
- Issue template: placeholder version

Docker image digests removed (Renovate will pin them).
uv.lock header updated; full regeneration via `uv lock` recommended
in CI before merge.

Closes homeassistant-ai#699

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add build deps for Python 3.14 and restore image digest pins

- Dockerfiles: add gcc + libffi-dev to builder stage for cffi
  compilation (no prebuilt 3.14 wheels yet), restore SHA256 digest
  pins for both builder and runtime images
- CI (pr.yml): switch lint and unit-tests from slim container to
  runner-based setup (matching e2e-validation pattern) — ubuntu
  runners have gcc/rust needed for source builds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove quoted type annotations for Python 3.14 (UP037)

Python 3.14 implements PEP 649 (deferred evaluation of annotations),
making quoted forward references unnecessary. Ruff UP037 now flags these
when target-version is py314.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add libc-dev to Docker builder stage for cffi compilation

gcc needs C standard library headers (assert.h, etc.) to compile cffi.
The --no-install-recommends flag skipped libc-dev which is only
"recommended" by gcc, not a hard dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add make and autoconf for jq source build in Docker

The jq Python package builds jq 1.8.1 from source on Python 3.14
(no prebuilt wheels) and requires make/autoconf for its configure step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add temp workflow to regenerate uv.lock for Python 3.14

The lockfile was generated for Python 3.13 and only contains cp313
wheel entries. Python 3.14 can't use those, causing Docker builds to
fall back to source compilation. This workflow regenerates the lockfile
with cp314 wheel entries.

Also reverts unnecessary build-tool additions to Dockerfiles — the
proper fix is the lockfile, not adding gcc/Rust to the builder.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update uv.lock wheel entries from cp313 to cp314

The lockfile was generated for Python 3.13 and only contained cp313
wheel entries. When uv sync --locked runs on Python 3.14, it can't
use cp313 wheels and falls back to building from source (sdist),
which requires gcc, Rust, etc.

Updated 203 wheel entries to cp314 using PyPI JSON API. Remaining 12
cp313 entries are macOS/Windows/i686 platforms where the filename
pattern differs — these don't affect linux x86_64 Docker builds.

Also removes the temporary lockfile regeneration workflow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update Python version reference in documentation issue template

The documentation.yml issue template still referenced Python 3.13+ in
placeholder examples. Updated to 3.14+ to match the rest of the PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…g empty data (homeassistant-ai#812)

When get_states() failed, asyncio.gather with return_exceptions=True silently
swallowed the exception, causing ha_get_overview to return success=true with
total_entities=0 — indistinguishable from an empty HA instance.

Changes:
- Raise entities fetch exception immediately so the outer error handler converts
  it to a proper ToolError with actionable suggestions
- Log services fetch failures at warning level and include partial=true +
  warnings[] in the response so callers can distinguish a real zero from a
  fetch failure
- Move inline `import random` to module level

Fixes homeassistant-ai#811
…ssistant-ai#816)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…ate annotations (homeassistant-ai#817)

The CI container images in pr.yml were stuck at uv 0.9.30 while
Dockerfiles are at 0.11.0, because Renovate had no annotations to
detect them. Added `# renovate: datasource=docker depName=...`
comments so future uv bumps are applied automatically.

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR homeassistant-ai#700 removed the jq dependency from upstream. The previous merge
resolution incorrectly kept jq imports and jq test classes that no
longer exist in upstream/master.

- Remove jq try/import block and _JQ_UNAVAILABLE_ERROR from
  tools_config_dashboards.py
- Replace TestJqTransformAndFindCard with TestFindCard (drop all
  jq_transform tests, keep find_card tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sistant-ai#805)

* docs: clarify Claude.ai "Couldn't reach MCP server" is normal, annotate stateless session logs

Claude.ai shows a "Couldn't reach the MCP server" error during its initial
connection handshake even though the server connects successfully afterward.
This causes confusion for users who believe they cannot connect. Additionally,
the MCP SDK logs "Terminating session: None" on every request in stateless
mode, which looks alarming but is routine cleanup.

- Add setup note in claude-ai.md explaining the error is expected
- Add FAQ entries for both the connection error and the session log message
- Add log filter to annotate "Terminating session: None" with
  "(Normal — stateless mode)" in both main and addon entry points

Closes homeassistant-ai#783

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rename filter to public, add unit tests, fix ruff import order

- Rename _StatelessSessionLogFilter to StatelessSessionLogFilter since
  it is imported cross-module (addon start.py)
- Add unit tests for the log filter covering annotation, real sessions,
  other loggers, and unrelated messages
- Fixes ruff isort violation in start.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add detailed Cloudflare AI crawler instructions to landing page and docs

Expand the browser landing page with step-by-step instructions for
disabling Cloudflare's "Block AI training bots" setting, the most
common connection issue for Cloudflare users. Add the same instructions
with screenshot to FAQ, cloudflared setup guide, and addon DOCS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Clarify landing page: tell user to copy URL from address bar

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Landing page: clarify to paste full URL with private key into MCP settings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add tip to ask Claude directly to verify MCP connection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add Cloudflare AI crawler warning and Claude connection tips to setup wizard

The setup wizard is a single-page app in setup.astro — the markdown
content files don't render in the wizard UI. Add the AI crawler
blocking warning directly to the Cloudflare Tunnel instructions block,
and add connection verification steps to the Claude.ai config section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Setup wizard: render UI client steps as instruction blocks, add verify tip

- Claude.ai, ChatGPT, Raycast, Open WebUI steps now render as
  formatted HTML instruction blocks instead of inside a code box
- Hide the config code box + copy button for UI-based clients
- Add "Verify Your Server" instruction block for all HTTP-based setups
  telling users to visit the URL in their browser to confirm it works

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix: render instructions after config generation so UI steps appear

The instructionsEl.innerHTML was being set before UI client steps were
pushed to the array. Move the render call to after config generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Move Cloudflare screenshot to repo, replace all GitHub attachment URLs

Gemini flagged that user-uploaded GitHub assets can have URLs change
or be removed. Moved the screenshot to site/public/images/ and updated
all references to use the permanent GitHub Pages URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt-ai#818)

Bumps [smol-toml](https://github.com/squirrelchat/smol-toml) from 1.6.0 to 1.6.1.
- [Release notes](https://github.com/squirrelchat/smol-toml/releases)
- [Commits](squirrelchat/smol-toml@v1.6.0...v1.6.1)

---
updated-dependencies:
- dependency-name: smol-toml
  dependency-version: 1.6.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#819)

Bumps [requests](https://github.com/psf/requests) from 2.32.5 to 2.33.0.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.32.5...v2.33.0)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.33.0
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
dependabot Bot and others added 9 commits March 25, 2026 23:34
…#820)

Bumps [yaml](https://github.com/eemeli/yaml) from 2.8.2 to 2.8.3.
- [Release notes](https://github.com/eemeli/yaml/releases)
- [Commits](eemeli/yaml@v2.8.2...v2.8.3)

---
updated-dependencies:
- dependency-name: yaml
  dependency-version: 2.8.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps  and [picomatch](https://github.com/micromatch/picomatch). These dependencies needed to be updated together.

Updates `picomatch` from 4.0.3 to 4.0.4
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](micromatch/picomatch@4.0.3...4.0.4)

Updates `picomatch` from 2.3.1 to 2.3.2
- [Release notes](https://github.com/micromatch/picomatch/releases)
- [Changelog](https://github.com/micromatch/picomatch/blob/master/CHANGELOG.md)
- [Commits](micromatch/picomatch@4.0.3...4.0.4)

---
updated-dependencies:
- dependency-name: picomatch
  dependency-version: 4.0.4
  dependency-type: indirect
- dependency-name: picomatch
  dependency-version: 2.3.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…t-ai#826)

Bumps [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro) from 5.16.11 to 5.18.1.
- [Release notes](https://github.com/withastro/astro/releases)
- [Changelog](https://github.com/withastro/astro/blob/astro@5.18.1/packages/astro/CHANGELOG.md)
- [Commits](https://github.com/withastro/astro/commits/astro@5.18.1/packages/astro)

---
updated-dependencies:
- dependency-name: astro
  dependency-version: 5.18.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Skills-as-tools replaces several removed documentation tools, so
the default is intentionally True to preserve documentation access
for all clients.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… deep search (homeassistant-ai#814)

* fix: add badge search, exact match mode, and dashboard search to deep search (homeassistant-ai#801)

- ha_dashboard_find_card now searches view-level badges (views[n].badges),
  catching entity references in badge chips that were previously missed
  during rename operations

- ha_deep_search gains exact_match parameter (default: True) that uses
  substring matching instead of fuzzy scoring, eliminating false positives
  when searching for known entity IDs

- ha_deep_search gains 'dashboard' search type that fetches and searches
  all storage-mode dashboard configurations for entity references

Closes homeassistant-ai#801

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: extract _score_deep_match to deduplicate exact_match scoring logic

Addresses Gemini Code Assist review — the scoring logic for exact_match
was duplicated across automations, scripts, and helpers. Extracted into
a single _score_deep_match helper method.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add exact_match parameter to ha_search_entities and ha_get_integration

Per maintainer discussion in homeassistant-ai#801, fuzzy matching should be disabled by
default across all search tools. This adds exact_match=True (default) to:

- ha_search_entities: routes directly to substring matching, skipping
  fuzzy search entirely. Set exact_match=False for typo-tolerant search.

- ha_get_integration: skips fuzzy scoring branch when filtering by query,
  keeping only exact substring matches. Set exact_match=False for fuzzy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: update E2E tests for exact_match default in ha_search_entities

Tests that specifically verify fuzzy search behavior now pass
exact_match=False, since the default changed to True.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: also search sections-view header cards in ha_dashboard_find_card

Addresses feedback from @Patch76 on homeassistant-ai#801 — views[n].header.card in
sections views accepts a card (typically Markdown with Jinja2 templates)
that can contain entity references. Like badges, it's a sibling of
views[n].cards and was missed by card-focused search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address sergeykad review — error handling, bool coercion, tests

- Dashboard search now raises on failure instead of silently returning
  empty results (fixes false-confidence problem)
- Dropped `or False` from `coerce_bool_param(..., default=True)` calls
  that undermined the True default (3 locations)
- Fixed potential None.lower() crash in ha_get_integration query filter
- Badge search now triggers on card_type="badge" (not just entity_id)
- Added unit tests for badge search, header card search, strategy dashboards
- Added E2E tests for exact_match default, fuzzy opt-in, dashboard search
  type, and search_entities exact_match default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…in permissions (homeassistant-ai#828)

* Potential fix for code scanning alert no. 31: Workflow does not contain permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* ci: add workflow-level permissions to test-installer-scripts.yml

Addresses code scanning alerts homeassistant-ai#23, homeassistant-ai#26, homeassistant-ai#27 by adding explicit
permissions: contents: read at the workflow level, matching the
fix already applied to semver-release.yml for alerts homeassistant-ai#29-31.

All three jobs (test-macos, test-windows, check-demo-env) only
need read access to checkout code and run scripts.

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Sergey <sergey@example.com>
…#830)

Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.5 to 46.0.6.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](pyca/cryptography@46.0.5...46.0.6)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.6
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ed) (homeassistant-ai#829)

Add migration guide to OAUTH.md with security context (SSRF and XSS),
add troubleshooting entries to FAQ.md and site FAQ, update README banner
to link to the migration guide instead of the GitHub issue.

Closes homeassistant-ai#749
Resolved conflict in tools_config_dashboards.py — kept the PR's
deletion of ha_get_dashboard_guide and ha_get_card_documentation
(converted to skill references). All upstream changes from homeassistant-ai#814
(badge search, header card search, exact_match, formatting) are
preserved via auto-merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kingpanther13 added a commit that referenced this pull request Apr 7, 2026
The HMAC secret is regenerated on every restart, so all pre-existing
tokens are already invalidated. The only scenario for an unsigned
token reaching a running server is an attacker crafting a plain
base64 blob to bypass HMAC verification. Remove the path entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kingpanther13 added a commit that referenced this pull request May 23, 2026
…t infrastructure

Outcome of going through all 60 findings from Gemini Code Assist +
pr-review-toolkit (code-reviewer, pr-test-analyzer, silent-failure-
hunter, comment-analyzer). 3 wrong (skipped: PATH-resolved node binary
mis-flagged as hardcoded; project-relative esbuild path mis-flagged
as hardcoded; theoretical FakeBroadcastChannel constructor-throw).
Remaining real items addressed:

Harness:
- vm.runInContext replaces window.eval / indirect eval to clear the
  "no eval()" style-guide flag (Gemini #1, #2).
- Timer-callback and broadcast-listener throws now record into the
  errors list instead of being silently swallowed (homeassistant-ai#30, homeassistant-ai#32).
- SAFETY_CAP exhaustion records a clear "runaway setInterval" error
  instead of breaking silently (#4, homeassistant-ai#31).
- Non-navigation jsdomErrors route to errors (not console) so tests
  asserting `not result.errors` catch them (homeassistant-ai#38).
- Transpile failure short-circuits init eval to avoid cascading
  syntax errors from un-transpiled TS (homeassistant-ai#34).
- FakeBroadcastChannel.postMessage now delivers to peer same-name
  channels in the same context per spec (#5).
- Time-faked surface documented accurately (Date.now / setTimeout /
  setInterval only; new Date / performance.now still wall-time) (homeassistant-ai#46).
- New broadcastChannelUnavailable param simulates the
  `typeof BroadcastChannel === 'undefined'` browsing context so the
  production null-guard branch is exercised (homeassistant-ai#15).
- Dead comments and rot-prone duplications removed (homeassistant-ai#47, homeassistant-ai#49, homeassistant-ai#51,
  homeassistant-ai#56, homeassistant-ai#57, homeassistant-ai#58, homeassistant-ai#59, homeassistant-ai#66).

extract_astro_vars.mjs:
- vm.runInContext replaces (0, eval) (Gemini #2).
- Multi-line `import { a, b } from 'x';` now stripped robustly (#7).
- Eval errors wrapped with the source path for actionable failures (homeassistant-ai#35).

_js_harness.py:
- Wrong test file name and workflow path in docstring fixed (homeassistant-ai#41, homeassistant-ai#42).
- _strip_astro_frontmatter raises ValueError when frontmatter opens but
  never closes (homeassistant-ai#36).
- discover_script_surfaces raises when site/src/ is missing instead of
  silently producing partial results (homeassistant-ai#37).
- extract_script_body accepts source_label for actionable errors (homeassistant-ai#40).
- Astro `<script lang="js">` is no longer mis-tagged as TypeScript (#9).
- Inert chr(92) Windows backslash replace removed (homeassistant-ai#14).
- Field docstrings on ScriptSurface trimmed to the one that earns its
  keep (homeassistant-ai#52).
- _PY_RENDERERS registry refactor + accurate enumeration comment (homeassistant-ai#45).

test_settings_ui_js_behavior.py:
- Rot-bait PR/issue numbers removed from module docstring (homeassistant-ai#43).
- _TOP_LEVEL_ELEMENT_IDS + import-time drift check replaces the
  "refresh this manually" comment (homeassistant-ai#55).
- _assert_clean_init helper called at the top of every test so init
  failures surface as init errors, not as misleading
  "side effect didn't fire" failures (homeassistant-ai#33).
- 4xx restartBtn assertion now reads disabled state via JS and snaps
  to body.dataset instead of OR-shortcircuiting against a wiped DOM (homeassistant-ai#27).
- New test_script_boots_without_broadcastchannel_global covers the
  null-guard branch (homeassistant-ai#15).
- Assertion-restating comments removed (homeassistant-ai#60).

test_astro_setup_js_behavior.py:
- Rot-bait homeassistant-ai#1422 reference removed from module docstring (homeassistant-ai#44).
- _section_has_hidden_class replaces fragile substring slicing (#6).
- test_initial_state_only_client_section_visible now asserts on the
  promised visibility, not just absence of errors (homeassistant-ai#26).
- Per-client smoke now captures config-output text AND instructions
  HTML into body.dataset and asserts on non-empty content, catching a
  typo that drops the whole per-client branch (homeassistant-ai#21).

test_astro_tools_js_behavior.py:
- _card_class helper replaces ±200-char substring slicing (#12).
- test_design_mode_toggle now asserts design-only elements lose
  'hidden' class, not just the button label flip (homeassistant-ai#25).
- New tests cover .filter-btn / .cat-btn / .size-filter-btn /
  group-category|file|none / sort-alpha / expand-all wiring (homeassistant-ai#22, homeassistant-ai#23,
  homeassistant-ai#24) — the adjacent coverage gaps issue homeassistant-ai#1422 didn't name but that
  fit the harness's same regression-class.

test_consent_form_js_behavior.py:
- _build_form_dom docstring fixed (said "three", listed four) (homeassistant-ai#54).

test_rendered_scripts_parse.py:
- Missing-dependency skip flips to fail when CI=true so a workflow
  drift that drops the install step doesn't silently lose parse
  coverage (homeassistant-ai#29).
- Subsumed-test-class reference removed from module docstring.

AGENTS.md:
- "60s probe windows take milliseconds" wording fixed; time-faked
  surface documented (homeassistant-ai#13).
- Per-surface module naming guidance updated; reflects actual files (#10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kingpanther13 added a commit that referenced this pull request May 24, 2026
…ge for every rendered <script> (homeassistant-ai#1425)

* test(internal): JSDOM behaviour harness + auto-discovery parse coverage for every rendered <script>

Closes homeassistant-ai#1422.

Adds a JSDOM harness (tests/js/harness.mjs + tests/src/unit/_js_harness.py)
that drives real rendered <script> bodies through stubbed fetch /
BroadcastChannel / virtual timers / DOM and reports observed side
effects. A discovery walker auto-picks-up every <script> surface in
the repo (src/ha_mcp/settings_ui.py, src/ha_mcp/auth/consent_form.py,
every site/src/**/*.astro) so parse coverage extends as new UI surfaces
ship — no registration needed.

Behavioural coverage landed for the surfaces named in homeassistant-ai#1422:

* settings_ui — restartInProgress concurrency guard, 4xx-suppress-reload
  branch, 5xx fall-through, instance_id-flip probe, BroadcastChannel
  restart-required + restart-initiated listeners, saveFeatureFlag
  JSON-parse fallback.
* setup.astro — state-machine progression (local / network / remote),
  plus a parametrised per-client smoke that drives the wizard to
  config generation for every id in the real clientsData array.
* tools.astro — search/filter pipeline + design-mode toggle (TypeScript;
  esbuild strips types in the harness before eval).
* Layout.astro — copy-button idempotency across re-init.
* consent_form — submit handler disable + spinner state.

The legacy TestRenderedHTMLJsSyntax in test_settings_ui.py is removed —
the auto-discovery parse test in test_rendered_scripts_parse.py
subsumes it (and extends to the four other surfaces it never covered).

CI: unit-tests job in pr.yml installs nodejs + jsdom + esbuild via
apt-get / npm ci. Local devs without tests/js/node_modules/ get clean
skips, matching the original parse guard's behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): skip Astro frontmatter in script extraction; drain microtasks before clock advance

CI surfaced two harness bugs the local smoke-tests didn't catch:

1. `extract_script_body` and the discovery walker greedily matched the
   first `<script>` substring in the source, which in setup.astro is
   actually a frontmatter comment: `// below in the <script> block keyed
   off the entry's id.` That made the "script body" start mid-frontmatter
   and the extracted text wasn't valid JS — esbuild and JSDOM both
   rejected it with "Unexpected identifier 'keyed'".

   Fix: strip the `--- ... ---` Astro frontmatter block before searching
   for `<script>` tags. Plain .py and .html sources have no frontmatter
   and pass through unchanged.

2. `clock.advance(settleMs)` returned immediately when no timers were
   yet scheduled, but the script under test often awaits a chain of
   stubbed-fetch promises BEFORE hitting its first `setTimeout`. With
   only one microtask drain between eval and advance, those promises
   hadn't resolved yet, so no timers existed, advance was a no-op, and
   the script stayed suspended — `restartAddon`'s POST to
   /api/settings/restart never fired and the `alert(msg)` in the 4xx
   branch never ran.

   Fix: drain microtasks aggressively at the start of advance() so
   pending promises get to schedule their timers, and drain again when
   the timer queue temporarily empties (a promise resolution may queue
   new timers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(internal): expand initial DOM fixtures to cover every top-level addEventListener target

CI surfaced this via the test_5xx test (the only one whose assertion
included the harness errors list): the settings_ui script aborts during
init at `document.getElementById('backupRefresh').addEventListener(...)`
because the test DOM is missing the backup table / modal markup. With
init aborted, the invoke step never runs — `restartAddon` is never
called, POSTs never fire, `alert()` never runs, and all three restart-
flow tests silently fail.

The setup.astro tests had the same shape: `generateConfig` queries
`config-section` (distinct from `section-config`) to show/hide the
inner code block. Without it, the proxy click handler in the remote-
flow test threw and the `document.body.dataset.beforeProxy` assignment
never landed.

Fixes:
- settings_ui MIN_DOM now includes backupBulkDelete, backupDomain,
  backupEntity, backupList, backupRefresh, backupState, featuresBody,
  modalBackdrop / modalBody / modalClose / modalTitle. Set built from
  `grep -h "document.getElementById" settings_ui.py` so future top-level
  handlers will surface as the same pattern.
- setup.astro DOM now includes config-section alongside section-config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): run JSDOM eval at global scope; capture body attrs in dom snapshot

Two harness bugs the prior CI rounds didn't surface until init-stage
crashes were resolved:

1. Wrapping the rendered script in an `async () => { ... }()` IIFE
   confined top-level `function` declarations to the IIFE scope.
   `function restartAddon() {...}` never landed on `window`, so
   `invoke: "window.restartAddon();"` threw `is not a function`. A
   real browser hoists inline-script function decls to the global
   window — match that by running prelude + script body at global
   scope and keeping the IIFE for `invoke` alone (so awaits inside
   invoke still work).

2. `document.body.innerHTML` returns body's children but not body's
   own attrs, so tests that wrote `document.body.dataset.foo = 'bar'`
   as a side-channel for in-page state had no way to assert on it —
   `result.dom` came back without the attr. Serialise
   `document.documentElement.outerHTML` instead so html/head/body
   tags and their own attributes round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(internal): sequence /api/settings/info responses for 5xx restart probe

The test_5xx flow hits the info endpoint three times — loadTools init,
restartAddon's pre-POST baseline capture, and _probeAddonRestarted
after the POST. The old single-response fixture returned the SAME
instance_id every time, so the probe never saw the flip and looped
until timeout, leaving reloads=0.

Adds a `responses: [...]` shape to the harness fetch_map: each match
on a URL pattern advances a per-pattern counter; the last entry sticks
after exhaustion (matches "the addon came back online and stays
online"). Test now provides baseline → baseline → flipped so the
probe terminates with restarted=true and the reload fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(test): cache apt downloads and node_modules for the unit-tests job

The unit-tests "Install git and Node.js" step was 44 s — almost all of
it network download of the nodejs / npm .deb. The "Install JS test
dependencies" step is 1 s when node_modules is fresh but can grow as
deps change.

- Cache /var/cache/apt/archives keyed on a stable string (apt package
  set rarely changes). Disable docker-clean and set Keep-Downloaded-
  Packages so the cached .debs survive install for the next run. apt
  install still runs (unpacks from local cache, ~3-5 s) but skips the
  network leg.
- Cache tests/js/node_modules keyed on package-lock.json so dep bumps
  invalidate cleanly. `npm ci` short-circuits when the tree matches.

Expected first-cold-cache run: unchanged (~45 s install). Cache hits:
~5 s for both steps combined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(internal): address Gemini + pr-review-toolkit findings on JS test infrastructure

Outcome of going through all 60 findings from Gemini Code Assist +
pr-review-toolkit (code-reviewer, pr-test-analyzer, silent-failure-
hunter, comment-analyzer). 3 wrong (skipped: PATH-resolved node binary
mis-flagged as hardcoded; project-relative esbuild path mis-flagged
as hardcoded; theoretical FakeBroadcastChannel constructor-throw).
Remaining real items addressed:

Harness:
- vm.runInContext replaces window.eval / indirect eval to clear the
  "no eval()" style-guide flag (Gemini #1, #2).
- Timer-callback and broadcast-listener throws now record into the
  errors list instead of being silently swallowed (homeassistant-ai#30, homeassistant-ai#32).
- SAFETY_CAP exhaustion records a clear "runaway setInterval" error
  instead of breaking silently (#4, homeassistant-ai#31).
- Non-navigation jsdomErrors route to errors (not console) so tests
  asserting `not result.errors` catch them (homeassistant-ai#38).
- Transpile failure short-circuits init eval to avoid cascading
  syntax errors from un-transpiled TS (homeassistant-ai#34).
- FakeBroadcastChannel.postMessage now delivers to peer same-name
  channels in the same context per spec (#5).
- Time-faked surface documented accurately (Date.now / setTimeout /
  setInterval only; new Date / performance.now still wall-time) (homeassistant-ai#46).
- New broadcastChannelUnavailable param simulates the
  `typeof BroadcastChannel === 'undefined'` browsing context so the
  production null-guard branch is exercised (homeassistant-ai#15).
- Dead comments and rot-prone duplications removed (homeassistant-ai#47, homeassistant-ai#49, homeassistant-ai#51,
  homeassistant-ai#56, homeassistant-ai#57, homeassistant-ai#58, homeassistant-ai#59, homeassistant-ai#66).

extract_astro_vars.mjs:
- vm.runInContext replaces (0, eval) (Gemini #2).
- Multi-line `import { a, b } from 'x';` now stripped robustly (#7).
- Eval errors wrapped with the source path for actionable failures (homeassistant-ai#35).

_js_harness.py:
- Wrong test file name and workflow path in docstring fixed (homeassistant-ai#41, homeassistant-ai#42).
- _strip_astro_frontmatter raises ValueError when frontmatter opens but
  never closes (homeassistant-ai#36).
- discover_script_surfaces raises when site/src/ is missing instead of
  silently producing partial results (homeassistant-ai#37).
- extract_script_body accepts source_label for actionable errors (homeassistant-ai#40).
- Astro `<script lang="js">` is no longer mis-tagged as TypeScript (#9).
- Inert chr(92) Windows backslash replace removed (homeassistant-ai#14).
- Field docstrings on ScriptSurface trimmed to the one that earns its
  keep (homeassistant-ai#52).
- _PY_RENDERERS registry refactor + accurate enumeration comment (homeassistant-ai#45).

test_settings_ui_js_behavior.py:
- Rot-bait PR/issue numbers removed from module docstring (homeassistant-ai#43).
- _TOP_LEVEL_ELEMENT_IDS + import-time drift check replaces the
  "refresh this manually" comment (homeassistant-ai#55).
- _assert_clean_init helper called at the top of every test so init
  failures surface as init errors, not as misleading
  "side effect didn't fire" failures (homeassistant-ai#33).
- 4xx restartBtn assertion now reads disabled state via JS and snaps
  to body.dataset instead of OR-shortcircuiting against a wiped DOM (homeassistant-ai#27).
- New test_script_boots_without_broadcastchannel_global covers the
  null-guard branch (homeassistant-ai#15).
- Assertion-restating comments removed (homeassistant-ai#60).

test_astro_setup_js_behavior.py:
- Rot-bait homeassistant-ai#1422 reference removed from module docstring (homeassistant-ai#44).
- _section_has_hidden_class replaces fragile substring slicing (#6).
- test_initial_state_only_client_section_visible now asserts on the
  promised visibility, not just absence of errors (homeassistant-ai#26).
- Per-client smoke now captures config-output text AND instructions
  HTML into body.dataset and asserts on non-empty content, catching a
  typo that drops the whole per-client branch (homeassistant-ai#21).

test_astro_tools_js_behavior.py:
- _card_class helper replaces ±200-char substring slicing (#12).
- test_design_mode_toggle now asserts design-only elements lose
  'hidden' class, not just the button label flip (homeassistant-ai#25).
- New tests cover .filter-btn / .cat-btn / .size-filter-btn /
  group-category|file|none / sort-alpha / expand-all wiring (homeassistant-ai#22, homeassistant-ai#23,
  homeassistant-ai#24) — the adjacent coverage gaps issue homeassistant-ai#1422 didn't name but that
  fit the harness's same regression-class.

test_consent_form_js_behavior.py:
- _build_form_dom docstring fixed (said "three", listed four) (homeassistant-ai#54).

test_rendered_scripts_parse.py:
- Missing-dependency skip flips to fail when CI=true so a workflow
  drift that drops the install step doesn't silently lose parse
  coverage (homeassistant-ai#29).
- Subsumed-test-class reference removed from module docstring.

AGENTS.md:
- "60s probe windows take milliseconds" wording fixed; time-faked
  surface documented (homeassistant-ai#13).
- Per-surface module naming guidance updated; reflects actual files (#10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): seed data-transports on wizard tiles; add NODE/ESBUILD env overrides; FakeBroadcastChannel ctor guard

CI surfaced a real test-fixture bug exposed by the new jsdomError →
errors routing: setup.astro's connection-click handler reads
`card.dataset.transports` via JSON.parse, but the wizard DOM stubs
were emitting `<button data-client="...">` without the matching
`data-transports` attribute. JSON.parse(undefined) threw "undefined is
not valid JSON" on the jsdomError channel, which the previous
silent-handling code dropped — now correctly surfaced as a test
failure. Fix: serialise the real `transports` array from the
clientsData entry onto each tile.

Also addressing the items previously marked deferred / skipped during
the Gemini + pr-review-toolkit triage:

- NODE_BINARY env override (Gemini #3): _node_binary() helper checks
  the env var before falling back to PATH-resolved `node`. Default
  unchanged.
- ESBUILD_BINARY env override (Gemini #8): _esbuild_binary() returns
  the env-var path when set, else the project-local install. Default
  unchanged so the lockfile-pinned install stays the reproducible
  default.
- FakeBroadcastChannel constructor guard (sf-hunter #L1): wraps the
  `new` in try/catch and records construction failures into errors
  before re-raising.
- Trim TestWizardStateMachine class docstring (comment-analyzer homeassistant-ai#50).
- Tighten the info-call enumeration comment in the 5xx test to
  describe the harness's "last entry sticks" semantics rather than
  pinning a specific call count (comment-analyzer homeassistant-ai#48).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): stub layout-dependent JSDOM APIs (scrollIntoView, scrollTo, matchMedia)

The new timer-callback-error routing surfaced a real JSDOM limitation:
`section.scrollIntoView()` (called from the wizard's `scrollToSection`
helper inside a setTimeout) is not implemented in JSDOM. Every
per-client setup-flow test failed with
``timer callback: TypeError: section.scrollIntoView is not a function``
— production behaviour is fine, but the harness's noise filter wasn't
distinguishing real script bugs from JSDOM-missing-API noise.

Adds a defensive no-op stub for scrollIntoView (Element + HTMLElement
prototypes), scrollTo on window, and matchMedia — the three most
common layout-dependent APIs production UI scripts touch. Future
rendered scripts that lean on other layout APIs (IntersectionObserver,
etc.) can extend the list when needed.

Also relaxes the per-client smoke's bare `assert not result.errors` to
rely on `_assert_clean_init` (init/transpile/invoke/jsdom errors) plus
the content-shape assertion. Timer-callback errors from missing JSDOM
APIs are noise; the content-shape check still catches the regression
class the test is named for.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): seed .tool-chevron on tool-card fixture for tools.astro expand-all test

The expand-all handler queries `card.querySelector('.tool-chevron')!`
(TypeScript non-null assertion). The runtime `!` doesn't actually
check; chevron is null in the test DOM and `chevron.classList.add(...)`
throws. Production cards include the chevron; our fixture didn't.
Add it alongside `.tool-details` in `_build_tools_dom`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: kingpanther13 <kingpanther13@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kingpanther13 added a commit that referenced this pull request May 25, 2026
…d findings (homeassistant-ai#1164)

Round-3 review pass landed 13 verified findings; the cosmetic
"persisted-value visible in UI after F5" item was explicitly skipped
per user direction (UI shows post-gate value when master off; user
accepts this since beta tools are actually disabled at runtime — the
preserve-across-master-cycle UX is at the data layer, not the visual).

Source fixes:

- **Save button copy is now source-blind** — the previous "your
  feature-flag toggles already saved on click" claim only held when
  ``saveFeatureFlag`` raised ``restartNotice``; tool-config pin saves,
  backup-config saves, and cross-tab ``restart-required`` broadcasts
  also raise it. New copy: "a restart is pending. Click Restart above
  to apply your prior changes." (#2)
- **Stale F.37 test docstring** describing the deleted server-side
  cascade rewritten. (#3)
- **Beta-gate INFO log noise** — cascade-clear removal meant the gate
  could fire its "forcing %s=False" line every Settings rebuild,
  spamming addon logs once a user had truthy sub-flags persisted.
  Dedup via ``_BETA_GATE_LOGGED`` set per process, cleared on
  ``_reset_global_settings``. (#9)
- **Lazy-lock docstring** updated to reflect Python 3.13 semantics
  (``asyncio.Lock()`` no longer takes a loop arg; the lazy pattern
  still serves test fixtures and single-loop deployment, with the
  invariant documented). (#10)
- **Addon-mode carve-out comment** clarified to distinguish dev
  (master in schema) from stable (master web-UI-only). (homeassistant-ai#14)
- **probe-div null branch** in F.37 now writes ``data-error`` so a
  failing test points at "selector missed" vs "value flipped"
  unambiguously. (homeassistant-ai#17)

Tests added:

- ``test_translations_cover_every_schema_key`` — parity check that
  every ``schema:`` key has a non-empty translation ``name`` and
  ``description``. Parameterised across stable + dev addons. Pins
  the class of silent gap that this PR's ``advanced_debug_logging``
  fix addressed. (#4)
- ``test_save_features_acquires_override_file_lock`` +
  ``test_save_advanced_acquires_override_file_lock`` — counting-lock
  wrapper asserts ``async with _get_override_file_lock()`` runs
  exactly once in each file-mode write path. Pin against a regression
  that silently bypasses concurrent-save serialisation. (#5)
- ``test_dual_save_buttons_mirror_disabled_and_status_on_post_failure``
  — exercises the 500-response branch of the dual-save mirror so a
  regression that broke ``_setAdvSaveStatus``/``_setAdvSaveDisabled``
  for error paths only would still fail. (#6)
- ``test_save_features_master_on_restores_subflag_values_in_addon_mode``
  — addon-mode round-trip mirror of the existing standalone restore
  test; asserts the Supervisor merge-and-post call carries only the
  master flip-on and never zeroes out sub-flag values. (#7)
- ``test_save_button_nothing_to_save_when_no_dirty_and_no_restart`` +
  ``test_save_button_restart_pending_hint_when_dirty_empty_but_restart_showing``
  — both branches of the empty-dirty Save click are exercised; the
  restart-pending branch asserts the copy is source-blind. (#8)

Deferred per user direction:
- #1 (file-vs-Settings visual after F5): user accepts the current
  behavior (UI shows post-gate value; runtime tools actually
  disabled when master off; data-layer preserve still works
  end-to-end).
- #11/#12/homeassistant-ai#13 (code-simplifier helper extractions): skipped as
  complicated to implement without behavior risk.
- homeassistant-ai#15 (pre-homeassistant-ai#1164 users with already-cleared sub-flags): release-note
  concern, not a code change.
- homeassistant-ai#16 (lock fragility under future thread-pool dispatch):
  speculative future-risk; not actionable today.
kingpanther13 added a commit that referenced this pull request May 29, 2026
…ant-ai#1164) (homeassistant-ai#1431)

* feat(policy): scaffold policy package for per-tool approval (#966)

* feat(policy): add Predicate/Rule/Policy data models (#966)

* feat(addon): expose enable_per_tool_approval option (#966)

* feat(config): add enable_per_tool_approval setting (#966)

* feat(policy): atomic load/save for tool_policy.json (#966)

* feat(addon): wire enable_per_tool_approval through start.py + docs (#966)

* feat(policy): args-hash + remember-cache for approval queue (#966)

* feat(policy): predicate evaluator (eq/in/regex/exists/...) (#966)

* feat(policy): pending entries with TTL, decisions, and event signalling (#966)

* feat(policy): PolicyMiddleware happy-path branches (#966)

* test(policy): cover block/deny/timeout/recall/remember branches (#966)

* feat(policy): /api/policy/* Starlette handlers (#966)

* fix(policy): wrap ValidationError, scope contains op, regex doc, test bind (#966)

* feat(policy): Policies tab in web UI + sidecar route wiring (#966)

* feat(policy): register PolicyMiddleware on the FastMCP server (#966)

* fix(policy): return 400 on malformed approve/deny bodies (#966)

* feat(toolsearch): unpin yaml-edit and code-mode tools, gated by approval middleware (#966)

* chore: ruff format + lint cleanup for policy package (#966)

- ruff format reflow on PR-touched files (case statements split to two
  lines, function signatures, line continuations).
- UP042: Verdict now inherits from StrEnum instead of (str, Enum).
- E402: hoist `import anyio` to the top of test_approval_queue.py.
- I001: sort imports in test_evaluator.py and test_model.py.

No behavior change.

* fix(policy): mypy narrowing for evaluator comparisons (#966)

`Predicate.value` is `Any | None` and `extract_path` returns `Any`, so
`val == pv`, `val > pv`, etc. inherit `Any` and trip the project's
`warn_return_any` mypy setting on functions declared `-> bool`. Wrap
the comparison branches in `bool(...)` to make the narrowing explicit.

Also guard the `regex` branch with `isinstance(pv, str)` so `re.search`
receives a definite `str` instead of `Any | None`; a non-string regex
value now returns False instead of raising TypeError at evaluation
time, which is the only sensible behavior for a malformed pattern.

No change to any test's expected outcome.

* docs: credit @L1AD and PolicyLayer for #966 inspiration

* docs(addon): fix wrong YAML example in enable_per_tool_approval section (#966)

* fix(policy): CI green + critical bugs from reviewer cycle (#966)

- expires_in_seconds: use time-remaining not total TTL
- middleware: fail-closed on corrupt policy load (was crashing all gated calls)
- handlers: 500-with-corrupt-flag on get_config when policy invalid
- approval URL: wire secret_prefix via lazy getattr so HTTP-standalone
  emits a usable absolute path
- approve/deny: return bool, 409 on already-decided
- Predicate: field validators for op/value compatibility (Gemini)
- persistence: explicit UTF-8 encoding (Gemini)
- middleware: reuse tools/helpers safe_progress
- handlers comment: fix wrong "next call" claim
- test_middleware: unwrap ExceptionGroup for pytest.raises (CI fix)
- test_stdio_settings_sidecar: include new policy_* handler keys (CI fix)

* refactor(policy): drop default_action + tighten Rule.tool_name (#966)

- Policy schema simplified: no default_action field. System is always
  "allow unless a rule matches; rule = require approval". The previous
  default_action='require_approval' option was a bricked-config trap
  since rules can't grant allow-overrides.
- Rule.tool_name now rejects empty string; wildcard '*' documented in
  the docstring.
- Evaluator simplified to match.
- Tests updated/added.

* refactor(policy): encapsulate decision state + clean naming (#966)

- PendingApproval.decide() encapsulates the decision/event coupling;
  property guards read-only access to decision.
- __post_init__ validates expires_at > created_at.
- ApprovalQueue docstring spells out single-process scope and
  restart-loses-tokens semantics.
- Rename args_preview -> args throughout (it was always the full
  unmodified args; "preview" was misleading).
- Remove on_policy_change dead parameter from build_policy_handlers.

* fix(toolsearch): default-pinned tools should be user-unpinnable (#966)

- Computed pinned set now respects tool_config.json — explicit "enabled"
  state removes from defaults.
- Remove ha_restart / ha_reload_core from DEFAULT_PINNED_TOOLS (recovery
  actions, low frequency, low value in default LLM tool surface).
- Add ha_manage_backup to MANDATORY_TOOLS (operational essential).
- Server now declares _settings_secret_prefix on __init__ for pyright.

* refactor(policy): rename feature to "Tool Security Policies" (#966)

User-facing rename: addon config option, env var, Settings attribute,
UI tab label, addon DOCS sections, translations, server method.
Internal Python naming (policy/ package, tool_policy.json, /api/policy/*
routes, class names like PolicyMiddleware/ApprovalQueue) unchanged for
less churn.

* docs: small comment polish for policy review nits (#966)

* feat(ui): per-tool security-gated toggle in Tools tab (#966)

* test(policy): integration + timing-isolation coverage gaps (#966)

- test_server_policy_wiring: assert _apply_tool_security_policies attaches
  middleware + approval_queue when enabled, neither when disabled.
- test_settings_ui_handler_selection: parametrize the live-vs-stub branch
  in build_settings_handlers (sidecar / no server / no queue / live).
- test_middleware wait-loop timing: assert event-wake exit, not polling.
- test_middleware multi-rule precedence: first-match wins for remember_minutes.

* feat(ui): rewrite Tool Security Policies tab — per-tool cards + predicate editor (#966)

* feat(config): expose enable_tool_security_policies as a feature flag (#966)

Wires the new addon-config toggle into FEATURE_FLAG_FIELDS so it appears
in the Server Settings tab and rides PR #1420's _save_feature_flags +
_supervisor_merge_and_post_options + _schedule_supervisor_self_restart
flow when toggled in addon mode.

* fix(policy): address all verified review findings + CI failures (#966)

CI fixes:
- ruff: drop dead noqa: SLF001 suppression
- unit test: test_missing_path_never_matches_except_exists uses
  op-compatible values so Predicate field_validator doesn't reject

Review findings:
- FEATURE_META entry for enable_tool_security_policies (toggle now
  renders in Server Settings tab)
- Approval URL points to /settings?tab=tool-security-policies (was
  POST-only /api/policy/approve which 405'd on browser open)
- policyDecide surfaces network errors + 409 current_decision
- Policy gains version field for optimistic concurrency; PUT 409s
  on version mismatch; client surfaces 'reload before saving'
- _apply_tool_security_policies failure logs spell out security
  impact (TOOL SECURITY GATING IS NOT ACTIVE) and include
  data_dir/env-var context
- Validator rejects value on op='exists'; gt/lt TypeError degrades
  to False
- ToolVisibilityResult -> UserToolStateOverrides, fields are
  frozenset, disjointness asserted
- PendingApproval.event private; expose async wait()
- _SupervisorOptionsError gains transport()/validation() classmethods
  encoding kind->status_code pairing
- Wiring test binds queue identity; handler-selection covers all 3
  live routes
- Comment + doc polish (audit-trail claim, e2e docstring path,
  internal Task references)

* fix(policy): CI green + real e2e test for the approval flow (#966)

- ruff format: tests/src/unit/test_settings_ui.py
- test_save_and_roundtrip: account for save_policy version bump
- test_serialized_shape_is_stable: include 'version' in expected keys
- test_addon_save_returns_500_when_server_is_none: guard
  server._settings_secret_prefix assignment with None check
  (regression from #4's secret-prefix wiring)
- tests/src/e2e/policy/test_approval_flow.py: real e2e exercising
  block -> approve -> re-call cycle with strict args-binding rejection
  on mutated args. Skip-stub replaced with real test driving the live
  middleware via mcp_client + /api/policy/* HTTP.

* fix(ui): broken quote escaping in predicate-form placeholder breaks JS parse (#966)

The Python source `'placeholder=\\'\"lock\"...\\\\'>'` rendered as JS
`'placeholder='\"lock\"..'>'` — the single quote inside the HTML attribute
value closed the outer JS string literal, and subsequent tokens (\"lock\",
'or', '[', ...) broke parsing. With a syntax error in the inline <script>,
the browser stopped executing — Tools tab stuck on 'Loading...', tabs
unclickable.

Switched to a JS-safe double-quoted attribute with &quot; for the
embedded double quotes in the placeholder hint.

* fix(ui): gated toggle reads addon-config flag, not Policy.enabled (#966)

The per-tool 'security gated' toggle was grayed out even when the user
had enable_tool_security_policies turned ON in the addon config + the
Server Settings tab toggle, because the JS was reading Policy.enabled
(the file field) instead of the addon-config feature flag — which is
the single source of truth for whether the middleware is active.

loadPolicyState now reads enable_tool_security_policies from
/api/settings/features (same place renderFeatureFlags consumes from).

* fix(policy): Policy.extra=ignore so old persisted files load cleanly (#966)

Persisted tool_policy.json files from an earlier revision of this PR
carry default_action (since dropped) and rejected with ValidationError
on load — surfacing as 'Could not load policy: 500' when the user
clicked the per-tool gated toggle.

Predicate/Rule keep extra=forbid (typo catching at construction).

* fix(policy): drop Policy.enabled — addon-config flag is the sole switch (#966)

The middleware's server-side gate was checking `policy.enabled` (a
file field with no UI surface), so it returned ALLOW on every call
regardless of rules. The addon-config flag
(`enable_tool_security_policies`) was supposed to be the only switch
— and the middleware is only registered when that flag is true — so
the inner `policy.enabled` check was both redundant and broken.

Remove the field, remove both server-side checks (middleware +
evaluator), update tests, and refresh the JS comment that referred
to it.

* fix(policy): drop approve_url, instruct LLM to send user to settings page (#966)

The relative-path approve_url doesn't resolve cleanly through cloudflared
or other reverse-proxy deployment modes — the LLM can't safely hand it
to the user. The user already knows where the Tool Security Policies
tab is (they set the rule from it), and that page lists all pending
approvals, so a per-request URL is unnecessary noise.

- Drop approve_url from USER_APPROVAL_REQUIRED context; keep `token`
  so a caller could correlate but the user doesn't need to act on it.
- Update message + progress text to instruct the LLM to tell the user
  to open the settings UI Tool Security Policies tab.
- Drop the now-unused approval_url_builder param + the
  _settings_secret_prefix plumbing in server.py / settings_ui.py.

Also fix the failing test_defaults (asserted dropped Policy.enabled field)
and the e2e test PUT body that still carried `"enabled": True`.

* feat(policy): schema-driven condition builder for write/destructive tools (#966)

The previous "Add predicate" UX required users to type both the dotted
arg path (e.g. `args.domain`) and the value as JSON. Two problems:
  1. They need to know what fields each tool takes.
  2. They need to know what values are legal (which HA domains exist,
     which entities, etc.).

Replace the free-text path input with a dropdown sourced from the tool's
JSON schema, and replace the free-text value input with a (multi-)select
sourced from HA when the path has a known value source (domain, service,
entity_id today; trivially extensible). Free-text is still available via
an "(other — type a path)" escape hatch and as the automatic fallback
for ops that don't pair with a registry (regex / contains / gt / lt).

Server:
- New `/api/policy/tool-schema?name=...` returns
  `{paths: [...], value_sources: {path: source_key}}`. Read-only tools
  return empty paths so the UI falls back to free-text (gating those is
  low-value but still permitted manually).
- New `/api/policy/value-source?source=...` resolves a source key to a
  live list of choices. In-process 30s TTL cache avoids hammering HA
  when the user explores paths.
- value_sources.py registry maps (tool_name, arg_path) → source_key for
  the common write/destructive surface (call_service, set_entity,
  set_integration_enabled, get_history, etc.). New mappings are one
  dict entry plus, if a new source key, one fetcher.
- Both endpoints mount in addon + secret-prefix routes. Sidecar serves
  503 stubs (no FastMCP registry / HA client in that process).

UI: rename user-facing "predicate" → "condition" (CS jargon → SQL/JIRA
terminology users actually recognise; internal Pydantic class stays
`Predicate` so the wire format is unchanged). Form fetches the schema
lazily on first open, caches it on the card, refetches value choices
when path/op changes.

Includes test_schema_handlers.py covering: missing-name 400, sidecar
503, unknown-tool 404, read-only empty-paths, write-tool paths +
registry, JSON-schema enum passthrough, value-source 400 paths, both
HA-services payload shapes, domain filtering for entities/services,
and upstream-fetch 502 mapping.

* test: include new policy handler keys in sidecar all-keys assertion (#966)

* fix(ui): clearer condition-builder labels, optional value, bareword input (#966)

User feedback on the new form was:
  1. "args.foo" path placeholder is gibberish; no real label on path/value
  2. value box should not be mandatory for ops where backend allows None
  3. typing `lock` into the value box errored with "Invalid JSON" — every
     normal-looking input has to be quoted
  4. for ha_call_service `data` was the only arg without an obvious meaning

Changes:
- Real `<label>`s on the form rows: "Argument:", "Match when:", "Value:".
- Op dropdown shows friendly text ("is present (any value)", "equals",
  "is one of", "matches regex", etc.); wire values unchanged.
- Hint line under the value row reflects the current op so users know
  whether a value is required and roughly what shape it should take.
- Value is now OPTIONAL for ops where the backend accepts a missing
  field (exists, eq, neq, contains). Submitting an empty value omits
  the `value` key from the predicate entirely.
- Bareword inputs auto-coerce: `lock` → `"lock"`, `lock,alarm` → list,
  `42` → number, `true` → bool. Falls back to a clearer error if even
  the smart-coercion can't make JSON.
- Path dropdown options now carry the schema `description` as a
  `title` tooltip, so `data` reads as "Service data dict" on hover
  instead of being a mystery.
- Schema-declared enums render as a value dropdown automatically (no
  registry entry needed) when the path's JSON-schema has `enum`.

Also fix /tmp/extract_js.py — naive paren-counter broke once form
strings started containing parens; switch to ast.parse so future
edits don't silently break the harness.

* feat(policy): wildcard path "args.*" + clearer empty-value semantics (#966)

User asked for a catch-all "any argument equals X" condition and called
out that the previous "Leave blank to gate on null" hint was nonsensical
— a blank value should mean "any value" to a normal user, not "match
the null literal".

Backend:
- Refactor evaluator: `extract_path` → `iter_path_values`, which yields
  every value the dotted path resolves to. A `*` segment fans out across
  the current node (dict values for dicts, items for lists). A path like
  `args.*` thus yields every top-level arg; `args.config.*` yields every
  leaf of the config sub-dict.
- `match_predicate` rewrites to "ANY matching value satisfies the op",
  which collapses to the previous single-value semantics for non-wildcard
  paths. So `path=args.*, op=eq, value="lock"` gates whenever any arg of
  the tool call equals "lock".

UI:
- New "(any argument)" option at the top of the path dropdown, fills
  `args.*` and carries a tooltip explaining the semantic.
- VALUE_OPTIONAL_OPS shrinks to just `exists` — blank value is no longer
  silently accepted for eq/neq/contains. Instead, the value-required
  error fires, and the hint text under the field tells the user to
  switch op to `is present` if they wanted "any value".
- Hint copy revised across all ops so the "what happens with this op +
  blank value" question has a clear answer at each step.

Tests:
- New TestIterPathValues covering top-level, nested, missing, and the
  three wildcard shapes (dict values, list items, empty).
- New TestWildcardPredicate covering eq/in/exists/regex matching via
  `args.*` plus an end-to-end evaluate() test.
- Existing tests still pass with the refactored matcher; signatures of
  the public functions are unchanged.

* fix(ui): default condition path to '(any argument)'; relabel error (#966)

- Drop the '(pick an argument)' placeholder; default the path dropdown
  to '(any argument)' so the form is immediately submittable.
- 'path is required' error reads 'argument is required' if it ever
  fires (it won't on the happy path now).

* feat(policy): auto-save conditions + surface matched_rule in approval error (#966)

UI:
- Drop the manual "Save changes" button on each rule card. Conditions
  now PUT to disk the moment the user clicks "Save condition", clicks
  the × on a condition row, or edits the remember-minutes field
  (debounced 500ms). The only feedback is a small "Saving…" / "Saved."
  status line next to the card.
- Removed the now-dead .policy-save-rule CSS and the markDirty helper.

Server:
- USER_APPROVAL_REQUIRED error context now carries `matched_rule` with
  the rule's tool_name + when[]. Lets the user (and the LLM) tell at a
  glance which rule fired, instead of guessing whether their condition
  saved correctly.

* feat(policy): case-insensitive string comparison in all ops (#966)

Security gates shouldn't fire differently based on whether the LLM
capitalised its argument — 'Lock' and 'LOCK' and 'lock' are the same
operationally. eq/neq/in/not_in/contains lower-case both sides before
comparing when both are strings; regex uses re.IGNORECASE. Non-string
types pass through unchanged so int(1) != str('1') still holds.

* fix(policy): mypy bool cast + broaden e2e coverage (#966)

mypy: bool(_ci(val) == _ci(pv)) — _ci returns Any (passes non-strings
through unchanged), so eq/neq comparisons need an explicit bool wrap.

Tests: previous e2e only covered the happy block→approve→re-call path.
Add four more cases against the live testcontainer:
- wildcard `args.*` gates when any arg matches the value
- wildcard `args.*` passes through when no arg matches
- case-insensitive matching (rule 'lock' gates caller 'LOCK')
- deny → middleware raises USER_DENIED, tool never runs
- remember_minutes>0: second call within the window skips the queue

* refactor(policy): address review-cycle findings (#966)

Gemini (6 unresolved threads):
- Migrate POLICY_LOAD_FAILED / USER_DENIED / USER_APPROVAL_REQUIRED off
  manual `ToolError(json.dumps(...))` onto the canonical
  `raise_tool_error(create_error_response(...))` pattern. Added the
  three error codes to ErrorCode enum.
- Hoist sync `load_policy()` off the event loop via
  `anyio.to_thread.run_sync` in the middleware's policy provider call.
- Add justification comment on `local_provider._list_tools()` (same
  rationale that's already documented in `settings_ui.py`'s tool
  enumerator: public `list_tools()` filters disabled tools but
  operators may still want to author gating rules for them).

Code-reviewer findings:
- ApprovalQueue TOCTOU: two concurrent `on_call_tool` coroutines with
  identical (tool, args_hash) could both miss `find()` and create
  duplicate pending entries; approving one would leave the other
  waiter blocked. Introduce `find_or_create(...)` serialised behind
  an `anyio.Lock`; middleware now uses it.
- ApprovalQueue had no pending-entries cap → memory exhaustion under
  an LLM retry-loop with mutated args. Add `PENDING_CAP = 1000` with
  FIFO eviction of oldest entries when the cap is hit.

Silent-failure-hunter findings:
- handlers.py: `get_tool_schema` and `get_value_source` now
  `logger.exception` before returning 500/502 so FastMCP version
  bumps or HA outages leave a traceable signal instead of opaque
  client errors.
- value_sources fetchers `logger.warning` on unexpected HA response
  shapes (would otherwise silently return empty dropdowns).
- value_sources cache no longer stores empty results — a transient HA
  glitch returning [] would otherwise pin the dropdown blank for 30s.

PR-test-analyzer findings (the critical one):
- test_persistence.py's `test_save_and_roundtrip` passed `Policy(enabled=True, ...)`
  for a field that no longer exists; `extra="ignore"` silently dropped it
  so the test was a no-op assertion. Replace with real round-tripped
  fields (wait_seconds / approval_ttl_minutes / remember_minutes) and
  add an explicit `test_load_drops_unknown_fields` exercising the
  extra="ignore" back-compat contract with a JSON file carrying
  `default_action` + `enabled`.
- Add wildcard scalar/None tests (`args.x.*` against scalar yields
  nothing; doesn't crash).
- Add ApprovalQueue tests: concurrent `find_or_create` shares one
  pending entry; `create` evicts oldest at PENDING_CAP.
- Add handler tests: sidecar value-source returns 503, tool-schema 500
  on `_list_tools` exception, value-source cache key separates per
  params, `_extract_arg_paths` skips malformed property entries.

Comment-analyzer findings:
- Grammar fix in handlers.py `_is_write_or_destructive` docstring.
- model.py docstring "older version of this PR" → "older builds".
- Strip the `(#966)` / `(issue #966)` parentheticals from module
  docstrings, settings_ui CSS/HTML/comments — git blame and the
  commit message carry the link.

* fix(policy): UI surface fetch failures + middleware reissues swept pending (#966)

- Middleware: after _wait_for_decision returns without a verdict, check
  whether the pending entry was swept (TTL elapsed during the wait). If
  so, create a fresh entry before raising USER_APPROVAL_REQUIRED so the
  LLM isn't told to re-call against a dead token.

- UI: policyLoadConfig now surfaces fetch failures in a visible error
  banner instead of silently rendering blank — picks up the
  policy_file_corrupt:true repair hint from the server's 500 response.

- UI: loadValueChoices records the failure (lastValueSourceError) so
  renderHint can show it under the value row. The dropdown still
  downgrades to free-text, but the user can now tell a transient HA
  outage from "no value source registered for this path".

- UI: renderValueControl uses an autoincrement seq so rapid path/op
  edits don't let an earlier slow fetch's DOM mutation land after a
  newer one's (similar to the autoSave pattern).

* fix(policy): logger.info on silent decide-False; debug log on gt/lt type-mismatch; strengthen event-wake test (#966)

- ApprovalQueue.approve/deny: emit logger.info when the call returns
  False (unknown token or already decided) — was silent. Helps debug
  the case where the middleware's consume_and_maybe_remember races
  with an out-of-band decide.

- Evaluator gt/lt TypeError fallback now logs at debug so a user
  whose 'battery_level < 20' rule never fires can see that the arg
  came in as a string and tighten the rule.

- test_event_wakes_waiter now measures elapsed wait time and asserts
  < 200ms, ruling out a hidden poll-loop impl that would still pass
  the previous decision-only check.

* style: ruff format evaluator.py for CI's 0.15.13 (#966)

Local ruff 0.15.7 didn't wrap the multi-arg logger.debug call;
CI's ruff 0.15.13 does. Upgrading local toolchain to match.

* feat(policy): clear remember-cache on save, clearer disabled-state UX, mirror master toggle (#966)

Three things:

1. Remember-cache invalidation on policy save (B2).
   ApprovalQueue.clear_remember_cache() drops every remembered
   approval; put_config calls it after a successful save. Without
   this, tightening a rule was silently bypassed by any in-flight
   remembered approvals until their window expired.

2. Better 503 / "unavailable" messaging (B8 + the broader issue).
   The stub handler's 503 message used to read "Live approvals
   unavailable in this mode (sidecar)" even when the real cause was
   the feature being turned off in addon config — the user had no
   way to tell from the UI. Updated to call out all three causes
   (feature off, sidecar, ImportError) and point at the addon log.
   The pending-list JS now checks policyState.enabled first and
   shows "Tool Security Policies is turned off" when that's the
   actual reason, falling back to the server's 503 message
   otherwise.

3. Mirror the master toggle onto the Tool Security Policies tab.
   Was only exposed in Server Settings before — users on the
   Policies tab had to navigate away to find the on/off switch.
   New checkbox at the top of the tab posts to the same
   /api/settings/features endpoint, so the two surfaces are live
   mirrors of the same addon-config flag.

* test(policy): fix JS-harness drift guard + lock policy-tab behaviour (#966)

The merged-in JSDOM behaviour test (#1425) failed collection because
its hardcoded _TOP_LEVEL_ELEMENT_IDS list didn't yet know about the
policy-tab handlers this PR adds (policy-master-toggle, policy-save-global-btn).
Add them, plus matching DOM stubs in _build_min_dom so the init pass
doesn't throw on the addEventListener calls.

While the file is open, add three behavioural tests that pin the new
condition-builder UX wiring:

- Master toggle change POSTs to /api/settings/features with the
  enable_tool_security_policies flag (so the on-tab toggle stays a
  true mirror of the Server-Settings checkbox).
- /api/policy/pending 503 renders "Tool Security Policies is turned
  off" when the addon flag is off (avoids the old misleading
  "sidecar / unavailable" copy).
- /api/policy/pending 503 propagates the server's addon-log message
  verbatim when the flag IS on but the queue is unreachable (so users
  know where to look for ImportError details).

The parse-coverage path catches syntax breaks already; these tests
catch behavioural regressions on top of it.

* refactor(policy): address 2nd-round review findings (#966)

Verified all 23 findings from the 2nd pr-review-toolkit pass against
the code; fixed 22 (skipping #13 — the JSDOM seq-cancel race test is
high-effort to author reliably and the production guard is small
enough that bench-level review catches regressions).

## Correctness / silent-failure

- ApprovalQueue PENDING_CAP eviction now sorts by
  `(decision == "pending", created_at)` so resolved entries evict
  first. When a still-pending entry MUST be evicted (cap full, no
  resolved to drop), `.set()` its event so any waiter in
  `_wait_for_decision` wakes immediately instead of blocking the
  full wait_seconds against a row that no longer exists.
- Middleware: log INFO with old + new token on the
  reissue-after-sweep branch so operators can correlate
  "approval row keeps reappearing" with the actual cause.
- Middleware: scope `clear_remember_cache` to "rules actually
  changed" — editing only wait_seconds / approval_ttl_minutes no
  longer blows away in-flight remembered approvals.
- Policy: model_validator requires `wait_seconds < approval_ttl_minutes * 60`
  so the middleware can't repeatedly issue fresh pending entries
  because the wait outlasted the TTL.
- value_sources: cache key uses `urllib.parse.urlencode` so a future
  param value containing `=`/`&` can't collide with another key.
- ApprovalQueue: `approve`/`deny` on unknown token now logs WARNING
  (security-gating endpoint, suggests UI bug or token probing).
  Already-decided stays INFO (legitimate race).

## UI

- settings_ui.policyState gains an `enabledKnown` tri-state bit so
  downstream branches (`policyLoadPending`'s "feature off" copy,
  master-toggle revert) don't false-confidently route to the
  "disabled" message when the features fetch actually failed.
- Master-toggle change handler reverts the checkbox on save failure
  AND syncs from `policyState.enabled` after a successful load —
  no more "UI says on, server says off" drift.
- `policyLoadConfig` appends "(response body unparseable)" when the
  500 body isn't JSON (e.g. HTML error page from a misrouted sidecar),
  so the operator sees more than just "HTTP 500".
- `policyLoadPending` surfaces fetch errors inline ("Lost contact
  with server, retrying") instead of silently freezing the list.
- `fetchToolSchema` records `lastValueSourceError` on failure so the
  hint banner explains why the value dropdown silently downgraded
  to free text.

## Tests

- test_handlers: `test_put_config_clears_remember_cache_when_rules_change`
  + `test_put_config_preserves_remember_cache_when_only_timing_changes`
  lock in the scoped invalidation.
- test_approval_queue: `test_create_evicts_resolved_entries_before_pending`,
  `test_evicting_pending_wakes_its_waiter`,
  `test_create_after_sweep_still_evicts_when_pending_fills_cap`
  cover the new eviction rules. The strengthened
  `test_find_or_create_lock_blocks_concurrent_create_under_real_race`
  inserts a yield point inside the lock body so the lock actually
  matters to the assertion (the previous test would pass even
  without the lock under anyio's cooperative scheduler).
- test_middleware: `test_swept_pending_during_wait_is_reissued_with_fresh_token`
  exercises the previously-untested reissue branch.
- test_schema_handlers: `test_value_source_empty_result_not_cached`
  proves the empty-result no-cache guard actually triggers a refetch.
- test_settings_ui_js_behavior: master-toggle test now JSON-parses
  the POST body and structurally asserts
  `flags.enable_tool_security_policies is True` instead of loose
  substring matching.

## Comments

- approval_queue.py PENDING_CAP docstring now matches implementation
  (was promising "resolved first" before the implementation actually
  did it).
- evaluator.py: gt-branch comment example uses ">" not "<".
- handlers.py: dropped cross-reference to settings_ui that would rot.
- middleware.py: trimmed "fast on warm disk" speculation.
- value_sources.py: trimmed "(WebSocket reconnect, auth lapse)"
  speculation in the empty-cache comment.
- settings_ui.py: four comments still saying "predicates" updated
  to "conditions" to match user-facing terminology.

* fix(ui): blank value on eq/in/etc coerces to op=exists (#966)

User expects 'leave value blank to gate on the argument's mere
presence regardless of value' to work across the equality-ish ops,
not just op=exists. Earlier I had the form reject blank value for
anything other than exists with a 'value is required' error.

Now: for eq / neq / in / not_in / contains / exists, leaving the
value blank silently coerces the predicate to op=exists on save.
The condition row then reads as 'args.* exists' which is the right
description of what's stored.

Ops that genuinely need a value (regex / gt / lt) still raise
'value is required for op=...'. The hint text under each op
updated to call out 'Leave blank to gate on any value' where it
applies.

* docs(addon): drop beta tag from Tool Security Policies (#966)

The feature is stable enough to ship as a default-supported addon
config option, not a beta. Also corrects two pieces of doc drift
that landed here originally:

- 'approval URL' wording → 'tell the user to open the Tool Security
  Policies tab' (the URL field was dropped earlier in this PR)
- 'predicates' → 'conditions' (matches the user-facing terminology
  the UI now uses)

Touches both prod and dev addon directories (config.yaml-driven UI
text + the rendered DOCS.md).

* docs(beta): describe 3-path enabling (dev addon, stable + web UI, env vars) (#1164)

* feat(config): advanced settings registry + beta master toggle field (#1164)

- Add ``ADVANCED_SETTINGS_FIELDS`` registry (21 fields across connection,
  search, operations, diagnostics, tools_surface, beta_codemode sections)
- Add ``_ADVANCED_SETTINGS_BOUNDS`` and ``_ADVANCED_SETTINGS_CHOICES``
  dicts for UI/POST validation
- Add ``BETA_FEATURE_FIELDS`` tuple for master-gate enforcement
- Add ``enable_beta_features`` Settings field (alias ENABLE_BETA_FEATURES,
  default False) as the master beta toggle
- Update ``FEATURE_FLAG_FIELDS``: add ``enable_beta_features`` at front,
  ``enable_code_mode`` at end; reorder for UI grouping
- Extend ``BACKUP_OVERRIDE_FIELDS`` from 3 to 5 entries (add
  ``auto_backup_dir`` and ``auto_backup_calendar_lookahead_days``)
- Extend ``_apply_backup_overrides`` to handle ``str`` type; widen
  ``coerced`` annotation to ``bool | int | str``; add bounds check for
  ``auto_backup_calendar_lookahead_days`` (1..365)
- Add coverage gate test asserting every Settings env alias is registered
  in one of the three panel registries (or in the explicit ALLOWLIST)
- Add ``test_enable_beta_features_default_false``

* refactor(config): code-review fixups — docstring clarity, stricter bool reject, null-byte guard (#1164)

* feat(settings-ui): per-tool env-pin for DISABLED_TOOLS / PINNED_TOOLS (#1164 addendum)

Add env_pinned_tools() and effective_tool_config() helpers so tools listed
in DISABLED_TOOLS / PINNED_TOOLS env vars stay read-only at runtime even
after tool_config.json has been written. The _get_tools GET handler now
includes env_pinned metadata per tool entry; _save_tools rejects incoming
flips of env-pinned tools with HTTP 409. Server.py startup path updated to
use effective_tool_config() so env pins apply at boot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(config): master beta gate + advanced overrides apply (#1164)

- Rewrite _apply_feature_flag_overrides: lift addon-mode short-circuit
  for beta fields (enable_beta_features + BETA_FEATURE_FIELDS), add
  master gate that forces all 5 beta sub-flags to False when master is
  off regardless of env/file state
- Update get_feature_flag_origin: beta fields never return "addon" —
  they follow standalone precedence in either mode
- Add _apply_advanced_overrides: reads feature_flags.json for all editable
  ADVANCED_SETTINGS_FIELDS entries; skips display-only fields; validates
  types, bounds, and choices before setattr
- Wire _apply_advanced_overrides into get_global_settings (runs after
  feature-flag + backup passes)
- Add 18 unit tests covering master gate semantics, addon-mode carve-out,
  advanced-override int/str/float/display-only/out-of-bounds/invalid-
  choice cases, and backward-compat (pre-master override files)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(settings-ui): code-review fixups — drop dup env_pinned, settings param, msg + tests (#1164)

Address code-quality review feedback on 32d67b4d:

- Drop the redundant per-tool-entry `env_pinned` field from the GET
  /api/settings/tools response. The top-level `env_pinned` map is the
  single source of truth; UI does O(1) lookups against it.
- `effective_tool_config()` now accepts an optional `settings`
  parameter (mirrors `load_tool_config()`); restores dependency
  injection at the server.py startup callsite (`self.settings`).
- 409 rejection message uses comma-joined names instead of Python list
  repr for better human readability; structured `context.rejected`
  remains for programmatic access.
- Add `test_get_tools_includes_env_pinned_map` test covering the new
  GET response field.
- Symmetrize `get_data_dir.cache_clear()` calls at both ends of each
  tmp_path test so cross-test cache pollution can't leak in.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(config): code-review fixups — docstring accuracy, top-level import, null-byte test (#1164)

- Correct _apply_advanced_overrides docstring: most advanced fields
  ARE in the addon config.yaml schema (backup_hint, verify_ssl,
  enabled_tool_modules, etc.) and are handled correctly via the env-
  var-wins check because start.py exports them. Only code_mode_* and
  mcp_server_version are file-only in either mode.
- Move `from typing import Any` from function body to module-level
  imports (stdlib typing — no need to defer).
- Add test_advanced_override_str_field_with_null_byte_rejected to
  cover the previously-unexercised null-byte reject branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(settings-ui): render env-pinned tool rows as read-only + update addon translations (#1164)

- Add `toolEnvPinned` module-level map; populate from `data.env_pinned` in `loadTools()`
- Tool rows for env-pinned tools get `.env-pinned` class, all inputs disabled, and
  a `feature-locked-note` banner naming the env var (DISABLED_TOOLS or PINNED_TOOLS)
- Group master toggle excludes env-pinned tools from bulk enable/disable
- Update `pinNotice` copy to describe the env-pinned lock-until-unset semantics
- Update `homeassistant-addon-dev/translations/en.yaml` descriptions for
  `disabled_tools` and `pinned_tools` to reflect that they are operator-level
  locks, not seed-only values
- Add JSDOM behavioural tests for env-pinned disabled and pinned tool rows

* feat(settings-ui): /api/settings/advanced GET+POST handlers (#1164)

* feat(settings-ui): render advanced settings sections in Server Settings tab (#1164)

* feat(settings-ui): beta master toggle + nested sub-row gating + 409 rejection (#1164)

* feat(settings-ui): nest code-mode sub-numerics under enable_code_mode (#1164)

* test(addon): assert start.py auto-enables master beta in dev addon mode + stable schema absence (#1164)

* feat(addon): start.py auto-enables ENABLE_BETA_FEATURES=true when dev addon options carry beta keys (#1164)

* fix(settings-ui): post-CI/post-review fixes — backup-config str+range, addon-mode gate skip, e2e env, JSDOM ids, stale comments (#1164)

* fix(config): beta-sub-flag origin returns addon in dev mode (env var presence as signal) (#1164)

* fix(tests): addon-mode save tests use non-beta flag matching new origin semantics (#1164)

* refactor(settings-ui): loadAdvancedSettings error parity + atomic-write helper reuse + beta_sub_flags via API (#1164)

* refactor(config): narrower setattr errors + hasattr precheck + Gemini coerced-decl nit (#1164)

* refactor(settings-ui): hoist override-file read + display-only warning + missing test coverage (#1164)

* test(settings-ui): section-uniqueness + registry-disjoint + tighter env-pin assertions + accurate docstring (#1164)

* refactor(addon): extract maybe_auto_enable_beta_master helper + real unit tests (#1164)

* test(settings-ui): JSDOM coverage for advanced sections + master live-render; verify file untouched on 400 (#1164)

* refactor(config): NamedTuple registries + import-time validator; finish silent-failure + JSDOM nesting tests (#1164)

* fix(tests): restore standalone-mode assertion + add adv section containers (#1164)

`replace_all` from an earlier sweep had pivoted the standalone-mode save
assertion onto a beta sub-flag, so the master-gate guard now rejected
the request and the test failed. Restore the non-beta `enable_tool_search`
flag here — the assertion is about the unified save-contract shape, not
the beta path.

JSDOM `TestAdvancedSectionRender` tests were failing because MIN_DOM
lacked the five `adv*` section containers; `renderAdvancedSection` would
silently no-op (getElementById returned null) and the assertions fired
against an empty body. Add `advConnection`, `advSearch`, `advOperations`,
`advToolsSurface`, `advDiagnostics` to `_TOP_LEVEL_ELEMENT_IDS` so
`_build_min_dom` emits `<div>` containers for them.

* fix(tests): adopt master beta gate + new advanced handler keys (#1164)

Three failures surfaced after rebasing onto upstream/master:

1. ``test_returns_all_handler_keys`` expected the pre-#1164 handler
   set. Add ``get_advanced_settings`` / ``save_advanced_settings``
   to the expected keys.

2. ``test_tools_filesystem.TestFeatureFlag::test_enabled_with_*``
   broke because the master beta gate now forces every beta sub-flag
   False at runtime when ``ENABLE_BETA_FEATURES`` is unset. Set both
   env vars together in the enabling tests so they exercise the
   sub-flag bool parsing in isolation, and add an explicit test for
   the gated behavior so a future regression in
   ``_apply_feature_flag_overrides`` surfaces here too.

3. ``test_yaml_config_tool.enable_flag`` fixture sets
   ``ENABLE_YAML_CONFIG_EDITING`` but didn't set the master, so the
   cached settings landed with the sub-flag forced False — would have
   broken next-up after the filesystem tests. Set
   ``ENABLE_BETA_FEATURES`` alongside.

* fix(policy): contains operator case-insensitive on list-membership branch

Pre-fix:
  case "contains":
      if isinstance(val, str) and isinstance(pv, str):
          return pv.lower() in val.lower()           # CI
      return isinstance(val, (list, tuple, set)) and pv in val  # case-SENSITIVE

The string-in-string branch was already case-insensitive (matching the
``_ci``-equivalent treatment that ``eq`` / ``in`` / ``not_in`` apply),
but the list-membership branch fell through to Python's default ``in``
operator. A rule listing ``["light.kitchen"]`` would not fire on an
LLM passing ``["Light.Kitchen"]`` — silent gate failure.

Bring it in line with the other string-op branches via per-element
``_ci``, which passes non-string entries through unchanged so
mixed-type collections still get natural equality semantics.

Caught by Gemini Code Assist on #1431; addressing inline rather than
opening a follow-up because the policy module is now in master
(#1421) and any reviewer running the suite would see the
case-sensitivity asymmetry in the existing TestCaseInsensitive class.

* feat(settings-ui): addon-aware locked banner, beta-at-bottom, danger warning, dual save buttons, fork-dev stable copy (#1164)

Five user-feedback fixes against the Server Settings UI:

1. Locked-banner copy adapts in addon mode. The standalone "Set via
   env var X — unset it to edit here." copy is misleading in HA
   addon mode where the operator has no env-var surface (start.py
   writes the env vars from /data/options.json; Supervisor writes
   the rest). Endpoints now return is_addon; the JS helper
   envLockedNoteHtml swaps in addon-aware copy that points users at
   the addon Configuration tab. Master beta gets an extra hint
   explaining the auto-enable rule.

2. Beta block rendered into a dedicated bottom-of-panel betaBody
   container instead of featuresBody. The dangerous block sits last
   so users see safer settings first; a "Beta features (dangerous)"
   header in warning color marks the boundary.

3. enable_beta_features help-text rewritten to lead with an explicit
   danger warning (permanent damage to HA, no warranty, take a
   backup, own risk). Mirrored as a blockquote at the top of the
   dev addon's beta-options section in DOCS.md so the addon UI also
   surfaces the risk.

4. Save button redesigned — primary-CTA styling (bigger, accent
   background, hover state), duplicated at the top of the panel so a
   user scrolling either end can hit save, and paired with a
   prominent two-step note explaining that Save → Restart are both
   required for changes to take effect.

5. New scripts/fork-dev/copy-stable.sh + restore-dev.sh let
   maintainers whose only HA test path is the fork-dev addon flip
   homeassistant-addon-dev/ between dev-flavor and a stable-mirroring
   "stable test" flavor. Round-trip clean — copy → restore restores
   the index exactly.

JSDOM behavioural coverage added for each of (1)-(4); MIN_DOM updated
with the new top-row + beta-section element ids.

* fix(tests): JSDOM beta-block tests assert against production HTML / fixed regex (#1164)

Three CI failures from the previous commit:

- ``test_beta_section_header_present_with_danger_styling`` and
  ``test_two_step_save_note_present`` asserted on static
  ``panel-server`` markup that lives in the rendered HTML template,
  not in any JS-populated container. MIN_DOM doesn't replicate the
  full panel-server children (by design — it's a minimal handler
  stub), so the assertions never found their target strings. Switch
  both tests to assert directly against ``_SETTINGS_HTML``; the
  presence of the markup at the template level is the property we
  actually want to lock down.

- ``test_beta_rows_render_into_betaBody`` used a regex
  ``<div id="betaBody">(.*?)</div>\s*<div`` whose ``</div>\s*<div``
  boundary matched the very first nested ``</div><div`` *inside* the
  master row (after the ``.feature-name`` close tag), so ``bb_content``
  only captured the header of the master row. Anchor the boundary on
  ``</div>\s*</body>`` instead — non-greedy capture between the
  ``betaBody`` open tag and the body close still gives the full
  container content. Added a fallback path that asserts on class
  markers + non-leakage to featuresBody if the regex still misses.

* feat(settings): fix stuck-master bug, master in dev schema, cascade-clear, drop connection panel, sync backup_hint+verify_ssl (#1164)

Six fixes against the Server Settings + addon Configuration surfaces:

1. ``maybe_auto_enable_beta_master`` now requires ``config.get(key) is
   True`` instead of ``key in config``. The bare presence check fired
   the moment HA Supervisor merged the dev addon's schema defaults
   into options.json, locking the master "on" with origin=env on
   every fresh dev install even when all 5 sub-flags were False.
   New unit tests cover the truthy / all-false / one-of-many /
   non-bool-truthy permutations so a future regression on the
   semantic fails loudly.

2. Master ``enable_beta_features`` moved into the dev addon
   Configuration tab (schema + options + translations + DOCS.md).
   Defaults ON in dev — beta tools are the channel's purpose, so a
   fresh install lights them up without the user round-tripping
   to the web UI. Stable's schema is unchanged; the standalone web
   UI master path remains the gate there. start.py writes
   ENABLE_BETA_FEATURES from options.json only when the key is
   present; ``get_feature_flag_origin`` now treats the master like
   the sub-flags (env-var presence in addon mode → origin=addon).
   ``maybe_auto_enable_beta_master`` is kept as a one-cycle legacy
   bridge for installs whose options.json pre-dates the master key.

3. Beta sub-flags also default ON in the dev addon options block.

4. Master-off cascade clear: flipping ``enable_beta_features=False``
   in ``_save_feature_flags`` now also writes False for every truthy
   beta sub-flag in the same save. Without this, sub-flags stayed
   True in the override file and resumed the moment the master was
   flipped back on — UX bug the user reported as "having to turn off
   every toggle individually." JS mirrors the cascade so sub-rows
   visually de-toggle on master-off without a page reload.

5. "Connection (display only)" section removed from the Server
   Settings panel. The read-only HOMEASSISTANT_URL / TOKEN /
   SUPERVISOR_TOKEN fields just wasted space — operators already see
   them in addon logs and configuration. Registry entries kept in
   ADVANCED_SETTINGS_FIELDS so the API still returns them (env-pin
   debugging, future surfaces). ``verify_ssl`` moved from
   ``connection`` to ``operations`` so it still renders in the panel.

6. ``backup_hint`` and ``verify_ssl`` now sync between addon
   Configuration and the web UI like feature flags do. New
   ``ADDON_SYNCED_ADVANCED_FIELDS`` set drives the origin helper
   (returns ``'addon'`` in addon mode for these) and the save
   handler (routes their writes through Supervisor
   ``/addons/self/options`` instead of the override file).

Cross-surface gate visibility note appended to every beta sub-flag
description in ``translations/en.yaml`` (and a top-of-options note
on the master) so addon Configuration users know the web UI master
gates everything. Locked-banner addon-mode copy from the earlier
commit was already in place; tests in this commit re-target
fixtures from the removed connection section to ``search``.

* fix(addon-dev): beta sub-flags default OFF — only the master defaults ON (#1164)

Mis-read of the user's intent in the previous commit. The intended
shape for the dev addon's fresh-install defaults is:

  enable_beta_features:                   true   ← gate unlocked
  enable_yaml_config_editing:             false  ← user opts in
  enable_filesystem_tools:                false  ← user opts in
  enable_custom_component_integration:    false  ← user opts in
  enable_code_mode:                       false  ← user opts in
  enable_lite_docstrings:                 false  ← user opts in

The previous commit defaulted every sub-flag to true alongside the
master, which would have shipped every beta tool live on a fresh
dev install — including filesystem writes and the YAML config
editor. Each sub-flag mutates the user's HA system, so they remain
opt-in even on the dev channel; the master being on just means the
gate is open.

* revert(scope): drop scripts/fork-dev/ — maintainer tooling, wrong repo (#1164)

Pushed these in 69b7a235 as part of task #17. They're personal-fork
test tooling for the fork-dev addon workflow — they have no business
on master. Removing from the PR; they're still in this branch's
git history if anyone needs to fish them back out.

* fix(addon+ui): #1431 review pass — restore MCP_HOST, gate sub-flag env writes, sane save (#1164)

Round-2 review pass addressed 14 verified findings:

**Bugs**:
- **MCP_HOST regression** restored. The PR's earlier merge of
  upstream/master dropped the `bind_host = os.getenv("MCP_HOST",
  "0.0.0.0")` block introduced by #1434/#1436. `mcp.run(host=...)`
  now goes through `bind_host` again.
- **Beta sub-flag env vars** are now written only when the matching
  key is present in `/data/options.json`. start.py was writing
  ENABLE_YAML_CONFIG_EDITING=false (etc.) unconditionally on stable
  addon, marking those fields origin='addon' in the web UI; the
  user's save then POSTed to Supervisor which rejected because the
  keys are not in stable's schema.
- **Env-pinned tool save 409**: `_save_tools` now accepts re-sends
  whose state matches the env-pinned value, rejecting only true
  mismatches. The JS `saveConfig` POSTs the entire `toolStates` map
  including env-pinned rows; without this fix every save with
  `DISABLED_TOOLS` / `PINNED_TOOLS` non-empty would 409.
- **Cascade-clear** now reads the persisted override file directly
  via `_read_feature_flag_override_file()` instead of
  `get_global_settings()` (whose master gate had already forced
  sub-flags to False, hiding stale-true overrides). Also force-False
  sub-flags that are explicitly True in the same payload as
  master=false, so `{master:false, sub:true}` no longer lands an
  inconsistent persisted state.
- **Master beta-gate check** is now applied uniformly (no more
  "skip in addon mode" carve-out). The skip existed because the
  legacy auto-enable wrote ENABLE_BETA_FEATURES from sub-flag
  presence; now start.py writes the master from its own options
  key, so the gate is sound to apply in both modes.
- **Dev-upgrade silent-disable warning**: start.py logs when
  master=false but a sub-flag is true in options.json, so an
  operator who toggled the master off in Configuration after a
  pre-#1164 dev install sees why their previously-enabled beta
  tools went away.
- **Mixed-batch advanced save** is now split client-side. The
  server-side guard that returned 500 stays as a defense, but the
  UI no longer triggers it — `saveAdvancedSettings` partitions
  `_advancedDirty` into addon-routed and file-routed batches.

**Logging / defensive code**:
- Supervisor failure in `_save_advanced_settings` now logs before
  returning, matching the sibling `_save_feature_flags` /
  `_save_backup_config` handlers.
- The three `assert sup_err is not None` sites that would crash
  under `python -O` now explicitly return INTERNAL_ERROR (covers
  the addon-route paths in feature flags and advanced settings).
- `_apply_advanced_overrides` narrows `except Exception` to
  `(ValueError, TypeError)`, matching the parallel
  `_apply_feature_flag_overrides` exception shape.
- `maybe_auto_enable_beta_master` now logs which sub-flag(s)
  triggered the legacy bridge when it fires, with a removal-
  candidate note in the docstring.

**Docs / UX**:
- Docstring drift fixed in `get_feature_flag_origin`,
  `_get_advanced_settings`, `_save_advanced_settings`.
- `envLockedNoteHtml` master copy rewritten — origin='env' on the
  master is now only the legacy-bridge path, not the default
  dev-addon path.
- "Bottom" save button now actually at the bottom of `panel-server`
  (below the beta block and its code-mode sub-numerics). Second
  two-step save note duplicated near the bottom row so users
  editing dangerous beta toggles also see it.

Findings deferred to follow-up (legit but bigger than this pass):
- A.2 concurrent-save read-modify-write race (needs an
  `asyncio.Lock` around the override-file path; functionality is
  safe today because the runtime gate hides the persisted-state
  inconsistency).
- A.4 master-flip via addon Configuration tab → no cascade (the
  runtime gate + new log warning cover the observable surface).
- F.* missing tests for new behaviours — adding in a follow-up
  commit.

* test(settings): cover #1431 review pass fixes (#1164)

- ``test_env_pinned_noop_resend_does_not_409`` — JS saveConfig POSTs
  the entire toolStates map; env-pinned no-op resend must be accepted.
- ``test_env_pinned_value_mismatch_still_409s`` — pin true flips still
  rejected.
- ``test_save_features_cascade_clears_subflag_even_when_payload_says_true``
  — in-payload {master:false, sub:true} → 409 instead of inconsistent
  persisted state.
- ``test_save_features_cascade_reads_override_file_not_post_gate_settings``
  — cascade reads the file directly so it catches stale-true sub-flag
  overrides hidden by the master gate on the resolved Settings.
- ``test_stable_addon_does_not_declare_enable_beta_features`` and
  ``test_dev_addon_declares_enable_beta_features_master_in_schema`` —
  lock the schema asymmetry that makes the dev/stable channel
  distinction work.
- ``test_dev_addon_defaults_every_beta_subflag_to_false`` — sub-flags
  remain opt-in even on dev.

* fix(settings): serialise override-file RMW + cover addon-synced advanced save (#1164)

A.2 (concurrent-save race): both ``_save_feature_flags`` and
``_save_advanced_settings`` touch the same ``feature_flags.json``
override file. Two near-simultaneous requests could interleave their
read/merge/write and clobber each other's persisted state. The
runtime master gate hid the inconsistency for beta sub-flags, but
other field combinations (advanced + feature-flag in the same
window) would have lost state silently.

Wrap the RMW window in an ``asyncio.Lock`` (lazy-initialised under
the live event loop so module import doesn't bind to no loop). Both
handlers acquire the same lock, so saves serialise correctly.

F.35 + F.40 (test coverage):
- ``test_save_advanced_addon_synced_routes_through_supervisor`` —
  ``backup_hint`` / ``verify_ssl`` saves in addon mode call
  ``_supervisor_merge_and_post_options``, return ``mode='addon'``,
  do NOT write the override file.
- ``test_save_advanced_addon_synced_supervisor_4xx_surfaces_validation_failed``
  — Supervisor schema rejection surfaces as
  ``CONFIG_VALIDATION_FAILED`` with the Supervisor status code
  preserved, not a generic 502.
- ``test_origin_for_addon_synced_field_is_addon_in_addon_mode`` —
  pins the origin matrix: ``backup_hint`` / ``verify_ssl`` come back
  ``origin='addon, editable'`` in addon mode; non-synced env-pinned
  fields stay ``origin='env, locked'``.

* fix(settings): A.8 preserve sub-flag visual + cover remaining deferred test gaps (#1164)

A.8 — JS master-off no longer flips sub-flag checkboxes visually.
Previously the cascade-clear set ``_lastFeatureFlags[sub].value =
false`` on master-off so the re-render painted sub-rows unchecked.
That fought the user's mental model — they expressed intent on
individual sub-flags, master-off shouldn't visually wipe that
context. Now sub-rows stay checked + dimmed + disabled after the
master flips off; the server-side cascade still clears the values
on disk, so a refresh shows the cleared state, and a failed save
leaves the visible checked state matching the actual on-disk state.

F.37 — ``test_master_off_click_dims_subrow_live_without_clobbering_value``
dispatches a real change event on the master input and asserts the
sub-row goes dimmed + disabled but keeps its checked attribute.

F.38 — three cross-mode origin permutations for the master:
  - dev-addon env-set → 'addon'
  - standalone env-set → 'env'
  - addon mode + file override (no env) → 'file'

F.42 — ``test_dual_save_buttons_mirror_disabled_and_status_state``
probes both ``advSaveStatus`` / ``advSaveStatusTop`` text and both
buttons' disabled state via a hidden probe div, asserts the mirror
holds at save completion.

* fix(tests): probe JSDOM properties via probe div; assert mid-save mirror (#1164)

Two test bugs in 25b45ab7's new assertions:

1. ``test_master_off_click_dims_subrow_live_without_clobbering_value``
   asserted on the literal string ``"checked"`` in the serialised
   DOM. ``input.checked`` is a DOM property (not an HTML attribute),
   so JSDOM's serialiser doesn't emit it. The .checked state IS true
   at the property level — just invisible to the regex. Probe via a
   hidden ``__sub_state_probe`` div that reads the .checked / .disabled
   properties and writes them to data-* attributes.

2. ``test_dual_save_buttons_mirror_disabled_and_status_state`` probed
   AFTER the full save+reload chain, by which point
   loadAdvancedSettings() had blanked both status text els.
   Restructure the test to probe SYNCHRONOUSLY after ``click()``,
   while saveAdvancedSettings is mid-flight (status="Saving…",
   both buttons disabled). That's the actual mirror invariant we
   wanted to lock — the helpers ``_setAdvSaveDisabled(true)`` and
   ``_setAdvSaveStatus('Saving…')`` run synchronously before the
   first await.

* feat(settings): drop sub-flag cascade-clear; restore advanced_debug_logging translation; clearer Save-button copy (#1164)

Three user-reported fixes from stable-addon testing:

1. Stable add-on's ``translations/en.yaml`` was missing the
   ``advanced_debug_logging`` description — the schema declares the
   toggle in ``config.yaml:49`` but no translation ever shipped, so
   the addon Configuration UI showed an unlabelled checkbox. Add the
   missing entry (same wording as dev's translation).

2. Big "Save advanced settings" button used to say "Nothing to save."
   after the user toggled a feature flag (beta master, Tool Search,
   etc.). Feature-flag toggles auto-save on click via
   ``saveFeatureFlag`` — they never enter ``_advancedDirty``, so the
   advanced-save button sees nothing to do. When the restart banner
   is already showing (recent feature-flag save), surface that
   explicitly: "No advanced changes to save — your feature-flag
   toggles already saved on click. Click Restart above to apply
   them." Falls back to the original "Nothing to save." copy when
   there's no pending restart.

3. Drop the master-off cascade-clear behaviour entirely. The runtime
   master gate in ``_apply_feature_flag_overrides`` already forces
   every beta sub-flag to False whenever the master is off, so the
   tools stay disabled at runtime regardless of file state. Leaving
   the sub-flag values in the override file means toggling the
   master off → on restores the user's prior sub-flag selections
   automatically; the previous cascade-clear forced users to
   re-check each sub-flag after every master cycle, which was the
   wrong UX trade for an opt-in beta surface.

The master-gate check is unchanged — it still rejects payloads that
try to enable a sub-flag while the effective master is off, so the
"sub true while master false in same payload" inconsistency still
can't land. The cascade-clear was a separate (now-removed) defence.

Tests updated:
- ``test_save_features_master_off_preserves_subflag_values`` — was
  ``test_save_features_cascade_clears_subflags_when_master_off``;
  asserts the new "preserve" semantics.
- ``test_save_features_master_on_restores_runtime_subflag_values``
  — new round-trip test for master off → on restoring sub-flags.
- ``test_save_features_payload_master_false_sub_true_rejected_by_gate``
  — renamed; asserts gate rejection still covers the inconsistent
  payload.
- ``test_save_features_cascade_reads_override_file_not_post_gate_settings``
  — deleted (no cascade, no reason to test cascade's read path).
- ``test_save_features_master_off_applied_dict_contains_only_master``
  — new no-cascade pin; ``applied`` carries only the master flip.

* fix(settings): #1431 review pass — address 13 verified findings (#1164)

Round-3 review pass landed 13 verified findings; the cosmetic
"persisted-value visible in UI after F5" item was explicitly skipped
per user direction (UI shows post-gate value when master off; user
accepts this since beta tools are actually disabled at runtime — the
preserve-across-master-cycle UX is at the data layer, not the visual).

Source fixes:

- **Save button copy is now source-blind** — the previous "your
  feature-flag toggles already saved on click" claim only held when
  ``saveFeatureFlag`` raised ``restartNotice``; tool-config pin saves,
  backup-config saves, and cross-tab ``restart-required`` broadcasts
  also raise it. New copy: "a restart is pending. Click Restart above
  to apply your prior changes." (#2)
- **Stale F.37 test docstring** describing the deleted server-side
  cascade rewritten. (#3)
- **Beta-gate INFO log noise** — cascade-clear removal meant the gate
  could fire its "forcing %s=False" line every Settings rebuild,
  spamming addon logs once a user had truthy sub-flags persisted.
  Dedup via ``_BETA_GATE_LOGGED`` set per process, cleared on
  ``_reset_global_settings``. (#9)
- **Lazy-lock docstring** updated to reflect Python 3.13 semantics
  (``asyncio.Lock()`` no longer takes a loop arg; the lazy pattern
  still serves test fixtures and single-loop deployment, with the
  invariant documented). (#10)
- **Addon-mode carve-out comment** clarified to distinguish dev
  (master in schema) from stable (master web-UI-only). (#14)
- **probe-div null branch** in F.37 now writes ``data-error`` so a
  failing test points at "selector missed" vs "value flipped"
  unambiguously. (#17)

Tests added:

- ``test_translations_cover_every_schema_key`` — parity check that
  every ``schema:`` key has a non-empty translation ``name`` and
  ``description``. Parameterised across stable + dev addons. Pins
  the class of silent gap that this PR's ``advanced_debug_logging``
  fix addressed. (#4)
- ``test_save_features_acquires_override_file_lock`` +
  ``test_save_advanced_acquires_override_file_lock`` — counting-lock
  wrapper asserts ``async with _get_override_file_lock()`` runs
  exactly once in each file-mode write path. Pin against a regression
  that silently bypasses concurrent-save serialisation. (#5)
- ``test_dual_save_buttons_mirror_disabled_and_status_on_post_failure``
  — exercises the 500-response branch of the dual-save mirror so a
  regression that broke ``_setAdvSaveStatus``/``_setAdvSaveDisabled``
  for error paths only would still fail. (#6)
- ``test_save_features_master_on_restores_subflag_values_in_addon_mode``
  — addon-mode round-trip mirror of the existing standalone restore
  test; asserts the Supervisor merge-and-post call carries only the
  master flip-on and never zeroes out sub-flag values. (#7)
- ``test_save_button_nothing_to_save_when_no_dirty_and_no_restart`` +
  ``test_save_button_restart_pending_hint_when_dirty_empty_but_restart_showing``
  — both branches of the empty-dirty Save click are exercised; the
  restart-pending branch asserts the copy is source-blind. (#8)

Deferred per user direction:
- #1 (file-vs-Settings visual after F5): user accepts the current
  behavior (UI shows post-gate value; runtime tools actually
  disabled when master off; data-layer preserve still works
  end-to-end).
- #11/#12/#13 (code-simplifier helper extractions): skipped as
  complicated to implement without behavior risk.
- #15 (pre-#1164 users with already-cleared sub-flags): release-note
  concern, not a code change.
- #16 (lock fragility under future thread-pool dispatch):
  speculative future-risk; not actionable today.

* feat(settings): version footer + Patch76 review feedback

Adds the running ha-mcp version to the settings UI footer (issue
#1466). ``info.version`` flows from ``/api/settings/info`` →
``HA_MCP_BUILD_VERSION`` env var on addon builds (set by both
stable and dev Dockerfiles) → package metadata fallback. Empty on
older deployments without the field.

Addresses Patch76's review (no blockers, all bundleable):

- ``OverrideField`` folds the structurally-identical
  ``FeatureFlagField`` and ``BackupOverrideField`` into one
  NamedTuple; aliases preserve readable construction sites.
  ``AdvancedField`` keeps its own type since it c…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants