Add opt-in observed URL domain audit pipeline for tool responses and write sinks#7987
Merged
Merged
Conversation
Copilot
AI
changed the title
[WIP] Add support for safe-outputs URL domain surfacing
Add opt-in observed URL domain audit pipeline for tool responses and write sinks
Jun 23, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds an opt-in URL domain audit pipeline to the MCP Gateway so operators can see which URL domains were encountered at runtime (from tool responses and write-sink bodies) and persist them as a log-directory artifact (observed-url-domains.json) for downstream safe-outputs consumption.
Changes:
- Adds
--url-domain-audit/MCP_GATEWAY_URL_DOMAIN_AUDIT, wires it into runtime config, and gates auditing via a global logger toggle. - Introduces an
observed-url-domains.jsonlogger that records unique domains per source key (server ID andwrite-sink) using atomic file updates. - Implements domain extraction from tool response payloads (middleware) and write-sink request bodies (guard) using a shared
internal/urlutilhelper, with unit/integration test coverage.
Show a summary per file
| File | Description |
|---|---|
| internal/urlutil/domains.go | New shared helper to extract URL hostnames from strings and recursively-walked JSON-like values. |
| internal/middleware/jqschema.go | Hooks audit into tool response handling and infers server ID from server___tool names. |
| internal/middleware/jqschema_test.go | Adds unit tests for URL domain extraction helpers. |
| internal/middleware/jqschema_integration_test.go | Adds integration test ensuring observed-url-domains.json is written when audit is enabled. |
| internal/logger/registry.go | Registers the observed-domains logger in gateway init/close sequences. |
| internal/logger/observed_url_domains_logger.go | New logger implementation that persists observed domains per server ID to JSON. |
| internal/logger/init_test.go | Updates registry/init expectations to include the new logger and file. |
| internal/logger/global_helpers.go | Extends the generic logger constraint to include the new logger type. |
| internal/guard/write_sink.go | Audits URL domains found in write-sink tool args/body when enabled. |
| internal/guard/write_sink_test.go | Adds tests validating write-sink auditing writes to observed-url-domains.json. |
| internal/config/config_core.go | Adds a runtime-only Gateway.URLDomainAudit flag (not loaded from config files). |
| internal/cmd/root.go | Wires the CLI/env flag into runtime config and enables auditing in the logger package. |
| internal/cmd/flags_test.go | Ensures --url-domain-audit flag is registered. |
| internal/cmd/flags_logging.go | Registers the new CLI flag with env-default behavior. |
| docs/ENVIRONMENT_VARIABLES.md | Documents MCP_GATEWAY_URL_DOMAIN_AUDIT and the output artifact file. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 15/15 changed files
- Comments generated: 3
Comment on lines
+60
to
+70
| for _, match := range matches { | ||
| parsed, err := url.Parse(match) | ||
| if err != nil { | ||
| continue | ||
| } | ||
| host := strings.ToLower(parsed.Hostname()) | ||
| if host == "" { | ||
| continue | ||
| } | ||
| domainSet[host] = struct{}{} | ||
| } |
Comment on lines
54
to
58
| // closableLogger is a constraint for types that have a Close method. | ||
| // This is satisfied by *FileLogger, *JSONLLogger, *MarkdownLogger, *ServerFileLogger, and *ToolsLogger. | ||
| type closableLogger interface { | ||
| *FileLogger | *JSONLLogger | *MarkdownLogger | *ServerFileLogger | *ToolsLogger | ||
| *FileLogger | *JSONLLogger | *MarkdownLogger | *ServerFileLogger | *ToolsLogger | *ObservedURLDomainsLogger | ||
| Close() error |
Collaborator
|
@copilot address review feedback |
…from URLs, and update closableLogger comment
Contributor
Author
Addressed all three issues in the latest commit:
Also added test cases covering the upper-case scheme and trailing-punctuation scenarios in |
This was referenced Jun 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Safe-outputs currently redacts URLs for domains not explicitly allowlisted, which makes open-web research output unusable and provides no runtime visibility into what domains were actually encountered. This change adds an opt-in audit path in the gateway to surface observed domains as a session artifact that downstream safe-outputs logic can consume.
Config + runtime toggle
--url-domain-auditandMCP_GATEWAY_URL_DOMAIN_AUDIT.Gateway.URLDomainAudit) and logger audit enablement.Observed domain artifact
observed-url-domains.jsonin the log directory.Middleware domain extraction (tool responses)
jqschemamiddleware, when audit mode is enabled, extracts URL hostnames from response payload string leaves and records them under the backend server ID inferred fromserver___tool.Write-sink body audit (safe-output writes)
WriteSinkGuard.LabelResource, when audit mode is enabled, extracts URL hostnames from tool args/body, emits debug audit logs, and records domains to the same observed-domains artifact.Shared URL extraction utility
internal/urlutilfor recursive value-walk + hostname extraction to avoid duplicated parsing logic across middleware and guard paths.Docs
MCP_GATEWAY_URL_DOMAIN_AUDITin environment variable reference.