Skip to content

feat: unified JSONL log file between proxy and gateway#2350

Merged
lpcox merged 9 commits into
mainfrom
feat/unified-jsonl-logs
Mar 23, 2026
Merged

feat: unified JSONL log file between proxy and gateway#2350
lpcox merged 9 commits into
mainfrom
feat/unified-jsonl-logs

Conversation

@lpcox

@lpcox lpcox commented Mar 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

Unify the JSONL RPC message logs so both the DIFC proxy and MCP gateway append to the same rpc-messages.jsonl file, creating a single chronological timeline of all RPC messages and DIFC filtering events.

Problem

Previously the proxy wrote proxy-rpc.jsonl to /tmp/gh-aw/proxy-logs/ and the gateway wrote rpc-messages.jsonl to /tmp/gh-aw/mcp-logs/. This made it harder to get a complete picture of DIFC events across both components.

Changes

Code

  • internal/cmd/proxy.go: Rename JSONL output from proxy-rpc.jsonlrpc-messages.jsonl

Workflow

  • repo-assist.lock.yml: Mount /tmp/gh-aw/mcp-logs into the proxy container and point --log-dir there
  • Update proxy.log reference path to match new log directory
  • Enable local build for testing

Result

Both proxy and gateway append to /tmp/gh-aw/mcp-logs/rpc-messages.jsonl:

  • Proxy writes first (pre-agent gh calls with DIFC filtering)
  • Gateway appends later (agent MCP tool calls with DIFC filtering)
  • Single file = unified timeline for analysis

Change proxy JSONL filename from proxy-rpc.jsonl to rpc-messages.jsonl
so both the DIFC proxy and MCP gateway append to the same file,
creating a single unified timeline of all RPC messages and DIFC events.

Changes:
- internal/cmd/proxy.go: rename proxy-rpc.jsonl → rpc-messages.jsonl
- repo-assist.lock.yml: mount gateway log dir into proxy container,
  point proxy --log-dir to /tmp/gh-aw/mcp-logs/, update proxy.log
  reference path
- Enable local build for testing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 22, 2026 22:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to unify DIFC-related JSONL RPC message logging so the proxy and gateway write into a single rpc-messages.jsonl, enabling a single chronological timeline for analysis across both components.

Changes:

  • Rename the proxy’s JSONL output file from proxy-rpc.jsonl to rpc-messages.jsonl.
  • Update the repo-assist lock workflow to mount /tmp/gh-aw/mcp-logs into the proxy container and point --log-dir there.
  • Switch the workflow to build and run a ghcr.io/github/gh-aw-mcpg:local image (proxy + gateway) for debugging.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
internal/cmd/proxy.go Renames the proxy JSONL log filename to align with the gateway’s rpc-messages.jsonl.
.github/workflows/repo-assist.lock.yml Adjusts mounts and --log-dir for unified log location; enables local image build and uses :local tags.
Comments suppressed due to low confidence (1)

.github/workflows/repo-assist.lock.yml:867

  • MCP_GATEWAY_DOCKER_COMMAND is updated to run ghcr.io/github/gh-aw-mcpg:local instead of a pinned version. In a lock workflow this reduces reproducibility and can fail if the local image build step is skipped/changed; consider using the released tag by default and switching to :local only under an explicit debug flag.
          export GH_AW_ENGINE="copilot"
          export MCP_GATEWAY_DOCKER_COMMAND='docker run -i --rm --network host -v /var/run/docker.sock:/var/run/docker.sock -e MCP_GATEWAY_PORT -e MCP_GATEWAY_DOMAIN -e MCP_GATEWAY_API_KEY -e MCP_GATEWAY_PAYLOAD_DIR -e MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD -e DEBUG -e MCP_GATEWAY_LOG_DIR -e GH_AW_MCP_LOG_DIR -e GH_AW_SAFE_OUTPUTS -e GH_AW_SAFE_OUTPUTS_CONFIG_PATH -e GH_AW_SAFE_OUTPUTS_TOOLS_PATH -e GH_AW_ASSETS_BRANCH -e GH_AW_ASSETS_MAX_SIZE_KB -e GH_AW_ASSETS_ALLOWED_EXTS -e DEFAULT_BRANCH -e GITHUB_MCP_SERVER_TOKEN -e GITHUB_MCP_GUARD_MIN_INTEGRITY -e GITHUB_MCP_GUARD_REPOS -e GITHUB_REPOSITORY -e GITHUB_SERVER_URL -e GITHUB_SHA -e GITHUB_WORKSPACE -e GITHUB_TOKEN -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e GITHUB_JOB -e GITHUB_ACTION -e GITHUB_EVENT_NAME -e GITHUB_EVENT_PATH -e GITHUB_ACTOR -e GITHUB_ACTOR_ID -e GITHUB_TRIGGERING_ACTOR -e GITHUB_WORKFLOW -e GITHUB_WORKFLOW_REF -e GITHUB_WORKFLOW_SHA -e GITHUB_REF -e GITHUB_REF_NAME -e GITHUB_REF_TYPE -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GH_AW_SAFE_OUTPUTS_PORT -e GH_AW_SAFE_OUTPUTS_API_KEY -v /tmp/gh-aw/mcp-payloads:/tmp/gh-aw/mcp-payloads:rw -v /opt:/opt:ro -v /tmp:/tmp:rw -v '"${GITHUB_WORKSPACE}"':'"${GITHUB_WORKSPACE}"':rw ghcr.io/github/gh-aw-mcpg:local'

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/cmd/proxy.go
log.Printf("Warning: Failed to initialize file logger: %v", err)
}
if err := logger.InitJSONLLogger(proxyLogDir, "proxy-rpc.jsonl"); err != nil {
if err := logger.InitJSONLLogger(proxyLogDir, "rpc-messages.jsonl"); err != nil {

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the proxy JSONL filename to match the gateway means two different processes/containers will append to the same JSONL file concurrently. The current JSONL logger uses json.Encoder.Encode (which can perform multiple writes) plus per-entry Sync(), so interleaving writes across processes can produce corrupted/invalid JSONL and the extra fsync contention can significantly slow proxy/gateway logging. Consider implementing cross-process safe append (e.g., marshal to bytes and do a single Write per entry, plus an OS-level file lock), or keep separate files and merge by timestamp during analysis.

Suggested change
if err := logger.InitJSONLLogger(proxyLogDir, "rpc-messages.jsonl"); err != nil {
if err := logger.InitJSONLLogger(proxyLogDir, "proxy-rpc-messages.jsonl"); err != nil {

Copilot uses AI. Check for mistakes.
Comment on lines +402 to +406
- name: Build local MCPG container (debugging only)
run: |
rustup target add wasm32-wasip1
cd guards/github-guard/rust-guard && ./build.sh && cd ../../..
docker build . -t ghcr.io/github/gh-aw-mcpg:local

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is labeled "debugging only" and the preceding comment says to uncomment it for local testing, but it is now enabled unconditionally. Building the image (rustup target add + wasm build + docker build) will slow down every run and can introduce failures unrelated to the workflow’s purpose; consider gating this behind a workflow_dispatch input/env flag or reverting it to commented/conditional execution.

Suggested change
- name: Build local MCPG container (debugging only)
run: |
rustup target add wasm32-wasip1
cd guards/github-guard/rust-guard && ./build.sh && cd ../../..
docker build . -t ghcr.io/github/gh-aw-mcpg:local
# - name: Build local MCPG container (debugging only)
# run: |
# rustup target add wasm32-wasip1
# cd guards/github-guard/rust-guard && ./build.sh && cd ../../..
# docker build . -t ghcr.io/github/gh-aw-mcpg:local

Copilot uses AI. Check for mistakes.
Comment on lines 417 to 426
docker run -d --name awmg-proxy --network host \
-e GH_TOKEN \
-e DEBUG='*' \
-v "$PROXY_LOG_DIR:$PROXY_LOG_DIR" \
ghcr.io/github/gh-aw-mcpg:v0.1.26 proxy \
-v "$MCP_LOG_DIR:$MCP_LOG_DIR" \
ghcr.io/github/gh-aw-mcpg:local proxy \
--policy "$POLICY" \
--listen 0.0.0.0:18443 \
--log-dir "$PROXY_LOG_DIR" \
--log-dir "$MCP_LOG_DIR" \
--tls --tls-dir "$PROXY_LOG_DIR/proxy-tls" \

Copilot AI Mar 22, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock workflow switches the proxy container from a pinned released image (v0.1.26) to ghcr.io/github/gh-aw-mcpg:local. Unless this workflow is explicitly intended for debugging only, this breaks reproducibility and makes the run depend on a successful local build; consider keeping the pinned version and only using :local when the debug build step is explicitly enabled.

This issue also appears on line 866 of the same file.

Copilot uses AI. Check for mistakes.
lpcox and others added 8 commits March 22, 2026 16:22
Replace stubBackendCaller with restBackendCaller that translates
guard CallTool requests into GitHub REST API calls. This enables
the WASM guard to perform backend enrichment (author_association,
repo visibility, trusted bot detection) when labeling proxy responses.

Previously all items defaulted to 'none' integrity because the
stub caller failed all enrichment requests. Now the proxy can
correctly identify github-actions[bot] as a trusted bot and
assign appropriate integrity levels.

Supported tool translations:
- pull_request_read → GET /repos/{owner}/{repo}/pulls/{number}
- issue_read → GET /repos/{owner}/{repo}/issues/{number}
- search_repositories → GET /search/repositories?q=...

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The proxy's handleWithDIFC was not storing tool_args in the request
context before calling LabelResponse. This meant the WASM guard's
label_response function received null tool_args, causing
extract_repo_info() to return empty strings. Without repo info,
per-item enrichment (pr_integrity/issue_integrity) was skipped,
defaulting all items to 'none' integrity and filtering everything.

Fix: Store args in context using guard.SetRequestStateInContext()
before the LabelResponse call, mirroring what the gateway does in
unified.go:891-892.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…agent calls

Two changes to fix proxy DIFC filtering of legitimate items:

1. Add --trusted-bots flag to proxy CLI and pass trusted bots to the
   guard via BuildLabelAgentPayload during LabelAgent initialization.
   This enables the guard's built-in trusted bot bypass (dependabot[bot],
   github-actions[bot], etc.) to assign writer integrity instead of
   falling back to author_association which returns CONTRIBUTOR with
   limited-scope tokens.

2. Update repo-assist workflow to use GITHUB_MCP_SERVER_TOKEN (the same
   stronger token used by the MCP server) for the proxy container and
   pre-agent gh CLI steps. The previous github.token lacks org membership
   visibility, causing the REST API to return author_association=CONTRIBUTOR
   instead of MEMBER for org members.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enrichment REST calls (author_association, repo visibility) now use
the proxy's configured GitHubToken instead of the client's auth header.
The client's GITHUB_TOKEN (installation token) lacks read:org scope,
causing author_association to return CONTRIBUTOR instead of MEMBER.
The server token (from GH_AW_GITHUB_MCP_SERVER_TOKEN) may have the
necessary org visibility for correct enrichment results.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The proxy was initialized with trustedBots=0 because the --trusted-bots
CLI flag was never passed in the workflow. This adds the standard
first-party bots (github-actions[bot], dependabot[bot], copilot) so
the guard can bypass author_association checks for bot-authored items.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Without the 'author' field in --json, the GraphQL query doesn't
request author { login }, so the DIFC guard can't identify trusted
bots (github-actions, dependabot, etc.) and all items get none
integrity, causing everything to be filtered.

Adding 'author' to both gh issue list and gh pr list ensures the
GraphQL response includes author.login for trusted bot detection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…queries

The DIFC proxy now rewrites GraphQL queries for issue/PR collections
to include author{login} and authorAssociation fields before forwarding
to GitHub. This ensures the guard always has the data it needs for
trusted-bot detection and integrity labeling, regardless of what fields
the caller (gh CLI) originally requested.

This eliminates the need for per-item REST enrichment round-trips and
means callers don't need to know what fields the guard requires.

Reverts the --json author addition in the workflow since the proxy
now handles it transparently.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace switch/case with if/else if so break exits the for loop
directly instead of only the switch statement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox lpcox merged commit 06c856a into main Mar 23, 2026
13 checks passed
@lpcox lpcox deleted the feat/unified-jsonl-logs branch March 23, 2026 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants