Skip to content

fix(firewall): skip non-Squid diagnostic lines in generate_usage_activity_summary#41429

Merged
pelikhan merged 2 commits into
mainfrom
copilot/cli-tools-test-fix-timestamp-validation
Jun 25, 2026
Merged

fix(firewall): skip non-Squid diagnostic lines in generate_usage_activity_summary#41429
pelikhan merged 2 commits into
mainfrom
copilot/cli-tools-test-fix-timestamp-validation

Conversation

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Squid writes diagnostic/startup messages (WARNING:, DNS, Accepting, etc.) into access log files. These lines have ≥8 whitespace-separated fields, bypassing the existing length guard, causing their third field to be misidentified as a domain name and inflating firewall stats.

Changes

  • generate_usage_activity_summary.cjs — add a numeric Unix timestamp guard in parseFirewallLogs() after the field-count check, matching the pattern already used in parse_firewall_logs.cjs:

    if (!/^\d+(\.\d+)?$/.test(parts[0])) {
      continue;
    }

    Genuine Squid access log entries always begin with a decimal epoch timestamp (e.g. 1761332530.474); diagnostic lines do not.

  • generate_usage_activity_summary.test.cjs (new) — tests covering: diagnostic keywords not recorded as domains, all-diagnostic log returns null, valid entries still counted correctly.

…enerate_usage_activity_summary

Squid diagnostic/startup messages (WARNING:, DNS, Accepting, etc.)
can appear in access log files with 8+ whitespace-separated fields,
bypassing the existing `parts.length < 8` guard. Their third field was
then mistakenly treated as a domain name.

Add a `/^\d+(\.\d+)?$/` numeric timestamp check — identical to the
guard already present in parse_firewall_logs.cjs — so that only genuine
Squid access log entries are counted.

Closes #41398

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix timestamp validation in generate_usage_activity_summary.cjs fix(firewall): skip non-Squid diagnostic lines in generate_usage_activity_summary Jun 25, 2026
Copilot AI requested a review from pelikhan June 25, 2026 12:30
@pelikhan pelikhan marked this pull request as ready for review June 25, 2026 12:37
Copilot AI review requested due to automatic review settings June 25, 2026 12:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the firewall usage-summary generator against Squid diagnostic/startup lines that can appear in access logs, preventing those lines from being mis-parsed as real requests and inflating domain stats.

Changes:

  • Added a numeric Unix timestamp guard in parseFirewallLogs() so only real Squid access log entries are counted.
  • Added a new Vitest unit test file covering diagnostic-line skipping, all-diagnostic logs returning null, and valid allow/block counting.
Show a summary per file
File Description
actions/setup/js/generate_usage_activity_summary.cjs Skip non-access-log diagnostic lines by requiring a numeric epoch timestamp in the first field.
actions/setup/js/generate_usage_activity_summary.test.cjs New unit tests validating the new guard and preserving correct parsing for real entries.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment on lines +7 to +8
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #41429 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold: 100).

@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

Test Quality Sentinel analysis for PR #41429 was already completed in the prior workflow invocation (same run ID 28170581267). Comment posted (90/100, Excellent) and APPROVE review submitted. Both safe-output limits (add_comment: 1/1, submit_pull_request_review: 1/1) are exhausted — confirming all actions were applied.

@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions github-actions Bot mentioned this pull request Jun 25, 2026
@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100 — Excellent

Analyzed 3 test(s): 3 design, 0 implementation, 0 guideline violation(s).

📊 Metrics & Test Classification (3 tests analyzed)
Metric Value
New/modified tests analyzed 3
✅ Design tests (behavioral contracts) 3 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 3 (100%)
Duplicate test clusters 0
Test inflation detected YES — 84 test lines vs 6 production lines (14:1); contextually expected for a brand-new test file establishing first coverage for parseFirewallLogs
🚨 Coding-guideline violations 0
Test File Classification Issues Detected
skips Squid diagnostic lines (WARNING:, DNS, Accepting)... generate_usage_activity_summary.test.cjs:29 ✅ Design
returns null when only non-Squid diagnostic lines are present generate_usage_activity_summary.test.cjs:52 ✅ Design
counts valid Squid access log entries correctly generate_usage_activity_summary.test.cjs:67 ✅ Design

Go: 0 (*_test.go); JavaScript: 3 (*.test.cjs). No other languages detected.

Verdict

Check passed. 0% implementation tests (threshold: 30%). All three tests verify observable behavioral contracts: the timestamp-guard fix is directly exercised by test 1 (diagnostic lines skipped), test 2 covers the boundary case (null on all-diagnostic input), and test 3 confirms correct allowed/blocked counting. No mocking of business logic — tests use real filesystem I/O with isolated temp directories. Inflation flag is noted but is a natural artifact of establishing first-ever test coverage for this module alongside a small production fix.

🧪 Test quality analysis by Test Quality Sentinel · 81 AIC · ⌖ 9.7 AIC · ⊞ 8.4K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 3 tests verify behavioral contracts with edge-case coverage; no guideline violations detected.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /tdd and /diagnose — approving with minor test-quality suggestions.

📋 Key Themes & Highlights

Key Themes

  • Test assertions don't fully lock in the regression: the three not.toContain("WARNING:") etc. calls in test 1 are always-true regardless of the fix; the real guard is total_requests === 1.
  • Small dead-code smell: unused __filename/__dirname imports in the test file.
  • Minor flakiness risk: Date.now() for temp-dir uniqueness; prefer fs.mkdtempSync.
  • Coverage gap: requests_by_domain is never asserted across any test case.

Positive Highlights

  • ✅ Fix is minimal, surgical, and mirrors the identical guard already in parse_firewall_logs.cjs — good consistency.
  • ✅ Three focused tests cover the three scenarios that matter: all-diagnostic, mixed, and valid-only.
  • beforeEach/afterEach lifecycle with isolated temp directories — correct approach.
  • ✅ The core regression is caught by total_requests === 1, even if surrounding assertions can be tightened.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 78 AIC · ⌖ 7.18 AIC · ⊞ 6.5K

// Diagnostic keywords must not appear as domain names
expect(result.allowed_domains).not.toContain("WARNING:");
expect(result.allowed_domains).not.toContain("DNS");
expect(result.allowed_domains).not.toContain("Accepting");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The not.toContain("WARNING:") / not.toContain("DNS") / not.toContain("Accepting") assertions never fail, even without the fix.

In every diagnostic line used in this test, parts[2] (the domain field) resolves to api.github.com:443 — not the keyword at parts[0]. These three assertions were always-true before the patch too. The real regression guard is total_requests === 1 on line 44.

💡 Suggested replacement

Replace the three not.toContain calls with assertions that would actually fail if the timestamp guard were removed:

// The 3 diagnostic lines must not inflate the request count
expect(result.allowed_requests).toBe(1);
expect(result.blocked_requests).toBe(0);
// allowed_domains should have exactly one entry (not four)
expect(result.allowed_domains).toHaveLength(1);

import { fileURLToPath } from "url";

const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] __filename and __dirname are declared but never referenced in this file — dead code.

createRequire(import.meta.url) doesn't need them, and no other line in the test uses them. Other test files that use the same createRequire pattern (e.g. action_input_utils.test.cjs) omit these declarations.

💡 Suggested change

Remove lines 5, 7, and 8:

-import { fileURLToPath } from "url";
-
-const __filename = fileURLToPath(import.meta.url);
-const __dirname = path.dirname(__filename);

let squidLogDir;

beforeEach(() => {
squidLogDir = path.join("/tmp/gh-aw", `squid-logs-unit-test-${Date.now()}`);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] Date.now() for unique temp-dir names is theoretically flaky: two test files running at the same millisecond would collide on the directory name.

💡 Prefer fs.mkdtempSync
squidLogDir = fs.mkdtempSync("/tmp/gh-aw/squid-logs-unit-test-");

mkdtempSync atomically creates a uniquely-suffixed directory, which is both safer and more idiomatic for temp-dir creation.

expect(result.allowed_requests).toBe(1);
expect(result.blocked_requests).toBe(1);
expect(result.allowed_domains).toContain("api.github.com:443");
expect(result.blocked_domains).toContain("blocked.example.com:443");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] requests_by_domain is populated by the function but never asserted in any test case. It's the per-domain breakdown that callers may rely on.

💡 Suggested addition to the third test
expect(result.requests_by_domain["api.github.com:443"]).toEqual({ allowed: 1, blocked: 0 });
expect(result.requests_by_domain["blocked.example.com:443"]).toEqual({ allowed: 0, blocked: 1 });

This locks in the per-domain shape alongside the aggregate counters.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — fix is correct, tests need hardening

The timestamp guard in parseFirewallLogs() is the right fix and correctly implements the invariant described in the PR. Two issues need to be addressed before merge:

  1. [HIGH] Test isolation: the tests write into /tmp/gh-aw/squid-logs-unit-test-*/ and parseFirewallLogs() scans all /tmp/gh-aw/squid-logs-* paths — any residual directory (stale CI artifacts, parallel jobs) will contaminate exact-count assertions. The function needs an injectable logPaths parameter so tests can scope to a controlled path.

  2. [MEDIUM] Regex is under-strict: /^\d+(\.\d+)?$/ allows integer-only first fields while the PR explicitly states the invariant is a decimal epoch. Tighten to /^\d+\.\d+$/ to match the stated format and make the intent unambiguous.

🔎 Code quality review by PR Code Quality Reviewer · 81.1 AIC · ⌖ 7.66 AIC · ⊞ 5.2K


fs.writeFileSync(path.join(squidLogDir, "access.log"), logContent);

const result = parseFirewallLogs();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test assertions on total_requests will fail non-deterministically if any /tmp/gh-aw/squid-logs-* directory contains stale or concurrent log files.

💡 Details and suggested fix

parseFirewallLogs() unconditionally scans every file matching /tmp/gh-aw/squid-logs-*/**/*.log. The unit tests work around this by writing into a directory whose name matches that glob (squid-logs-unit-test-${Date.now()}). This design has two failure modes:

  1. Residual production data: If any agent job leaves behind a squid-logs-* directory — even from a previous failed run — the function picks up those log lines and inflates total_requests past the expected count, causing toBe(1) / toBe(2) to fail spuriously.

  2. No injection point: The function has hardcoded paths with no way to accept alternate directories, so tests cannot be truly isolated without depending on global filesystem state.

The root fix is to give parseFirewallLogs an optional logPaths parameter defaulting to the production paths, so tests can pass controlled inputs:

// source
function parseFirewallLogs(logPaths) {
  const firewallPaths = logPaths ?? [
    "/tmp/gh-aw/sandbox/firewall/logs/**/*.log",
    // ...
  ];
}

// test
const result = parseFirewallLogs([path.join(squidLogDir, "**/*.log")]);

With this change the tests stop depending on ambient global state and the exact-count assertions become reliable.


// Skip non-Squid diagnostic lines (WARNING:, DNS, Accepting, etc.) by
// validating that the first field is a numeric Unix timestamp.
if (!/^\d+(\.\d+)?$/.test(parts[0])) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp regex accepts integer-only first fields (e.g. 12345) while the PR description explicitly states Squid timestamps are always decimal (e.g. 1761332530.474).

💡 Details

The current regex /^\d+(\.\d+)?$/ matches both:

  • 1761332530.474 ✓ (valid Squid timestamp)
  • 12345 ✓ (integer — would also pass, contrary to stated invariant)

If the codebase invariant is that Squid access logs always emit a decimal timestamp, the guard should enforce that:

if (!/^\d+\.\d+$/.test(parts[0])) {
  continue;
}

The tighter regex removes the optional group and requires the decimal point, preventing any non-Squid line whose first token happens to be a bare integer from being processed. It also makes the guard self-documenting: the code now asserts the expected format rather than a looser approximation of it.

This is low risk in practice — bare integer first tokens in diagnostic lines are unlikely — but aligning the regex with the stated invariant closes a small gap and matches the description in the PR body.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

@pelikhan pelikhan merged commit b36c128 into main Jun 25, 2026
62 of 63 checks passed
@pelikhan pelikhan deleted the copilot/cli-tools-test-fix-timestamp-validation branch June 25, 2026 13:21
Copilot stopped work on behalf of pelikhan due to an error June 25, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants