-
Notifications
You must be signed in to change notification settings - Fork 432
fix(firewall): skip non-Squid diagnostic lines in generate_usage_activity_summary #41429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| import { afterEach, beforeEach, describe, expect, it } from "vitest"; | ||
| import fs from "fs"; | ||
| import path from "path"; | ||
| import { createRequire } from "module"; | ||
| import { fileURLToPath } from "url"; | ||
|
|
||
| const __filename = fileURLToPath(import.meta.url); | ||
| const __dirname = path.dirname(__filename); | ||
|
Comment on lines
+7
to
+8
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd]
💡 Suggested changeRemove lines 5, 7, and 8: -import { fileURLToPath } from "url";
-
-const __filename = fileURLToPath(import.meta.url);
-const __dirname = path.dirname(__filename); |
||
|
|
||
| const req = createRequire(import.meta.url); | ||
| const { parseFirewallLogs } = req("./generate_usage_activity_summary.cjs"); | ||
|
|
||
| describe("generate_usage_activity_summary.cjs", () => { | ||
| /** Unique directory for each test to avoid cross-test interference */ | ||
| let squidLogDir; | ||
|
|
||
| beforeEach(() => { | ||
| squidLogDir = path.join("/tmp/gh-aw", `squid-logs-unit-test-${Date.now()}`); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd] 💡 Prefer
|
||
| fs.mkdirSync(squidLogDir, { recursive: true }); | ||
| }); | ||
|
|
||
| afterEach(() => { | ||
| if (fs.existsSync(squidLogDir)) { | ||
| fs.rmSync(squidLogDir, { recursive: true, force: true }); | ||
| } | ||
| }); | ||
|
|
||
| describe("parseFirewallLogs", () => { | ||
| it("skips Squid diagnostic lines (WARNING:, DNS, Accepting) and does not treat them as domain names", () => { | ||
| const logContent = [ | ||
| // Squid startup/diagnostic messages that should be skipped | ||
| 'WARNING: 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| 'DNS 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| 'Accepting 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| // A valid access log entry that should be counted | ||
| '1761332530.474 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| ].join("\n"); | ||
|
|
||
| fs.writeFileSync(path.join(squidLogDir, "access.log"), logContent); | ||
|
|
||
| const result = parseFirewallLogs(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Test assertions on 💡 Details and suggested fix
The root fix is to give // source
function parseFirewallLogs(logPaths) {
const firewallPaths = logPaths ?? [
"/tmp/gh-aw/sandbox/firewall/logs/**/*.log",
// ...
];
}
// test
const result = parseFirewallLogs([path.join(squidLogDir, "**/*.log")]);With this change the tests stop depending on ambient global state and the exact-count assertions become reliable. |
||
|
|
||
| expect(result).not.toBeNull(); | ||
| expect(result.total_requests).toBe(1); | ||
| expect(result.allowed_domains).toContain("api.github.com:443"); | ||
| // Diagnostic keywords must not appear as domain names | ||
| expect(result.allowed_domains).not.toContain("WARNING:"); | ||
| expect(result.allowed_domains).not.toContain("DNS"); | ||
| expect(result.allowed_domains).not.toContain("Accepting"); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd] The In every diagnostic line used in this test, 💡 Suggested replacementReplace the three // The 3 diagnostic lines must not inflate the request count
expect(result.allowed_requests).toBe(1);
expect(result.blocked_requests).toBe(0);
// allowed_domains should have exactly one entry (not four)
expect(result.allowed_domains).toHaveLength(1); |
||
| }); | ||
|
|
||
| it("returns null when only non-Squid diagnostic lines are present", () => { | ||
| const logContent = [ | ||
| 'WARNING: 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| "DNS resolver ready - some extra fields here to pass length check x y z", | ||
| "Accepting connections on port 3128 x y z", | ||
| ].join("\n"); | ||
|
|
||
| fs.writeFileSync(path.join(squidLogDir, "access.log"), logContent); | ||
|
|
||
| const result = parseFirewallLogs(); | ||
|
|
||
| expect(result).toBeNull(); | ||
| }); | ||
|
|
||
| it("counts valid Squid access log entries correctly", () => { | ||
| const logContent = [ | ||
| '1761332530.474 172.30.0.20:35288 api.github.com:443 140.82.112.22:443 1.1 CONNECT 200 TCP_TUNNEL:HIER_DIRECT api.github.com:443 "-"', | ||
| '1761332531.000 172.30.0.20:35289 blocked.example.com:443 1.2.3.4:443 1.1 CONNECT 403 NONE_NONE:HIER_NONE blocked.example.com:443 "-"', | ||
| ].join("\n"); | ||
|
|
||
| fs.writeFileSync(path.join(squidLogDir, "access.log"), logContent); | ||
|
|
||
| const result = parseFirewallLogs(); | ||
|
|
||
| expect(result).not.toBeNull(); | ||
| expect(result.total_requests).toBe(2); | ||
| expect(result.allowed_requests).toBe(1); | ||
| expect(result.blocked_requests).toBe(1); | ||
| expect(result.allowed_domains).toContain("api.github.com:443"); | ||
| expect(result.blocked_domains).toContain("blocked.example.com:443"); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [/tdd] 💡 Suggested addition to the third testexpect(result.requests_by_domain["api.github.com:443"]).toEqual({ allowed: 1, blocked: 0 });
expect(result.requests_by_domain["blocked.example.com:443"]).toEqual({ allowed: 0, blocked: 1 });This locks in the per-domain shape alongside the aggregate counters. |
||
| }); | ||
| }); | ||
| }); | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp regex accepts integer-only first fields (e.g.
12345) while the PR description explicitly states Squid timestamps are always decimal (e.g.1761332530.474).💡 Details
The current regex
/^\d+(\.\d+)?$/matches both:1761332530.474✓ (valid Squid timestamp)12345✓ (integer — would also pass, contrary to stated invariant)If the codebase invariant is that Squid access logs always emit a decimal timestamp, the guard should enforce that:
The tighter regex removes the optional group and requires the decimal point, preventing any non-Squid line whose first token happens to be a bare integer from being processed. It also makes the guard self-documenting: the code now asserts the expected format rather than a looser approximation of it.
This is low risk in practice — bare integer first tokens in diagnostic lines are unlikely — but aligning the regex with the stated invariant closes a small gap and matches the description in the PR body.