Skip to content

Commit 3e35624

Browse files
Add automated security review workflow for KB documentation
This workflow automatically scans all PRs that modify Knowledge Base files (docs/kb/**/*.md, docs/kb/**/*.mdx) for potential customer data leakage. Detection targets: - Customer hostnames, FQDNs, and domains - IP addresses and MAC addresses from customer infrastructure - Customer email addresses and usernames - Company/organization names - Customer-specific file paths, URLs, and registry keys - License keys, tokens, GUIDs, and SSH fingerprints - Log snippets with identifiable data The workflow uses Claude AI to analyze PR diffs and posts actionable feedback directly to the PR, including specific line numbers and remediation guidance. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent 3b9b6b8 commit 3e35624

File tree

1 file changed

+165
-0
lines changed

1 file changed

+165
-0
lines changed
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
name: KB Security Review - Customer Data Leakage Detection
2+
3+
on:
4+
pull_request:
5+
types: [opened, synchronize, reopened]
6+
paths:
7+
- "docs/kb/**/*.md"
8+
- "docs/kb/**/*.mdx"
9+
10+
jobs:
11+
kb-security-review:
12+
runs-on: ubuntu-latest
13+
permissions:
14+
contents: read
15+
pull-requests: write
16+
issues: read
17+
id-token: write
18+
19+
steps:
20+
- name: Checkout repository
21+
uses: actions/checkout@v4
22+
with:
23+
fetch-depth: 0 # Full history for comprehensive diff analysis
24+
25+
- name: Run Claude KB Security Review
26+
id: claude-security-review
27+
uses: anthropics/claude-code-action@v1
28+
with:
29+
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
30+
prompt: |
31+
REPO: ${{ github.repository }}
32+
PR NUMBER: ${{ github.event.pull_request.number }}
33+
34+
**SECURITY REVIEW: Knowledge Base Customer Data Leakage Detection**
35+
36+
You are performing a security review of documentation changes to detect potential customer data leakage.
37+
38+
## Your Task
39+
40+
1. Use `gh pr diff ${{ github.event.pull_request.number }}` to get all changes in this PR
41+
2. Focus ONLY on changes to files in `docs/kb/` directory
42+
3. Analyze the diff for potential customer-identifying, environment-specific, or proprietary information
43+
44+
## What to Flag
45+
46+
Identify and flag ANY of the following types of sensitive data in the ADDED lines (+):
47+
48+
### High Priority - Customer Infrastructure
49+
- **Hostnames, FQDNs, or domains** that are NOT:
50+
- Netwrix domains (netwrix.com, stealthbits.com, anixis.com)
51+
- Microsoft/vendor domains (microsoft.com, azure.com, office365.com, github.com, etc.)
52+
- Generic documentation examples (example.com, contoso.com, fabrikam.com, northwind.com)
53+
- **IP addresses** that appear to be real customer infrastructure (not obviously generic like 192.0.2.x)
54+
- **MAC addresses**
55+
- **Server names or computer names** that look customer-specific (not generic like "server1", "dc01")
56+
57+
### High Priority - Identifiable Information
58+
- **Email addresses** that are NOT:
59+
- Netwrix employees (@netwrix.com)
60+
- Generic examples ([email protected], [email protected])
61+
- **Usernames or account names** that appear customer-specific (not generic like "testuser", "john.doe")
62+
- **Company or organization names** that are NOT part of Netwrix products/brands
63+
- **Customer-specific Active Directory structures** (OU paths with non-generic naming)
64+
65+
### Medium Priority - System Details
66+
- **File paths** that reference real customer systems or contain customer-specific naming
67+
- **URLs** pointing to customer infrastructure
68+
- **Registry keys** with customer-specific values or paths
69+
- **Database names** or connection strings with customer-specific information
70+
71+
### Medium Priority - Credentials & Keys
72+
- **License keys, serial numbers, or activation codes**
73+
- **API tokens, access tokens, or credentials**
74+
- **GUIDs or UUIDs** that appear in security contexts (credential IDs, API keys)
75+
- **SSH fingerprints or cryptographic keys**
76+
- **Certificate thumbprints or serial numbers** from real certificates
77+
78+
### Medium Priority - Log Output
79+
- **Log snippets or error messages** containing:
80+
- Customer hostnames, domains, or IP addresses
81+
- Customer usernames or email addresses
82+
- Customer-specific paths or identifiers
83+
- Real timestamps that could identify customer activity patterns
84+
85+
## What NOT to Flag (False Positives)
86+
87+
- Netwrix product domains and infrastructure
88+
- Microsoft example domains (contoso.com, fabrikam.com, northwind.com, tailspintoys.com)
89+
- Generic placeholders like "example.com", "domain.com", "company.com"
90+
- RFC 5737 documentation IP addresses (192.0.2.x, 198.51.100.x, 203.0.113.x)
91+
- Generic server names (server1, dc01, web-server, etc.)
92+
- Generic usernames (admin, testuser, john.doe, jane.smith)
93+
- Placeholder GUIDs in obvious example contexts
94+
- localhost, 127.0.0.1, or other loopback addresses
95+
- Private IP ranges in obviously generic examples (10.0.0.1, 192.168.1.1)
96+
97+
## Output Format
98+
99+
If you find ANY potential customer data leakage:
100+
101+
1. Use `gh pr comment` to post a review comment with the following structure:
102+
103+
```markdown
104+
## ⚠️ KB Security Review: Potential Customer Data Leakage Detected
105+
106+
This PR contains changes to Knowledge Base files that may include customer-identifying or environment-specific information that should be reviewed and potentially redacted.
107+
108+
### Findings
109+
110+
#### 📁 File: `path/to/file.md`
111+
112+
**Line X:** [Brief description of what type of data was found]
113+
- **Action Required:** [Specific, actionable guidance on what to review/replace]
114+
- **Suggestion:** [Generic replacement if applicable]
115+
116+
---
117+
118+
### Review Checklist
119+
120+
Before merging this PR, please verify:
121+
- [ ] All hostnames and domains are either Netwrix-owned, well-known vendors, or generic examples
122+
- [ ] No customer-specific email addresses or usernames are present
123+
- [ ] IP addresses are either RFC 5737 documentation IPs or clearly generic examples
124+
- [ ] File paths and URLs do not reference real customer systems
125+
- [ ] Log snippets have been sanitized of customer-identifying information
126+
- [ ] No license keys, tokens, or credentials are exposed
127+
128+
### Need Help?
129+
130+
- Replace customer domains with: `example.com`, `contoso.com`, `fabrikam.com`
131+
- Replace customer IPs with: `192.0.2.1`, `198.51.100.1`, `203.0.113.1`
132+
- Replace customer servers with: `server01`, `dc01`, `web-server01`
133+
- Replace customer accounts with: `testuser`, `serviceaccount`, `domain\admin`
134+
- Replace GUIDs with: `<credential-id>`, `<guid>`, or obviously fake ones
135+
```
136+
137+
2. Keep findings GENERAL and ACTIONABLE - never quote the actual sensitive data in your review
138+
3. Focus on WHAT needs review, not on explaining WHY the data is sensitive
139+
4. Group findings by file for clarity
140+
5. Provide specific line numbers or sections to review
141+
142+
If NO customer data leakage is found:
143+
144+
1. Use `gh pr comment` to post:
145+
146+
```markdown
147+
## ✅ KB Security Review: No Customer Data Leakage Detected
148+
149+
This PR has been reviewed for potential customer data leakage in Knowledge Base files. No customer-identifying, environment-specific, or proprietary information was detected in the changes.
150+
151+
The documentation changes appear to use appropriate generic examples and do not expose customer infrastructure or identifiable information.
152+
```
153+
154+
## Important Guidelines
155+
156+
- Be thorough but practical - focus on real risks, not theoretical ones
157+
- Prioritize HIGH and MEDIUM severity findings
158+
- When in doubt about whether something is customer-specific, FLAG IT for human review
159+
- Provide actionable guidance, not just identification
160+
- Keep the tone professional and helpful, not accusatory
161+
- Remember: The goal is to protect customer privacy and maintain documentation quality
162+
163+
Now perform the security review and post your findings.
164+
165+
claude_args: '--allowed-tools "Bash(gh pr diff:*),Bash(gh pr comment:*),Bash(gh pr view:*),Bash(gh pr list:*)"'

0 commit comments

Comments
 (0)