Skip to content

Add URL security scanning to prevent inappropriate content#13

Open
Pjv93 wants to merge 45 commits intoaws-samples:mainfrom
Pjv93:main
Open

Add URL security scanning to prevent inappropriate content#13
Pjv93 wants to merge 45 commits intoaws-samples:mainfrom
Pjv93:main

Conversation

@Pjv93
Copy link
Contributor

@Pjv93 Pjv93 commented Sep 15, 2025

Summary

Adds automated URL content scanning with automatic commit reverting to detect and block inappropriate links that could be added via URL hijacking or malicious commits.

Problem Solved

  • Prevents inappropriate content from being added to workshop repositories
  • Protects against URL hijacking attacks where legitimate URLs become compromised over time
  • Automatically removes malicious content by reverting commits
  • Provides ongoing monitoring with monthly scans

Features Added

  • Real-time URL scanning on every commit/PR
  • 🔄 Automatic commit reverting when inappropriate content is detected
  • Monthly repository-wide URL security scans
  • Contextual keyword detection to reduce false positives
  • Slack notification support for security alerts and revert actions
  • Protection against both immediate threats and time-delayed attacks

How Auto-Revert Works

  1. Push occursAction scans URLsIf violations detectedAutomatically reverts commit
  2. Creates descriptive revert commit explaining the security violation
  3. Sends enhanced Slack notification indicating revert action
  4. Repository stays clean and protected automatically

Testing

  • ✅ Successfully blocks URLs with inappropriate content
  • ✅ Allows legitimate educational content (e.g., 'adult education')
  • Auto-revert functionality tested and working
  • ✅ Slack notifications working for both alerts and reverts
  • ✅ Monthly scanning tested

Security Benefits

  • Immediate protection: Malicious content is automatically removed within minutes
  • Self-healing repositories: No manual intervention needed for security violations
  • Team awareness: Slack alerts keep teams informed of security actions
  • Audit trail: Clear revert commits show what was blocked and why

This ensures all modernization workshops created from this template have automatic security protection with immediate remediation of inappropriate content.

PJV and others added 30 commits September 15, 2025 12:44
- Automatically reverts commits containing inappropriate URLs
- Creates descriptive revert commit message
- Enhanced Slack notifications for revert actions
- Provides immediate protection against malicious content
- Detects http://, https://, protocol-relative (//), and domain-only URLs
- Scans all file types (not just .md/.html) for comprehensive coverage
- Parallel processing for improved performance with 140+ repositories
- Smart filtering to avoid false positives
- Maintains auto-revert and Slack notification functionality
- Two-layer security: Google Safe Browsing + content analysis
- Detects malware, phishing, and inappropriate content
- Fast batch checking with Google's threat database
- Fallback to content analysis for non-malware violations
- Maintains auto-revert and Slack notification functionality
- Free tier: 10,000 requests/day (perfect for 140+ repositories)
- Remove incorrect -m flag usage
- Use --message flag for proper commit message formatting
- Ensures auto-revert functionality works correctly
PJV and others added 15 commits September 15, 2025 15:36
- Check URLs for inappropriate keywords before attempting to access content
- Blocks URLs like 'badsite.com/porn-content' immediately
- Maintains educational context filtering
- Provides faster detection without network requests
- Use single-line commit message to avoid git parsing issues
- Maintains essential information about security violation
- Ensures auto-revert functionality works properly
✅ URL pattern detection: Blocks inappropriate keywords in URLs
✅ Content analysis: Scans accessible URL content
✅ Google Safe Browsing: API integration ready
✅ Auto-revert: Automatically removes malicious commits
✅ Slack notifications: Alerts team of security violations
✅ Comprehensive URL extraction: Handles all URL formats
- Show detailed logs of what content is being retrieved
- Increase content analysis from 5KB to 50KB
- Display response status, content length, title, and preview
- Better error handling - don't mark failed requests as clean
- More verbose output to debug security scanning issues
- Use simple -m flag instead of --message
- Avoid special characters that cause git parsing issues
- Ensures auto-revert works reliably
- Remove custom message flags that cause parsing issues
- Use git's default revert message format
- Should fix auto-revert functionality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant