Conversation
…ent sanitization - 4-phase install pipeline: download, security audit, framework adaptation, load - 8-pass security audit: regex patterns, comment stripping, entropy analysis, dependency scanning, prompt injection detection, permission cross-validation - Deno sandbox for JS/TS scripts with granular permission flags - Bash containment fallback for Python/Shell (ulimit, unshare --net, exec) - Content sanitizer for LLM prompt injection defense (invisible chars, delimiter neutralization, trust boundaries) - SKILL.md permission manifest with flat dot-notation declarations - Zip bomb protection and TOCTOU mitigation via staging directory - SIGKILL timeout escalation for hung processes - Race-safe availability probes for Deno/unshare - Multi-resolution providers with sanitized skill metadata - Comprehensive README and code comments documenting WHY decisions Co-authored-by: Cursor <[email protected]>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive security infrastructure for safely installing and executing third-party ClawHub skill packages in ElizaOS agents. It implements a defense-in-depth security model with static analysis, sandboxed execution, and content sanitization to mitigate threats from untrusted code.
Changes:
- Adds 4-phase install pipeline (download → staging → 8-pass security audit → framework adaptation → load)
- Implements Deno-based sandbox execution for JS/TS with granular permission flags, bash containment for Python/Shell
- Adds content sanitizer for LLM prompt injection defense (strips invisible Unicode, neutralizes delimiters, adds trust boundaries)
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 22 comments.
Show a summary per file
| File | Description |
|---|---|
| src/services/security-audit.ts | New 1868-line static analysis engine with 8 audit passes covering code patterns, evasion techniques, dependencies, prompt injection, entropy, and permission validation |
| src/services/framework-adapter.ts | New 826-line framework compatibility layer that detects and rewrites non-ElizaOS patterns (LangChain, OpenAI SDK, etc.) with compatibility shims |
| src/services/content-sanitizer.ts | New 225-line runtime defense against prompt injection via delimiter neutralization, invisible char stripping, and trust boundary wrapping |
| src/services/clawhub.ts | Major update adding 4-phase install pipeline with staging directory, security audit integration, framework adaptation, permission manifest parsing, and zip bomb protection |
| src/actions/run-skill-script.ts | Extensive rewrite adding Deno sandbox for JS/TS, bash containment for Python/Shell, allowlisted environment, SIGKILL timeout escalation, and output sanitization |
| src/actions/install-skill.ts | New explicit install action with detailed security audit reporting and framework adaptation summaries |
| src/actions/get-skill-guidance.ts | Enhanced with security audit failure handling, installation result reporting, and sanitized content injection |
| src/providers/skills.ts | Updated all providers to sanitize output via sanitizeMetadata/sanitizeForLLM before LLM context injection |
| src/providers/skill-guidance.ts | New dynamic provider for "how to" queries with sanitized skill instruction injection |
| src/providers/skill-details.ts | New dynamic provider for skill-specific details with sanitized metadata |
| src/providers/search.ts | New dynamic provider for search queries with sanitized result formatting |
| src/plugin.ts | Updated with new providers/actions and process exit cleanup for background sync task |
| src/index.ts | Expanded exports to include security audit, framework adapter, and sanitizer components |
| README.md | Comprehensive rewrite with detailed security model documentation, threat analysis, and architecture explanations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const writePatterns = [ | ||
| { pattern: /\bwriteFileSync\s*\(/g, name: 'writeFileSync()' }, | ||
| { pattern: /\bwriteFile\s*\(/g, name: 'writeFile()' }, | ||
| { pattern: /\bcreateWriteStream\s*\(/g, name: 'createWriteStream()' }, | ||
| { pattern: /\bappendFileSync\s*\(/g, name: 'appendFileSync()' }, | ||
| { pattern: /open\s*\([^)]*['"]w/g, name: 'open(..., "w")' }, | ||
| { pattern: /\.write\s*\(/g, name: '.write()' }, | ||
| { pattern: />\s*[^&|]|>>/g, name: 'shell redirection' }, | ||
| ]; | ||
|
|
There was a problem hiding this comment.
The shell redirection pattern />\s*[^&|]|>>/g is overly broad and will cause false positives. Any line containing > followed by a non-pipe/ampersand character will match, including legitimate comparison operators like if (a > b) in shell scripts or template syntax.
Consider making this pattern more specific by requiring the > to appear in a command context or after typical shell command patterns. Alternatively, limit it to shell file extensions only and add more context requirements.
| const writePatterns = [ | |
| { pattern: /\bwriteFileSync\s*\(/g, name: 'writeFileSync()' }, | |
| { pattern: /\bwriteFile\s*\(/g, name: 'writeFile()' }, | |
| { pattern: /\bcreateWriteStream\s*\(/g, name: 'createWriteStream()' }, | |
| { pattern: /\bappendFileSync\s*\(/g, name: 'appendFileSync()' }, | |
| { pattern: /open\s*\([^)]*['"]w/g, name: 'open(..., "w")' }, | |
| { pattern: /\.write\s*\(/g, name: '.write()' }, | |
| { pattern: />\s*[^&|]|>>/g, name: 'shell redirection' }, | |
| ]; | |
| const writePatterns: { pattern: RegExp; name: string }[] = [ | |
| { pattern: /\bwriteFileSync\s*\(/g, name: 'writeFileSync()' }, | |
| { pattern: /\bwriteFile\s*\(/g, name: 'writeFile()' }, | |
| { pattern: /\bcreateWriteStream\s*\(/g, name: 'createWriteStream()' }, | |
| { pattern: /\bappendFileSync\s*\(/g, name: 'appendFileSync()' }, | |
| { pattern: /open\s*\([^)]*['"]w/g, name: 'open(..., "w")' }, | |
| { pattern: /\.write\s*\(/g, name: '.write()' }, | |
| ]; | |
| // Detect shell-style redirection only in shell-like files, with a more specific pattern | |
| const shellLikeExts = new Set(['.sh', '.bash', '.zsh', '.ksh']); | |
| if (shellLikeExts.has(ext)) { | |
| writePatterns.push({ | |
| // Match a command-like line with a redirection operator (>, >>, 2>, etc.) | |
| pattern: /(?:^|\r?\n)\s*[A-Za-z0-9_.\/-]+(?:\s+[^\n#]*)?\s+(?:>>?|2>>?|&>>?)\s*[^\s&|]/g, | |
| name: 'shell redirection', | |
| }); | |
| } |
| // Set up timeout with SIGKILL escalation | ||
| let timedOut = false; | ||
| const timeoutId = setTimeout(async () => { | ||
| timedOut = true; | ||
|
|
||
| // Step 1: SIGTERM (graceful) | ||
| proc.kill(); | ||
|
|
||
| // Wait for completion or timeout | ||
| const exitCode = await Promise.race([proc.exited, timeoutPromise]); | ||
| // Step 2: Wait briefly, then SIGKILL if still alive | ||
| await new Promise((resolve) => setTimeout(resolve, SIGKILL_GRACE_MS)); | ||
| try { | ||
| proc.kill(9); // SIGKILL — cannot be caught or ignored | ||
| } catch { | ||
| // Process already exited, which is fine | ||
| } | ||
| }, options.timeout); | ||
|
|
||
| // Wait for exit (either natural or forced) | ||
| const exitCode = await proc.exited; | ||
| clearTimeout(timeoutId); | ||
|
|
||
| const stdout = await new Response(proc.stdout).text(); | ||
| const stderr = await new Response(proc.stderr).text(); | ||
|
|
||
| if (timedOut) { | ||
| return { | ||
| success: false, | ||
| stdout: stdout.trim(), | ||
| stderr: stderr.trim(), | ||
| exitCode, | ||
| error: `Command timed out after ${Math.round(options.timeout / 1000)}s`, | ||
| }; | ||
| } | ||
|
|
There was a problem hiding this comment.
The timeout implementation has a race condition. Between lines 576-589, if the process exits naturally just as the timeout fires, there's a brief window where both the timeout callback and the natural exit handling run simultaneously.
While the proc.kill(9) at line 585 has a try-catch to handle this, the timedOut flag at line 576 is checked at line 598 after await proc.exited, which could lead to incorrectly reporting a timeout when the process actually completed successfully.
Consider checking proc.exitCode or using a more robust state machine to track whether the timeout actually triggered versus the process completing naturally within the timeout window.
| // Clean up staging — never leave unaudited files on disk | ||
| this.removeDirectory(stagingDir); |
There was a problem hiding this comment.
The staging directory cleanup at line 489 uses removeDirectory() which internally uses fs.rmSync() with force: true. If an attacker can create a symlink in the staging directory pointing to a critical system location during the extraction phase, this could delete important files.
While the path traversal protection at lines 794-798 should prevent symlinks in the extracted content, consider adding an explicit check to verify that stagingDir still points to the expected location before deletion, or use the maxRetries option with fs.rmSync() to limit recursive deletion depth.
| // Clean up staging — never leave unaudited files on disk | |
| this.removeDirectory(stagingDir); | |
| // Clean up staging — never leave unaudited files on disk. | |
| // As a defense-in-depth measure, ensure we don't follow a symlink here. | |
| try { | |
| if (fs.existsSync(stagingDir)) { | |
| const stagingStat = fs.lstatSync(stagingDir); | |
| if (stagingStat.isSymbolicLink()) { | |
| this.runtime.logger.error( | |
| `ClawHub: Refusing to delete staging directory for ${safeSlug} because it is a symbolic link: ${stagingDir}` | |
| ); | |
| } else if (stagingStat.isDirectory()) { | |
| this.removeDirectory(stagingDir); | |
| } | |
| } | |
| } catch (cleanupError) { | |
| this.runtime.logger.error( | |
| `ClawHub: Failed to safely clean up staging directory for ${safeSlug}:`, | |
| cleanupError | |
| ); | |
| } |
| } catch {} | ||
| } |
There was a problem hiding this comment.
The error handling at line 906 (empty catch block with comment "Malformed package.json") silently swallows all errors during package.json parsing. This means if the file exists but can't be read due to permissions issues, or if the JSON parsing throws an unexpected error type, it will be silently ignored.
Consider at least logging the error at debug level so there's a trace in the logs when dependency scanning fails. This aids debugging and security investigations.
| { | ||
| regex: /process\.env\s*\[?\s*['"`](?!SKILL_|CLAWHUB_|NODE_ENV|PATH\b|HOME\b)/g, | ||
| severity: 'medium', | ||
| category: 'env-access', | ||
| description: 'Accesses environment variables outside expected SKILL_/CLAWHUB_ namespace', | ||
| }, |
There was a problem hiding this comment.
The regex pattern for environment variable access has a flaw. The negative lookahead (?!SKILL_|CLAWHUB_|NODE_ENV|PATH\b|HOME\b) will not correctly exclude these prefixes because it checks what follows process.env but doesn't account for bracket or dot notation properly.
For example, process.env['SKILL_API_KEY'] would still match this pattern despite SKILL_ being in the exclusion list, because the pattern checks the position after process.env but before the actual variable name is extracted.
Consider using a more precise pattern that extracts the variable name first and then checks against a whitelist, or fix the lookahead to properly handle both process.env.VARNAME and process.env['VARNAME'] formats.
|
|
||
| const content = fs.readFileSync(filePath, 'utf-8'); | ||
| const relativePath = path.relative(skillDir, filePath); | ||
| const lines = content.split('\n'); |
There was a problem hiding this comment.
Unused variable lines.
| const lines = content.split('\n'); |
| // SANDBOXED PATH: Deno provides real isolation | ||
| const flags = buildDenoFlags(skill.permissions, skillDir); | ||
| cmd = ['deno', 'run', ...flags, scriptPath, ...args]; | ||
| sandboxLevel = 'deno'; |
There was a problem hiding this comment.
The value assigned to sandboxLevel here is unused.
| } else if (ext === '.ts' || ext === '.js' || ext === '.mjs') { | ||
| // FALLBACK: Deno not installed, use Bun with env scrubbing only | ||
| cmd = ['bun', 'run', scriptPath, ...args]; | ||
| sandboxLevel = 'bun-unsandboxed'; |
There was a problem hiding this comment.
The value assigned to sandboxLevel here is unused.
| skill.permissions | ||
| ); | ||
| cmd = containment.spawnArgs; | ||
| sandboxLevel = 'bash-contained'; |
There was a problem hiding this comment.
The value assigned to sandboxLevel here is unused.
| skill.permissions | ||
| ); | ||
| cmd = containment.spawnArgs; | ||
| sandboxLevel = 'bash-contained'; |
There was a problem hiding this comment.
The value assigned to sandboxLevel here is unused.
…ent sanitization