We actively support the following versions of docprocessor with security updates:
| Version | Supported |
|---|---|
| 1.x.x | ✅ |
| < 1.0 | ❌ |
We take the security of docprocessor seriously. If you have discovered a security vulnerability, please report it to us privately.
Please do not report security vulnerabilities through public GitHub issues.
Instead, please report them via one of the following methods:
- Email: Send details to [email protected]
- GitHub Security Advisories: Use the private vulnerability reporting feature
Please include as much of the following information as possible:
- Type of vulnerability (e.g., injection, XSS, authentication bypass, etc.)
- Full paths of source file(s) related to the manifestation of the issue
- The location of the affected source code (tag/branch/commit or direct URL)
- Any special configuration required to reproduce the issue
- Step-by-step instructions to reproduce the issue
- Proof-of-concept or exploit code (if possible)
- Impact of the issue, including how an attacker might exploit it
- We will acknowledge receipt of your vulnerability report within 48 hours
- We will send a more detailed response within 7 days indicating the next steps
- We will keep you informed about the progress toward a fix
- We may ask for additional information or guidance during the process
- We request that you give us reasonable time to address the vulnerability before public disclosure
- We will credit you in the security advisory unless you prefer to remain anonymous
- Once a fix is available, we will:
- Release a patched version
- Publish a security advisory
- Credit the reporter (if permission granted)
- Update the CHANGELOG.md
When using docprocessor, we recommend:
- Validate file paths: Always validate and sanitize file paths from user input
- Limit file sizes: Set reasonable limits on document sizes to prevent DoS
- File type verification: Verify file types match expected formats
from pathlib import Path
# Good: Validate file path
file_path = Path(user_input).resolve()
if not file_path.is_file():
raise ValueError("Invalid file path")
# Good: Check file extension
allowed_extensions = {'.pdf', '.txt', '.docx', '.md'}
if file_path.suffix.lower() not in allowed_extensions:
raise ValueError("Unsupported file type")- API Key Protection: Never commit API keys to version control
- Use environment variables: Store sensitive credentials in environment variables
- Implement rate limiting: Protect against abuse of LLM APIs
import os
# Good: Use environment variables
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
raise ValueError("API key not configured")- Use strong API keys: Generate secure random API keys
- Network isolation: Run Meilisearch behind a firewall
- Enable authentication: Always require authentication in production
- Use HTTPS: Encrypt data in transit
# Good: Use environment-specific configuration
indexer = MeiliSearchIndexer(
url=os.getenv("MEILISEARCH_URL"),
api_key=os.getenv("MEILISEARCH_API_KEY"),
index_prefix=os.getenv("ENV_PREFIX", "prod_")
)- OCR limits: Be aware that OCR processing can be resource-intensive
- Temporary file cleanup: Ensure temporary files are cleaned up
- Memory management: Monitor memory usage with large documents
# Good: Use context managers for file handling
from pathlib import Path
def process_safely(file_path: Path):
try:
processor = DocumentProcessor()
result = processor.process(file_path)
return result
finally:
# Clean up any temporary files if needed
passdocprocessor depends on several third-party libraries. We:
- Regularly update dependencies to patch known vulnerabilities
- Use Dependabot to monitor for security updates
- Pin dependency versions in production deployments
- Malicious images: Be cautious with user-uploaded images
- Resource exhaustion: Large images can consume significant memory
- Tesseract vulnerabilities: Keep Tesseract OCR updated
- Malformed PDFs: PDFs can contain malicious content
- Embedded scripts: Be aware of JavaScript in PDFs
- File size attacks: Extremely large PDFs can cause DoS
- File type validation: Checks file extensions before processing
- Error handling: Proper exception handling prevents information leakage
- Logging: Security-relevant events are logged (without sensitive data)
- Input sanitization: File paths and parameters are validated
For security and privacy reasons, docprocessor:
- Does NOT transmit your documents to external services (except your configured LLM)
- Does NOT store documents or API keys
- Does NOT log document content
- Does NOT include analytics or telemetry
Security updates are released as soon as possible after a vulnerability is confirmed. Updates are published:
- As patch releases (e.g., 1.0.1 → 1.0.2)
- With a security advisory in GitHub
- With a note in CHANGELOG.md
- Via GitHub notifications to repository watchers
To receive security updates:
- Watch the repository on GitHub
- Subscribe to release notifications
- Follow @KnowledgeInnov (if available)
We thank the following security researchers for responsibly disclosing vulnerabilities:
- (None yet - be the first!)
If you have questions about this security policy, please contact:
- Email: [email protected]
- GitHub Discussions: Security category
Last Updated: 2025-10-22 Version: 1.0