🐍 Code Explainer

A state-of-the-art, production-ready LLM-powered system for generating human-readable explanations of Python code with enhanced retrieval, security, and monitoring capabilities.

🚀 Quick Start • 📖 Documentation • � Tutorial • �🔧 Installation • 💡 Examples • 🤝 Contributing • 💬 Discussions

✨ Features

Device portability and intelligent explanations:

Unified DeviceManager selects the best device automatically (CUDA > MPS > CPU) with safe fallbacks
Precision control via CODE_EXPLAINER_PRECISION (fp32, fp16, bf16, 8bit)
Optional IntelligentExplanationGenerator for adaptive, audience-aware explanations

🧠 Core AI Capabilities

Advanced AI Models: Fine-tuned CodeT5, CodeBERT, and GPT models for accurate explanations
Enhanced RAG: Retrieval-Augmented Generation with FAISS, BM25, and hybrid search
Cross-Encoder Reranking: Improved relevance with sentence-transformers rerankers
MMR Diversity: Maximal Marginal Relevance for diverse code examples
Multi-Agent Analysis: Collaborative explanations from specialized agents
Symbolic Analysis: Property-based testing and complexity analysis
Batch Processing: Efficient batch explanation with memory optimization and progress tracking
Async Processing: Non-blocking explanation generation for better responsiveness
Performance Monitoring: Real-time memory usage, GPU stats, and performance metrics
Model Optimization: Quantization support (4-bit/8-bit), gradient checkpointing, and inference optimizations
Security Features: Input validation, rate limiting, and security auditing
API v2 Endpoints: Enhanced REST API with performance monitoring, security validation, and model optimization

🎯 Smart Analysis & Prompting

Multiple Strategies: vanilla, ast_augmented, retrieval_augmented, execution_trace, and enhanced_rag
Code Understanding: Support for functions, classes, algorithms, and data structures
Complexity Analysis: Automatic time/space complexity detection
Error Pattern Recognition: Common bug identification and debugging suggestions
Intelligent Augmentation: Automatic function name and recursion hints for robustness

🌐 Production-Ready Interfaces

REST API: FastAPI with Prometheus metrics, rate limiting, and health checks
Web UI: Streamlit and Gradio interfaces for interactive exploration
CLI Tools: Comprehensive command-line interface with rich output
Python SDK: Direct integration for developers

Compatibility Notes

Small compatibility shims have been added to improve testability and backwards compatibility with prior consumer code. Notable shims include:

CodeExplainer.explain_code_with_symbolic(...) — convenience method that returns combined symbolic + textual explanations.
CodeExplainerTrainer accepts a config_path parameter for older callers.
clear_model_cache and get_model_cache_info are exported at the package level for convenience when managing cached model artifacts.

🔒 Security & Safety

Code Redaction: Automatic PII and credential detection and redaction
Security Validation: AST-based dangerous pattern detection
Safe Execution: Sandboxed code execution with resource limits
Input Validation: Comprehensive request validation and sanitization

📊 Monitoring & Observability

Prometheus Metrics: API performance, error rates, and P95/P99 latencies
Grafana Dashboard: Pre-built monitoring dashboards
Structured Logging: JSON logging with request IDs and tracing
Health Checks: Comprehensive service health monitoring
Performance Monitoring: Memory usage, GPU utilization, and cache statistics

🗄️ Advanced Caching System

Multi-Level Caching: Explanation cache, embedding cache, and advanced cache with strategies
Cache Strategies: LRU, LFU, FIFO, Size-based, and Adaptive eviction policies
Persistence: Disk-backed caching with TTL and compression
Invalidation: Tag-based, time-based, version-based, and content-based invalidation
Cache Metrics: Hit rates, access times, and eviction statistics

🧪 Advanced Evaluation & Testing

Traditional Metrics: BLEU, ROUGE-L, BERTScore, CodeBLEU for quantitative assessment
LLM-as-a-Judge: Multi-judge consensus evaluation with GPT-4 and Claude
Preference Learning: Pairwise comparisons and Bradley-Terry ranking
Contamination Detection: Comprehensive data leakage detection (exact, n-gram, semantic)
Robustness Testing: Adversarial testing with 7 transformation types
Comprehensive CLI: Full evaluation pipeline with detailed reporting

🔮 Continuous Integration & Deployment

Quality Assurance: Automated testing with pytest, coverage, and type checking
Release Automation: Automated releases with changelogs and semantic versioning
Pre-commit Hooks: Code formatting, linting, and security checks
Multi-environment Testing: Testing across Python 3.9, 3.10, 3.11, 3.12
Setup Validation: Automated configuration and environment validation

🎯 Developer Experience

mkdocs Documentation: Comprehensive documentation site with examples
Development Containers: VS Code devcontainer for instant setup
Makefile Automation: Common tasks simplified with make commands
nbstripout: Clean notebook commits without outputs

🚀 Quick Start

Installation

# Install Poetry (if not already installed)
curl -sSL https://install.python-poetry.org | python3 -

# Install from source with Poetry
git clone https://github.com/rajatsainju2025/code-explainer.git
cd code-explainer
poetry install

# For RAG features (optional)
poetry install --with rag

# For development
poetry install --with dev

# For all optional dependencies
poetry install --with rag,metrics,monitoring,dev

Alternative: Install from PyPI

# Basic installation
pip install code-explainer

# With RAG features
pip install code-explainer[rag]

# With all optional features
pip install code-explainer[all]

Basic Usage

from code_explainer import CodeExplainer

# Initialize the explainer
explainer = CodeExplainer()

# Explain some code
code = """
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)
"""

explanation = explainer.explain_code(code)
print(explanation)

Web Interface

# Start the FastAPI server
python -m code_explainer.cli_commands.main serve

# Or use Streamlit (if available)
# streamlit run streamlit_app.py

# Or use Gradio (if available)
# python -c "import gradio as gr; gr.Interface(...)"

CLI Usage

# Explain a file
python -m code_explainer.cli_commands.main explain --file examples/fibonacci.py

# Use different strategies
python -m code_explainer.cli_commands.main explain --file mycode.py --prompt-strategy vanilla
python -m code_explainer.cli_commands.main explain --file mycode.py --prompt-strategy ast_augmented
python -m code_explainer.cli_commands.main explain --file mycode.py --prompt-strategy retrieval_augmented

# Run evaluations
python -m code_explainer.cli_commands.main eval --dataset humaneval --model codet5-small

# Research-driven evaluation (contamination, dynamic, multi-agent, adversarial)
python -c "
from code_explainer.research_evaluation_orchestrator import ResearchEvaluationOrchestrator
orchestrator = ResearchEvaluationOrchestrator()
results = orchestrator.run_evaluation(model, dataset)
print(results)
"

# Check security
python -c "
from code_explainer.security import CodeSecurityValidator
validator = CodeSecurityValidator()
is_safe, issues = validator.validate_code(user_code)
print('Safe:', is_safe, 'Issues:', issues)
"

# Run golden tests
python -m code_explainer.cli_commands.main test

For a 15-minute walkthrough, see the Zero to Results tutorial: docs/tutorials/zero_to_results.md

📊 Performance & Benchmarks

Metric	CodeT5-Small	CodeT5-Base	GPT-3.5-Turbo	Our Enhanced RAG
BLEU-4	0.42	0.48	0.55	0.61
ROUGE-L	0.38	0.44	0.52	0.58
BERTScore	0.71	0.76	0.82	0.85
CodeBLEU	0.35	0.41	0.48	0.54
Human Rating	3.2/5	3.6/5	4.1/5	4.4/5

Benchmarked on HumanEval and MBPP datasets with human evaluators.

🧪 Advanced Evaluation Framework

Our evaluation system implements state-of-the-art assessment methods following open evaluation best practices:

Traditional Metrics

# Comprehensive traditional metrics
code-explainer evaluate \
  --test-data test.jsonl \
  --predictions predictions.jsonl \
  --metrics bleu rouge bertscore codebleu

LLM-as-a-Judge Evaluation

# Multi-judge consensus evaluation
code-explainer eval-llm-judge \
  --test-data test.jsonl \
  --predictions predictions.jsonl \
  --judges gpt-4 claude-3-sonnet \
  --criteria accuracy clarity completeness

Contamination Detection

# Detect data leakage between train/test
code-explainer eval-contamination \
  --train-data train.jsonl \
  --test-data test.jsonl \
  --methods exact ngram substring semantic

Robustness Testing

# Test model robustness under adversarial conditions
code-explainer eval-robustness \
  --test-data test.jsonl \
  --model-path ./results \
  --test-types typo case whitespace punctuation \
  --severity-levels 0.05 0.1 0.2

Preference-Based Evaluation

# Compare models using pairwise preferences
code-explainer eval-preference \
  --test-data test.jsonl \
  --predictions-a model_a.jsonl \
  --predictions-b model_b.jsonl \
  --use-bradley-terry

📖 See our Advanced Evaluation Tutorial for comprehensive examples and best practices.

🏗️ Architecture

graph TB
    A[Code Input] --> B[Security Validation]
    B --> C[AST Analysis]
    C --> D[Strategy Selection]

    D --> E1[Vanilla LLM]
    D --> E2[AST-Augmented]
    D --> E3[Enhanced RAG]
    D --> E4[Multi-Agent]

    E3 --> F[Vector Store]
    E3 --> G[BM25 Index]
    E3 --> H[Cross-Encoder Reranker]

    E1 --> I[Response Synthesis]
    E2 --> I
    E3 --> I
    E4 --> I

    I --> J[Quality Validation]
    J --> K[Security Redaction]
    K --> L[Final Explanation]

🔧 Configuration

The system is highly configurable through YAML files:

# configs/custom.yaml
model:
  name: "microsoft/CodeGPT-small-py"
  max_length: 512
  temperature: 0.7

training:
  num_train_epochs: 100
  per_device_train_batch_size: 8
  learning_rate: 5e-5

prompt:
  template: "Explain this Python code:\n```python\n{code}\n```\nExplanation:"

📦 Model Presets

Use ready-made presets to switch models quickly:

Preset	Arch	Base Model	Config	Train	Evaluate
DistilGPT-2 (default)	causal	distilgpt2	`configs/default.yaml`	`cx-train -c configs/default.yaml`	`code-explainer eval -c configs/default.yaml`
CodeT5 Small	seq2seq	Salesforce/codet5-small	`configs/codet5-small.yaml`	`cx-train -c configs/codet5-small.yaml`	`code-explainer eval -c configs/codet5-small.yaml`
CodeT5 Base	seq2seq	Salesforce/codet5-base	`configs/codet5-base.yaml`	`cx-train -c configs/codet5-base.yaml`	`code-explainer eval -c configs/codet5-base.yaml`
CodeGPT Small (CodeBERT family)	causal	microsoft/CodeGPT-small-py	`configs/codebert-base.yaml`	`cx-train -c configs/codebert-base.yaml`	`code-explainer eval -c configs/codebert-base.yaml`
StarCoderBase 1B	causal	bigcode/starcoderbase-1b	`configs/starcoderbase-1b.yaml`	`cx-train -c configs/starcoderbase-1b.yaml`	`code-explainer eval -c configs/starcoderbase-1b.yaml`
StarCoder2 Instruct	causal	bigcode/starcoder2-3b	`configs/starcoder2-instruct.yaml`	`cx-train -c configs/starcoder2-instruct.yaml`	`code-explainer eval -c configs/starcoder2-instruct.yaml`
CodeLlama Instruct	causal	codellama/CodeLlama-7b-Instruct-hf	`configs/codellama-instruct.yaml`	`cx-train -c configs/codellama-instruct.yaml`	`code-explainer eval -c configs/codellama-instruct.yaml`

Data paths in each config default to the tiny examples in data/. Override any path via CLI flags (e.g., --data for training or --test-file for eval).

📖 Documentation

Training Your Own Model

from code_explainer import CodeExplainerTrainer

# Initialize trainer with custom config
trainer = CodeExplainerTrainer("configs/custom.yaml")

# Train on custom dataset
trainer.train(data_path="data/my_dataset.json")

Advanced Usage

# Batch processing
codes = ["print('hello')", "x = [1,2,3]", "def add(a,b): return a+b"]
explanations = explainer.explain_code_batch(codes)

# Prompt strategy (CLI)
# From API
# POST /explain {"code": "...", "strategy": "ast_augmented"}

# A/B compare strategies
python scripts/ab_compare_strategies.py --config configs/default.yaml --max-samples 5 \
  --strategies vanilla ast_augmented retrieval_augmented

🧩 Prompt Strategies

See docs/strategies.md for details on: vanilla | ast_augmented | retrieval_augmented | execution_trace, including safety notes and examples.

💡 Examples

See quick-start examples in examples/ (training, evaluation, and serving with presets). Start here:

examples/README.md
examples/preset_switching.md
examples/eval_report_template.md

Contribute examples/data: see the discussion “Call for community samples (tiny datasets)” in the Discussions tab.

📝 Example Explanations

Input:

class BankAccount:
    def __init__(self, balance=0):
        self.balance = balance

    def deposit(self, amount):
        self.balance += amount
        return self.balance

Output:

This code defines a BankAccount class that represents a simple bank account. The __init__ method initializes the account with an optional starting balance (defaulting to 0). The deposit method adds money to the account and returns the new balance.

🛠️ Development

git clone https://github.com/rajatsainju2025/code-explainer.git
cd code-explainer

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Additional tools:

Makefile targets: install, format, lint, type, precommit, test, clean
Devcontainer: .devcontainer/devcontainer.json for a ready-made VS Code container

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=code_explainer --cov-report=html

# Run specific test
pytest tests/test_model.py::test_explain_code

For scope, speed, and coverage goals, see the testing strategy discussion: .github/DISCUSSIONS.md.

Planning & Roadmap

Plan review: docs/plan_review.md
Roadmap: NEXT_PHASE_ROADMAP.md
Reimagination: REIMAGINE.md

Code Quality

# Format code
black src/ tests/

# Sort imports
isort src/ tests/

# Type checking
mypy src/

```dockerfile
# Build image
docker build -t code-explainer .

# Run web interface
# Run training
docker run -v $(pwd)/data:/app/data code-explainer train --data /app/data/train.json

📈 Roadmap

Multi-language Support: JavaScript, Java, C++, etc.
Advanced Models: Integration with CodeT5, CodeBERT, StarCoder
VS Code Extension: Direct integration with development environment
API Service: RESTful API for integration with other tools
Performance Optimization: Model quantization and optimization
Enterprise Features: Authentication, usage tracking, custom deployments

📅 20-Push Improvement Plan Status

Current Status: Push 15/20 Complete ✅

✅ Completed Pushes (15/20)

Push 1-5: Initial setup and core improvements
Push 6-10: Advanced caching and batch processing
Push 11: Logging enhancements
Push 12: Performance optimizations (quantization, gradient checkpointing, memory monitoring)
Push 13: Security enhancements (rate limiting, input validation, security auditing)
Push 14: API improvements (v2 endpoints for health, performance, security validation)
Push 15: Testing expansions (comprehensive integration tests for all new features)

🔄 Remaining Pushes (16-20)

Push 16: Documentation updates
Push 17: CI/CD enhancements
Push 18: Performance benchmarking
Push 19: Production deployment
Push 20: Final integration and release

Track progress in IMPLEMENTATION_STATUS.md

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details. 4. Push to the branch (git push origin feature/amazing-feature)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Hugging Face for the amazing Transformers library
OpenAI for GPT model architecture inspiration
The open-source community for various tools and libraries
Author: Rajat Sainju
Email: your.email@example.com
GitHub: @rajatsainju2025
Project Link: https://github.com/rajatsainju2025/code-explainer

💬 Join the community

Start here: #4
General Q&A and ideas: Discussions tab

API (FastAPI)

Run the FastAPI server (example):

uvicorn code_explainer.api.server:app --host 0.0.0.0 --port 8000

Endpoints:

GET /health → {"status": "ok"}
GET /version → {"version": }
GET /strategies → list of supported strategies
POST /explain {code: str, strategy?: str} → {explanation: str}

More: see docs/api.md and docs/strategies.md.

⚡ Performance Optimizations (v2.4.0)

Code Explainer includes comprehensive performance optimizations for production deployments:

JSON Serialization

orjson Integration: 3-10x faster JSON serialization across Redis caching, security, retrieval, config loading, and data governance modules
Shared Utilities: Centralized json_loads/json_dumps in utils/hashing.py for consistency

Memory Efficiency

__slots__ Everywhere: 20-30% memory reduction on CacheStats, CacheConfig, RetrievalConfig, CodeExplainerException, DatabaseConfig, DataGovernanceConfig, StructuredLogger
frozenset Lookups: O(1) validation for strategies, languages, and retrieval methods

Timing & Caching

perf_counter() Timing: Sub-millisecond precision for API latency measurements
@lru_cache Confidence: Cached multi-agent confidence computations
AST Caching: Bounded cache for parsed syntax trees in symbolic analyzer
Precompiled Regex: Input sanitization patterns compiled once at module load

Code Quality

Named Constants: ONE_HOUR, TWO_HOURS, ONE_DAY for TTL; _MIN_CODE_LENGTH_FOR_CACHE, etc. for symbolic analyzer
Type Safety: Optional[T] annotations throughout error handling and logging
__all__ Exports: Explicit public API in validation and cache modules

Run benchmarks to validate:

python scripts/benchmark_hashing.py
python benchmarks/benchmark_inference.py

Name		Name	Last commit message	Last commit date
Latest commit History 963 Commits
.devcontainer		.devcontainer
.github		.github
alembic		alembic
benchmarks		benchmarks
configs		configs
data		data
docs		docs
examples		examples
helm/code-explainer		helm/code-explainer
k8s		k8s
monitoring		monitoring
scripts		scripts
src/code_explainer		src/code_explainer
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.mypy.ini		.mypy.ini
.optimization_log		.optimization_log
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.toml		.style.toml
API_DOCUMENTATION.md		API_DOCUMENTATION.md
BRANCH_SUMMARY.md		BRANCH_SUMMARY.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPMENT.md		DEVELOPMENT.md
Dockerfile		Dockerfile
ERROR_HANDLING_GUIDE.md		ERROR_HANDLING_GUIDE.md
INFRASTRUCTURE_UPGRADE_SUMMARY.md		INFRASTRUCTURE_UPGRADE_SUMMARY.md
LICENSE		LICENSE
Makefile		Makefile
PERF_NOTES.md		PERF_NOTES.md
README.md		README.md
SECURITY.md		SECURITY.md
UPGRADE_COMPLETE.md		UPGRADE_COMPLETE.md
alembic.ini		alembic.ini
app.py		app.py
bibliography.bib		bibliography.bib
docker-compose.worker.yml		docker-compose.worker.yml
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pyproject.toml.backup		pyproject.toml.backup
pytest.ini		pytest.ini
setup.py		setup.py
streamlit_app.py		streamlit_app.py
train.py		train.py
uvicorn_config.py		uvicorn_config.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🐍 Code Explainer

✨ Features

🧠 Core AI Capabilities

🎯 Smart Analysis & Prompting

🌐 Production-Ready Interfaces

Compatibility Notes

🔒 Security & Safety

📊 Monitoring & Observability

🗄️ Advanced Caching System

🧪 Advanced Evaluation & Testing

🔮 Continuous Integration & Deployment

🎯 Developer Experience

🚀 Quick Start

Installation

Alternative: Install from PyPI

Basic Usage

Web Interface

CLI Usage

📊 Performance & Benchmarks

🧪 Advanced Evaluation Framework

Traditional Metrics

LLM-as-a-Judge Evaluation

Contamination Detection

Robustness Testing

Preference-Based Evaluation

🏗️ Architecture

🔧 Configuration

📦 Model Presets

📖 Documentation

Training Your Own Model

Advanced Usage

🧩 Prompt Strategies

💡 Examples

🛠️ Development

Running Tests

Code Quality

📈 Roadmap

📅 20-Push Improvement Plan Status

✅ Completed Pushes (15/20)

🔄 Remaining Pushes (16-20)

🤝 Contributing

📄 License

🙏 Acknowledgments

💬 Join the community

API (FastAPI)

⚡ Performance Optimizations (v2.4.0)

JSON Serialization

Memory Efficiency

Timing & Caching

Code Quality

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages