This document outlines the 12-week implementation roadmap for transforming the FreeCAD LLM automation system into a production-ready, multi-agent architecture with advanced 3D understanding, real-time collaboration, and industrial compliance.
Target Metrics:
- Response Time: <5s per design iteration
- Concurrent Users: 100+
- Success Rate: >85% on complex CAD tasks
- FEA Integration: Automated stress/thermal analysis
| Component | Technology | Purpose |
|---|---|---|
| Orchestration | Docker, Kubernetes | Container management, scaling |
| State Management | Redis (TTL Cache), Redis Streams | Real-time state, audit trails |
| CAD Engine | FreeCAD (Headless), Pivy | 3D modeling execution |
| Distributed Compute | Ray | Parallel agent execution |
| Vector Store | Milvus/FAISS | Geometric embeddings |
| Message Bus | LangGraph/AutoGen | Agent communication |
| API Gateway | FastAPI | RESTful interface |
1. Planner Agent
- Technology: Chain-of-Thought (CoT) reasoning, RAG with ISO/DIN standards
- Vector DB: Milvus for design pattern retrieval
- FreeCAD Interaction: Queries DOM tree for state-aware task decomposition
- Output: Task dependency graph (NetworkX JSON format)
2. Generator Agent
- Technology: CodeLlama (Fine-tuned with LoRA), Llama3
- Specialization: PartDesign body workflow (Sketch → Pad → Pocket)
- FreeCAD API: Direct Python scripting with constraint enforcement
- Error Handling: Dynamic indexing for Sketcher constraints
3. Validator Agent
- Technology:
- Geometric: CalculiX (FEA), Gmsh (meshing)
- Visual: GPT-4V multimodal critique
- Geometric Analysis: Open3D for manifold checks
- FreeCAD Interaction: Exports STL/STEP for external validation
- Output: Pass/fail + detailed feedback JSON
4. Orchestrator Agent
- Technology: LangGraph state machine, exponential backoff
- Responsibility:
- Recompute monitoring
- Topological naming error recovery
- Loop convergence (max 10 iterations)
- Human-in-the-loop triggers
User Prompt
↓
[Planner Agent] → Task Graph JSON
↓
[Generator Agent] → FreeCAD Python Script
↓
[Headless FreeCAD Execution] → DOM Update
↓
[State Encoder]
├─→ BERT (FreeCAD XML)
├─→ PointNet++ (Mesh geometry)
└─→ GraphSAGE (B-Rep features)
↓
[Vector DB Storage] + [Redis Cache]
↓
[Validator Agent]
├─→ Geometric (manifold, tolerances)
├─→ Physical (FEA simulation)
└─→ Visual (GPT-4V critique)
↓
[Orchestrator Decision]
├─→ Success → Export (STEP/IGES)
├─→ Minor Issues → Refine (loop)
└─→ Critical Failure → Human Review
Hierarchical DOM Encoding:
{
"document": {
"bodies": [
{
"id": "Body001",
"features": [
{"type": "Sketch", "constraints": [...], "geometry": [...]},
{"type": "Pad", "length": 50, "direction": "Z"}
]
}
],
"embeddings": {
"text": [...], # BERT 768-dim
"geometry": [...], # PointNet 1024-dim
"topology": [...] # GraphSAGE 256-dim
}
}
}Storage:
- Hot State: Redis (TTL 1hr)
- Long-term: Milvus (vector search)
- Audit Trail: Redis Streams (immutable log)
Objectives:
- Eliminate GUI dependencies in production
- Implement robust error handling for topological naming
Tasks:
-
Xvfb Integration (
tools/gui/headless_manager.py)- Virtual display wrapper for FreeCAD.Gui - Fallback to console-only mode - Process lifecycle management
-
Recompute Error Handling (
core/freecad/document_manager.py)- Try-catch wrapper for doc.recompute() - Exponential backoff (3 attempts) - State rollback on failure - Detailed error logging with DOM snapshot
-
Testing:
- 50 complex models with circular references
- Memory leak profiling (target: <100MB growth/hour)
Deliverables:
- Headless FreeCAD runs in Docker without X11
- 95% recompute success rate on test dataset
- CI/CD pipeline with headless tests
Objectives:
- Transition from monolithic to multi-agent architecture
- Implement inter-agent communication
Tasks:
-
Agent Refactoring (
core/agents/)agents/ ├── planner.py # CoT + RAG ├── generator.py # CodeLlama wrapper ├── validator.py # Multi-modal validation └── orchestrator.py # LangGraph state machine -
LangGraph Integration
- Define state schema (TypedDict) - Implement conditional edges (validation pass/fail) - Configure message passing protocol
-
Redis Message Bus (
core/messaging/redis_bus.py)- Pub/Sub channels per agent - Request/response pattern with correlation IDs - Dead letter queue for failures
Deliverables:
- 4 independent agents with clean interfaces
- LangGraph workflow executes simple box model
- Redis integration tests (concurrent agents)
Planner Agent:
-
RAG Implementation (
agents/planner/rag_engine.py)- Vector DB: Milvus collection for ISO/DIN standards
- Embeddings: BERT-base on technical documentation
- Retrieval: Top-k=5 relevant design patterns per query
- Chunking Strategy: 512 tokens with 50-token overlap
-
Task Graph Generation (
agents/planner/task_decomposer.py)- NetworkX directed acyclic graph (DAG) - Dependencies: geometric (parent/child), logical (sequence) - JSON serialization for Generator consumption - Cycle detection + topological sort validation
Generator Agent:
-
PartDesign Workflow Enforcement (
agents/generator/body_workflow.py)- Template library: Box, Cylinder, Loft, Sweep - Constraint validation (Sketch closure, redundancy) - Dynamic variable indexing (geo_id mapping) - Attachment offset calculations
-
Code Generation (
agents/generator/llm_wrapper.py)- Model: CodeLlama-13B (4-bit quantization)
- Prompting: Few-shot examples from curated dataset
- Output Parsing: AST validation before execution
- Safety: Whitelist FreeCAD API calls only
Testing:
- 100 natural language prompts → executable scripts
- Target: 70% success without refinement
Deliverables:
- Planner retrieves relevant design standards
- Generator produces valid PartDesign scripts
- Task graph correctly sequences dependencies
Validator Agent:
-
Geometric Validation (
agents/validator/geometry_checker.py)- OCC.Core.ShapeFix for manifold repair - Tolerance checks (1e-6 default) - Self-intersection detection (BRepAlgoAPI_Check) - Volume/surface area sanity bounds
-
FEA Integration (
agents/validator/fea_runner.py)- Gmsh auto-meshing (tetrahedral, element size: auto) - CalculiX static analysis - Material library: Steel, Aluminum, ABS plastic - Stress threshold warnings (von Mises > yield/2)
-
Visual Critique (
agents/validator/vision_validator.py)- STL export → rendered PNG (matplotlib/vtk) - GPT-4V prompt: "Identify geometric anomalies" - Confidence scoring on feedback
3D Machine Learning:
-
Geometry Embeddings (
ml/geometry_encoder.py)- Mesh → Point Cloud (Open3D, 2048 points) - PointNet++ encoder (PyTorch) - Output: 1024-dim vector per model - Training: Contrastive learning on ShapeNet
-
Feature Graph Embeddings (
ml/graph_encoder.py)- B-Rep topology → NetworkX graph - Node features: face area, edge curvature - GraphSAGE (256-dim output) - Use case: Similar design retrieval
Deliverables:
- Validator catches 90%+ geometric errors
- FEA runs automatically on generated parts
- Point cloud embeddings cluster similar shapes
Ray Deployment:
-
Agent Actors (
core/distributed/ray_agents.py)@ray.remote class PlannerActor: def __init__(self): # Load models/DBs in actor init async def plan(self, prompt: str) -> TaskGraph: # Stateful processing
-
Kubernetes Configuration (
k8s/ray-cluster.yaml)- Head node: 8 CPU, 16GB RAM - Worker nodes: 4x (16 CPU, 32GB RAM each) - Auto-scaling: 2-10 workers based on queue depth
-
Resource Management
- CPU-only for Planner/Orchestrator - GPU (T4) for Generator/Validator LLMs - Shared Redis for state synchronization
Testing:
- 50 concurrent design tasks
- Measure: latency (P95), throughput, resource utilization
Deliverables:
- Ray cluster deployed on K8s
- Agents scale horizontally under load
- Shared state consistency verified
WebSocket Server (api/websocket_handler.py)
- FastAPI WebSocket endpoint
- Redis Pub/Sub bridge
- Events: state_update, validation_result, error
- Client: Three.js for 3D previewDashboard Features (frontend/dashboard/)
-
3D Viewer:
- Three.js + STL loader
- Real-time model updates via WebSocket
- Camera controls, exploded views
-
State Inspector:
- Live DOM tree visualization
- Feature history timeline
- Validation logs stream
-
Metrics:
- Agent status (idle/busy)
- Loop iteration count
- Performance graphs (Plotly)
Optimization:
-
Redis Caching (
core/cache/primitive_cache.py)- Pre-compute common primitives (ISO bolts, gears) - Cache key: parameter hash - TTL: 24 hours - Hit rate target: >60%
-
Profiling (
tools/performance/profiler.py)- cProfile integration - Line-by-line timing (line_profiler) - Memory snapshots (tracemalloc) - Target: <5s per ReAct loop
Deliverables:
- Real-time 3D preview in browser
- Dashboard shows live agent activity
- Performance optimized to <5s/loop
Export Pipeline (core/export/cad_exporter.py)
- STEP AP214 (automotive standard)
- IGES 5.3 (legacy CAM compatibility)
- STL (high-resolution for 3D printing)
- Metadata injection: creation date, agent versionGD&T Validation (agents/validator/gdt_checker.py)
- Parse TechDraw annotations
- Validate: flatness, perpendicularity, concentricity
- Tolerance stack-up analysis
- Report: ISO 1101 complianceAudit Trail (core/audit/stream_logger.py)
- Redis Streams for immutable logs
- Entry schema:
{
"timestamp": ISO8601,
"agent": "generator",
"action": "script_execution",
"input_hash": sha256,
"output_state": {...},
"user_id": UUID
}
- Retention: 90 days (compliance requirement)Deliverables:
- Multi-format export tested on 100 models
- GD&T validation on technical drawings
- Audit trail queryable via API
Security Hardening:
-
Container Security (
docker/Dockerfile.production)- Non-root user (freecad:1000) - Read-only root filesystem - Dropped capabilities (CAP_SYS_ADMIN) - Network policies (K8s)
-
Authentication (
api/auth/oauth_handler.py)- OAuth 2.0 / OpenID Connect - JWT tokens (15min expiry) - Role-based access control (RBAC) - API rate limiting (100 req/min per user)
Load Testing (tests/load/locust_scenarios.py)
- Tool: Locust
- Scenarios:
1. 100 concurrent simple designs (boxes)
2. 50 concurrent complex assemblies
3. Spike test: 0→200 users in 1min
- Success Criteria:
- P95 latency < 10s
- Error rate < 1%
- No memory leaks over 1hrLLM Fine-Tuning:
-
Dataset Curation (
ml/training/dataset.py)- 1,000 expert-reviewed (prompt, script, model) triplets - Sources: FreeCAD forum, GitHub, internal review - Validation: 80/10/10 train/val/test split
-
LoRA Training (
ml/training/lora_finetuner.py)- Base model: Llama3-8B - LoRA rank: 16, alpha: 32 - Training: 4x A100 GPUs, 24 hours - Evaluation: BLEU score on code generation - Target: >0.6 BLEU improvement
Deliverables:
- OAuth authentication enforced
- Load tests pass at 100 concurrent users
- Fine-tuned model outperforms base by 30%+
Model: BERT-base-uncased
Input: FreeCAD XML + feature descriptions
Preprocessing: Tokenization → [CLS] token pooling
Dimensionality: 768
Use Case: Semantic design searchModel: PointNet++ / MeshCNN
Input: STL mesh → 2048 point cloud
Training: ShapeNet (55 categories)
Dimensionality: 1024
Use Case: Shape similarity, retrievalModel: GraphSAGE
Input: B-Rep feature graph
Node Features: [face_area, edge_curvature, vertex_valence]
Edge Features: [adjacency_type, angle]
Dimensionality: 256
Use Case: Structural pattern matchingFreeCAD Model
↓
[Gmsh] → Tetrahedral Mesh (auto element size)
↓
[CalculiX INP File]
├─→ Material: E=200GPa, ν=0.3 (steel)
├─→ Boundary: Fixed faces detection
└─→ Load: Pressure/force vectors
↓
[Static Analysis] → von Mises stress field
↓
[Topology Optimization] (PyMOO + SIMP)
↓
[Optimized Geometry] → Back to FreeCAD
Automated Failure Modes:
- Max stress > 0.8 × Yield strength → Warning
- Displacement > 10% of part size → Reject
- Safety factor < 1.5 → Request design review
from OCC.Core.ShapeFix import ShapeFix_Shape
def auto_repair_geometry(shape):
"""
Repair common CAD errors:
- Non-manifold edges
- Invalid face orientations
- Small edges/faces (< tolerance)
"""
fixer = ShapeFix_Shape(shape)
fixer.SetPrecision(1e-6)
fixer.SetMaxTolerance(1e-3)
fixer.Perform()
return fixer.Shape()# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: freecad-orchestrator
spec:
replicas: 3
template:
spec:
containers:
- name: orchestrator
image: freecad-llm:v1.0
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
env:
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: redis-secret
key: url
---
# Ray cluster for distributed agents
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
name: freecad-ray-cluster
spec:
rayVersion: '2.9.0'
headGroupSpec:
serviceType: ClusterIP
rayStartParams:
dashboard-host: '0.0.0.0'
workerGroupSpecs:
- replicas: 4
minReplicas: 2
maxReplicas: 10
rayStartParams:
num-cpus: "16"| Component | Technology | Metrics |
|---|---|---|
| Metrics | Prometheus | Agent latency, queue depth, error rate |
| Logging | ELK Stack | Structured JSON logs, full-text search |
| Tracing | Jaeger | Distributed request tracing |
| Alerting | PagerDuty | On-call rotation for critical failures |
┌─────────────────┐
│ E2E Tests │ 10 critical user journeys
│ (Selenium) │
├─────────────────┤
│ Integration │ 50 agent interaction tests
│ Tests (pytest) │
├─────────────────┤
│ Unit Tests │ 500+ function-level tests
│ (pytest) │ Target: 80% coverage
└─────────────────┘
Complexity Tiers:
- Tier 1 (Simple): Primitives (box, cylinder, sphere) - 50 models
- Tier 2 (Moderate): Brackets, flanges, simple assemblies - 100 models
- Tier 3 (Complex): Gearboxes, engines, organic shapes - 50 models
Success Criteria:
- Tier 1: 95% success rate
- Tier 2: 85% success rate
- Tier 3: 70% success rate
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Topological naming errors | High | High | Robust recompute handling, state rollback |
| LLM hallucination (invalid code) | High | Medium | AST validation, sandbox execution |
| FEA solver instability | Medium | Medium | Mesh quality checks, fallback to simpler analysis |
| K8s scaling delays | Low | Medium | Pre-warmed worker pool, predictive scaling |
| Data privacy (model leakage) | Low | High | On-premise deployment option, data encryption |
- Iteration Speed: <5s per ReAct loop (P95)
- Throughput: 100 concurrent designs
- Uptime: 99.5% SLA
- Success Rate: >85% on validation dataset
- Geometric Accuracy: <0.1mm deviation from spec
- FEA Validation: 90% of stress predictions within 15% of manual analysis
- User Adoption: 500 active users (6 months post-launch)
- Time Savings: 60% reduction in CAD scripting time
- Cost: <$0.50 per design iteration (compute cost)
- Assembly Intelligence: Multi-body constraint reasoning, kinematic simulation
- Generative Design: Lattice structures, topology optimization loop
- Sheet Metal Module: Unfold/bend sequence planning
- CAM Integration: Automatic toolpath generation for CNC
- Multi-CAD Support: SolidWorks/CATIA import via STEP translation
- Collaborative Editing: Operational transform for concurrent users
- 2x Backend Engineers (Python, Ray, K8s)
- 1x ML Engineer (PyTorch, NLP)
- 1x CAD Domain Expert (FreeCAD, mechanical engineering)
- 1x DevOps Engineer (Docker, CI/CD)
- 1x QA Engineer (Test automation)
- Compute: 4x GPU nodes (T4), 8x CPU nodes (16 cores each)
- Storage: 500GB SSD (Redis), 2TB HDD (logs, models)
- Bandwidth: 10Gbps internal, 1Gbps external
- Development: $300k (salaries, 12 weeks)
- Infrastructure: $2k/month (cloud compute)
- LLM API Costs: $500/month (GPT-4V + fine-tuning)
- Total Year 1: ~$350k
from langgraph.graph import StateGraph, END
from typing import TypedDict
class DesignState(TypedDict):
prompt: str
task_graph: dict
script: str
model_state: dict
validation_result: dict
iteration: int
def build_workflow():
workflow = StateGraph(DesignState)
workflow.add_node("planner", planner_agent)
workflow.add_node("generator", generator_agent)
workflow.add_node("executor", freecad_executor)
workflow.add_node("validator", validator_agent)
workflow.set_entry_point("planner")
workflow.add_edge("planner", "generator")
workflow.add_edge("generator", "executor")
workflow.add_edge("executor", "validator")
workflow.add_conditional_edges(
"validator",
should_continue,
{
"refine": "generator",
"success": END,
"human_review": "human_node"
}
)
return workflow.compile()import ray
@ray.remote(num_cpus=2, num_gpus=0.5)
class GeneratorActor:
def __init__(self):
self.model = load_codellama()
async def generate(self, task_graph):
script = await self.model.generate(task_graph)
return validate_and_clean(script)
# Parallel execution
actors = [GeneratorActor.remote() for _ in range(4)]
results = ray.get([
actor.generate.remote(task)
for actor, task in zip(actors, task_batch)
])version: '3.8'
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
freecad:
build:
context: .
dockerfile: docker/Dockerfile.freecad
environment:
- DISPLAY=:99
- REDIS_URL=redis://redis:6379
volumes:
- ./outputs:/app/outputs
depends_on:
- redis
orchestrator:
build:
context: .
dockerfile: docker/Dockerfile.app
ports:
- "8000:8000"
environment:
- REDIS_URL=redis://redis:6379
- OPENAI_API_KEY=${OPENAI_API_KEY}
depends_on:
- redis
- freecad
volumes:
redis_data:- Version: 1.0
- Date: January 1, 2026
- Authors: AI Design Engineering Team
- Status: Implementation Ready
-
Week 0 (Pre-Implementation):
- Secure infrastructure budget approval
- Finalize team assignments
- Set up development environments
- Kickoff meeting with stakeholders
-
Week 1 (Day 1):
- Create feature branches
- Initialize CI/CD pipelines
- First standup meeting
- Begin headless FreeCAD refactoring
Let's build the future of AI-assisted CAD design! 🚀