Name	Name	Last commit message	Last commit date
parent directory ..
providers	providers
.env.example	.env.example
AGENTS.md	AGENTS.md
README.md	README.md
SKILL.md	SKILL.md
SPEC.md	SPEC.md
__init__.py	__init__.py
analyzer.py	analyzer.py
cache.py	cache.py
defaults.py	defaults.py
demo_llm_features.py	demo_llm_features.py
generator.py	generator.py
llm_operations.py	llm_operations.py
llm_processor.py	llm_processor.py
mcp.py	mcp.py
processor.py	processor.py
prompts.py	prompts.py
test_llm_system.py	test_llm_system.py

LLM Module

Multi-provider Large-Language-Model integration for GNN models: parse, interpret, summarise, and annotate Active Inference specifications through Ollama (local) and, when API keys are present, OpenAI / OpenRouter / Perplexity / Anthropic.

Module Structure

src/llm/
├── __init__.py                    # Module initialization and exports
├── README.md                      # This documentation
├── analyzer.py                    # LLM analysis system
├── llm_operations.py             # Core LLM operations
├── llm_processor.py              # LLM processing system
├── mcp.py                        # Model Context Protocol integration
├── prompts.py                    # LLM prompt templates
└── providers/                    # LLM provider implementations
    ├── __init__.py              # Provider initialization
    ├── base_provider.py         # Base provider interface
    ├── openai_provider.py       # OpenAI provider
    ├── openrouter_provider.py   # OpenRouter provider
    ├── perplexity_provider.py   # Perplexity provider
    └── ollama_provider.py       # Ollama (local) provider

LLM Processing Architecture

graph TB
    subgraph "Input Processing"
        GNNFile[GNN Files]
        Processor[processor.py]
        ProviderSelect[Provider Selection]
    end
    
    subgraph "LLM Providers"
        OpenAI[OpenAI Provider]
        Anthropic[Anthropic Provider]
        Ollama[Ollama Provider]
        OpenRouter[OpenRouter Provider]
    end
    
    subgraph "Analysis Components"
        Analyzer[analyzer.py]
        Generator[generator.py]
        Prompts[prompts.py]
    end
    
    subgraph "Output Generation"
        Analysis[Analysis Results]
        Insights[Model Insights]
        Documentation[Generated Docs]
        Summary[LLM Summary]
    end
    
    GNNFile --> Processor
    Processor --> ProviderSelect
    
    ProviderSelect -->|API Key Available| OpenAI
    ProviderSelect -->|API Key Available| Anthropic
    ProviderSelect -->|Local Recovery| Ollama
    ProviderSelect -->|Alternative| OpenRouter
    
    OpenAI --> Analyzer
    Anthropic --> Analyzer
    Ollama --> Analyzer
    OpenRouter --> Analyzer
    
    Analyzer --> Generator
    Generator --> Prompts
    
    Analyzer --> Analysis
    Generator --> Insights
    Generator --> Documentation
    Generator --> Summary

Provider Selection Flow

flowchart TD
    Start[Start LLM Processing] --> CheckKeys{API Keys<br/>Available?}
    
    CheckKeys -->|OpenAI Key| UseOpenAI[Use OpenAI]
    CheckKeys -->|Anthropic Key| UseAnthropic[Use Anthropic]
    CheckKeys -->|No Keys| CheckOllama{Ollama<br/>Available?}
    
    CheckOllama -->|Yes| UseOllama[Use Ollama]
    CheckOllama -->|No| Recovery[Recovery Analysis]
    
    UseOpenAI --> Process[Process with LLM]
    UseAnthropic --> Process
    UseOllama --> Process
    Recovery --> Process
    
    Process --> Results[Analysis Results]

Module Integration Flow

flowchart LR
    subgraph "Pipeline Step 13"
        Step13[13_llm.py Orchestrator]
    end
    
    subgraph "LLM Module"
        Processor[processor.py]
        Analyzer[analyzer.py]
        Generator[generator.py]
        Providers[providers/]
    end
    
    subgraph "Downstream Steps"
        Step16[Step 16: Analysis]
        Step20[Step 20: Website]
        Step23[Step 23: Report]
    end
    
    Step13 --> Processor
    Processor --> Analyzer
    Processor --> Generator
    Processor --> Providers
    
    Processor -->|LLM Insights| Step16
    Processor -->|LLM Summaries| Step20
    Processor -->|LLM Analysis| Step23

Core Components

LLM Analysis System (`analyzer.py`)

`analyze_gnn_model_with_llm(gnn_content: str, model_name: str, analysis_type: str = "comprehensive") -> Dict[str, Any]`

Performs comprehensive LLM analysis of GNN models.

Analysis Types:

comprehensive: Full model analysis and interpretation
structural: Model structure and architecture analysis
semantic: Semantic meaning and behavior analysis
performance: Performance characteristics analysis
optimization: Optimization suggestions and improvements

Returns:

Dictionary containing comprehensive analysis results
Model interpretation and insights
Performance recommendations
Optimization suggestions

`interpret_model_behavior(gnn_content: str, model_name: str) -> Dict[str, Any]`

Interprets model behavior using LLM analysis.

Features:

Behavioral pattern analysis
Dynamic behavior interpretation
Interaction pattern identification
Performance characteristic analysis

`generate_model_documentation(gnn_content: str, model_name: str) -> str`

Generates comprehensive model documentation using LLM.

Content:

Model overview and purpose
Component descriptions
Usage instructions
Performance characteristics
Optimization recommendations

LLM Operations (`llm_operations.py`)

`process_llm_analysis(target_dir: Path, output_dir: Path, verbose: bool = False) -> bool`

Main function for processing LLM analysis of GNN models.

Features:

Multi-provider LLM analysis
Comprehensive model interpretation
Documentation generation
Performance optimization suggestions

`extract_model_insights(gnn_content: str) -> List[Dict[str, Any]]`

Extracts insights from GNN models using LLM analysis.

Insights:

Model complexity analysis
Performance characteristics
Optimization opportunities
Best practices recommendations

`generate_optimization_suggestions(gnn_content: str) -> List[Dict[str, Any]]`

Generates optimization suggestions using LLM analysis.

Suggestions:

Performance improvements
Structural optimizations
Parameter tuning recommendations
Best practices implementation

LLM Provider System (`providers/`)

BaseProvider (`base_provider.py`)

Base interface for LLM providers.

Key Methods:

analyze_model(content: str, analysis_type: str) -> Dict[str, Any]
generate_insights(content: str) -> List[Dict[str, Any]]
optimize_model(content: str) -> Dict[str, Any]
document_model(content: str) -> str

OpenAIProvider (`openai_provider.py`)

OpenAI GPT model integration.

Features:

GPT-4 and GPT-3.5-turbo support
Advanced model analysis
Comprehensive documentation generation
Performance optimization suggestions

OpenRouterProvider (`openrouter_provider.py`)

OpenRouter multi-provider integration.

Features:

Multiple LLM provider access
Cost optimization
Provider selection based on task
Recovery mechanisms

PerplexityProvider (`perplexity_provider.py`)

Perplexity AI integration.

Features:

Real-time information access
Current best practices
Research integration
Performance benchmarking

LLM Processing System (`llm_processor.py`)

`process_llm_request(content: str, request_type: str, provider: str = "auto") -> Dict[str, Any]`

Processes LLM requests with automatic provider selection.

Request Types:

analysis: Model analysis and interpretation
optimization: Performance optimization suggestions
documentation: Model documentation generation
insights: Model insights and recommendations

`select_optimal_provider(request_type: str, content_length: int) -> str`

Selects the optimal LLM provider based on request type and content.

Selection Criteria:

Request type requirements
Content complexity
Cost considerations
Performance requirements

Usage Examples

Basic LLM Analysis

from llm import analyze_gnn_model_with_llm

# Analyze GNN model with LLM
analysis = analyze_gnn_model_with_llm(
    gnn_content=gnn_content,
    model_name="my_model",
    analysis_type="comprehensive"
)

print(f"Model complexity: {analysis['complexity']}")
print(f"Performance score: {analysis['performance_score']}")
print(f"Optimization suggestions: {len(analysis['optimizations'])}")

Model Interpretation

from llm import interpret_model_behavior

# Interpret model behavior
interpretation = interpret_model_behavior(
    gnn_content=gnn_content,
    model_name="my_model"
)

print(f"Behavioral patterns: {interpretation['patterns']}")
print(f"Dynamic characteristics: {interpretation['dynamics']}")
print(f"Interaction analysis: {interpretation['interactions']}")

Documentation Generation

from llm import generate_model_documentation

# Generate comprehensive documentation
documentation = generate_model_documentation(
    gnn_content=gnn_content,
    model_name="my_model"
)

print("Generated documentation:")
print(documentation)

Provider-Specific Analysis

from llm.providers import OpenAIProvider, PerplexityProvider

# Use specific providers
openai_provider = OpenAIProvider()
perplexity_provider = PerplexityProvider()

# OpenAI analysis
openai_analysis = openai_provider.analyze_model(gnn_content, "comprehensive")

# Perplexity analysis with current research
perplexity_analysis = perplexity_provider.analyze_model(gnn_content, "research")

Optimization Suggestions

from llm import generate_optimization_suggestions

# Generate optimization suggestions
suggestions = generate_optimization_suggestions(gnn_content)

for suggestion in suggestions:
    print(f"Optimization: {suggestion['type']}")
    print(f"Description: {suggestion['description']}")
    print(f"Expected improvement: {suggestion['improvement']}")

LLM Analysis Pipeline

graph TD
    Input[GNN Model] --> Prep[Content Preparation]
    Prep --> Selector{Provider<br/>Selector}
    
    Selector -->|Auto/Manual| OpenAI[OpenAI Provider]
    Selector -->|Auto/Manual| Perplexity[Perplexity Provider]
    Selector -->|Auto/Manual| OpenRouter[OpenRouter Provider]
    Selector -->|Auto/Manual| Ollama[Ollama Provider]
    
    OpenAI --> Analysis[LLM Analysis]
    Perplexity --> Analysis
    OpenRouter --> Analysis
    Ollama --> Analysis
    
    Analysis --> Insights[Insight Extraction]
    Analysis --> Opt[Optimization Suggestions]
    Analysis --> Doc[Documentation Gen]
    
    Insights --> Report[Final Report]
    Opt --> Report
    Doc --> Report

1. Content Preparation

# Prepare GNN content for LLM analysis
prepared_content = prepare_content_for_llm(gnn_content)
analysis_context = create_analysis_context(model_name, analysis_type)

2. Provider Selection

# Select optimal LLM provider
provider = select_optimal_provider(request_type, len(prepared_content))
llm_provider = initialize_provider(provider)

3. LLM Analysis

# Perform LLM analysis
analysis_result = llm_provider.analyze_model(prepared_content, analysis_type)
insights = llm_provider.generate_insights(prepared_content)

4. Result Processing

# Process and validate results
processed_results = process_llm_results(analysis_result, insights)
validated_results = validate_analysis_results(processed_results)

5. Documentation Generation

# Generate comprehensive documentation
documentation = generate_comprehensive_documentation(validated_results)
optimization_report = generate_optimization_report(validated_results)

Integration with Pipeline

Pipeline Step 13: LLM Processing

# Called from 13_llm.py
def process_llm(target_dir, output_dir, verbose=False, **kwargs):
    # Perform LLM analysis of GNN models
    analysis_results = analyze_gnn_models_with_llm(target_dir, verbose)
    
    # Generate insights and recommendations
    insights = generate_model_insights(analysis_results)
    
    # Create comprehensive documentation
    documentation = generate_llm_documentation(analysis_results, insights)
    
    return True

Output Structure

output/13_llm_output/
├── model_analysis.json            # LLM analysis results
├── model_insights.json            # Model insights and recommendations
├── optimization_suggestions.json  # Optimization suggestions
├── model_documentation.md         # Generated documentation
├── performance_analysis.json      # Performance analysis
├── behavioral_analysis.json       # Behavioral analysis
└── llm_summary.md                # LLM processing summary

LLM Providers

OpenAI Provider

Models: GPT-4, GPT-3.5-turbo
Strengths: Advanced reasoning, comprehensive analysis
Use Cases: Complex model analysis, detailed documentation
Cost: Higher cost for advanced models

OpenRouter Provider

Models: Multiple providers (Anthropic, Google, etc.)
Strengths: Provider selection, cost optimization
Use Cases: Cost-effective analysis, provider comparison
Cost: Variable based on provider selection

Perplexity Provider

Models: Real-time information access
Strengths: Current research integration, live data
Use Cases: Research-based analysis, current best practices
Cost: Moderate cost with research benefits

Analysis Types

Comprehensive Analysis

Model Structure: Complete model architecture analysis
Performance Characteristics: Performance evaluation and benchmarking
Optimization Opportunities: Identification of improvement areas
Best Practices: Implementation of current best practices
Documentation: Comprehensive model documentation

Structural Analysis

Component Analysis: Analysis of individual model components
Relationship Mapping: Mapping of component relationships
Dependency Analysis: Analysis of component dependencies
Complexity Assessment: Assessment of model complexity

Semantic Analysis

Meaning Interpretation: Interpretation of model semantics
Behavioral Analysis: Analysis of model behavior patterns
Interaction Analysis: Analysis of component interactions
Purpose Understanding: Understanding of model purpose and goals

Performance Analysis

Efficiency Assessment: Assessment of model efficiency
Resource Usage: Analysis of resource utilization
Scalability Analysis: Analysis of scalability characteristics
Optimization Recommendations: Specific optimization suggestions

Configuration Options

LLM Settings

# LLM configuration
config = {
    'default_provider': 'auto',     # Default LLM provider
    'analysis_depth': 'comprehensive', # Analysis depth level
    'include_research': True,       # Include current research
    'optimization_focus': True,     # Focus on optimization
    'documentation_style': 'technical', # Documentation style
    'cost_optimization': True       # Enable cost optimization
}

Provider-Specific Settings

# Provider-specific configuration
provider_config = {
    'openai': {
        'model': 'gpt-4',
        'temperature': 0.1,
        'max_tokens': 4000
    },
    'perplexity': {
        'include_research': True,
        'current_best_practices': True
    },
    'openrouter': {
        'cost_optimization': True,
        'provider_selection': 'auto'
    }
}

Error Handling

LLM Analysis Failures

# Handle LLM analysis failures gracefully
try:
    analysis = analyze_gnn_model_with_llm(content, model_name)
except LLMAnalysisError as e:
    logger.error(f"LLM analysis failed: {e}")
    # Provide recovery analysis or error reporting

Provider Failures

# Handle provider failures gracefully
try:
    provider = select_optimal_provider(request_type, content_length)
    result = provider.analyze_model(content)
except ProviderError as e:
    logger.warning(f"Provider failed: {e}")
    # Fall back to alternative provider

Rate Limiting Issues

# Handle rate limiting issues
try:
    result = process_llm_request(content, request_type)
except RateLimitError as e:
    logger.warning(f"Rate limit exceeded: {e}")
    # Implement retry with backoff

Performance Optimization

Caching Strategies

Analysis Cache: Cache LLM analysis results
Provider Cache: Cache provider responses
Documentation Cache: Cache generated documentation
Insight Cache: Cache model insights

Cost Optimization

Provider Selection: Select cost-effective providers
Request Batching: Batch multiple requests
Response Caching: Cache responses to avoid repeated requests
Token Optimization: Optimize token usage

Performance Monitoring

Response Time: Monitor LLM response times
Cost Tracking: Track API costs
Quality Metrics: Monitor analysis quality
Provider Performance: Track provider performance

Testing and Validation

Unit Tests

# Test individual LLM functions
def test_llm_analysis():
    analysis = analyze_gnn_model_with_llm(test_content, "test_model")
    assert 'complexity' in analysis
    assert 'performance_score' in analysis
    assert 'optimizations' in analysis

Integration Tests

# Test complete LLM pipeline
def test_llm_pipeline():
    success = process_llm_analysis(test_dir, output_dir)
    assert success
    # Verify LLM outputs
    llm_files = list(output_dir.glob("**/*"))
    assert len(llm_files) > 0

Provider Tests

# Test different providers
def test_provider_selection():
    providers = ['openai', 'perplexity', 'openrouter']
    for provider in providers:
        result = test_provider(provider, test_content)
        assert result['success']

Dependencies

Required Dependencies

ollama: Local LLM client (recommended for default runs)
requests: HTTP requests for API calls
json: JSON data handling
pathlib: Path handling

Optional Dependencies

openai: OpenAI API integration
perplexity: Perplexity AI integration
openrouter: OpenRouter integration
tiktoken: Token counting for OpenAI
asyncio: Asynchronous processing

Performance Metrics

Processing Times

Small Models (< 100 variables): 5-30 seconds
Medium Models (100-1000 variables): 30-120 seconds
Large Models (> 1000 variables): 120-600 seconds

Memory Usage

Base Memory: ~50MB
Per Analysis: ~10-50MB depending on complexity
Peak Memory: 2-3x base usage during analysis

Cost Metrics

OpenAI GPT-4: ~$0.03-0.06 per 1K tokens
OpenAI GPT-3.5: ~$0.002 per 1K tokens
Perplexity: ~$0.01-0.02 per 1K tokens
OpenRouter: Variable based on provider

Troubleshooting

Common Issues

1. API Rate Limiting

Error: Rate limit exceeded for OpenAI API
Solution: Implement retry with exponential backoff or use alternative provider

2. Token Limit Exceeded

Error: Token limit exceeded for model
Solution: Truncate content or use model with higher token limit

3. Provider Failures

Error: Provider service unavailable
Solution: Fall back to alternative provider or implement retry logic

4. Analysis Quality Issues

Error: Poor analysis quality or irrelevant results
Solution: Adjust prompts or use different provider with better context

Debug Mode

# Enable debug mode for detailed LLM information
analysis = analyze_gnn_model_with_llm(content, model_name, debug=True, verbose=True)

Configuration for Fast Local Runs (Ollama)

Set these environment variables to use small, fast models locally:

OLLAMA_MODEL=smollm2:135m-instruct-q4_K_S
OLLAMA_MAX_TOKENS=256
OLLAMA_TIMEOUT=60

Default tag is also defined in code as llm.defaults.DEFAULT_OLLAMA_MODEL. Override with OLLAMA_MODEL or input/config.yaml llm.model.

process_llm passes the selected tag to every structured and custom prompt via get_response(..., model_name=...), and into per-file summarization when Ollama is available. Summary tasks prefer Ollama before cloud providers when registered. If OpenAI returns quota errors, unset OPENAI_API_KEY for local-only runs.

You can also point to a different host:

OLLAMA_HOST=http://127.0.0.1:11434

Common Ollama tags: smollm2:135m-instruct-q4_K_S, gemma3:4b, tinyllama, and larger llama2 variants — see https://ollama.com/library

Future Enhancements

Planned Features

Multi-Modal Analysis: Support for image and audio analysis
Real-time Analysis: Live analysis during model development
Collaborative Analysis: Multi-LLM collaborative analysis
Custom Training: Custom LLM training for domain-specific analysis

Performance Improvements

Async Processing: Asynchronous LLM request processing
Batch Processing: Batch processing of multiple models
Advanced Caching: Advanced caching strategies for analysis results
Cost Optimization: Advanced cost optimization algorithms

Summary

The module analyses GNN models with an LLM of the caller's choosing, prefers local Ollama when no cloud keys are set, and writes per-model summary / explanation / optimisation artifacts into output/13_llm_output/ for downstream consumption by steps 16, 20, and 23.

License and Citation

This module is part of the GeneralizedNotationNotation project. See the main repository for license and citation information.

Documentation

README: Module Overview
AGENTS: Agentic Workflows
SPEC: Architectural Specification
SKILL: Capability API

FilesExpand file tree

llm

Directory actions

More options

Directory actions

More options

Latest commit

History

llm

Folders and files

parent directory

README.md

LLM Module

Module Structure

LLM Processing Architecture

Provider Selection Flow

Module Integration Flow

Core Components

LLM Analysis System (analyzer.py)

analyze_gnn_model_with_llm(gnn_content: str, model_name: str, analysis_type: str = "comprehensive") -> Dict[str, Any]

interpret_model_behavior(gnn_content: str, model_name: str) -> Dict[str, Any]

generate_model_documentation(gnn_content: str, model_name: str) -> str

LLM Operations (llm_operations.py)

process_llm_analysis(target_dir: Path, output_dir: Path, verbose: bool = False) -> bool

extract_model_insights(gnn_content: str) -> List[Dict[str, Any]]

generate_optimization_suggestions(gnn_content: str) -> List[Dict[str, Any]]

LLM Provider System (providers/)

BaseProvider (base_provider.py)

OpenAIProvider (openai_provider.py)

OpenRouterProvider (openrouter_provider.py)

PerplexityProvider (perplexity_provider.py)

LLM Processing System (llm_processor.py)

process_llm_request(content: str, request_type: str, provider: str = "auto") -> Dict[str, Any]

select_optimal_provider(request_type: str, content_length: int) -> str

Usage Examples

Basic LLM Analysis

Model Interpretation

Documentation Generation

Provider-Specific Analysis

Optimization Suggestions

LLM Analysis Pipeline

1. Content Preparation

2. Provider Selection

3. LLM Analysis

4. Result Processing

5. Documentation Generation

Integration with Pipeline

Pipeline Step 13: LLM Processing

Output Structure

LLM Providers

OpenAI Provider

OpenRouter Provider

Perplexity Provider

Analysis Types

Comprehensive Analysis

Structural Analysis

Semantic Analysis

Performance Analysis

Configuration Options

LLM Settings

Provider-Specific Settings

Error Handling

LLM Analysis Failures

Provider Failures

Rate Limiting Issues

Performance Optimization

Caching Strategies

Cost Optimization

Performance Monitoring

Testing and Validation

Unit Tests

Integration Tests

Provider Tests

Dependencies

Required Dependencies

Optional Dependencies

Performance Metrics

Processing Times

Memory Usage

Cost Metrics

LLM Analysis System (`analyzer.py`)

`analyze_gnn_model_with_llm(gnn_content: str, model_name: str, analysis_type: str = "comprehensive") -> Dict[str, Any]`

`interpret_model_behavior(gnn_content: str, model_name: str) -> Dict[str, Any]`

`generate_model_documentation(gnn_content: str, model_name: str) -> str`

LLM Operations (`llm_operations.py`)

`process_llm_analysis(target_dir: Path, output_dir: Path, verbose: bool = False) -> bool`

`extract_model_insights(gnn_content: str) -> List[Dict[str, Any]]`

`generate_optimization_suggestions(gnn_content: str) -> List[Dict[str, Any]]`

LLM Provider System (`providers/`)

BaseProvider (`base_provider.py`)

OpenAIProvider (`openai_provider.py`)

OpenRouterProvider (`openrouter_provider.py`)

PerplexityProvider (`perplexity_provider.py`)

LLM Processing System (`llm_processor.py`)

`process_llm_request(content: str, request_type: str, provider: str = "auto") -> Dict[str, Any]`

`select_optimal_provider(request_type: str, content_length: int) -> str`