Infrastructure RAG

❗❗NOTE❗❗❗

🤖 AI-Driven Development: This is an experimental project built through human-AI collaboration using GitHub Copilot and Claude 3.5 Sonnet. The code architecture, implementation decisions, and problem-solving approaches were developed through iterative AI-assisted development.

Overview

Real-time infrastructure analysis using RAG (Retrieval Augmented Generation). Analyze Prometheus metrics using vector similarity search and LLM to provide insights, anomaly detection, and recommendations.

Architecture

graph LR
    subgraph Data Collection
        Prometheus --> CollectMetrics[Collect Metrics]
    end

    subgraph RAG Service
        CollectMetrics --> SearchHistorical[Search Historical]
        SearchHistorical --> VectorStore[Vector Store/Qdrant]
        VectorStore --> Analysis[Similar Patterns]
        Analysis --> TensorZero[TensorZero Gateway]
        TensorZero --> LLMProviders[LLM Providers]
        SearchHistorical --> StoreNew[Store New Metrics]
        StoreNew --> VectorStore
    end

    subgraph LLM Providers
        Claude[Claude] 
        Ollama[Ollama]
    end

    subgraph API Layer
        LLMProviders --> APIEndpoints[FastAPI Service]
        APIEndpoints --> Response[JSON Response]
    end

Features

Prometheus metrics collection and analysis
Vector similarity search for historical patterns
Flexible LLM support through TensorZero:
- Claude
- Local LLMs via Ollama
- Others: https://www.tensorzero.com/docs/gateway/guides/providers/openai-compatible
REST API with FastAPI
Structured responses with validation
Async support
Comprehensive test coverage

Prerequisites

Python 3.11+
Prometheus
Node Exporter

Installation

# Clone repository
git clone https://github.com/awcodify/ragnarock.git
cd ragnarock

# Create virtual environment and install dependencies
make setup

# Configure environment
cp .env.example .env

Run Dependencies

For development purposes, we already create docker-compose for all dependencies. You can simply run:

docker-compose up -d

If you want to run local LLM, we recommend using ollama:

Install ollama. Find the docs in: https://github.com/ollama/ollama
Run ollama:

# pull model
ollama pull llama3.2

# serve ollama api
OLLAMA_HOST=localhost:1090 ollama serve

Configuration

Required environment variables in .env:

PROMETHEUS_URL=http://localhost:9090

TensorZero Configuration

The project uses TensorZero for unified LLM interface. Configuration is defined in tensorzero/tensorzero.toml:

ragnarock/tensorzero/tensorzero.toml

Lines 1 to 25 in 6c14140

    
           [models.llama3_2] 
        
           routing = ["ollama"] 
        
           [models.llama3_2.providers.ollama] 
        
           type = "openai" 
        
           api_base = "http://host.docker.internal:1090/v1/" 
        
           model_name = "llama3.2" 
        
           api_key_location = "none" 
        
           [models.claude-3-5-sonnet-latest] 
        
           routing = ["anthropic"] 
        
           [models.claude-3-5-sonnet-latest.providers.anthropic] 
        
           type = "anthropic" 
        
           model_name = "claude-3-5-haiku-20241022" 
        
           [functions.generate_analysis] 
        
           type = "chat" 
        
           user_schema = "functions/generate_analysis/user_schema.json" 
        
           [functions.generate_analysis.variants.llama3_2] 
        
           type = "chat_completion" 
        
           model = "llama3_2" 
        
           system_template = "functions/generate_analysis/system_template.minijinja" 
        
           user_template = "functions/generate_analysis/user_template.minijinja"

Usage

Start Server

Run api by using this command:

make run

# or
# uvicorn src.api.main:app --reload

If server run, you can do analyze by using this endpoint:

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"query": "analyze current system health"}'

Example Response

{
  "current_metrics": {
    "cpu_usage": {
      "value": 0.05119298245614405,
      "timestamp": 1731501942
    },
    "memory_usage": {
      "value": 1494454272.0,
      "timestamp": 1731501942
    },
    "disk_usage": {
      "value": 12325642240.0,
      "timestamp": 1731501942
    },
    "network_receive": {
      "value": 0.0,
      "timestamp": 1731501942
    },
    "network_transmit": {
      "value": 0.0,
      "timestamp": 1731501942
    }
  },
  "similar_patterns": {
    "cpu_usage": {
      "value": 0.04961368692149644,
      "timestamp": 1731501905
    },
    "disk_usage": {
      "value": 12325597184.0,
      "timestamp": 1731501905
    },
    "memory_usage": {
      "value": 1498570752.0,
      "timestamp": 1731501905
    },
    "network_receive": {
      "value": 0.0,
      "timestamp": 1731501905
    },
    "network_transmit": {
      "value": 0.0,
      "timestamp": 1731501905
    }
  },
  "analysis": {
    "summary": "[ContentBlock(text=\"Based on the provided metrics, I can analyze the current system health as follows:\\n\\n1. CPU Usage:\\n   - The current CPU usage value is 0.05119298245614405 (approximately 5.12%).\\n   - This is a relatively low CPU utilization, suggesting that the system is not experiencing significant CPU load at the moment.\\n\\n2. Memory Usage:\\n   - The current memory usage value is 1494454272.0 bytes (approximately 1.39 GB).\\n   - Without additional context about the system's total memory capacity, it's difficult to determine if this memory usage is reasonable or concerning. However, it can serve as a baseline for comparison with historical data.\\n\\n3. Disk Usage:\\n   - The current disk usage value is 12325642240.0 bytes (approximately 11.48 GB).\\n   - Similar to memory usage, the significance of this value depends on the total disk capacity and the expected disk usage for the system's workload.\\n\\n4. Network Usage:\\n   - Both network_receive and network_transmit values are 0.0, indicating no network traffic at the time of measurement.\\n   - This could be expected if the system is not actively communicating over the network or if the measurement interval is too short to capture network activity.\\n\\nOverall, based on the provided metrics, the system appears to be in a relatively healthy state with low CPU usage and no significant network activity at the time of measurement. However, without additional context about the system's resources (e.g., total memory, disk capacity) and expected workload, it's difficult to make a definitive assessment of the system's health.\\n\\nIt's recommended to compare these current metrics with historical data and established baselines or thresholds to identify any potential issues or deviations from normal behavior.\", type='text')]",
    "historical_comparison": "Analysis based on historical data",
    "anomalies": [],
    "recommendations": [],
    "risk_level": "low"
  }
}

Development

# Setup project
make setup

# Run tests
make test

# Clean project
make clean

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
src		src
tensorzero		tensorzero
tests		tests
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
project.toml		project.toml
prometheus.yml		prometheus.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Infrastructure RAG

Overview

Architecture

Features

Prerequisites

Installation

Run Dependencies

Configuration

TensorZero Configuration

Usage

Start Server

Example Response

Development

About

Releases

Packages

Languages

	[models.llama3_2]
	routing = ["ollama"]

	[models.llama3_2.providers.ollama]
	type = "openai"
	api_base = "http://host.docker.internal:1090/v1/"
	model_name = "llama3.2"
	api_key_location = "none"

	[models.claude-3-5-sonnet-latest]
	routing = ["anthropic"]

	[models.claude-3-5-sonnet-latest.providers.anthropic]
	type = "anthropic"
	model_name = "claude-3-5-haiku-20241022"

	[functions.generate_analysis]
	type = "chat"
	user_schema = "functions/generate_analysis/user_schema.json"

	[functions.generate_analysis.variants.llama3_2]
	type = "chat_completion"
	model = "llama3_2"
	system_template = "functions/generate_analysis/system_template.minijinja"
	user_template = "functions/generate_analysis/user_template.minijinja"

awcodify/ragnarock

Folders and files

Latest commit

History

Repository files navigation

Infrastructure RAG

Overview

Architecture

Features

Prerequisites

Installation

Run Dependencies

Configuration

TensorZero Configuration

Usage

Start Server

Example Response

Development

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages