Skip to content

Real-time infrastructure analysis using vector similarity search and LLM to provide insights, anomaly detection, and recommendations.

Notifications You must be signed in to change notification settings

awcodify/ragnarock

Repository files navigation

Python Version FastAPI Code style: black

Infrastructure RAG

❗❗NOTE❗❗❗

🤖 AI-Driven Development: This is an experimental project built through human-AI collaboration using GitHub Copilot and Claude 3.5 Sonnet. The code architecture, implementation decisions, and problem-solving approaches were developed through iterative AI-assisted development.


Overview

Real-time infrastructure analysis using RAG (Retrieval Augmented Generation). Analyze Prometheus metrics using vector similarity search and LLM to provide insights, anomaly detection, and recommendations.

Architecture

graph LR
    subgraph Data Collection
        Prometheus --> CollectMetrics[Collect Metrics]
    end

    subgraph RAG Service
        CollectMetrics --> SearchHistorical[Search Historical]
        SearchHistorical --> VectorStore[Vector Store/Qdrant]
        VectorStore --> Analysis[Similar Patterns]
        Analysis --> TensorZero[TensorZero Gateway]
        TensorZero --> LLMProviders[LLM Providers]
        SearchHistorical --> StoreNew[Store New Metrics]
        StoreNew --> VectorStore
    end

    subgraph LLM Providers
        Claude[Claude] 
        Ollama[Ollama]
    end

    subgraph API Layer
        LLMProviders --> APIEndpoints[FastAPI Service]
        APIEndpoints --> Response[JSON Response]
    end
Loading

Features

Prerequisites

  • Python 3.11+
  • Prometheus
  • Node Exporter

Installation

# Clone repository
git clone https://github.com/awcodify/ragnarock.git
cd ragnarock

# Create virtual environment and install dependencies
make setup

# Configure environment
cp .env.example .env

Run Dependencies

For development purposes, we already create docker-compose for all dependencies. You can simply run:

docker-compose up -d

If you want to run local LLM, we recommend using ollama:

  1. Install ollama. Find the docs in: https://github.com/ollama/ollama
  2. Run ollama:
# pull model
ollama pull llama3.2

# serve ollama api
OLLAMA_HOST=localhost:1090 ollama serve

Configuration

Required environment variables in .env:

PROMETHEUS_URL=http://localhost:9090

TensorZero Configuration

The project uses TensorZero for unified LLM interface. Configuration is defined in tensorzero/tensorzero.toml:

[models.llama3_2]
routing = ["ollama"]
[models.llama3_2.providers.ollama]
type = "openai"
api_base = "http://host.docker.internal:1090/v1/"
model_name = "llama3.2"
api_key_location = "none"
[models.claude-3-5-sonnet-latest]
routing = ["anthropic"]
[models.claude-3-5-sonnet-latest.providers.anthropic]
type = "anthropic"
model_name = "claude-3-5-haiku-20241022"
[functions.generate_analysis]
type = "chat"
user_schema = "functions/generate_analysis/user_schema.json"
[functions.generate_analysis.variants.llama3_2]
type = "chat_completion"
model = "llama3_2"
system_template = "functions/generate_analysis/system_template.minijinja"
user_template = "functions/generate_analysis/user_template.minijinja"

Usage

Start Server

Run api by using this command:

make run

# or
# uvicorn src.api.main:app --reload 

If server run, you can do analyze by using this endpoint:

curl -X POST http://localhost:8000/analyze \
  -H "Content-Type: application/json" \
  -d '{"query": "analyze current system health"}'

Example Response

{
  "current_metrics": {
    "cpu_usage": {
      "value": 0.05119298245614405,
      "timestamp": 1731501942
    },
    "memory_usage": {
      "value": 1494454272.0,
      "timestamp": 1731501942
    },
    "disk_usage": {
      "value": 12325642240.0,
      "timestamp": 1731501942
    },
    "network_receive": {
      "value": 0.0,
      "timestamp": 1731501942
    },
    "network_transmit": {
      "value": 0.0,
      "timestamp": 1731501942
    }
  },
  "similar_patterns": {
    "cpu_usage": {
      "value": 0.04961368692149644,
      "timestamp": 1731501905
    },
    "disk_usage": {
      "value": 12325597184.0,
      "timestamp": 1731501905
    },
    "memory_usage": {
      "value": 1498570752.0,
      "timestamp": 1731501905
    },
    "network_receive": {
      "value": 0.0,
      "timestamp": 1731501905
    },
    "network_transmit": {
      "value": 0.0,
      "timestamp": 1731501905
    }
  },
  "analysis": {
    "summary": "[ContentBlock(text=\"Based on the provided metrics, I can analyze the current system health as follows:\\n\\n1. CPU Usage:\\n   - The current CPU usage value is 0.05119298245614405 (approximately 5.12%).\\n   - This is a relatively low CPU utilization, suggesting that the system is not experiencing significant CPU load at the moment.\\n\\n2. Memory Usage:\\n   - The current memory usage value is 1494454272.0 bytes (approximately 1.39 GB).\\n   - Without additional context about the system's total memory capacity, it's difficult to determine if this memory usage is reasonable or concerning. However, it can serve as a baseline for comparison with historical data.\\n\\n3. Disk Usage:\\n   - The current disk usage value is 12325642240.0 bytes (approximately 11.48 GB).\\n   - Similar to memory usage, the significance of this value depends on the total disk capacity and the expected disk usage for the system's workload.\\n\\n4. Network Usage:\\n   - Both network_receive and network_transmit values are 0.0, indicating no network traffic at the time of measurement.\\n   - This could be expected if the system is not actively communicating over the network or if the measurement interval is too short to capture network activity.\\n\\nOverall, based on the provided metrics, the system appears to be in a relatively healthy state with low CPU usage and no significant network activity at the time of measurement. However, without additional context about the system's resources (e.g., total memory, disk capacity) and expected workload, it's difficult to make a definitive assessment of the system's health.\\n\\nIt's recommended to compare these current metrics with historical data and established baselines or thresholds to identify any potential issues or deviations from normal behavior.\", type='text')]",
    "historical_comparison": "Analysis based on historical data",
    "anomalies": [],
    "recommendations": [],
    "risk_level": "low"
  }
}

Development

# Setup project
make setup

# Run tests
make test

# Clean project
make clean

About

Real-time infrastructure analysis using vector similarity search and LLM to provide insights, anomaly detection, and recommendations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published