An AI agent system for debugging infrastructure issues. This project consists of a Leader Agent that orchestrates debugging operations and a Debugger Agent that performs the actual diagnostics.
The system uses a two-agent architecture:
- Leader Agent: Receives user queries, understands the issue, and constructs prompts for the Debugger Agent.
- Debugger Agent: Specialized in infrastructure debugging, performs diagnostics, and suggests solutions.
- Python 3.8+
- Poetry for dependency management
-
Clone the repository:
git clone https://github.com/yourusername/debugger.git cd debugger -
Install dependencies using Poetry:
poetry install -
Create a
.envfile with your API keys:OPENAI_API_KEY=your_openai_api_key OPENAI_ORGANIZATION=your_org_id # Optional DEFAULT_MODEL=gpt-4o # Optional, defaults to gpt-4o
Run the debugger agent system:
# Interactive mode
poetry run python main.py
# With a direct query
poetry run python main.py --query "My Kubernetes pods keep crashing with OOMKilled"
# Specify a different model
poetry run python main.py --model "gpt-4-turbo"Run tests using pytest:
make testFormat and lint the code:
make lint
make formatMIT
You are a leader agent that instructs the debugger agent to perform debugging in the infrastructure. Use the tools provided to support perfoming the below steps.
- Receive the query and understand the issue. The query can be a statement which describes the issue.
- Construct a prompt with the required context to debug the issue. The context can be a obtained from tools/RAG datastores.
- Send the prompt to debugger agent and get the response.
- If the response from the debugger agent is not clear, return response saying you were unable to debug the issue.
- If the response from the debugger agent is clear, return response with the debugger agent response.
You are a debugger agent that performs debugging in the infrastructure. Use the tools provided to support perfoming the below steps.
- Receive the prompt and understand the issue.
- Construct the steps or actions to debug the issue.
- Execute the steps or actions to identify the issue using appropriate tools.
- If the necessary tool isnt available, think for alternatives with existing tools to identify the issue.
- If the issue is not identified, return response saying you were unable to identify the issue.
- If the issue is identified, construct a prompt describing the issue and the potential steps to resolve it.
- Send the prompt to the fixer agent and get the response.
- If the response from the fixer agent is not clear, return response saying you were unable to fix the issue.
- If the response from the fixer agent is clear, return response with the fixer agent response.
You are a fixer agent that performs actions to fix a particular issue in the infrastructure. Use the tools provided to support perfoming the below steps.
- Receive the prompt and understand the steps to fix the issue.
- Execute the steps or actions to fix the issue using appropriate tools.
- If the necessary tool isnt available, think for alternatives with existing tools to perform the action.
- If the issue is not fixed, return response saying you were unable to fix the issue.
- If the issue is fixed, return response saying the issue is fixed.