ARC Solver

This repository contains tools for solving Abstract Reasoning Corpus (ARC) tasks using LangGraph-based AI agents.

Main Scripts

1. run_langgraph_agent.py

Purpose: This is the main script for running a multi-solution LangGraph agent on ARC tasks. It uses multiple AI models to generate, refine, and fuse solution strategies through iterative reasoning, code generation, and execution.

What it does:

Runs AI agents to solve ARC challenges using configurable language models (GPT-4, Gemini, Llama, Qwen, etc.)
Supports both single task testing and batch processing with parallel workers
Generates multiple solution attempts per task through iterative refinement and fusion
Stores results in structured output directories under output/output_agent/
Supports resuming previous runs and evaluating existing solutions
Optional features include RAG hints, visual cues, and parallel evaluation

How to run:

# Basic usage - runs with default configuration
python run_langgraph_agent.py

# The script is configured via variables at the top of the file:
# - REASONING_MODEL: Model for reasoning & reflection
# - CODING_MODEL: Model for code generation
# - MODE: "single" or "batch"
# - NUM_TASKS: Number of tasks to process in batch mode
# - RESUME_RUN: Set to "latest" or specific run ID to resume

Key configuration variables (edit in the script):

REASONING_MODEL: Choose your reasoning model (e.g., "gemini-2.5-flash", "gpt-4o-mini")
CODING_MODEL: Choose your coding model
MODE: "single" for one task, "batch" for multiple tasks
NUM_TASKS: Number of tasks to process (in batch mode)
NUM_WORKERS: Number of parallel workers for batch processing
RESUME_RUN: Resume a previous run (set to run ID or "latest")
EVALUATE_ONLY: Only evaluate existing solutions without generating new ones

2. arc_visualizer.py

Purpose: A Flask-based web application for visualizing and inspecting ARC task solutions generated by the agent.

What it does:

Provides a web interface to browse all runs stored in output/output_agent/
Displays run summaries with accuracy statistics
Shows detailed visualizations of input/output grids for each task
Highlights differences between expected and predicted outputs
Allows inspection of individual solution attempts and their reasoning traces

How to run:

# Install Flask if not already installed
pip install flask

# Start the visualizer
python arc_visualizer.py

# Open your browser to http://localhost:5000

Features:

Browse all completed runs sorted by date (newest first)
View task-level results with color-coded grid visualizations
Compare expected vs. predicted outputs with diff highlighting
Inspect solution code and reasoning for each attempt

Requirements

Install dependencies:

pip install -r requirements.txt

Key dependencies include:

langchain & langgraph for agent orchestration
openai, anthropic, google-generativeai for LLM providers
flask for visualization
qdrant-client for optional RAG features

Workflow

Run the agent using run_langgraph_agent.py to generate solutions
Visualize results using arc_visualizer.py to inspect and analyze the outputs
Iterate by adjusting configuration parameters and re-running

Output Structure

Results are saved to output/output_agent/<timestamp>/:

Each task has its own subdirectory with detailed JSON output
params.json: Run configuration parameters
training_task_ids.txt / evaluation_task_ids.txt: Lists of processed tasks
Individual task folders contain solution attempts, code, and reasoning traces

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.claude		.claude
agentic		agentic
bash_scripts		bash_scripts
data		data
prompts		prompts
templates		templates
utils		utils
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
arc_baseline.py		arc_baseline.py
arc_visualizer.py		arc_visualizer.py
docker-compose.yml		docker-compose.yml
model_configs.py		model_configs.py
prompts.py		prompts.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_arc_baseline.py		run_arc_baseline.py
run_langgraph_agent.py		run_langgraph_agent.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARC Solver

Main Scripts

1. run_langgraph_agent.py

2. arc_visualizer.py

Requirements

Workflow

Output Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ARC Solver

Main Scripts

1. run_langgraph_agent.py

2. arc_visualizer.py

Requirements

Workflow

Output Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages