Skip to content

latentwill/runpod-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

runpod-skill

A Claude Code skill for deploying and managing GPU/CPU infrastructure on RunPod.

Built for any AI agent that needs to programmatically launch pods, manage storage, and control compute resources — not tied to a specific workflow or use case.

What it covers

  • Pod lifecycle — create, list, start, stop, restart, reset, update, terminate
  • GPU selection — full GPU table with IDs, VRAM, and fallback chains by use case
  • Spot instancespodRentInterruptable via GraphQL with bid pricing
  • Network volumes — persistent storage independent of pods (CRUD + attach)
  • Templates — reusable pod configurations via REST API
  • Connectivity — SSH, HTTP proxy, TCP ports with timeout warnings
  • Billing — cost queries for pods, endpoints, and volumes
  • CLIrunpodctl reference for common operations
  • Error handling — retry patterns for GPU stock-outs, zero-GPU restarts, volume mismatches

What it gets right that others get wrong

Common mistake This skill
Wrong REST URL (api.runpod.io) Correct: rest.runpod.io/v1
region: "us-west-1" (doesn't exist) Correct: dataCenterIds: ["US-TX-3"]
env format same for both APIs Documents the difference: REST = object, GraphQL = array of {key, value}
gpuTypeId vs gpuTypeIds confusion Flags singular (GraphQL) vs plural array (REST)
Container disk is persistent (it isn't) Explicit ephemeral/persistent/network storage model

Install

Copy the SKILL.md file into your Claude Code skills directory:

# If you have a skills directory configured
cp SKILL.md ~/.claude/skills/runpod/SKILL.md

# Or place it alongside your project
cp SKILL.md .claude/skills/runpod/SKILL.md

The skill activates when prompts mention RunPod, GPU cloud, launching pods, or related terms.

Prerequisites

  • A RunPod account with API access
  • API key set as RUNPOD_API_KEY environment variable

API coverage

The skill standardizes on the REST API (rest.runpod.io/v1) for all CRUD operations and documents GraphQL (api.runpod.io/graphql) for three things REST can't do:

  1. GPU availability queries with stockStatus filtering
  2. Runtime metrics (GPU utilization, container CPU/memory)
  3. Spot instance deployment (podRentInterruptable)

License

MIT

About

Claude Code skill for RunPod GPU infrastructure management

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors