daft-cli
is a simple launcher for spinning up and managing Ray clusters for daft
.
Getting started with Daft in a local environment is easy. However, getting started with Daft in a cloud environment is substantially more difficult. So much more difficult, in fact, that users end up spending more time setting up their environment than actually playing with our query engine.
Daft CLI aims to solve this problem by providing a simple CLI tool to remove all of this unnecessary heavy-lifting.
What Daft CLI is capable of:
- Spinning up clusters (Provisioned mode only)
- Listing all available clusters as well as their statuses (Provisioned mode only)
- Submitting jobs to a cluster (Both Provisioned and BYOC modes)
- Connecting to the cluster (Provisioned mode only)
- Spinning down clusters (Provisioned mode only)
- Creating configuration files (Both modes)
- Running raw SQL statements (BYOC mode only)
Daft CLI supports two modes of operation:
- Provisioned: Automatically provisions and manages Ray clusters in AWS
- BYOC (Bring Your Own Cluster): Connects to existing Ray clusters in Kubernetes
Command Group | Command | Provisioned | BYOC |
---|---|---|---|
cluster | up | ✅ | ❌ |
down | ✅ | ❌ | |
kill | ✅ | ❌ | |
list | ✅ | ❌ | |
connect | ✅ | ❌ | |
ssh | ✅ | ❌ | |
job | submit | ✅ | ✅ |
sql | ✅ | ❌ | |
status | ✅ | ❌ | |
logs | ✅ | ❌ | |
config | init | ✅ | ✅ |
check | ✅ | ❌ | |
export | ✅ | ❌ |
You'll need some python package manager installed.
We recommend using uv
for all things python.
- A valid AWS account with the necessary IAM role to spin up EC2 instances. This IAM role can either be created by you (assuming you have the appropriate permissions) or will need to be created by your administrator.
- The AWS CLI installed and configured on your machine.
- Login using the AWS CLI.
- A Kubernetes cluster with Ray already deployed
- Can be local (minikube/kind), cloud-managed (EKS/GKE/AKS), or on-premise.
- See our BYOC setup guides for detailed instructions
- Ray cluster running in your Kubernetes cluster
- Must be installed and configured using Helm
- See provider-specific guides for installation steps
- Daft installed on the Ray cluster
kubectl
installed and configured with the correct context- Appropriate permissions to access the namespace where Ray is deployed
To enable SSH access and port forwarding for provisioned clusters, you need to:
-
Create an SSH key pair (if you don't already have one):
# Generate a new key pair ssh-keygen -t rsa -b 2048 -f ~/.ssh/daft-key # This will create: # ~/.ssh/daft-key (private key) # ~/.ssh/daft-key.pub (public key)
-
Import the public key to AWS:
# Import the public key to AWS aws ec2 import-key-pair \ --key-name "daft-key" \ --public-key-material fileb://~/.ssh/daft-key.pub
-
Set proper permissions on your private key:
chmod 600 ~/.ssh/daft-key
-
Update your daft configuration to use this key:
[setup.provisioned] # ... other config ... ssh-private-key = "~/.ssh/daft-key" # Path to your private key ssh-user = "ubuntu" # User depends on the AMI (ubuntu for Ubuntu AMIs)
Notes:
- The key name in AWS must match the name of your key file (without the extension)
- The private key must be readable only by you (hence the chmod 600)
- Different AMIs use different default users:
- Ubuntu AMIs: use "ubuntu"
- Amazon Linux AMIs: use "ec2-user"
- Make sure this matches your
ssh-user
configuration
Using uv
:
# create project
mkdir my-project
cd my-project
# initialize project and setup virtual env
uv init
uv venv
source .venv/bin/activate
# install launcher
uv pip install daft-launcher
All interactions with Daft CLI are primarily communicated via a configuration file.
By default, Daft CLI will look inside your $CWD
for a file named .daft.toml
.
You can override this behaviour by specifying a custom configuration file.
# Initialize a new provisioned mode configuration
daft config init --provider provisioned
# or use the default provider (provisioned)
daft config init
# Cluster management
daft provisioned up
daft provisioned list
daft provisioned connect
daft provisioned ssh
daft provisioned down
daft provisioned kill
# Job management (works in both modes)
daft job submit example-job
daft job status example-job
daft job logs example-job
# Configuration management
daft config check
daft config export
# Initialize a new BYOC mode configuration
daft config init --provider byoc
You can specify a custom configuration file path with the -c
flag:
daft -c my-config.toml job submit example-job
Example Provisioned mode configuration:
[setup]
name = "my-daft-cluster"
version = "0.1.0"
provider = "provisioned"
dependencies = [] # Optional additional Python packages to install
[setup.provisioned]
region = "us-west-2"
number-of-workers = 4
ssh-user = "ubuntu"
ssh-private-key = "~/.ssh/daft-key"
instance-type = "i3.2xlarge"
image-id = "ami-04dd23e62ed049936"
iam-instance-profile-name = "YourInstanceProfileName" # Optional
[run]
pre-setup-commands = []
post-setup-commands = []
[[job]]
name = "example-job"
command = "python my_script.py"
working-dir = "~/my_project"
Example BYOC mode configuration:
[setup]
name = "my-daft-cluster"
version = "0.1.0"
provider = "byoc"
dependencies = [] # Optional additional Python packages to install
[setup.byoc]
namespace = "default" # Optional, defaults to "default"
[[job]]
name = "example-job"
command = "python my_script.py"
working-dir = "~/my_project"