Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 58 additions & 2 deletions skills/harbor-cli/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ For complete flag tables with types and defaults for every command, read `refere
| `harbor jobs summarize` | AI-powered failure summaries for a job |
| `harbor trials start` | Run a single trial (debugging) |
| `harbor trials summarize` | AI-powered summary of a single trial |
| `harbor download` | Download a task or dataset (auto-detects type) |
| `harbor datasets list` | List available datasets |
| `harbor datasets download` | Download a dataset |
| `harbor adapters init` | Scaffold a new benchmark adapter |
Expand Down Expand Up @@ -192,13 +193,49 @@ Flags `--registry-url` and `--registry-path` are mutually exclusive. Default: Ha
```bash
harbor datasets download [email protected]
harbor datasets download [email protected] -o ./my-tasks --overwrite
harbor datasets download org/my-dataset@latest --export # export layout
harbor datasets download org/my-dataset@latest -o ./out --cache # force cache layout
```

Two download modes (see also `harbor download`):
- **Cache mode** (default): content-addressable layout under `~/.cache/harbor/tasks`.
- **Export mode** (default when `--output-dir` is given, or with `--export`): human-readable `<output-dir>/<dataset-name>/<task-name>/`.

| Flag | Short | Description |
|------|-------|-------------|
| `DATASET` | | Name or `name@version` (positional) |
| `--output-dir` | `-o` | Download dir (default: `~/.cache/harbor/tasks`) |
| `--output-dir` | `-o` | Download dir. Cache mode default: `~/.cache/harbor/tasks`. Export mode default: current dir |
| `--overwrite` | | Re-download even if cached |
| `--export` | | Force export mode |
| `--cache` | | Force cache mode |
| `--registry-url` | | Legacy registry.json URL |
| `--registry-path` | | Path to legacy registry.json file |

## harbor download

Generic top-level download command. Auto-detects whether the argument is a task or dataset, then delegates to the appropriate downloader. Equivalent to calling `harbor tasks download` or `harbor datasets download` directly.

```bash
harbor download org/my-task@latest
harbor download [email protected]
harbor download org/my-dataset@latest -o ./datasets
harbor download org/my-dataset@latest --export
harbor download org/my-dataset@latest -o ./out --cache
```

**Download modes:**
- **Cache mode** (default): content-addressable layout under `~/.cache/harbor/tasks`. Tasks are deduplicated across datasets.
- **Export mode** (default when `--output-dir` is given, or with `--export`): human-readable layout. Datasets produce `<output-dir>/<dataset-name>/<task-name>/`; tasks produce `<output-dir>/<task-name>/`.

| Flag | Short | Description |
|------|-------|-------------|
| `NAME` | | Task or dataset `org/name@ref` or `name@version` (positional) |
| `--output-dir` | `-o` | Download directory |
| `--overwrite` | | Overwrite cached items |
| `--export` | | Force export mode |
| `--cache` | | Force cache mode |
| `--registry-url` | | Legacy registry.json URL (for legacy datasets) |
| `--registry-path` | | Path to legacy registry.json file |

## harbor adapters

Expand All @@ -219,6 +256,24 @@ harbor adapters review -p adapters/my-benchmark --skip-ai -o report.md

## harbor tasks

### harbor tasks download

Download a single task from the Harbor package registry.

```bash
harbor tasks download org/my-task@latest
harbor tasks download org/my-task@3 -o ./my-tasks
harbor tasks download org/my-task@latest --export
```

| Flag | Short | Description |
|------|-------|-------------|
| `NAME` | | Task as `org/name@ref` (positional; `@ref` defaults to `@latest`) |
| `--output-dir` | `-o` | Download dir (defaults to `~/.cache/harbor/tasks` in cache mode, or current dir in export mode) |
| `--overwrite` | | Overwrite cached task |
| `--export` | | Force export mode: `<output-dir>/<task-name>/` |
| `--cache` | | Force cache mode: content-addressable layout |

### harbor tasks init

```bash
Expand Down Expand Up @@ -454,7 +509,8 @@ harbor view ./trials

```bash
harbor datasets list
harbor datasets download [email protected]
harbor datasets download [email protected] # cache mode (default)
harbor download [email protected] --export # export to ./terminal-bench/
harbor run -d [email protected] -a claude-code -m anthropic/claude-sonnet-4-1 -n 8
harbor view ./jobs
harbor traces export -p ./jobs/my-job --push --repo my-org/traces
Expand Down
44 changes: 39 additions & 5 deletions skills/harbor-cli/references/flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ Detailed flags for every Harbor CLI command. This file is the authoritative refe
- [harbor jobs summarize](#harbor-jobs-summarize)
- [harbor trials start](#harbor-trials-start)
- [harbor trials summarize](#harbor-trials-summarize)
- [harbor download](#harbor-download)
- [harbor datasets list](#harbor-datasets-list)
- [harbor datasets download](#harbor-datasets-download)
- [harbor tasks download](#harbor-tasks-download)
- [harbor tasks init](#harbor-tasks-init)
- [harbor tasks check](#harbor-tasks-check)
- [harbor tasks start-env](#harbor-tasks-start-env)
Expand Down Expand Up @@ -211,26 +213,58 @@ Shares many flags with `harbor jobs start` but operates on a single task. Key di

---

## harbor download

Top-level command that auto-detects whether the argument is a task or dataset. Same `--export` / `--cache` mode logic as `harbor datasets download` and `harbor tasks download`.

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `NAME` | | str | | Task (`org/name@ref`) or dataset (`name@version`) (positional arg) |
| `--output-dir` | `-o` | Path | see description | Cache: `~/.cache/harbor/tasks`. Export: current dir |
| `--overwrite` | | bool | `false` | Overwrite cached items |
| `--export` | | bool | `false` | Force export mode |
| `--cache` | | bool | `false` | Force cache mode |
| `--registry-url` | | str | | Legacy registry.json URL (for legacy datasets) |
| `--registry-path` | | Path | | Path to legacy registry.json file |

---

## harbor datasets list

| Flag | Type | Description |
|------|------|-------------|
| `--registry-url` | str | URL of remote `registry.json` |
| `--registry-path` | Path | Path to local `registry.json` |
| `--registry-url` | str | URL of remote `registry.json` (legacy) |
| `--registry-path` | Path | Path to local `registry.json` (legacy) |

Mutually exclusive. Default: Harbor's public registry.

---

## harbor datasets download

Two modes: **cache** (content-addressable, `~/.cache/harbor/tasks`) and **export** (human-readable, `<output-dir>/<dataset-name>/<task-name>/`). Export mode is the default when `--output-dir` is given; cache mode is the default otherwise. `--export` and `--cache` override the auto-detection.

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `DATASET` | | str | | Dataset as `name` or `name@version` (positional arg) |
| `--output-dir` | `-o` | Path | `~/.cache/harbor/tasks` | Download directory |
| `--output-dir` | `-o` | Path | see description | Cache: `~/.cache/harbor/tasks`. Export: current dir |
| `--overwrite` | | bool | `false` | Re-download even if cached |
| `--registry-url` | | str | | Remote registry URL |
| `--registry-path` | | Path | | Local registry path |
| `--export` | | bool | `false` | Force export mode |
| `--cache` | | bool | `false` | Force cache mode |
| `--registry-url` | | str | | Legacy registry.json URL |
| `--registry-path` | | Path | | Path to legacy registry.json file |

---

## harbor tasks download

| Flag | Short | Type | Default | Description |
|------|-------|------|---------|-------------|
| `NAME` | | str | | Task as `org/name@ref` (positional arg; `@ref` defaults to `@latest`) |
| `--output-dir` | `-o` | Path | see description | Cache: `~/.cache/harbor/tasks`. Export: current dir |
| `--overwrite` | | bool | `false` | Overwrite cached task |
| `--export` | | bool | `false` | Force export mode: `<output-dir>/<task-name>/` |
| `--cache` | | bool | `false` | Force cache mode: content-addressable layout |

---

Expand Down