Mesh

Mesh includes zip, the distributed inference engine that powers its serving runtime.

Mesh is a distributed network for sharing model execution across machines on a local network, with a control plane coordinating device registration, ring membership, job dispatch, status, and accounting.

The core idea is simple:

workers on the same LAN contribute compute
workers join a model ring for the model they serve
jobs are dispatched through the control plane
tensors move directly between workers on the dataplane
results and credits are recorded durably by the control plane

Mesh has one production execution path. There is no mock or synthetic executor in this repo.

zip

zip is the inference engine embedded in Mesh and is being maintained as a separate open-source sibling project alongside this repo.

zip owns:

explicit serving sessions
explicit prefill and decode phases
backend abstraction for provider-specific execution
checkpoint-backed KV handoff
runtime decode queue and microbatch planning primitives
tensor-plane transport and execution-facing types

Mesh still owns the broader product shell around zip:

durable control-plane scheduling and accounting
worker CLI and process lifecycle
relay, UI, and operator workflows

Operations

Operator runbooks for the production engine live in runbooks/README.md.

How It Works

Mesh is split into two layers:

local worker mesh:
- agents run on each device
- devices discover peers, join pools, and participate in a model ring
- workers load real shard artifacts from disk
- workers exchange tensor data directly over the dataplane
control plane:
- registers devices
- stores network, ring, job, and ledger state
- assigns distributed jobs to the active ring
- exposes topology, status, and accounting APIs

For constrained networks, Mesh can also use a relay for peer connectivity, but the intended fast path is direct local-network connectivity.

Functionality

local-network compute sharing across multiple workers
explicit model-ring membership and shard ownership
distributed inference job submission and tracking
direct tensor transport between workers
durable control-plane state for jobs, topology, and ledger events
explicit execution providers:
- cpu
- metal
- cuda
pool creation and LAN peer discovery
credit accounting tied to real worker participation

CLI Surface

Mesh ships one grouped CLI:

mesh device
- initialize device identity
- start the agent
- inspect local device status
mesh resource
- lock, unlock, and inspect committed resources
mesh ring
- join a model ring
- leave a ring
- inspect ring status, topology, and shard assignment
mesh job
- submit a distributed inference job
- fetch job status
- watch a job
- inspect local runtime stats
mesh ledger
- inspect summary and event history for the current network
mesh pool
- create pools
- join pools
- list pools and peers
- inspect LAN discovery state
mesh doctor
- verify local setup and control-plane reachability
mesh ui
- launch the local UI

Execution Providers

Mesh now exposes one execution architecture with explicit provider selection underneath it:

cpu: baseline runtime for broad compatibility, including Intel Macs and CPU-only Linux machines
metal: native Apple path for Apple Silicon workers
cuda: native Linux/NVIDIA path for datacenter and workstation GPUs

Provider choice is part of node configuration and capability reporting. Nodes advertise the providers they can actually run, the control plane stores that inventory, and the agent binds the tensor backend to the selected provider at startup. There is no silent provider fallback path.

Default provider selection is simple:

prefer metal when available
otherwise prefer cuda when available
otherwise use cpu

To pin a node to a provider, set it in ~/.meshnet/device.toml:

[execution]
preferred_provider = "cpu"

This is useful for:

running Intel Macs as CPU workers on a LAN mesh
forcing CPU parity checks on an Apple Silicon machine
forcing a known GPU backend during bring-up and debugging

Install

git clone https://github.com/saint0x/mesh.git
cd mesh
./install.sh

This installs:

mesh
mesh-control-plane
mesh-relay

Quick Start

Start infrastructure:

mesh-relay
mesh-control-plane

Start worker 1:

mesh device init --network-id demo --name "Worker 1"
mesh ring join --model-id tinyllama-1.1b
mesh device start

Start worker 2:

export MESHNET_HOME=~/.meshnet-worker2
mesh device init --network-id demo --name "Worker 2"
mesh ring join --model-id tinyllama-1.1b
mesh device start

Submit inference:

mesh job run --prompt "hello from mesh" --max-tokens 16 --model-id tinyllama-1.1b

Useful checks:

mesh doctor
mesh ring status
mesh ring topology
mesh ring shard
mesh pool list
mesh pool peers --pool-id <POOL_ID>
mesh ledger summary
mesh ledger events
mesh ui

mesh doctor now treats repo-local control-plane.db and mesh_control_plane.db files as ambiguous artifacts. The authoritative control-plane database path is ~/.meshnet/control-plane.db.

For local UI development, mesh-ui's existing dev command now boots the real local Mesh UI API first and then starts Vite, so the dashboard talks to live Mesh state instead of a frontend-only server.

Model Assets

Every worker needs real model assets under ~/.meshnet/models/<model_id>/:

model.json
tokenizer.json
shard-<worker>-of-<total>.manifest.json
shard-<worker>-of-<total>.safetensors

model.json defines the real tensor-parallel dimension and total model size. The control plane uses it for shard assignment, and the workers use the tokenizer for output decoding. The shard loader validates safetensors payloads against their manifests in artifact_loader.rs.

The same canonical artifacts are used across providers. Provider choice changes execution, not model semantics.

For lower-memory bring-up on laptops or CPU-only machines, start with a smaller real model:

uv venv -p 3.12 /tmp/mesh-model-py312
source /tmp/mesh-model-py312/bin/activate
uv pip install numpy safetensors huggingface_hub torch
python scripts/fetch_hf_llama_to_meshnet.py --out-dir ~/.meshnet/models --workers 2
MESHNET_REAL_ARTIFACT_MODEL_ID=smollm2-135m-instruct bash scripts/test_real_artifact_loading.sh

The default fetch target is HuggingFaceTB/SmolLM2-135M-Instruct, which converts into two Mesh shards of roughly 419 MB each on this machine. The default bash scripts/test_real_artifact_loading.sh smoke path also prefers smollm2-135m-instruct when it is installed, and only falls back to generic artifact discovery if that lower-memory model is absent.

Core Components

agent: worker runtime and CLI for device bring-up, pool participation, ring membership, shard loading, inference execution, and dataplane transport. It embeds zip inside the larger worker process. The native Mesh boundary for the engine is zip.rs, with the current internal engine implementation living under agent/src/inference. See main.rs and zip.rs.
control-plane: durable coordinator for registration, topology, distributed job dispatch, status polling, and ledger events. See inference.rs and ring_manager.rs.
relay-server: optional connectivity layer for environments that cannot keep workers directly connected. See relay-server/README.md.

Verification

Mesh uses Fozzy first for system validation.

fozzy doctor --deep --scenario tests/production_dispatch.fozzy.json --runs 5 --seed 424242 --json
fozzy test --det --strict tests/production_dispatch.fozzy.json tests/live_relay_runtime.fozzy.json tests/real_artifact_loading.fozzy.json --json
fozzy run tests/production_dispatch.fozzy.json --det --record /tmp/production_dispatch_trace.fozzy --json
fozzy trace verify /tmp/production_dispatch_trace.fozzy --strict --json
fozzy replay /tmp/production_dispatch_trace.fozzy --json
fozzy ci /tmp/production_dispatch_trace.fozzy --json
bash scripts/test_real_artifact_loading.sh
cargo test --workspace

bash scripts/test_real_artifact_loading.sh is the explicit host-backed artifact residency gate. It loads a real shard set from ~/.meshnet/models and can take minutes on multi-gigabyte artifacts; cargo test --workspace does not turn that path on by itself.

For provider work, validate both the runtime and the provider contract:

fozzy doctor --deep --scenario tests/production_dispatch.fozzy.json --runs 5 --seed 424242 --json
fozzy test --det --strict tests/production_dispatch.fozzy.json tests/live_relay_runtime.fozzy.json tests/real_artifact_loading.fozzy.json --json
fozzy run tests/production_dispatch.fozzy.json --det --record /tmp/production_dispatch_trace.fozzy --json
fozzy trace verify /tmp/production_dispatch_trace.fozzy --strict --json
fozzy replay /tmp/production_dispatch_trace.fozzy --json
fozzy ci /tmp/production_dispatch_trace.fozzy --json

More Docs

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 331 Commits
agent		agent
control-plane		control-plane
mesh-ui		mesh-ui
relay-server		relay-server
runbooks		runbooks
scripts		scripts
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
ENGINE.md		ENGINE.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
device1.sh		device1.sh
device2.sh		device2.sh
install.sh		install.sh
test_integration.sh		test_integration.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mesh

zip

Operations

How It Works

Functionality

CLI Surface

Execution Providers

Install

Quick Start

Model Assets

Core Components

Verification

More Docs

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mesh

zip

Operations

How It Works

Functionality

CLI Surface

Execution Providers

Install

Quick Start

Model Assets

Core Components

Verification

More Docs

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages