Mesh includes zip, the distributed inference engine that powers its serving runtime.
Mesh is a distributed network for sharing model execution across machines on a local network, with a control plane coordinating device registration, ring membership, job dispatch, status, and accounting.
The core idea is simple:
- workers on the same LAN contribute compute
- workers join a model ring for the model they serve
- jobs are dispatched through the control plane
- tensors move directly between workers on the dataplane
- results and credits are recorded durably by the control plane
Mesh has one production execution path. There is no mock or synthetic executor in this repo.
zip is the inference engine embedded in Mesh and is being maintained as a
separate open-source sibling project alongside this repo.
zip owns:
- explicit serving sessions
- explicit prefill and decode phases
- backend abstraction for provider-specific execution
- checkpoint-backed KV handoff
- runtime decode queue and microbatch planning primitives
- tensor-plane transport and execution-facing types
Mesh still owns the broader product shell around zip:
- durable control-plane scheduling and accounting
- worker CLI and process lifecycle
- relay, UI, and operator workflows
Operator runbooks for the production engine live in runbooks/README.md.
Mesh is split into two layers:
- local worker mesh:
- agents run on each device
- devices discover peers, join pools, and participate in a model ring
- workers load real shard artifacts from disk
- workers exchange tensor data directly over the dataplane
- control plane:
- registers devices
- stores network, ring, job, and ledger state
- assigns distributed jobs to the active ring
- exposes topology, status, and accounting APIs
For constrained networks, Mesh can also use a relay for peer connectivity, but the intended fast path is direct local-network connectivity.
- local-network compute sharing across multiple workers
- explicit model-ring membership and shard ownership
- distributed inference job submission and tracking
- direct tensor transport between workers
- durable control-plane state for jobs, topology, and ledger events
- explicit execution providers:
cpumetalcuda
- pool creation and LAN peer discovery
- credit accounting tied to real worker participation
Mesh ships one grouped CLI:
mesh device- initialize device identity
- start the agent
- inspect local device status
mesh resource- lock, unlock, and inspect committed resources
mesh ring- join a model ring
- leave a ring
- inspect ring status, topology, and shard assignment
mesh job- submit a distributed inference job
- fetch job status
- watch a job
- inspect local runtime stats
mesh ledger- inspect summary and event history for the current network
mesh pool- create pools
- join pools
- list pools and peers
- inspect LAN discovery state
mesh doctor- verify local setup and control-plane reachability
mesh ui- launch the local UI
Mesh now exposes one execution architecture with explicit provider selection underneath it:
cpu: baseline runtime for broad compatibility, including Intel Macs and CPU-only Linux machinesmetal: native Apple path for Apple Silicon workerscuda: native Linux/NVIDIA path for datacenter and workstation GPUs
Provider choice is part of node configuration and capability reporting. Nodes advertise the providers they can actually run, the control plane stores that inventory, and the agent binds the tensor backend to the selected provider at startup. There is no silent provider fallback path.
Default provider selection is simple:
- prefer
metalwhen available - otherwise prefer
cudawhen available - otherwise use
cpu
To pin a node to a provider, set it in ~/.meshnet/device.toml:
[execution]
preferred_provider = "cpu"This is useful for:
- running Intel Macs as CPU workers on a LAN mesh
- forcing CPU parity checks on an Apple Silicon machine
- forcing a known GPU backend during bring-up and debugging
git clone https://github.com/saint0x/mesh.git
cd mesh
./install.shThis installs:
meshmesh-control-planemesh-relay
Start infrastructure:
mesh-relay
mesh-control-planeStart worker 1:
mesh device init --network-id demo --name "Worker 1"
mesh ring join --model-id tinyllama-1.1b
mesh device startStart worker 2:
export MESHNET_HOME=~/.meshnet-worker2
mesh device init --network-id demo --name "Worker 2"
mesh ring join --model-id tinyllama-1.1b
mesh device startSubmit inference:
mesh job run --prompt "hello from mesh" --max-tokens 16 --model-id tinyllama-1.1bUseful checks:
mesh doctor
mesh ring status
mesh ring topology
mesh ring shard
mesh pool list
mesh pool peers --pool-id <POOL_ID>
mesh ledger summary
mesh ledger events
mesh uimesh doctor now treats repo-local control-plane.db and mesh_control_plane.db files as ambiguous artifacts. The authoritative control-plane database path is ~/.meshnet/control-plane.db.
For local UI development, mesh-ui's existing dev command now boots the real local Mesh UI API first and then starts Vite, so the dashboard talks to live Mesh state instead of a frontend-only server.
Every worker needs real model assets under ~/.meshnet/models/<model_id>/:
model.jsontokenizer.jsonshard-<worker>-of-<total>.manifest.jsonshard-<worker>-of-<total>.safetensors
model.json defines the real tensor-parallel dimension and total model size. The control plane uses it for shard assignment, and the workers use the tokenizer for output decoding. The shard loader validates safetensors payloads against their manifests in artifact_loader.rs.
The same canonical artifacts are used across providers. Provider choice changes execution, not model semantics.
For lower-memory bring-up on laptops or CPU-only machines, start with a smaller real model:
uv venv -p 3.12 /tmp/mesh-model-py312
source /tmp/mesh-model-py312/bin/activate
uv pip install numpy safetensors huggingface_hub torch
python scripts/fetch_hf_llama_to_meshnet.py --out-dir ~/.meshnet/models --workers 2
MESHNET_REAL_ARTIFACT_MODEL_ID=smollm2-135m-instruct bash scripts/test_real_artifact_loading.shThe default fetch target is HuggingFaceTB/SmolLM2-135M-Instruct, which converts into two Mesh shards of roughly 419 MB each on this machine.
The default bash scripts/test_real_artifact_loading.sh smoke path also prefers smollm2-135m-instruct when it is installed, and only falls back to generic artifact discovery if that lower-memory model is absent.
agent: worker runtime and CLI for device bring-up, pool participation, ring membership, shard loading, inference execution, and dataplane transport. It embedszipinside the larger worker process. The native Mesh boundary for the engine is zip.rs, with the current internal engine implementation living under agent/src/inference. See main.rs and zip.rs.control-plane: durable coordinator for registration, topology, distributed job dispatch, status polling, and ledger events. See inference.rs and ring_manager.rs.relay-server: optional connectivity layer for environments that cannot keep workers directly connected. See relay-server/README.md.
Mesh uses Fozzy first for system validation.
fozzy doctor --deep --scenario tests/production_dispatch.fozzy.json --runs 5 --seed 424242 --json
fozzy test --det --strict tests/production_dispatch.fozzy.json tests/live_relay_runtime.fozzy.json tests/real_artifact_loading.fozzy.json --json
fozzy run tests/production_dispatch.fozzy.json --det --record /tmp/production_dispatch_trace.fozzy --json
fozzy trace verify /tmp/production_dispatch_trace.fozzy --strict --json
fozzy replay /tmp/production_dispatch_trace.fozzy --json
fozzy ci /tmp/production_dispatch_trace.fozzy --json
bash scripts/test_real_artifact_loading.sh
cargo test --workspacebash scripts/test_real_artifact_loading.sh is the explicit host-backed artifact residency gate. It loads a real shard set from ~/.meshnet/models and can take minutes on multi-gigabyte artifacts; cargo test --workspace does not turn that path on by itself.
For provider work, validate both the runtime and the provider contract:
fozzy doctor --deep --scenario tests/production_dispatch.fozzy.json --runs 5 --seed 424242 --json
fozzy test --det --strict tests/production_dispatch.fozzy.json tests/live_relay_runtime.fozzy.json tests/real_artifact_loading.fozzy.json --json
fozzy run tests/production_dispatch.fozzy.json --det --record /tmp/production_dispatch_trace.fozzy --json
fozzy trace verify /tmp/production_dispatch_trace.fozzy --strict --json
fozzy replay /tmp/production_dispatch_trace.fozzy --json
fozzy ci /tmp/production_dispatch_trace.fozzy --jsonMIT