PIN Client Daemon (pin-clientd) v2.2.2

The PIN Client Daemon connects your local LLM inference server to the AiAssist P2P Inference Network.

Quick Start

Register as an operator at https://aiassist.net/pin/join
Copy the config template and add your credentials
Run the daemon: ./pin-clientd --config config.json

CLI Options

./pin-clientd [OPTIONS]

Options:
  -c, --config <FILE>     Config file path [default: config.json]
  -l, --log-level <LEVEL> Log level (trace, debug, info, warn, error) [default: info]
  -n, --threads <NUM>     Number of concurrent inference threads [default: 1]
  -h, --help              Print help
  -V, --version           Print version

Multi-threaded Inference

Use -n to enable parallel request processing:

# Process up to 4 requests concurrently
./pin-clientd -c config.json -n 4

Recommended values based on hardware:

CPU-only: 1-2 threads
Single GPU: 2-4 threads
Multi-GPU: threads per GPU × number of GPUs

Important: Ollama requires additional configuration for parallel requests.

By default, Ollama processes requests sequentially. To enable parallel inference:

# Set before starting Ollama
export OLLAMA_NUM_PARALLEL=4
ollama serve

Or add to your systemd service file:

[Service]
Environment="OLLAMA_NUM_PARALLEL=4"

Backend	Parallel Support
Ollama	Requires `OLLAMA_NUM_PARALLEL` env var
vLLM	✅ Native (no config needed)
TGI	✅ Native (no config needed)
LMStudio	✅ Native (no config needed)

Match OLLAMA_NUM_PARALLEL to your daemon -n value for optimal throughput.

Installation

Prerequisites

Rust (1.70+): Install via rustup

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

LLM Backend (one of):
- Ollama - Easiest for beginners
- vLLM - Best for production GPU inference
- text-generation-inference - HuggingFace's solution
- LMStudio - Desktop app with API server

Build from Source

=======

Build in MSYS2:

rustup target add x86_64-pc-windows-gnu
rustup default stable-x86_64-pc-windows-gnu
cargo clean

Build

git clone https://github.com/aiassistsecure/pin-clientd.git
cd pin-clientd
./build.sh

// or simply
cargo build --release

Or manually with cargo:

cargo build --release
cp target/release/pin-clientd .

Setup

1. Register as Operator

Visit https://aiassist.net/pin/operator to register and get your credentials:

clientId (starts with op_)
apiSecret (starts with pin_sk_)

2. Create Config File

cp config.example.json config.json

Edit config.json with your credentials:

{
  "clientId": "op_your_id_here",
  "apiSecret": "pin_sk_your_secret_here",
  "nodes": [
    {
      "alias": "My-GPU",
      "inferenceUri": "http://localhost:11434",
      "apiMode": "ollama",
      "region": "us-east",
      "capacity": 5,
      "pricePerThousandTokens": 0.001
    }
  ]
}

3. Start Your LLM Backend

# For Ollama
ollama serve

# Verify it's running
curl http://localhost:11434/api/tags

4. Run the Daemon

./pin-clientd -c config.json

With multi-threading:

./pin-clientd -c config.json -n 4

5. Run as System Service (Linux)

Create /etc/systemd/system/pin-clientd.service:

[Unit]
Description=PIN Client Daemon
After=network.target ollama.service

[Service]
Type=simple
User=your-username
WorkingDirectory=/home/your-username/pin-clientd
ExecStart=/home/your-username/pin-clientd/pin-clientd -c config.json -n 4
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable pin-clientd
sudo systemctl start pin-clientd
sudo journalctl -u pin-clientd -f  # View logs

Configuration

Config Structure

{
  "clientId": "op_your_operator_id",
  "apiSecret": "your_api_secret",
  "nodes": [
    {
      "alias": "GPU-1",
      "inferenceUri": "http://localhost:11434",
      "apiMode": "ollama",
      "region": "us-east",
      "capacity": 10,
      "pricePerThousandTokens": 0.001
    }
  ]
}

Root Fields

Field	Required	Description
`clientId`	Yes	Your operator ID (starts with `op_`)
`apiSecret`	Yes	Your API secret from registration
`nodes`	Yes	Array of node configurations (at least one required)

Node Fields

Field	Required	Description
`alias`	Yes	Friendly name for this node
`inferenceUri`	Yes	LLM server URL (e.g., `http://localhost:11434`)
`apiMode`	Yes	API format: `ollama` or `openai`
`region`	Yes	Geographic region (see table below)
`capacity`	Yes	Max concurrent requests
`pricePerThousandTokens`	No	Your price per 1K tokens in USD (default: $0.001)

API Modes

Ollama Mode

Use "apiMode": "ollama" for standard Ollama installations.

Model discovery: GET /api/tags
Chat endpoint: POST /api/chat
Default port: 11434

{
  "alias": "ollama-node",
  "inferenceUri": "http://localhost:11434",
  "apiMode": "ollama",
  "region": "us-east",
  "capacity": 5,
  "pricePerThousandTokens": 0.001
}

OpenAI Mode

Use "apiMode": "openai" for OpenAI-compatible APIs:

vLLM
text-generation-inference (TGI)
LMStudio
LocalAI
Any OpenAI-compatible server
Model discovery: GET /v1/models
Chat endpoint: POST /v1/chat/completions

{
  "alias": "vllm-node",
  "inferenceUri": "http://localhost:8000",
  "apiMode": "openai",
  "region": "us-west",
  "capacity": 20,
  "pricePerThousandTokens": 0.002
}

Regions

Choose the region closest to your server's physical location.

Region ID	Name	Example Use Case
`us-east`	US East	AWS us-east-1, NYC, Virginia
`us-west`	US West	AWS us-west-2, California, Oregon
`eu-west`	EU West	AWS eu-west-1, Ireland, UK
`eu-central`	EU Central	AWS eu-central-1, Frankfurt, Amsterdam
`asia-pacific`	Asia Pacific	AWS ap-northeast-1, Tokyo, Singapore
`global`	Global (Any)	Multi-region or unknown location

Multi-Node Configuration

Register multiple nodes with different backends:

{
  "clientId": "op_abc123",
  "apiSecret": "secret_xyz",
  "nodes": [
    {
      "alias": "ollama-node",
      "inferenceUri": "http://localhost:11434",
      "apiMode": "ollama",
      "region": "us-east",
      "capacity": 10,
      "pricePerThousandTokens": 0.001
    },
    {
      "alias": "vllm-node",
      "inferenceUri": "http://localhost:8000",
      "apiMode": "openai",
      "region": "us-east",
      "capacity": 20,
      "pricePerThousandTokens": 0.002
    },
    {
      "alias": "lmstudio-node",
      "inferenceUri": "http://192.168.1.100:1234",
      "apiMode": "openai",
      "region": "us-west",
      "capacity": 5,
      "pricePerThousandTokens": 0.0005
    }
  ]
}

Running the Daemon

# Basic usage
./pin-clientd --config config.json

# With debug logging
RUST_LOG=debug ./pin-clientd --config config.json

# Specify log level
./pin-clientd --config config.json --log-level info

Interview System

When your daemon connects, the server sends interview prompts to verify LLM quality. The daemon automatically:

Receives test prompts from the server
Runs them against your local LLM
Reports timing metrics (TTFT, tokens/sec)
Gets assigned a quality tier

Quality tiers affect routing priority:

verified - Highest priority (>90% accuracy, >20 tok/s)
standard - Normal priority (>70% accuracy, >10 tok/s)
slow - Budget tier (>70% accuracy, <10 tok/s)
failed - Blocked from production (<70% accuracy)

Install as Service

sudo mkdir -p /opt/pin-clientd
sudo cp target/release/pin-clientd /opt/pin-clientd/
sudo cp config.json /opt/pin-clientd/
sudo cp pin-clientd.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable pin-clientd
sudo systemctl start pin-clientd

View Logs

journalctl -u pin-clientd -f

Example Configurations

File	Description
`config.example.json`	Basic single-node template
`config.ollama.example.json`	Ollama-specific example
`config.openai.example.json`	OpenAI-compatible API example
`config.multi-node.example.json`	Multi-node with mixed backends

License

MIT License - AiAssist Secure

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
Cargo.toml		Cargo.toml
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
build.sh		build.sh
config.example.json		config.example.json
config.multi-node.example.json		config.multi-node.example.json
config.ollama.example.json		config.ollama.example.json
config.openai.example.json		config.openai.example.json
pin-clientd.service		pin-clientd.service

Folders and files

Latest commit

History

Repository files navigation

PIN Client Daemon (pin-clientd) v2.2.2

Quick Start

CLI Options

Multi-threaded Inference

Installation

Prerequisites

Build from Source

Build

Setup

1. Register as Operator

2. Create Config File

3. Start Your LLM Backend

4. Run the Daemon

5. Run as System Service (Linux)

Configuration

Config Structure

Root Fields

Node Fields

API Modes

Ollama Mode

OpenAI Mode

Regions

Multi-Node Configuration

Running the Daemon

Interview System

Install as Service

View Logs

Example Configurations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages