Skip to content
16 changes: 10 additions & 6 deletions .claude/skills/devnet-runner/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ validators:
enrFields:
ip: "127.0.0.1" # Node IP (127.0.0.1 for local, real IP for ansible)
quic: 9001 # QUIC/UDP port for P2P communication
metricsPort: 8081 # Prometheus metrics endpoint port
metricsPort: 8081 # HTTP port exposed by the node (see note below)
count: 1 # Number of validator indices assigned to this node
```

Expand All @@ -158,7 +158,7 @@ validators:
| `privkey` | Yes | 32-byte hex string (64 chars). Used for P2P identity and ENR generation |
| `enrFields.ip` | Yes | IP address. Use `127.0.0.1` for local, real IPs for ansible |
| `enrFields.quic` | Yes | QUIC port. Must be unique per node in local mode |
| `metricsPort` | Yes | Prometheus metrics port. Must be unique per node in local mode |
| `metricsPort` | Yes | HTTP port exposed by the node. Must be unique per node in local mode. For ethlambda, this maps to `--metrics-port`; the API server uses a separate `--api-port` (default 5052) |
| `count` | Yes | Number of validator indices. Sum of all counts = total validators |

### Adding a New Validator Node
Expand All @@ -175,7 +175,8 @@ validators:

3. **Assign unique ports** (for local mode):
- QUIC: 9001, 9002, 9003... (increment for each node)
- Metrics: 8081, 8082, 8083... (increment for each node)
- Metrics/API: 8081, 8082, 8083... (increment for each node)
- **ethlambda note:** ethlambda uses separate API and metrics ports. The `metricsPort` in the config maps to `--metrics-port`. The API server binds to `--api-port` (default 5052) which must also be unique if running multiple ethlambda nodes.

4. **Add the entry to `lean-quickstart/local-devnet/genesis/validator-config.yaml`:**
```yaml
Expand Down Expand Up @@ -220,6 +221,8 @@ When running multiple nodes locally, each needs unique ports:
| grandine_0 | 9006 | 8086 |
| ethlambda_0 | 9007 | 8087 |

**ethlambda dual-port note:** ethlambda runs separate API and metrics HTTP servers. The `metricsPort` from `validator-config.yaml` maps to `--metrics-port`. The API server (`--api-port`, default 5052) must also be configured with a unique port if running multiple ethlambda nodes. Update `ethlambda-cmd.sh` in `lean-quickstart` to pass both `--api-port` and `--metrics-port` flags.

For **ansible mode**, all nodes can use the same ports (9001, 8081) since they run on different machines.

### Local vs Ansible Deployment
Expand Down Expand Up @@ -345,6 +348,7 @@ Check if ports are in use:
```bash
lsof -i :9001 # Check QUIC port
lsof -i :8081 # Check metrics port
lsof -i :5052 # Check ethlambda API port (if using default)
```

Update ports in `lean-quickstart/local-devnet/genesis/validator-config.yaml` if needed.
Expand Down Expand Up @@ -397,7 +401,7 @@ To restart a single node mid-devnet (e.g., to test a new image or checkpoint syn
**Important:** Restart nodes one at a time, waiting for each to fully sync before restarting the next. If 1/3 or more validators are offline simultaneously, finalization stalls because 3SF-mini requires 2/3+ votes to justify checkpoints.

1. Choose a node to restart. If restarting the aggregator, finalization and attestation inclusion in blocks will stop until it catches back up to head.
2. Identify a healthy node's metrics port to use as checkpoint source
2. Identify a healthy node's API port to use as checkpoint source (ethlambda serves `/lean/v0/states/finalized` on `--api-port`, default 5052)
3. Update the Docker image tag in `client-cmds/<client>-cmd.sh` if needed
4. **Pull the new image before restarting** to minimize node downtime:
```bash
Expand All @@ -407,10 +411,10 @@ To restart a single node mid-devnet (e.g., to test a new image or checkpoint syn
```bash
cd lean-quickstart && NETWORK_DIR=local-devnet ./spin-node.sh \
--restart-client <node_name> \
--checkpoint-sync-url http://127.0.0.1:<source_metrics_port>/lean/v0/states/finalized
--checkpoint-sync-url http://127.0.0.1:<source_api_port>/lean/v0/states/finalized
```

**Important:** RPC and metrics share the same port (`--metrics-port`). There is no separate RPC port.
**Important:** ethlambda serves the API (including `/lean/v0/states/finalized`) on `--api-port` (default 5052) and Prometheus metrics on `--metrics-port` (default 5054). Use the API port for checkpoint sync URLs.

See `references/checkpoint-sync.md` for the full procedure, verification steps, and troubleshooting.

Expand Down
16 changes: 8 additions & 8 deletions .claude/skills/devnet-runner/references/checkpoint-sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,15 @@ Restarting a node with checkpoint sync instead of replaying from genesis. Useful
## Prerequisites

- A running devnet with at least one healthy node to serve the checkpoint state
- The checkpoint source node's RPC must be reachable (same port as `--metrics-port`)
- The checkpoint source node's API must be reachable (`--api-port`, default 5052)

## Key Concepts

**RPC and metrics share the same port.** ethlambda serves both Prometheus metrics (`/metrics`) and the Lean API (`/lean/v0/...`) on the `--metrics-port`. There is no separate RPC port.
**ethlambda runs separate API and metrics servers.** The API (`/lean/v0/...`, including health and states) is served on `--api-port` (default 5052). Prometheus metrics (`/metrics`) and pprof are served on `--metrics-port` (default 5054). Both share the bind address `--http-address` (default `127.0.0.1`).

**Checkpoint sync URL format:**
**Checkpoint sync URL format (uses the API port):**
```
http://<host>:<metrics-port>/lean/v0/states/finalized
http://<host>:<api-port>/lean/v0/states/finalized
```

**The node must have the same genesis config.** Checkpoint sync verifies the downloaded state against the local genesis config (genesis time, validator pubkeys, validator count). The `--custom-network-config-dir` must point to the same genesis used by the rest of the devnet.
Expand All @@ -42,16 +42,16 @@ validators:

### Step 2: Identify a checkpoint source

Pick any other running node's metrics port as the checkpoint source. The port is configured as `metricsPort` in `validator-config.yaml`.
Pick any other running node's API port as the checkpoint source. For ethlambda, the API is served on `--api-port` (default 5052). For other clients, the API may share the `metricsPort` from `validator-config.yaml`.

For local devnets (host networking), the URL is:
```
http://127.0.0.1:<metrics-port>/lean/v0/states/finalized
http://127.0.0.1:<api-port>/lean/v0/states/finalized
```

Verify the endpoint is reachable:
```bash
curl -s http://127.0.0.1:<metrics-port>/lean/v0/health
curl -s http://127.0.0.1:<api-port>/lean/v0/health
# Should return: {"status":"healthy","service":"lean-spec-api"}
```

Expand All @@ -77,7 +77,7 @@ docker pull <image>:<new_tag>
```bash
cd lean-quickstart && NETWORK_DIR=local-devnet ./spin-node.sh \
--restart-client <node_name> \
--checkpoint-sync-url http://127.0.0.1:<source_metrics_port>/lean/v0/states/finalized
--checkpoint-sync-url http://127.0.0.1:<source_api_port>/lean/v0/states/finalized
```

This automatically:
Expand Down
9 changes: 8 additions & 1 deletion .claude/skills/devnet-runner/references/clients.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ Ports are configured per-node in `validator-config.yaml`. Typical port assignmen

**Note:** Adjust ports to avoid conflicts when running multiple nodes.

**ethlambda dual-port note:** ethlambda runs separate API (`--api-port`, default 5052) and metrics (`--metrics-port`, default 5054) HTTP servers. Both share a bind address (`--http-address`, default `127.0.0.1`). The `metricsPort` from `validator-config.yaml` maps to `--metrics-port`. The API port must be configured separately in `ethlambda-cmd.sh`.

## Client-Specific Configuration Notes

### zeam
Expand Down Expand Up @@ -83,6 +85,10 @@ Ports are configured per-node in `validator-config.yaml`. Typical port assignmen
- Image: `ghcr.io/lambdaclass/ethlambda:local`
- Rust implementation by LambdaClass
- Command file: `client-cmds/ethlambda-cmd.sh`
- **Dual HTTP servers:** Runs separate API and metrics servers on independent ports
- `--http-address` (default `127.0.0.1`): shared bind address
- `--api-port` (default `5052`): API server (health, states, checkpoints, fork choice)
- `--metrics-port` (default `5054`): metrics server (Prometheus, pprof)

## Changing Docker Images

Expand Down Expand Up @@ -112,6 +118,7 @@ To use a different image or tag:

| Issue | Image Tags Affected | Description |
|-------|---------------------|-------------|
| Separate API and metrics ports | PR #210+ | ethlambda now uses `--http-address`, `--api-port`, and `--metrics-port` instead of the old single `--metrics-address`/`--metrics-port`. `ethlambda-cmd.sh` in lean-quickstart must pass both `--api-port` and `--metrics-port` |
| Manifest unknown warning | local | Docker shows "manifest unknown" but falls back to local image - can be ignored |
| NoPeersSubscribedToTopic | all | Expected warning when no peers are connected to gossipsub topics |

Expand All @@ -125,5 +132,5 @@ These are set by `spin-node.sh` and available in client command scripts:
| `$configDir` | Genesis config directory path |
| `$dataDir` | Data directory path |
| `$quicPort` | QUIC port from config |
| `$metricsPort` | Metrics port from config |
| `$metricsPort` | Metrics port from config. For ethlambda, maps to `--metrics-port`; API server needs separate `--api-port` |
| `$privkey` | P2P private key |
3 changes: 1 addition & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,9 @@ COPY --from=builder /app/ethlambda /usr/local/bin
# Copy licenses
COPY LICENSE ./

# Lighthouse-compatible default ports:
# 9000/tcp, 9000/udp - P2P networking
# 9001/udp - QUIC connections
# 5052 - HTTP API
# 5052 - API RPC
# 5054 - Prometheus metrics
EXPOSE 9000/tcp 9000/udp 9001/udp 5052 5054
ENTRYPOINT ["/usr/local/bin/ethlambda"]
20 changes: 15 additions & 5 deletions bin/ethlambda/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,9 @@ struct CliOptions {
#[arg(long, default_value = "9000")]
gossipsub_port: u16,
#[arg(long, default_value = "127.0.0.1")]
metrics_address: IpAddr,
http_address: IpAddr,
#[arg(long, default_value = "5052")]
api_port: u16,
#[arg(long, default_value = "5054")]
metrics_port: u16,
#[arg(long)]
Expand Down Expand Up @@ -83,7 +85,8 @@ async fn main() -> eyre::Result<()> {
ethlambda_blockchain::metrics::set_node_info("ethlambda", version::CLIENT_VERSION);
ethlambda_blockchain::metrics::set_node_start_time();

let metrics_socket = SocketAddr::new(options.metrics_address, options.metrics_port);
let api_socket = SocketAddr::new(options.http_address, options.api_port);
let metrics_socket = SocketAddr::new(options.http_address, options.metrics_port);
let node_p2p_key = read_hex_file_bytes(&options.node_key);
let p2p_socket = SocketAddr::new(IpAddr::from([0, 0, 0, 0]), options.gossipsub_port);

Expand Down Expand Up @@ -164,9 +167,16 @@ async fn main() -> eyre::Result<()> {
})
.inspect_err(|err| error!(%err, "Failed to send InitBlockChain — actors not wired"))?;

ethlambda_rpc::start_rpc_server(metrics_socket, store)
.await
.unwrap();
tokio::spawn(async move {
let _ = ethlambda_rpc::start_metrics_server(metrics_socket)
.await
.inspect_err(|err| error!(%err, "Metrics server failed"));
});
tokio::spawn(async move {
let _ = ethlambda_rpc::start_api_server(api_socket, store)
.await
.inspect_err(|err| error!(%err, "API server failed"));
});

info!("Node initialized");

Expand Down
18 changes: 12 additions & 6 deletions crates/net/rpc/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,20 @@ mod fork_choice;
mod heap_profiling;
pub mod metrics;

pub async fn start_rpc_server(address: SocketAddr, store: Store) -> Result<(), std::io::Error> {
let metrics_router = metrics::start_prometheus_metrics_api();
pub async fn start_api_server(address: SocketAddr, store: Store) -> Result<(), std::io::Error> {
let api_router = build_api_router(store);

let listener = tokio::net::TcpListener::bind(address).await?;
axum::serve(listener, api_router).await?;

Ok(())
}

pub async fn start_metrics_server(address: SocketAddr) -> Result<(), std::io::Error> {
let metrics_router = metrics::start_prometheus_metrics_api();
let debug_router = build_debug_router();

let app = Router::new()
.merge(metrics_router)
.merge(api_router)
.merge(debug_router);
let app = Router::new().merge(metrics_router).merge(debug_router);

let listener = tokio::net::TcpListener::bind(address).await?;
axum::serve(listener, app).await?;
Expand All @@ -30,6 +35,7 @@ pub async fn start_rpc_server(address: SocketAddr, store: Store) -> Result<(), s
/// Build the API router with the given store.
fn build_api_router(store: Store) -> Router {
Router::new()
.route("/lean/v0/health", get(metrics::get_health))
.route("/lean/v0/states/finalized", get(get_latest_finalized_state))
.route(
"/lean/v0/checkpoints/justified",
Expand Down
4 changes: 1 addition & 3 deletions crates/net/rpc/src/metrics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,7 @@ use ethlambda_metrics::gather_default_metrics;
use tracing::warn;

pub fn start_prometheus_metrics_api() -> Router {
Router::new()
.route("/metrics", get(get_metrics))
.route("/lean/v0/health", get(get_health))
Router::new().route("/metrics", get(get_metrics))
}

pub(crate) async fn get_health() -> impl IntoResponse {
Expand Down
8 changes: 4 additions & 4 deletions docs/fork_choice_visualization.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ A browser-based real-time visualization of the LMD GHOST fork choice tree, serve
| `GET /lean/v0/fork_choice/ui` | Interactive D3.js visualization page |
| `GET /lean/v0/fork_choice` | JSON snapshot of the fork choice tree |

Both endpoints are served on the metrics port (`--metrics-port`, default `5054`).
Both endpoints are served on the API port (`--api-port`, default `5052`).

## Quick Start

Expand All @@ -32,10 +32,10 @@ cargo run --release -- \
--custom-network-config-dir ./config \
--node-key ./keys/node.key \
--node-id 0 \
--metrics-port 5054
--api-port 5052
```

Then open http://localhost:5054/lean/v0/fork_choice/ui.
Then open http://localhost:5052/lean/v0/fork_choice/ui.

## Visualization Guide

Expand Down Expand Up @@ -71,7 +71,7 @@ Then open http://localhost:5054/lean/v0/fork_choice/ui.
## JSON API

```bash
curl -s http://localhost:5054/lean/v0/fork_choice | jq .
curl -s http://localhost:5052/lean/v0/fork_choice | jq .
```

Response schema:
Expand Down
2 changes: 1 addition & 1 deletion docs/metrics.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Metrics

We collect various metrics and serve them via a Prometheus-compatible HTTP endpoint at `http://<metrics_address>:<metrics_port>/metrics` (default: `http://127.0.0.1:5054/metrics`).
We collect various metrics and serve them via a Prometheus-compatible HTTP endpoint at `http://<http_address>:<metrics_port>/metrics` (default: `http://127.0.0.1:5054/metrics`).

A ready-to-use Grafana + Prometheus monitoring stack with pre-configured [leanMetrics](https://github.com/leanEthereum/leanMetrics) dashboards is available in [lean-quickstart](https://github.com/blockblaz/lean-quickstart).

Expand Down
11 changes: 7 additions & 4 deletions preview-config.nix
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
#
# Ports:
# QUIC (P2P): 9001-9004 (UDP)
# Metrics/RPC: 8081-8084 (TCP) -- serves /lean/v0/* API and /metrics
# API RPC: 8081-8084 (TCP) -- serves /lean/v0/* API endpoints
# Metrics: 8085-8088 (TCP) -- serves /metrics and /debug/pprof/*
#
# Prerequisites:
# - Podman (rootless, with dockerCompat) for genesis generation tools:
Expand Down Expand Up @@ -74,11 +75,12 @@ let
let
name = "ethlambda_${toString idx}";
gossipPort = 9001 + idx;
metricsPort = 8081 + idx;
apiPort = 8081 + idx;
metricsPort = 8085 + idx;
aggregatorArgs = lib.optionals (idx == 0) [ "--is-aggregator" ];
in
{
description = "ethlambda node ${toString idx} (gossip=${toString gossipPort}, rpc=${toString metricsPort})";
description = "ethlambda node ${toString idx} (gossip=${toString gossipPort}, api=${toString apiPort}, metrics=${toString metricsPort})";
after = [ "setup-devnet.service" ];
requires = [ "setup-devnet.service" ];
path = with pkgs; [ bash coreutils ];
Expand All @@ -92,7 +94,8 @@ let
"--gossipsub-port" (toString gossipPort)
"--node-id" name
"--node-key" "${genesisDir}/${name}.key"
"--metrics-address" "0.0.0.0"
"--http-address" "0.0.0.0"
"--api-port" (toString apiPort)
"--metrics-port" (toString metricsPort)
] ++ aggregatorArgs);
Restart = "on-failure";
Expand Down
Loading