diff --git a/.claude/skills/devnet-runner/SKILL.md b/.claude/skills/devnet-runner/SKILL.md index f6d7517..3bc7ebf 100644 --- a/.claude/skills/devnet-runner/SKILL.md +++ b/.claude/skills/devnet-runner/SKILL.md @@ -142,7 +142,7 @@ validators: enrFields: ip: "127.0.0.1" # Node IP (127.0.0.1 for local, real IP for ansible) quic: 9001 # QUIC/UDP port for P2P communication - metricsPort: 8081 # Prometheus metrics endpoint port + metricsPort: 8081 # HTTP port exposed by the node (see note below) count: 1 # Number of validator indices assigned to this node ``` @@ -158,7 +158,7 @@ validators: | `privkey` | Yes | 32-byte hex string (64 chars). Used for P2P identity and ENR generation | | `enrFields.ip` | Yes | IP address. Use `127.0.0.1` for local, real IPs for ansible | | `enrFields.quic` | Yes | QUIC port. Must be unique per node in local mode | -| `metricsPort` | Yes | Prometheus metrics port. Must be unique per node in local mode | +| `metricsPort` | Yes | HTTP port exposed by the node. Must be unique per node in local mode. For ethlambda, this maps to `--metrics-port`; the API server uses a separate `--api-port` (default 5052) | | `count` | Yes | Number of validator indices. Sum of all counts = total validators | ### Adding a New Validator Node @@ -175,7 +175,8 @@ validators: 3. **Assign unique ports** (for local mode): - QUIC: 9001, 9002, 9003... (increment for each node) - - Metrics: 8081, 8082, 8083... (increment for each node) + - Metrics/API: 8081, 8082, 8083... (increment for each node) + - **ethlambda note:** ethlambda uses separate API and metrics ports. The `metricsPort` in the config maps to `--metrics-port`. The API server binds to `--api-port` (default 5052) which must also be unique if running multiple ethlambda nodes. 4. **Add the entry to `lean-quickstart/local-devnet/genesis/validator-config.yaml`:** ```yaml @@ -220,6 +221,8 @@ When running multiple nodes locally, each needs unique ports: | grandine_0 | 9006 | 8086 | | ethlambda_0 | 9007 | 8087 | +**ethlambda dual-port note:** ethlambda runs separate API and metrics HTTP servers. The `metricsPort` from `validator-config.yaml` maps to `--metrics-port`. The API server (`--api-port`, default 5052) must also be configured with a unique port if running multiple ethlambda nodes. Update `ethlambda-cmd.sh` in `lean-quickstart` to pass both `--api-port` and `--metrics-port` flags. + For **ansible mode**, all nodes can use the same ports (9001, 8081) since they run on different machines. ### Local vs Ansible Deployment @@ -345,6 +348,7 @@ Check if ports are in use: ```bash lsof -i :9001 # Check QUIC port lsof -i :8081 # Check metrics port +lsof -i :5052 # Check ethlambda API port (if using default) ``` Update ports in `lean-quickstart/local-devnet/genesis/validator-config.yaml` if needed. @@ -397,7 +401,7 @@ To restart a single node mid-devnet (e.g., to test a new image or checkpoint syn **Important:** Restart nodes one at a time, waiting for each to fully sync before restarting the next. If 1/3 or more validators are offline simultaneously, finalization stalls because 3SF-mini requires 2/3+ votes to justify checkpoints. 1. Choose a node to restart. If restarting the aggregator, finalization and attestation inclusion in blocks will stop until it catches back up to head. -2. Identify a healthy node's metrics port to use as checkpoint source +2. Identify a healthy node's API port to use as checkpoint source (ethlambda serves `/lean/v0/states/finalized` on `--api-port`, default 5052) 3. Update the Docker image tag in `client-cmds/-cmd.sh` if needed 4. **Pull the new image before restarting** to minimize node downtime: ```bash @@ -407,10 +411,10 @@ To restart a single node mid-devnet (e.g., to test a new image or checkpoint syn ```bash cd lean-quickstart && NETWORK_DIR=local-devnet ./spin-node.sh \ --restart-client \ - --checkpoint-sync-url http://127.0.0.1:/lean/v0/states/finalized + --checkpoint-sync-url http://127.0.0.1:/lean/v0/states/finalized ``` -**Important:** RPC and metrics share the same port (`--metrics-port`). There is no separate RPC port. +**Important:** ethlambda serves the API (including `/lean/v0/states/finalized`) on `--api-port` (default 5052) and Prometheus metrics on `--metrics-port` (default 5054). Use the API port for checkpoint sync URLs. See `references/checkpoint-sync.md` for the full procedure, verification steps, and troubleshooting. diff --git a/.claude/skills/devnet-runner/references/checkpoint-sync.md b/.claude/skills/devnet-runner/references/checkpoint-sync.md index f743df2..b3b640f 100644 --- a/.claude/skills/devnet-runner/references/checkpoint-sync.md +++ b/.claude/skills/devnet-runner/references/checkpoint-sync.md @@ -11,15 +11,15 @@ Restarting a node with checkpoint sync instead of replaying from genesis. Useful ## Prerequisites - A running devnet with at least one healthy node to serve the checkpoint state -- The checkpoint source node's RPC must be reachable (same port as `--metrics-port`) +- The checkpoint source node's API must be reachable (`--api-port`, default 5052) ## Key Concepts -**RPC and metrics share the same port.** ethlambda serves both Prometheus metrics (`/metrics`) and the Lean API (`/lean/v0/...`) on the `--metrics-port`. There is no separate RPC port. +**ethlambda runs separate API and metrics servers.** The API (`/lean/v0/...`, including health and states) is served on `--api-port` (default 5052). Prometheus metrics (`/metrics`) and pprof are served on `--metrics-port` (default 5054). Both share the bind address `--http-address` (default `127.0.0.1`). -**Checkpoint sync URL format:** +**Checkpoint sync URL format (uses the API port):** ``` -http://:/lean/v0/states/finalized +http://:/lean/v0/states/finalized ``` **The node must have the same genesis config.** Checkpoint sync verifies the downloaded state against the local genesis config (genesis time, validator pubkeys, validator count). The `--custom-network-config-dir` must point to the same genesis used by the rest of the devnet. @@ -42,16 +42,16 @@ validators: ### Step 2: Identify a checkpoint source -Pick any other running node's metrics port as the checkpoint source. The port is configured as `metricsPort` in `validator-config.yaml`. +Pick any other running node's API port as the checkpoint source. For ethlambda, the API is served on `--api-port` (default 5052). For other clients, the API may share the `metricsPort` from `validator-config.yaml`. For local devnets (host networking), the URL is: ``` -http://127.0.0.1:/lean/v0/states/finalized +http://127.0.0.1:/lean/v0/states/finalized ``` Verify the endpoint is reachable: ```bash -curl -s http://127.0.0.1:/lean/v0/health +curl -s http://127.0.0.1:/lean/v0/health # Should return: {"status":"healthy","service":"lean-spec-api"} ``` @@ -77,7 +77,7 @@ docker pull : ```bash cd lean-quickstart && NETWORK_DIR=local-devnet ./spin-node.sh \ --restart-client \ - --checkpoint-sync-url http://127.0.0.1:/lean/v0/states/finalized + --checkpoint-sync-url http://127.0.0.1:/lean/v0/states/finalized ``` This automatically: diff --git a/.claude/skills/devnet-runner/references/clients.md b/.claude/skills/devnet-runner/references/clients.md index bce5f3a..54dee03 100644 --- a/.claude/skills/devnet-runner/references/clients.md +++ b/.claude/skills/devnet-runner/references/clients.md @@ -40,6 +40,8 @@ Ports are configured per-node in `validator-config.yaml`. Typical port assignmen **Note:** Adjust ports to avoid conflicts when running multiple nodes. +**ethlambda dual-port note:** ethlambda runs separate API (`--api-port`, default 5052) and metrics (`--metrics-port`, default 5054) HTTP servers. Both share a bind address (`--http-address`, default `127.0.0.1`). The `metricsPort` from `validator-config.yaml` maps to `--metrics-port`. The API port must be configured separately in `ethlambda-cmd.sh`. + ## Client-Specific Configuration Notes ### zeam @@ -83,6 +85,10 @@ Ports are configured per-node in `validator-config.yaml`. Typical port assignmen - Image: `ghcr.io/lambdaclass/ethlambda:local` - Rust implementation by LambdaClass - Command file: `client-cmds/ethlambda-cmd.sh` +- **Dual HTTP servers:** Runs separate API and metrics servers on independent ports + - `--http-address` (default `127.0.0.1`): shared bind address + - `--api-port` (default `5052`): API server (health, states, checkpoints, fork choice) + - `--metrics-port` (default `5054`): metrics server (Prometheus, pprof) ## Changing Docker Images @@ -112,6 +118,7 @@ To use a different image or tag: | Issue | Image Tags Affected | Description | |-------|---------------------|-------------| +| Separate API and metrics ports | PR #210+ | ethlambda now uses `--http-address`, `--api-port`, and `--metrics-port` instead of the old single `--metrics-address`/`--metrics-port`. `ethlambda-cmd.sh` in lean-quickstart must pass both `--api-port` and `--metrics-port` | | Manifest unknown warning | local | Docker shows "manifest unknown" but falls back to local image - can be ignored | | NoPeersSubscribedToTopic | all | Expected warning when no peers are connected to gossipsub topics | @@ -125,5 +132,5 @@ These are set by `spin-node.sh` and available in client command scripts: | `$configDir` | Genesis config directory path | | `$dataDir` | Data directory path | | `$quicPort` | QUIC port from config | -| `$metricsPort` | Metrics port from config | +| `$metricsPort` | Metrics port from config. For ethlambda, maps to `--metrics-port`; API server needs separate `--api-port` | | `$privkey` | P2P private key | diff --git a/Dockerfile b/Dockerfile index 8a0bd5e..ca52fcb 100644 --- a/Dockerfile +++ b/Dockerfile @@ -59,10 +59,9 @@ COPY --from=builder /app/ethlambda /usr/local/bin # Copy licenses COPY LICENSE ./ -# Lighthouse-compatible default ports: # 9000/tcp, 9000/udp - P2P networking # 9001/udp - QUIC connections -# 5052 - HTTP API +# 5052 - API RPC # 5054 - Prometheus metrics EXPOSE 9000/tcp 9000/udp 9001/udp 5052 5054 ENTRYPOINT ["/usr/local/bin/ethlambda"] diff --git a/bin/ethlambda/src/main.rs b/bin/ethlambda/src/main.rs index 40ed12c..fabacb7 100644 --- a/bin/ethlambda/src/main.rs +++ b/bin/ethlambda/src/main.rs @@ -49,7 +49,9 @@ struct CliOptions { #[arg(long, default_value = "9000")] gossipsub_port: u16, #[arg(long, default_value = "127.0.0.1")] - metrics_address: IpAddr, + http_address: IpAddr, + #[arg(long, default_value = "5052")] + api_port: u16, #[arg(long, default_value = "5054")] metrics_port: u16, #[arg(long)] @@ -83,7 +85,8 @@ async fn main() -> eyre::Result<()> { ethlambda_blockchain::metrics::set_node_info("ethlambda", version::CLIENT_VERSION); ethlambda_blockchain::metrics::set_node_start_time(); - let metrics_socket = SocketAddr::new(options.metrics_address, options.metrics_port); + let api_socket = SocketAddr::new(options.http_address, options.api_port); + let metrics_socket = SocketAddr::new(options.http_address, options.metrics_port); let node_p2p_key = read_hex_file_bytes(&options.node_key); let p2p_socket = SocketAddr::new(IpAddr::from([0, 0, 0, 0]), options.gossipsub_port); @@ -164,9 +167,16 @@ async fn main() -> eyre::Result<()> { }) .inspect_err(|err| error!(%err, "Failed to send InitBlockChain — actors not wired"))?; - ethlambda_rpc::start_rpc_server(metrics_socket, store) - .await - .unwrap(); + tokio::spawn(async move { + let _ = ethlambda_rpc::start_metrics_server(metrics_socket) + .await + .inspect_err(|err| error!(%err, "Metrics server failed")); + }); + tokio::spawn(async move { + let _ = ethlambda_rpc::start_api_server(api_socket, store) + .await + .inspect_err(|err| error!(%err, "API server failed")); + }); info!("Node initialized"); diff --git a/crates/net/rpc/src/lib.rs b/crates/net/rpc/src/lib.rs index 27b3f16..bc85c38 100644 --- a/crates/net/rpc/src/lib.rs +++ b/crates/net/rpc/src/lib.rs @@ -11,15 +11,20 @@ mod fork_choice; mod heap_profiling; pub mod metrics; -pub async fn start_rpc_server(address: SocketAddr, store: Store) -> Result<(), std::io::Error> { - let metrics_router = metrics::start_prometheus_metrics_api(); +pub async fn start_api_server(address: SocketAddr, store: Store) -> Result<(), std::io::Error> { let api_router = build_api_router(store); + + let listener = tokio::net::TcpListener::bind(address).await?; + axum::serve(listener, api_router).await?; + + Ok(()) +} + +pub async fn start_metrics_server(address: SocketAddr) -> Result<(), std::io::Error> { + let metrics_router = metrics::start_prometheus_metrics_api(); let debug_router = build_debug_router(); - let app = Router::new() - .merge(metrics_router) - .merge(api_router) - .merge(debug_router); + let app = Router::new().merge(metrics_router).merge(debug_router); let listener = tokio::net::TcpListener::bind(address).await?; axum::serve(listener, app).await?; @@ -30,6 +35,7 @@ pub async fn start_rpc_server(address: SocketAddr, store: Store) -> Result<(), s /// Build the API router with the given store. fn build_api_router(store: Store) -> Router { Router::new() + .route("/lean/v0/health", get(metrics::get_health)) .route("/lean/v0/states/finalized", get(get_latest_finalized_state)) .route( "/lean/v0/checkpoints/justified", diff --git a/crates/net/rpc/src/metrics.rs b/crates/net/rpc/src/metrics.rs index aadb021..f0abbfd 100644 --- a/crates/net/rpc/src/metrics.rs +++ b/crates/net/rpc/src/metrics.rs @@ -3,9 +3,7 @@ use ethlambda_metrics::gather_default_metrics; use tracing::warn; pub fn start_prometheus_metrics_api() -> Router { - Router::new() - .route("/metrics", get(get_metrics)) - .route("/lean/v0/health", get(get_health)) + Router::new().route("/metrics", get(get_metrics)) } pub(crate) async fn get_health() -> impl IntoResponse { diff --git a/docs/fork_choice_visualization.md b/docs/fork_choice_visualization.md index ab50897..10b27fd 100644 --- a/docs/fork_choice_visualization.md +++ b/docs/fork_choice_visualization.md @@ -9,7 +9,7 @@ A browser-based real-time visualization of the LMD GHOST fork choice tree, serve | `GET /lean/v0/fork_choice/ui` | Interactive D3.js visualization page | | `GET /lean/v0/fork_choice` | JSON snapshot of the fork choice tree | -Both endpoints are served on the metrics port (`--metrics-port`, default `5054`). +Both endpoints are served on the API port (`--api-port`, default `5052`). ## Quick Start @@ -32,10 +32,10 @@ cargo run --release -- \ --custom-network-config-dir ./config \ --node-key ./keys/node.key \ --node-id 0 \ - --metrics-port 5054 + --api-port 5052 ``` -Then open http://localhost:5054/lean/v0/fork_choice/ui. +Then open http://localhost:5052/lean/v0/fork_choice/ui. ## Visualization Guide @@ -71,7 +71,7 @@ Then open http://localhost:5054/lean/v0/fork_choice/ui. ## JSON API ```bash -curl -s http://localhost:5054/lean/v0/fork_choice | jq . +curl -s http://localhost:5052/lean/v0/fork_choice | jq . ``` Response schema: diff --git a/docs/metrics.md b/docs/metrics.md index b562782..734beac 100644 --- a/docs/metrics.md +++ b/docs/metrics.md @@ -1,6 +1,6 @@ # Metrics -We collect various metrics and serve them via a Prometheus-compatible HTTP endpoint at `http://:/metrics` (default: `http://127.0.0.1:5054/metrics`). +We collect various metrics and serve them via a Prometheus-compatible HTTP endpoint at `http://:/metrics` (default: `http://127.0.0.1:5054/metrics`). A ready-to-use Grafana + Prometheus monitoring stack with pre-configured [leanMetrics](https://github.com/leanEthereum/leanMetrics) dashboards is available in [lean-quickstart](https://github.com/blockblaz/lean-quickstart). diff --git a/preview-config.nix b/preview-config.nix index 37acb7e..9f0576c 100644 --- a/preview-config.nix +++ b/preview-config.nix @@ -15,7 +15,8 @@ # # Ports: # QUIC (P2P): 9001-9004 (UDP) -# Metrics/RPC: 8081-8084 (TCP) -- serves /lean/v0/* API and /metrics +# API RPC: 8081-8084 (TCP) -- serves /lean/v0/* API endpoints +# Metrics: 8085-8088 (TCP) -- serves /metrics and /debug/pprof/* # # Prerequisites: # - Podman (rootless, with dockerCompat) for genesis generation tools: @@ -74,11 +75,12 @@ let let name = "ethlambda_${toString idx}"; gossipPort = 9001 + idx; - metricsPort = 8081 + idx; + apiPort = 8081 + idx; + metricsPort = 8085 + idx; aggregatorArgs = lib.optionals (idx == 0) [ "--is-aggregator" ]; in { - description = "ethlambda node ${toString idx} (gossip=${toString gossipPort}, rpc=${toString metricsPort})"; + description = "ethlambda node ${toString idx} (gossip=${toString gossipPort}, api=${toString apiPort}, metrics=${toString metricsPort})"; after = [ "setup-devnet.service" ]; requires = [ "setup-devnet.service" ]; path = with pkgs; [ bash coreutils ]; @@ -92,7 +94,8 @@ let "--gossipsub-port" (toString gossipPort) "--node-id" name "--node-key" "${genesisDir}/${name}.key" - "--metrics-address" "0.0.0.0" + "--http-address" "0.0.0.0" + "--api-port" (toString apiPort) "--metrics-port" (toString metricsPort) ] ++ aggregatorArgs); Restart = "on-failure";