This project demonstrates how to use Mercury for RPC communication in an HPC environment.
- CMake 3.10+
- Mercury HPC library
- C++17 compatible compiler
- spdlog library
mkdir build
cd build
cmake ..
make
make install
- Create a nodelist.conf file (optional - will be created automatically if not present):
hostname1:8080
hostname2:8080
hostname3:8080
- Run the server:
./mercury_server <role> <num_threads>
Examples:
# Start a worker with 4 threads
./mercury_server worker 4
# Start a master with 2 threads
./mercury_server master 2
The system supports multiple methods for node discovery:
- SLURM environment (automatic if running under SLURM)
- nodelist.conf file (fallback if no SLURM environment)
- Auto-registration (if no existing nodes found)
Mercury supports multiple transport protocols. This example uses:
na+sm
: Shared memory protocol (default)
Other available protocols:
ofi+tcp
: TCP/IP over libfabricofi+verbs
: InfiniBand verbs over libfabricna+cxi
: Cray/HPE Slingshot
To use a different protocol, modify the address string in main.cpp:
std::string address = "ofi+tcp://" + std::to_string(8080 + i);
The application uses spdlog for detailed logging. Log levels can be controlled at runtime:
spdlog::set_level(spdlog::level::debug); // Most verbose
spdlog::set_level(spdlog::level::info); // Normal operation
spdlog::set_level(spdlog::level::warn); // Warnings and errors only
Log format includes:
- Timestamp
- Log level
- Thread ID
- Component name
- Message
- HEARTBEAT: Basic alive check
- GET_ROLE: Query node's assigned role
- ECHO: Echo back the received message
- SHUTDOWN: Graceful shutdown of the node
The system implements a multi-threaded RPC server with:
- Multiple RPC handlers per node
- Role-based request handling
- SLURM integration
- Automatic node discovery
- Comprehensive logging