-
Notifications
You must be signed in to change notification settings - Fork 2
DocRPC
This document describes the architecture of the networking and RPC layers in SLASH2.
Note: the terms "server" and "client" used in this document follow standard convention of one host wishing to send a request to a peer and received a reply. The three components in a SLASH2 deployment (MDS, IOS, CLI) often play server and client roles with other components types to provide their service.
SLASH2 leverages a number of libraries which provide higher level networking facilities than many customary lower level primitives typically provided by the kernel and its support libraries. These are LNET, the Lustre networking stack, and its accompanying libraries.
SLASH2 uses a forked version of the Lustre Networking (LNET) library with a few changes to fit into SLASH2 code base. As SLASH2 is fully user-mode, much of the kernel-mode code in SLASH2's LNET has been removed.
The LNET fork is actually in PFL along with the accompanying RPC library PFLRPC.
Our version of LNET uses the same constructs for supporting underlying transport support, although with SLASH2 primarily only the usocklnd (user-mode sockets Lustre networing device) is used.
Please refer to the Lustre documentation for additional information on the workings of these libraries.
This module provides an API for higher level RPC operations. It is forked from an older version of the Lustre ptlrpc API.
Please refer to the Lustre documentation for additional information on the workings of these libraries.
Note: significant changes have been made to this library to support the different environments that SLASH2 deployments often consist of.
Note: this APIcode is in the process of being renamed from the "PSC" prefix to the "PFL" prefix so unfortunately some of the APIs names are still in this transition.
The PFL RPC API is primarily implemented in pfl/rpcclient.c and
pfl/service.c which provide three constructs for applications built
upon PFL RPC:
-
pscrpc_export- a handle to a connected client for a service provided by this operating daemon -
pscrpc_import- a handle for a connection to a remote service -
pscrpc_request(rq) - a single RPC exchange, which takes slightly different lifecycles in the client than in the server:-
clients always initiate requests (filling in
rq_reqmsg) and the PFL RPC API attaches therq_repmsgreply message when it is received from the server; -
servers invoke the corresponding service thread handling routine in one of the service's worker thread's context when an incoming request is received. The thread processes the request and generates a response. The response is thereafter transmitted by the PFL RPC module back to the client.
-
An additional API layer defined in pfl/rsx.h (RPC simple exchange)
provides a higher level RPC send interface: pfl_rsx_newreq() and
pfl_rsx_waitrep().
RSX also contains the bulk data processing methods rsx_bulkclient()
and rsx_bulkserver() which are used to transmit data that are larger
than the allowed message sizes which are defined when services are
registered.
The SLASH2 networking API is primarily implemented in
share/rpc_common.c and include/slashrpc.h which provide an
additional construct for communication among SLASH2 daemons:
-
slashrpc_cservice(csvc) - a higher level structure which uses apscrpc_importto issue requests to servers and handle replies in a versatile fashion
Server mode operation in SLASH2 is pretty straightforward:
a PFL RPC service is registered during daemon initialization and
incoming requests are handled when received.
pscrpc_thread_spawn() registers a new service for clients and spawns
worker threads to handle the requests.
Certain tie-ins are made when operations must be accomodated in a persistent fashion e.g. bmap leases issued to clients are written to stable storage and also retained in memory after an RPC exchange has finished.
Many RPC operations are stateless and this mode is preferred when possible.
A higher level API "wrapper" is defined in slash2/include/slconn.h
which calls the PFL RPC and RSX primitives explained above but offer
additional functionality beyond that provided by the lower layers:
- piggybacking of extra fields into RPCs to keep API usage simple yet offer increased communication
- encryption/authentication/integrity of message contents
- automatic operation counting
- higher level event handling, such as cleanup during failure
Clients can create new RPC requests with MSL_RMC_NEWREQ().
This mount_slash API (with the msl prefix) is used to create a new
RPC structure which is intended to be received by the MDS and is issued
by CLI (RMC).
RPCs can be processed synchronously or asynchronously;
SL_RSX_WAITREP() performs a blocking wait until the server replies or
times out;
SL_NBRQSET_ADD() pushes the request out and arranges for a callback to
be invoked when the server replies or times out.
Many RPC behavorial specifics can be tuned by setting appropriate fields
in pscrpc_request e.g. rq_bulk_abortable, which prevents the entire
import (and thus connection) from failing if the server doesn't need a
bulk tranmission to satisfy the request.
