Skip to content

Commit e6e93ed

Browse files
committed
rsocket: Add datagram support
Add datagram support through the rsocket API. Datagram support is handled through an entirely different protocol and internal implementation than streaming sockets. Unlike connected rsockets, datagram rsockets are not necessarily bound to a network (IP) address. A datagram socket may use any number of network (IP) addresses, including those which map to different RDMA devices. As a result, a single datagram rsocket must support using multiple RDMA devices and ports, and a datagram rsocket references a single UDP socket, plus zero or more UD QPs. Rsockets uses headers inserted before user data sent over UDP sockets to resolve remote UD QP numbers. When a user first attempts to send a datagram to a remote address (IP and UDP port), rsockets will take the following steps: 1. Store the destination address into a lookup table. 2. Resolve which local network address should be used when sending to the specified destination. 3. Allocate a UD QP on the RDMA device associated with the local address. 4. Send the user's datagram to the remote UDP socket. A header is inserted before the user's datagram. The header specifies the UD QP number associated with the local network address (IP and UDP port) of the send. A service thread is used to process messages received on the UDP socket. This thread updates the rsocket lookup tables with the remote QPN and path record data. The service thread forwards data received on the UDP socket to an rsocket QP. After the remote QPN and path records have been resolved, datagram communication between two nodes are done over the UD QP. Signed-off-by: Sean Hefty <[email protected]>
1 parent c6bfc1c commit e6e93ed

File tree

4 files changed

+1598
-154
lines changed

4 files changed

+1598
-154
lines changed

docs/rsocket

+91-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
rsocket Protocol and Design Guide 9/10/2012
1+
rsocket Protocol and Design Guide 11/11/2012
22

3-
Overview
4-
--------
3+
Data Streaming (TCP) Overview
4+
-----------------------------
55
Rsockets is a protocol over RDMA that supports a socket-level API
66
for applications. For details on the current state of the
77
implementation, readers should refer to the rsocket man page. This
@@ -189,3 +189,91 @@ registered remote data buffer.
189189
From host A's perspective, the transfer appears as a normal send/write
190190
operation, with the data stream redirected directly into the receiving
191191
application's buffer.
192+
193+
194+
195+
Datagram Overview
196+
-----------------
197+
The rsocket API supports datagram sockets. Datagram support is handled through an
198+
entirely different protocol and internal implementation. Unlike connected rsockets,
199+
datagram rsockets are not necessarily bound to a network (IP) address. A datagram
200+
socket may use any number of network (IP) addresses, including those which map to
201+
different RDMA devices. As a result, a single datagram rsocket must support
202+
using multiple RDMA devices and ports, and a datagram rsocket references a single
203+
UDP socket, plus zero or more UD QPs.
204+
205+
Rsockets uses headers inserted before user data sent over UDP sockets to resolve
206+
remote UD QP numbers. When a user first attempts to send a datagram to a remote
207+
address (IP and UDP port), rsockets will take the following steps:
208+
209+
1. Store the destination address into a lookup table.
210+
2. Resolve which local network address should be used when sending
211+
to the specified destination.
212+
3. Allocate a UD QP on the RDMA device associated with the local address.
213+
4. Send the user's datagram to the remote UDP socket.
214+
215+
A header is inserted before the user's datagram. The header specifies the
216+
UD QP number associated with the local network address (IP and UDP port) of
217+
the send.
218+
219+
A service thread is used to process messages received on the UDP socket. This
220+
thread updates the rsocket lookup tables with the remote QPN and path record
221+
data. The service thread forwards data received on the UDP socket to an
222+
rsocket QP. After the remote QPN and path records have been resolved, datagram
223+
communication between two nodes are done over the UD QP.
224+
225+
UDP Message Format
226+
------------------
227+
Rsockets uses messages exchanged over UDP sockets to resolve remote QP numbers.
228+
If a user sends a datagram to a remote service and the local rsocket is not
229+
yet configured to send directly to a remote UD QP, the user data is sent over
230+
a UDP socket with the following header inserted before the user data.
231+
232+
struct ds_udp_header {
233+
uint32_t tag;
234+
uint8_t version;
235+
uint8_t op;
236+
uint8_t length;
237+
uint8_t reserved;
238+
uint32_t qpn; /* lower 8-bits reserved */
239+
union {
240+
uint32_t ipv4;
241+
uint8_t ipv6[16];
242+
} addr;
243+
};
244+
245+
Tag - Marker used to help identify that the UDP header is present.
246+
#define DS_UDP_TAG 0x55555555
247+
248+
Version - IP address version, either 4 or 6
249+
Op - Indicates message type, used to control the receiver's operation.
250+
Valid operations are RS_OP_DATA and RS_OP_CTRL. Data messages
251+
carry user data, while control messages are used to reply with the
252+
local QP number.
253+
Length - Size of the UDP header.
254+
QPN - UD QP number associated with sender's IP address and port.
255+
The sender's address and port is extracted from the received UDP
256+
datagram.
257+
Addr - Target IP address of the sent datagram.
258+
259+
Once the remote QP information has been resolved, data is sent directly
260+
between UD QPs. The following header is inserted before any user data that
261+
is transferred over a UD QP.
262+
263+
struct ds_header {
264+
uint8_t version;
265+
uint8_t length;
266+
uint16_t port;
267+
union {
268+
uint32_t ipv4;
269+
struct {
270+
uint32_t flowinfo;
271+
uint8_t addr[16];
272+
} ipv6;
273+
} addr;
274+
};
275+
276+
Verion - IP address version
277+
Length - Size of the header
278+
Port - Associated source address UDP port
279+
Addr - Associated source IP address

src/cma.c

+12-2
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@ int rdma_destroy_id(struct rdma_cm_id *id)
513513
return 0;
514514
}
515515

516-
static int ucma_addrlen(struct sockaddr *addr)
516+
int ucma_addrlen(struct sockaddr *addr)
517517
{
518518
if (!addr)
519519
return 0;
@@ -2232,9 +2232,19 @@ void rdma_destroy_ep(struct rdma_cm_id *id)
22322232
int ucma_max_qpsize(struct rdma_cm_id *id)
22332233
{
22342234
struct cma_id_private *id_priv;
2235+
int i, max_size = 0;
22352236

22362237
id_priv = container_of(id, struct cma_id_private, id);
2237-
return id_priv->cma_dev->max_qpsize;
2238+
if (id && id_priv->cma_dev) {
2239+
max_size = id_priv->cma_dev->max_qpsize;
2240+
} else {
2241+
ucma_init();
2242+
for (i = 0; i < cma_dev_cnt; i++) {
2243+
if (!max_size || max_size > cma_dev_array[i].max_qpsize)
2244+
max_size = cma_dev_array[i].max_qpsize;
2245+
}
2246+
}
2247+
return max_size;
22382248
}
22392249

22402250
uint16_t ucma_get_port(struct sockaddr *addr)

src/cma.h

+2
Original file line numberDiff line numberDiff line change
@@ -145,10 +145,12 @@ typedef struct { volatile int val; } atomic_t;
145145
#define atomic_set(v, s) ((v)->val = s)
146146

147147
uint16_t ucma_get_port(struct sockaddr *addr);
148+
int ucma_addrlen(struct sockaddr *addr);
148149
void ucma_set_sid(enum rdma_port_space ps, struct sockaddr *addr,
149150
struct sockaddr_ib *sib);
150151
int ucma_max_qpsize(struct rdma_cm_id *id);
151152
int ucma_complete(struct rdma_cm_id *id);
153+
152154
static inline int ERR(int err)
153155
{
154156
errno = err;

0 commit comments

Comments
 (0)