InfinibandVerbs.jl Documentation

🚧 👷 Under construction 👷 🚧

This documenation is under construction.

Here are some of the more commonly used high level API functions. They are presented in these sections:

General functions

These structure and functions are useful for both sending and receiving packets.

InfinibandVerbs.ContextType
Context(dev_name, port_num; <kwargs>)

Create a Context object to use Infiniband Verbs on port_num of dev_name.

Keyword arguments, described in the extended help, control various aspects of the created objects. Their default values request minimal resources, but non-trivial applications will need to request more resources depending on the application's needs. Upon successful return, the Context's queue pair will be in the IBV_QPS_INIT state.

Extended help

The Context structure manages the following fields (so you don't have to):

Field nameTypeDescription
contextPtr{ibv_context}Context from C library
pdPtr{ibv_pd}Protection domain
send_comp_channelPtr{ibv_comp_channel}Send completion channel
recv_comp_channelPtr{ibv_comp_channel}Receive completion channel
send_cqPtr{ibv_cq}Send completion queue
recv_cqPtr{ibv_cq}Receive completion queue
qpPtr{ibv_qp}Queue pair
send_wcsVector{ibv_wc}Send work completions
recv_wcsVector{ibv_wc}Recv work completions

The following fields of the Context structure hold sizing information:

Field nameTypeDescription
send_cqeUInt32Number of send completion queue events
recv_cqeUInt32Number of recv completion queue events
max_send_wrUInt32Max number of posted send work requests
max_recv_wrUInt32Max number of posted recv work requests
max_send_sgeUInt32Max number of SGEs per send work request
max_recv_sgeUInt32Max number of SGEs per recv work request
(!) max_inline_dataUInt32Maximum amount of inline data

The Context constructor supports the following keyword arguments:

Keyword argumentDefaultDescription
forcefalseAllow dev_name:port_num to be inactive
(-) send_cqe1Number of events for send completion queue
(-) recv_cqe1Number of events for recv completion queue
(-) max_send_wr1Maximum number of send work requests
(-) max_recv_wr1Maximum number of recv work requests
(-) max_send_sge1Maximum number of SGEs for send WR sg_lists
(-) max_recv_sge1Maximum number of SGEs for recv WR sg_lists
req_notify_sendtrueRequest send CQ notifications
req_notify_recvtrueRequest recv CQ notifications
solicited_only_sendfalsesolicited_only for send CQ notifications
solicited_only_recvfalsesolicited_only for recv CQ notifications
(!) comp_vector0Completion queue comp_vector
(!) max_inline_data0Maximum inline data for QP
(!) qp_typeIBV_QPT_RAW_PACKETQP type
Info

Keyword arguments marked with (-) can be passed as -1 to use the device's maximum supported value, which can be retrieved from the Context field of the same name once constructed. The actual values will be greater than or equal to the requested values.

Warning

Fields and keyword arguments marked with (!) are for expert use. Use and/or change at your own risk.

source
InfinibandVerbs.hascapabilityFunction
hascapability(ctx::Context, cap)

Return true if the device corresponding to ctx has capability cap.

cap may be any one (and only one) of the ibv_device_cap_flags or ibv_raw_packet_caps flags.

Extended help

ibv_device_cap_flags
IBV_DEVICE_RESIZE_MAX_WR
IBV_DEVICE_BAD_PKEY_CNTR
IBV_DEVICE_BAD_QKEY_CNTR
IBV_DEVICE_RAW_MULTI
IBV_DEVICE_AUTO_PATH_MIG
IBV_DEVICE_CHANGE_PHY_PORT
IBV_DEVICE_UD_AV_PORT_ENFORCE
IBV_DEVICE_CURR_QP_STATE_MOD
IBV_DEVICE_SHUTDOWN_PORT
IBV_DEVICE_INIT_TYPE
IBV_DEVICE_PORT_ACTIVE_EVENT
IBV_DEVICE_SYS_IMAGE_GUID
IBV_DEVICE_RC_RNR_NAK_GEN
IBV_DEVICE_SRQ_RESIZE
IBV_DEVICE_N_NOTIFY_CQ
IBV_DEVICE_MEM_WINDOW
IBV_DEVICE_UD_IP_CSUM
IBV_DEVICE_XRC
IBV_DEVICE_MEM_MGT_EXTENSIONS
IBV_DEVICE_MEM_WINDOW_TYPE_2A
IBV_DEVICE_MEM_WINDOW_TYPE_2B
IBV_DEVICE_RC_IP_CSUM
IBV_DEVICE_RAW_IP_CSUM
IBV_DEVICE_MANAGED_FLOW_STEERING
ibv_raw_packet_caps
IBV_RAW_PACKET_CAP_CVLAN_STRIPPING
IBV_RAW_PACKET_CAP_SCATTER_FCS
IBV_RAW_PACKET_CAP_IP_CSUM
IBV_RAW_PACKET_CAP_DELAY_DROP
source
InfinibandVerbs.post_wrsFunction
post_wrs(ctx::Context, wr::Ptr; modify_qp=false) -> nothing
post_wrs(ctx::Context, wrs::Vector, idx=1; modify_qp=false) -> nothing

Post linked WRs to the Context's QP.

wr may be a Ptr{ibv_send_wr}/Ptr{ibv_recv_wr} or Vector{ibv_send_wr}/Vector{ibv_recv_wr}. Passing a Vector will only post the WRs that are part of the linked list headed by the WR wrs[idx] of the Vector, which may not be the same as wrs[idx:end]. Throws a SystemError if the underlying library call fails, otherwise returns nothing.

If modify_qp is true, the Context's QP will be transitioned appropriately for the work request type:

  • For ibv_recv_wr, the QP will be transitioned to a "ready-to-receive" compatible state after posting the WRs.

  • For ibv_send_wr, the QP will be transitioned to the "ready-to-send" state before posting the WRs.

source
InfinibandVerbs.repost_loopFunction
repost_loop(callback, ctx::Context, wrs, timeout_ms, callback_args...)

Run a loop that processes work completions and reposts work requests.

The callback argument is a user supplied callback function described in detail below. ctx is a Context struct. wrs is a Vector{ibv_recv_wr} or Vector{ibv_send_wr} containing all the work requests being used. callback_args... are user supplied arguments that will be passed to the user supplied callback function. Typically callback_args will include the packet buffers (or some other means of getting packet (fragment) buffers) and a Ref{Int} for accumulating the number of packets posted (useful when finding the next location in the packet buffers to use), but the caller is free to decide how to make use of callback_args....

The user supplied callback function callback is called for every work completion event that has a non-zero number of work completions. It is also called if no work completions are received for timeout_ms milliseconds. The callback function is passed the following arguments:

callback(wcs, num_wc, wrs, callback_args...)
  • wcs: A Vector{ibv_wc} containing work completions (WCs)
  • num_wc: Number of valid entries in wcs (i.e. wcs[1:num_wc] are valid). If the callback function is called after a timeout, num_wc will be 0.
  • wrs: The same recv_wrs or send_wrs passed to repost_loop
  • callback_args...: The same callback_args... parameters that were passed to repost_loop (the callback function may specify the individual extra arguments by name rather than splatting callback_args)
Warning

This function can block waiting for work completion events. It should be run in a separate thread from the main Julia thread so that it will not interfere with the Julia scheduler. Usually this will be done using Threads.@spawn. For example (using repost_loop to receive):

recvtask = Threads.@spawn repost_loop(recv_callback, ctx, recv_wrs, timeout, user1, user2)

Note that Threads.@spawn does not create new threads. To use multiple threads, Julia must be started with more than one thread (e.g. julia -t 4). Additional threads cannot be created after startup.

Extended help

The main purpose of the callback function is to update the SG lists of the completed work requests (WRs) with pointers to the next available packet (fragment) buffer location(s). The callback function should return the number of work requests for repost_loop to repost. Returning 0 from the callback will keep repost_loop running but no WRs will be reposted. Returning a negative value will cause repost_loop to return (i.e. end). When num_wc is 0 (i.e. after a timeout), the callback should return a value less than 0 to end the loop or 0 to keep waiting for more work completions (i.e. packets).

For streaming applications that run "forever", the callback function should update the SG lists of the WRs for all num_wc WCs and return num_wc.

Applications that want to process exactly N WRs should pass N as part of callback_args... as well as two Ref{Int} parameters so the callback function can accumulate the number of WRs posted and the number of WRs done/processed. As the number of WRs posted approaches N the callback may return a number less than than num_wc. Such applications should also pay attention to the number of done/processed WRs and return a negative value when it equals N (or exceeds N, but that should never happen if exactly N WRs are posted).

source

These functions are useful when sending packets.

InfinibandVerbs.create_send_wrsFunction
create_send_wrs(ctx::Context, bufs, num_wr[, npad]; offload=false, post=false) -> send_wrs, sges, mrs
create_send_wrs(mrs, bufs, num_wr[, npad]) -> send_wrs, sges

Return num_wr ibv_send_wr work requests (WRs) and their associated scatter-gather (SG) lists.

bufs and mrs, if given, must be Vectors. If the form with ctx::Context is called, the memory region(s) of bufs will be registered before calling the mrs form and the resulting MRs will be returned in addition to the WRs and SG lists. If memory regions mrs are given, their lkey values will be used when populating the SG lists so mrs must correspond to bufs. All of the Arrays in bufs must hold the same number of packets (i.e. packet fragments). See the doc string for create_sges for details about the npad parameter.

The Context form also accepts keyword arguments offload and post, both of which default to false. If offload is true the WRs will be setup to offload the IP checksum calculation to the NIC. You can use hascapability to check whether your device supports this. See the man page of ibv_post_send for more details. If post is true, the Context's QP will be transitioned to the "ready-to-send" (RTS) state and the work requests will be posted.

source

These functions are useful when receiving packets.

InfinibandVerbs.create_recv_wrsFunction
create_recv_wrs(ctx::Context, bufs, num_wr[, npad]; post=false) -> recv_wrs, sges, mrs
create_recv_wrs(mrs, bufs, num_wr[, npad]) -> recv_wrs, sges

Return num_wr ibv_recv_wr work requests (WRs) and their associated scatter-gather (SG) lists.

bufs and mrs, if given, must be Vectors. If the form with ctx::Context is called, the memory region(s) of bufs will be registered before calling the mrs form and the resulting MRs will be returned in addition to the WRs and SG lists. If memory regions mrs are given, their lkey values will be used when populating the SG lists so mrs must correspond to bufs. All of the Arrays in bufs must hold the same number of packets (i.e. packet fragments). See the doc string for create_sges for details about the npad parameter.

The Context form also accepts keyword argument post (default false). If post is true, the work requests will be posted and the Context's QP will be transitioned to a "ready-to-receive" (RTR) compatible state.

source
InfinibandVerbs.create_flowFunction
create_flow(qp, port_num; <kwargs>) -> Ptr{ibv_flow}
create_flow(ctx::Context; <kwargs>) -> Ptr{ibv_flow}

Create a flow rule to specify which packets to receive.

The flow rule will be created for queue pair qp, port number port_num (or ctx.qp and ctx.port_num).

Various attributes and selectors of the flow rule can be specified by keyword arguments as described in the extended help. The returned Ptr{ibv_flow} can be passed to ibv_destroy_flow to remove the flow rule.

Extended help

Attributes

These keyword arguments set general attributes of the flow rule. See the Linux manual page for ibv_create_flow for more details.

  • General flow rule attributes:
    • flow_type one of :normal (default), :all, :mc, or :sniffer
    • priority defaults to 0
    • flow_flags defaults to 0

Selectors

Keyword arguments that are unspecified (or zero) are not used for matching. Integer values will be converted to network byte order as necessary.

  • Ethernet layer selectors
    • dmac::NTuple{6, UInt8}: destination MAC address to match
    • smac::NTuple{6, UInt8}: source MAC address to match
    • ethtype: EtherType value to match (e.g. 0x0800 for IPv4 packets)
    • vlan: VLAN tag to match
  • IPv4 layer selectors (can be UInt32 or IPv4)
    • sip: source IPv4 address (e.g. 0x0a000123 or ip"10.0.1.35")
    • dip: destination IPv4 address
    • proto: IP protocol
    • tos: Type of service (aka DSCP/ECN)
    • ttl: Time to live
    • flags: IP flags
  • TCP/UDP layer selectors
    • dport: destination TCP/UDP port to match
    • sport: source TCP/UDP port to match
    • tcpudp: one of :udp (default) or :tcp. Specifies whether sport/dport should match UDP or TCP ports. To match all UDP or all TCP packets, use the IPv4 layer proto selector instead.
source
InfinibandVerbs.destroy_flowFunction
destroy_flow(flow) -> nothing

Destroy flow that was created by create_flow. Packets specified by flow will no longer be received from the flow's device and port.

source

Less frequently used functions

These functions are not directly used so frequently because they are called by more commonly used functions. Sometimes the more commonly used functions expose parameters that are passed on to these functions so they are documented here.

InfinibandVerbs.create_sgesFunction
create_sges(bufs, lkeys, num_wr[, npad]) -> Matrix{ibv_sge}

Return a Matrix{ibv_sge} of size (length(bufs), num_wr).

Each column of the returned Matrix is an SG list initialized to point to the first num_wr packets of bufs. Both bufs and lkeys must be Vectors of the same length. The elements of bufs are Arrays that must be sized for the same number of packets and num_wr must not exceed this number. The elements of lkeys must be the lkey values corresponding to the registered memory regions for bufs.

The Arrays in bufs may include a number of padding bytes along their first dimension to improve alignment of the packet data. The number of padding bytes may be specified by passing npad. npad may be a single Int to apply the same value to all bufs Arrays or a Vector{Int} to specify values for all the bufs Array. The default value of npad is 0 (i.e. no padding). The value(s) of npad will be subtracted from the number of bytes in the first dimensions of the bufs Array.

source

Index