Generated from sdp_ti.h with ROBODoc vunknown on Thu Aug 01 16:45:08 2002
TABLE OF CONTENTS
- SDP/Transport Interface Overview
- SDP/Pseudo-code Examples
- TI/Opaque Handles
- TI/net_device_t
- Opaque Handle/ti_handle_t
- Opaque Handle/ti_cq_handle_t
- Opaque Handle/ti_endpoint_handle_t
- Opaque Handle/ti_pd_handle_t
- Opaque Handle/ti_conn_handle_t
- Opaque Handle/ti_mem_handle_t
- Opaque Handle/ti_res_pool_handle_t
- TI/ti_status_t
- TI/ti_err_rec_t
- TI/ti_event_t
- TI/ti_event_cb_t
- TI/ti_addr_fmt_t
- TI/ti_inet_addr_t
- TI/ti_port_t
- TI/ti_ioctl_t
- TI/ti_atomic_op_t
- TI/ti_access_t
- TI/ti_res_pool_type_t
- TI/ti_region_key_t
- TI/ti_rdma_desc_t
- TI/ti_local_data_desc_t
- TI/ti_msg_opt_t
- TI/ti_data_element_t
- TI/ti_conn_request_t
- TI/ti_conn_req_cb_t
- TI/ti_conn_info_t
- TI/ti_conn_attr_t
- TI/ti_pool_response_t
- TI/ti_xfer_cb_t
- TI/ti_context_cb_t
- TI/ti_err_cb_t
- TI/ti_res_pool_cb_t
- TI/ti_comp_info_t
- TI/ti_ops_t
- TI/ti_status_2_str
- TI/events_register
- TI/events_deregister
- TI/create_ti
- TI/destroy_ti
- TI/debug_svc
- TI/create_endpoint
- TI/destroy_endpoint
- TI/create_pd
- TI/destroy_pd
- TI/create_cq
- TI/poll_cq
- TI/rearm_cq
- TI/destroy_cq
- TI/connect
- TI/accept
- TI/reject
- TI/listen
- TI/disconnect
- TI/reg_virt_mem
- TI/reg_phys_mem
- TI/dereg_mem
- TI/res_pool_create
- TI/res_pool_destroy
- TI/res_pool_get
- TI/res_pool_put
- TI/msg_send
- TI/msg_recv
- TI/rdma_read
- TI/rdma_write
- TI/rdma_read_send
- TI/rdma_write_send
- TI/atomic_op
- TI/io_ctl
- TI/ti_transport_desc_t
- TI/ti_inet_route_event_t
- PP/SDP_register_transport
- PP/SDP_deregister_transport
- TI/ti_event_to_str
- TI/ti_status_to_str
- TI/mk_inet_addr_p
- TI/strdup
- TI/SAMPLE_CODE/sample_TI_bind
- TI/SAMPLE_CODE/sample_TI_callback
- SDP/SAMPLE_CODE/sample_sock_init
- SDP/SAMPLE_CODE/sample_bind
- SDP/SAMPLE_CODE/sample_connect
- SDP/SAMPLE_CODE/sample_listen
- SDP/SAMPLE_CODE/sample_connect_request_callback
- SDP/SAMPLE_CODE/sample_accept
- SDP/SAMPLE_CODE/sample_sendmsg
- SDP/SAMPLE_CODE/sample_msg_send_complete
- SDP/SAMPLE_CODE/sample_msg_recv
- SDP/SAMPLE_CODE/sample_msg_recv_complete
- SDP/SAMPLE_CODE/sample_disconnect
NAME
Sockets Direct Protocol -- Transport Interface Overview
COPYRIGHT
Intel Corporation 2002
PURPOSE
Define data types and API's for the SDP Transport Interface (TI).
HISTORY
NOTES
DESCRIPTION
The Sockets Direct Protocol (SDP) Transport Interface (TI) is designed to
abstract reliable hardware transport interface specifics in such a fashion
that an INET aware protocol module, Sockets Direct Protocol (SDP) in our
case, can interface to a consistent transport API over a potentially wide
range of differing reliable hardware transports. The initial reliable
hardware transport supported is InfiniBand, accessed via the software
layer known as the InfiniBand Access Layer (AL). The combination of AL
and TI interfaces is known as the module IBT.
Each transport provider (TP) has at least one Linux network link device
associated with it, represented by a Linux net_device struct. Once the TI
initializes and registers with SDP, the TI will provide link device
configuration event notifications to both SDP and the InfiniBand transport
(IBT) layers. In this fashion, Internet link device configuration changes
(IP address assignment, address change or device shutdown) events will
be propagated to all interested agents (SDP & IB transport layers).
The TI defines the Internet Protocol Address as its base level transport
endpoint addressing mechanism. SDP will establish connections, send and
recv datagrams based on IPv4 address format. TP's are expected to convert
an IP address into the transport/link specific addressing format required
for data transmission or reception.
Protocol provider modules, SDP in our case, are expected to export an
interface to register and de-register the TI layer. This registration
process enables a protocol provider module to discover the capabilities
of a TP (Transport Provider) via a TI operations vector.
See the Pseudo-code Examples for operational details.
VARIABLE PREFIX NAMING CONVENTIONS
'p_' indicates a pointer.
'h_' implied: pointer to an opaque handle. All handles are pointers.
'ph_' indicates a pointer to a handle; '*ph_' is a '**' construct.
'pfn_' pointer to a function.
NAME
Sockets Direct Protocol (Transport Interface) pseudo-code examples
PURPOSE
The SDP-TI pseudo-code examples are part of an example SD Protocol
implementation within the Linux kernel. Each example demonstrates how
the TI (Transport Interface) is utilized by SDP to implement user-mode
Linux socket semantics.
DESCRIPTION
A client side connection senario would invoke the following
sample_sock_init
sample_connect
sample_connect_request_callback
sample_sendmsg
sample_msg_send_complete
sample_disconnect
Server side connection senario
sample_sock_init
sample_listen
sample_connect_request_callback
sample_msg_recv
sample_msg_recv_complete
sample_disconnect
Protocol Provider Exported Interfaces
SDP_register_transport
SDP_deregister_transport
Exported Transport Interfaces - file: ti_exports.h
These TI interfaces are provided for the convience of those who are
developing Transport Provider modules.
ti_event_to_str
ti_status_to_str
mk_inet_addr_p
strdup
NAME
Opaque Handles -- Transport Interface opaque handles
PURPOSE
Opaque handles are transport defined and allocated resource pointers.
TI clients are not required to understand the internal structure of these
resources, hence 'opaque'.
net_device_t, ti_handle_t, ti_cq_handle_t
ti_endpoint_handle_t, ti_pd_handle_t, ti_conn_handle_t, ti_mem_win_handle_t
ti_mem_handle_t, ti_res_pool_handle_t
NAME
net_device_t
DESCRIPTION
Shorthand name for the standard Linux network device structure pointer
'struct net_device'.
SYNOPSIS
typedef struct net_device net_device_t;
NAME
ti_handle_t -- Transport interface handle
DESCRIPTION
An opaque handle that identifies a TI service handle. A transport
interface service handle is created [create_ti()] by SDP after the
Infiniband transport has registered with SDP. The destroy_ti() routine
will release all resources created by the create_ti call.
The transport interface service handle usage model is envisioned to be a
single handle per transport. With a single transport service handle,
multiple simultaneous connections, multiple connections over time and
multiple resource pools may all be created using a single transport service
handle. In most cases, only a single transport handle is required.
SYNOPSIS
typedef struct _ti_handle *ti_handle_t;
NAME
ti_cq_handle_t -- Completion Queue handle
DESCRIPTION
Opaque handle that identifies a completion queue. Completion Queues (CQs)
are allocated [create_cq], destroyed [destroy_cq], polled [poll_cq] and
rearmed for operation [rearm_cq] by SDP. CQs are never directly accessed
at the SDP layer. TI management routines extract raw completion queue
information into resource pool data elements [ti_data_element_t] which are
then returned in the SDP specified completion callback routine or polled
for, see [ti_xfer_cb_t, poll_cq].
SYNOPSIS
typedef struct _ti_cq_handle *ti_cq_handle_t;
SEE ALSO
create_cq, destroy_cq, poll_cq, rearm_cq, ti_xfer_cb_t, ti_conn_attr_t
NAME
ti_endpoint_handle_t - Transport Endpoint handle
DESCRIPTION
Local or remote transport endpoints identification. Endpoints are managed
by [create_endpoint() & destroy_endpoint()] calls.
SEE ALSO
ti_conn_info_t, connect, listen, create_pd
SYNOPSIS
typedef struct _ti_endpoint_handle *ti_endpoint_handle_t;
NAME
ti_pd_handle_t -- Protection Domain handle
DESCRIPTION
Protection domains are created [create_pd] to limit the scope of transport
operations.
SEE ALSO
create_endpoint, ti_conn_info_t, connect, listen, reg_virt_mem
SYNOPSIS
typedef struct _ti_pd_handle *ti_pd_handle_t;
NAME
ti_conn_handle_t -- connection handle
DESCRIPTION
An opaque connection handle that uniquely identifies an end-to-end reliable
connection. A connection handle is created as an output parameter of a
connect or listen call. A connection handle is used in data transfer
requests after the connection has become 'established'. Connection
establishment occurs after an accept() call has been invoked.
SEE ALSO
connect, listen, accept, reject, disconnect
msg_send, msg_recv, rdma_read, rdma_write, rdma_read_send,
rdma_write_send, atomic_op
SYNOPSIS
typedef struct _ti_conn_handle *ti_conn_handle_t;
NAME
ti_mem_handle_t -- Registered Memory Region handle
DESCRIPTION
Handle that identifies a registered memory region. Data buffer memory must
be registered with the transport before a send or receive operation can
complete successfully.
SEE ALSO
reg_virt_mem, reg_phys_mem
SYNOPSIS
typedef struct _ti_mem_handle *ti_mem_handle_t;
NAME
ti_res_pool_handle_t -- Resource pool handle
DESCRIPTION
Opaque handle that identifies a resource pool. Utilized to get and put
resources data elements from the pool.
SEE ALSO
res_pool_create, res_pool_get, res_pool_put, res_pool_destroy
msg_send, msg_recv
SYNOPSIS
typedef struct _ti_res_pool_handle *ti_res_pool_handle_t;
NAME
ti_status_t -- Transport Interface status code definitions
DESCRIPTION
TI routines return either return void or one of the status codes listed
here. TI_SUCCESS indicates success, all others indicate some type of
failure.
typedef enum
{
/*
* Generic Errors
*/
TI_NO_STATUS = -1,
TI_SUCCESS = 0,
TI_PENDING,
TI_INSUFFICIENT_RESOURCES,
TI_INSUFFICIENT_MEMORY,
TI_INVALID_PARAMETER,
TI_INVALID_PERMISSION,
TI_INVALID_ADDRESS,
TI_INVALID_HANDLE,
TI_INVALID_SETTING,
TI_INVALID_OPERATION,
TI_UNSUPPORTED,
TI_RESOURCE_BUSY,
TI_REJECTED,
TI_CANCELED,
TI_TIMEOUT,
TI_CQ_ERROR,
TI_CQ_OVERRUN_ERROR,
TI_NOT_FOUND,
TI_ERROR
} ti_status_t;
NAME
ti_err_rec_t
DESCRIPTION
Information returned to a user when an error occurs on an allocated
resource.
SYNOPSIS
typedef struct _ti_err_rec
{
ti_status_t code;
void *context;
} ti_err_rec_t;
FIELDS
code
An error code that identifies the type of error being reported.
context
client context information associated with the resource on which
the error occurred.
SEE ALSO
ti_status_t, ti_err_cb_t
NAME
ti_event_t -- enumerate TI events
DESCRIPTION
Linux events are propagated to Transports and registered Protocols (SDP).
Linux events (network device, system and IP routing) all have overlapping
namespaces. The TI is responsible for remapping the Linux specific event
names into a single TI unique event namespace. The reason this works is the
Linux specific events arrive on different notification chains, hence the
TI can delineate the namespaces. What a pain-in-the-backside.
SYNOPSIS
typedef enum {
TI_EVENT_NONE = 0,
// Network device - linux/notifier.h
TI_EVENT_NETDEV_ONLINE,
TI_EVENT_NETDEV_UP,
TI_EVENT_NETDEV_DOWN,
TI_EVENT_NETDEV_REBOOT, // Tell a protocol stack a network interface
// detected a hardware crash and restarted - we can
// use this e.g. to kick tcp sessions once done.
TI_EVENT_NETDEV_CHANGE, // Notification of a link device state change
TI_EVENT_NETDEV_REGISTER,
TI_EVENT_NETDEV_UNREGISTER,
TI_EVENT_NETDEV_CHANGEMTU,
TI_EVENT_NETDEV_CHANGEADDR,
TI_EVENT_NETDEV_GOING_DOWN,
TI_EVENT_NETDEV_CHANGENAME,
// system events - linux/notifier.h
TI_EVENT_SYS_DOWN,
TI_EVENT_SYS_RESTART,
TI_EVENT_SYS_HALT,
TI_EVENT_SYS_POWER_OFF,
// IPv4 route events - linux/route.h
TI_EVENT_RTF_UP, // route usable
TI_EVENT_RTF_GATEWAY, // destination is a gateway
TI_EVENT_RTF_HOST, // host entry (net otherwise)
TI_EVENT_RTF_REINSTATE, // reinstate route after timeout
TI_EVENT_RTF_DYNAMIC, // created dynamically (by redirect )
TI_EVENT_RTF_MODIFIED, // modified dynamically (by redirect )
TI_EVENT_RTF_MTU, // specific MTU for the route
TI_EVENT_RTF_MSS, // compatibility
TI_EVENT_RTF_WINDOW, // per route window clamping
TI_EVENT_RTF_IRTT, // Initial Round trip time
TI_EVENT_RTF_REJECT, // Reject route
// IPv6 route events - linux/ipv6_route.h
TI_EVENT_RTF_DEFAULT, // default - learned via ND
TI_EVENT_RTF_ALLONLINK, // fallback, no routes on link
TI_EVENT_RTF_ADDRCONF, // addrconf route - RA
TI_EVENT_RTF_NONEXTHOP, // route with no nexthop
TI_EVENT_RTF_EXPIRES, //
TI_EVENT_RTF_CACHE, // cached entry
TI_EVENT_RTF_FLOW, // flow significant route
TI_EVENT_RTF_POLICY, // policy route
TI_EVENT_RTF_LOCAL,
TI_EVENT_RTMSG_NEWDEVICE,
TI_EVENT_RTMSG_DELDEVICE,
TI_EVENT_RTMSG_NEWROUTE,
TI_EVENT_RTMSG_DELROUTE,
// TI (Transport Interface) Events
TI_EVENT_TI_SHUTDOWN, // data == char* transport_name
TI_EVENT_PROTO_SHUTDOWN // data == char* SDP shutdown reason
} ti_event_t;
SEE ALSO
events_register, events_deregister, ti_event_cb_t
#include <linux/notifier.h>
#include <net/route.h>
NAME
ti_event_cb_t -- transport event notification callback
DESCRIPTION
Upon return of the events_register() or tp_enables_events() calls,
network transport event notifications are enabled and will continue to be
delivered via the event callback routine until events_deregister() routine
is called. When the Event_callback routine is invoked by the TI, a
transport event has occurred; only the 'event' argument is always defined.
The '*p_data' input parameter is defined in the context of the 'event'
input parameter value.
Transport events encompass:
network link device events
link device shutdown, startup
a change in a link's IP address assignment, including broadcast and
netmask changes.
system shutdown events
The IP routing database has been changed: RIP or /sbin/route changes.
SYNOPSIS
typedef void (*ti_event_cb_t)( IN const ti_event_t event,
IN const void *p_data,
IN const void *p_context );
PARAMETERS
event
TI event code, see [ti_event_t].
*p_data
Event data pointer. Format is dependent on the event code.
See ti_transport_desc_t description.
*p_context
User defined context pointer which was originally supplied in the
events_register call.
NOTES
Event processing contexts:
Event 'TI_EVENT_NETDEV_ONLINE'
Link device interface is considered UP and able to send/recv packets.
The controlling transport state is not initialized.
callback argument definitions:
'data' argument points to an (ti_transport_desc_t) that describes
transport attributes:
transport_caps
transport supported capabilities
TI_operations_vector
defines transport operation entry points
net_device structure pointer
will be valid on each callback until the next
TI_EVENT_NETDEV_DOWN event for this device instance.
IP_net_addr
pointer to the assigned (ala ifconfig) IP address.
IP_netmask
pointer to the IP network mask for the IP subnet.
first_hop_IP_route_addr
pointer to the first-hop router's IP address.
Event 'TI_EVENT_NETDEV_GOING_DOWN'
Heads up the Link interface instance is shutting down.
Link device interface instance is considered going down, shortly it will
be unable to send/recv packets. IP address assignment is still valid.
Event_callback argument status:
'data' represents a (net_device_t structure *).
*IP_addr, *IP_netmask & *first_hop_IP_route_addr args are all valid.
Event 'TI_EVENT_NETDEV_DOWN'
The Link interface is gone...
'data' is null, device is history.
Event 'TI_EVENT_NETDEV_CHANGE'
'data' represents a (net_device_t structure *).
Link device has changed its configuration
Event 'TI_EVENT_NETDEV_CHANGENAME'
'data' represents a (net_device_t structure *).
Link device has changed its name
Event 'TI_EVENT_NETDEV_CHANGEMTU'
'data' represents a (net_device_t structure *).
Link device has changed its MTU
Event 'TI_EVENT_NETDEV_REGISTER'
'data' represents a (net_device_t structure *).
Link device 'data' has registered with the network device subsystem.
Device not online yet.
Event 'TI_EVENT_NETDEV_UNREGISTER'
'data' represents a (net_device_t structure *).
Link device 'data' has unregistered with the network device subsystem.
Device no longer available.
Event 'TI_EVENT_RTF_UP'
'data' represents an IP address (ti_inet_addr_t *)
route is usable
Event 'TI_EVENT_RTF_GATEWAY'
'data' represents an IP address (ti_inet_addr_t *)
address is a gateway
SEE ALSO
events_register, events_deregister
NAME
ti_addr_fmt_t -- IP address format identifier
DESCRIPTION
Identify types of Internet Protocol (IP) address formats.
SYNOPSIS
typedef enum
{
TI_ADDR_IPV4,
TI_ADDR_IPV6
} ti_addr_fmt_t;
NAME
ti_inet_addr_t -- Internet Protocol address: IPv4 or IPv6 format
DESCRIPTION
Define an Internet Protocol (IP) address structure. An address may be an
IP version 4 or an IP version 6 format. IP type is identified by the
'IP_addr_type' field.
typedef struct _ti_addr_ {
union {
struct in6_addr in6;
struct in_addr in4;
} inet_u;
#define ipv4_addr inet_u.in4.s_addr
#define ipv6_saddr32 inet_u.in6.s6_addr32
ti_addr_fmt_t IP_addr_type; // IPv4 or IPv6 ?
} ti_inet_addr_t;
#define SET_IPV4_ADDR(w,h) w.ipv4_addr = (h); w.IP_addr_type=TI_ADDR_IPV4
FIELDS
Union in_u
in_u.in6 - IPv6 address structure
in_u.in4 - IPv4 address structure
Union-end
IP_addr_type - Identifies which type of IP address: V4 or V6.
SEE ALSO
ti_addr_fmt_t
#include <linux/in.h>
#include <linux/in6.h>
NAME
ti_port_t -- Protocol port space identifier
DESCRIPTION
Unique Protocol port space identifier.
SEE ALSO
ti_create_endpoint, ti_endpoint_handle_t
SYNOPSIS
typedef uint16_t ti_port_t;
NAME
ti_ioctl_t -- Transport I/O control commands
DESCRIPTION
I/O control commands defined by the SDP Transport.
SYNOPSIS
typedef enum
{
TI_GET_CONN_PROFILE,
} ti_ioctl_t;
SEE ALSO
io_ctl
NAME
atomic_op_t -- Remote Atomic operations
PURPOSE
SDP Transport supported remote atomic operations
DESCRIPTION
Compare and swap two remote 32-bit values
Fetch and add a single remote 32-bit value
SYNOPSIS
typedef enum
{
TI_ATOMIC_COMPARE_SWAP,
TI_ATOMIC_FETCH_ADD
} ti_atomic_op_t;
SEE ALSO
atomic_op
NAME
ti_access_t -- Transport registered memory access permissions.
DESCRIPTION
Define types of IB registered memory access permissions.
SYNOPSIS
typedef enum
{
TI_MEM_READ=0,
TI_MEM_WRITE,
TI_MEM_READ_WRITE
} ti_access_t;
SEE ALSO
reg_virt_mem, reg_phys_mem
NAME
ti_res_pool_type_t -- classes of resource pools
PURPOSE
Identify resource pool data_elements as to what type of pool they were
allocated in. Also utilized by the data transfer completion callback
routine (ti_xfer_cb_t) to identify what type of data transfer completed
(RDMA or Buffered).
SYNOPSIS
typedef enum
{
TI_RES_POOL_MSG,
TI_RES_POOL_RDMA,
} ti_res_pool_type_t;
SEE ALSO
ti_xfer_cb_t, res_pool_create, msg_send, rdma_write
NAME
ti_region_key_t -- Memory region access key
DESCRIPTION
A memory region key that grants specific memory access rights.
SYNOPSIS
typedef uint32_t ti_region_key_t;
SEE ALSO
ti_rdma_desc_t, reg_virt_mem, reg_phys_mem
NAME
ti_rdma_desc_t -- Remote Direct Memory Access descriptor
DESCRIPTION
An RDMA descriptor contains the Virtual Address of the RDMA accessible
memory plus the Region key which grants RDMA hardware access. In order to
support newer processor architectures, the Virtual Address is 64-bits in
length.
SYNOPSIS
typedef struct _ti_rdma_desc
{
uint64_t vaddr;
ti_region_key_t rkey;
} ti_rdma_desc_t;
FIELDS
vaddr
Virtual Address of RDMA accessible memory
r_key
Region key that grants RDMA memory access permissions
SEE ALSO
reg_virt_mem, create_pd
NAME
ti_local_data_desc_t -- local data descriptor
DESCRIPTION
Each data element [ti_data_element_t] contains a linked list of data buffer
structures [<NULL> link-pointer terminated list]. Theses data buffers
describe local transport registered memory from which data is sent from or
received into.
SYNOPSIS
typedef struct _local_data_desc
{
void *vaddr;
ti_region_key_t lkey;
uint64_t byte_count;
} ti_local_data_desc_t;
FIELDS
*vaddr
Virtual Address of the transport registered memory data buffer
(p_Data ... (p_data + (byte_count-1)) must be TI registered in order
for send/recv to complete successfully.
lkey
local memory region key associated with memory 'vaddr' points to.
byte_count
Send/write - number of bytes to send from this data buffer
Recv - max size of the buffer pointed to by 'vaddr'.
SEE ALSO
ti_data_element_t, reg_virt_mem, msg_send, msg_recv
NAME
ti_msg_opt_t
DESCRIPTION
Indicates the attributes of a data element.
SYNOPSIS
typedef uint32_t ti_msg_opt_t;
#define TI_OPT_IMMEDIATE 0x00000001
#define TI_OPT_FENCE 0x00000002
#define TI_OPT_SIGNALED 0x00000004
#define TI_OPT_SOLICITED 0x00000008
VALUES
TI_OPT_SIGNALED
Indicates that message completion will generate an event
if CQ is armed.
TI_OPT_SOLICITED
Indicates that a solicited event will be set for this send message.
Expected to cause a message completion event on the remote endpoint,
if the remote receive CQ is armed.
TI_OPT_FENCE
The send operation is fenced. Complete all pending send operations
before processing this request. Typically used to fence a send operation
behind a DMA operation to avoid a send message completion indication
passing the DMA operation that it is referencing.
TI_OPT_IMMEDIATE
Send immediate data with the given request.
SEE ALSO
ti_data_element_t
NAME
ti_data_element_t -- data element structure
DESCRIPTION
The data element structure describes message data that is referenced in
message send and receive operations. A data element references the transport
registered data buffers accessed during the data transfer. Data buffers
MUST be transport registered in order to achieve a successful transfer.
SYNOPSIS
typedef struct _ti_data_element
{
struct _ti_data_element *p_next_msg;
uint32_t num_io_bufs;
union {
ti_local_data_desc_t *ldata;
ti_rdma_desc_t *rdata;
} io_buffer_array;
#define local_data_desc io_buffer_array.ldata
#define rdma_data_desc io_buffer_array.rdata
ti_res_pool_type_t pool_type;
void *p_context;
// Message options, SEND and RDMA Write
ti_msg_opt_t msg_opt;
// Out-Of-Band Data & length.
uint32_t oob_data_len;
uint32_t oob_data;
// Completion information
ti_status_t status;
uint32_t recv_byte_count;
uint32_t ud_offset;
boolean_t iodone;
// opaque transport data - do not touch!
void *otd;
} ti_data_element_t;
FIELDS
p_next_msg
The library employs a list of elements to optimize requesting and
completing data transfers. Elements in a list are linked together, with
a NULL-terminated pointer. This parameter references the next element
in a chain of elements or NULL to signify the end of the list.
num_io_buf_desc
Number of io buffer descriptors in the io_buffer_array.
io_buffer_array
A pointer to an array of IO buffer descriptors: either local data or
RDMA data descriptors. Field 'pool_type' indicates which type of IO
buffers.
pool_type
Which type of resource pool is this element from: RDMA or Buffered.
Also indicates pointer type of the elements of the io_buffer_array.
p_context
A reference to a user-controlled data buffer provided to store context
information with a message, datagram, or RDMA. The library ignores this
pointer and any memory location that it references throughout the life
of the element. Users may store context information directly into this
pointer or can request that the library set it to a user-controlled
memory region. By default, this pointer or the area it references
contain all zeros.
msg_opt
attributes of message transfer. TI_OPT_SOLICITED event for remote
completion notification and TI_OPT_SIGNALED for local notification.
oob_data_len
Number of bytes of Out-Of-Band (OOB) data represented by 'oob_data'.
Current OOB data size is limited to the 4 bytes represented by the
'oob_data' field.
oob_data
The OOB data bytes to send with an outbound transfer or any OOB data
received from an incoming transfer. OOB data is sent before any data
elements if 'oob_data_len' is > 0. Consider oob_data as immediate data
if your transport supports immediate data.
status
contains the operation result of a data transfer operation. Status
values are listed in the [ti_status_t] discussion.
recv_byte_count
Indicates the total number of bytes received into all the data buffers
referenced by p_buffer_list. For example, this field would be set to
100 if a 100-byte message were received from a remote node. If each
data_buffer in 'p_buffer_list' was sized at 25 bytes, then the total
100 bytes would span 4 data buffers of this element. Received data fills
the referenced buffers in the order they appear in the list.
This parameter is valid only when processing completed receives.
ud_offset
user-defined offset, not utilized in TI processing and included
solely as a convenience to the user. One envisioned use would be to
record how much of this data element has been consumed by a higher level
protocol.
iodone
user-defined boolean, not utilized in TI processing and included
solely as a convenience to the user. One envisioned use would be to
use this field as the IO done condition field.
*otd
Opaque transport data, allocated when a resource pool is created.
Data is not used or modified by TI clients. Implication is the creation
of data elements can only be accomplished by the res_pool_create call.
SEE ALSO
msg_send, msg_recv, rdma_write, rdma_read, ti_status_t, res_pool_create
NAME
ti_conn_request_t -- connection request information
DESCRIPTION
Provides information about a connection request to a remote endpoint or from
a remote endpoint. A ti_conn_request_t structure pointer is the sole input
parameter to the connection request callback routine [ti_conn_req_cb_t].
The callback routine is required to accept(), reject() or save the
ti_conn_request_t information for later accept or reject processing.
The ti_conn_request_t information must be copied for later processing as
the ti_conn_request_t struct is only valid for the duration of the conection
request callback routine.
SYNOPSIS
typedef struct _ti_conn_request
{
ti_conn_handle_t conn_handle;
ti_status_t status;
uint16_t caller; // 0 == connect(), 1 == listen()
uint16_t cb_type; // CN_{REQUEST,REPLY,REJECT,MRA}
void *context; // from conn_info->context
uint32_t private_data_len;
void *private_data;
uint16_t private_data_offset;
uint16_t private_data_offset_len;
} ti_conn_request_t;
#define CN_REQUEST 0x3232 // connection request message
#define CN_REPLY 0x3233 // connection request 'reply' message
#define CN_REJECT 0x3234 // connection rejection message
#define CN_MRA 0x3235 // message received acknowledgement
FIELDS
cb_type
Type of connection request callback:
CN_REQUEST - connection request message
CN_REPLY - connection request 'reply' message
CN_REJECT - connection rejection message
CN_MRA - message received acknowledgement
status
The status of the connection request attempt. If TI_SUCCESS upon entry
to the connection request callback routine, then the connection must be
either accept()'ed or reject()'ed to complete the connection request
cycle. The accept or reject can occur within the callback routine or
later, provided the ti_conn_request_t information is saved as the
ti_conn_request_t struct is only valid for the duration of the conection
request callback routine.
conn_handle
opaque connection handle, setup by the transport's connection callback
handler routine. If the connection request is in response to a connect()
call, 'conn_handle' is the same connection handle as was returned from
the connect() call. If the connection callback routine was invoked in
response to a listen() call, 'conn_handle' is a new connection handle
associated with this listen connection request. If the connection
request is accept()'ed, then 'conn_handle', from the ti_conn_request_t
is used in subsequent data transfer requests, not the connection handle
returned by the listen() call. If the connection request is rejected,
then the ti_conn_request_t 'conn_handle' will be invalidated by the
reject() call.
*context
client-context information that was specified when initiating the
connection request.
private_data_len
Length of private data in bytes.
*private_data
consumer supplied private data transferred as part of the reject
message.
private_data_offset
Number of bytes offset from 'private_data', where additional protocol
specific data starts. Zero for none.
private_data_offset_len
Length, in bytes, of additional protocol specific data starting at
(private_data+private_data_offset). Zero for none.
SEE ALSO
ti_conn_info_t, accept, reject, ti_conn_req_cb_t, listen, connect
NAME
ti_conn_req_cb_t -- connection request notification callback
DESCRIPTION
A connect or listen call specifies a Connection Request (CR) callback
routine via the ti_conn_info_t* argument. The CR callback routine is
invoked, in a system thread context, when a connection request attempt
arrives from a remote endpoint. The connection request handler can complete
the connection request cycle by either accepting, rejecting the connection
request. The connection request can also be deferred by simply providing a
connection request notification to an external agent. The connection
request cycle must be completed [accept(), reject()] within the
ti_conn_attr_t* specified timeout value, otherwise the connection will never
become established; accept() or reject() always fail with TI_TIMEOUT.
If the accept() or reject() connection request processing is to be defered,
then the ti_conn_request_t structure must be copied to stable storage as
the structure pointed to be the input parameter is only valid for the
duration of the callback routine.
SYNOPSIS
typedef void ( *ti_conn_req_cb_t ) (ti_conn_request_t *p_conn_req);
PARAMETERS
p_conn_req
[in] Connection request information is returned to the connection
request callback routine in the form of a '*p_conn_req' structure
pointer. The ti_conn_request_t struct includes request status, private
data and the caller-supplied context. The memory this pointer references
is only valid for the duration of the callback routine. If the
connection request processing is deferred, then the *p_conn_req contents
must be copied to another ti_conn_request_t struct in stable storage.
SEE ALSO
reject, accept, ti_conn_request_t, connect, listen, ti_conn_info_t,
ti_conn_attr_t
NAME
ti_conn_info_t -- connection setup information structure
DESCRIPTION
Specifies connection attributes describing local and remote endpoints
provided by the transport layer. Consumers provide this information during
a connect or listen request. Endpoint information is derived from IP
address and port pairs via the create_endpoint() call. See connect() and
listen() services. The Protection Domain handle is returned from
create_pd() call.
typedef struct _ti_conn_info
{
ti_endpoint_handle_t src_ep;
ti_endpoint_handle_t dst_ep;
ti_pd_handle_t pdh;
ti_cq_handle_t send_cqh;
ti_cq_handle_t recv_cqh;
void *private_data;
uint32_t private_data_len;
uint16_t private_data_offset;
uint16_t private_data_offset_len;
ti_conn_req_cb_t conn_request_cb;
void *context;
} ti_conn_info_t;
FIELDS
src_ep
handle to the transports local endpoint
dst_ep
handle to the transports remote endpoint
pdh
handle to the protection domain for the connection
send_cqh
Completion Queue (handle) used for send completions.
recv_cqh
Completion Queue (handle) used for receive completions.
*private_data
private data sent with req/reply
private_data_len
length in bytes of *p_private_data (user connect data), can be zero.
private_data_offset
Number of bytes offset from 'private_data' where additional protocol
specific information can be located. Zero for none.
private_data_offset_len
Length, in bytes, starting at 'private_data_offset' of additional
protocol specific data. Zero for none.
conn_request_cb
Connection request notification callback routine
context
context pointer provided by consumer, returned in the connection
request callback routine as a member of ti_conn_request_t structure.
SEE ALSO
connect, listen, ti_conn_req_cb_t, ti_conn_request_t
NAME
ti_conn_attr_t -- reliable data connection attributes
DESCRIPTION
Specifies attributes for a newly created connection and the local
work queues associated with this connection. Consumers specify maximum
sizes for transfers and receive messages so the transport can properly
size the transfer and receive work queues for the hardware interface.
SYNOPSIS
typedef struct _ti_conn_attr
{
uint32_t connect_request_timeout;
uint32_t connect_request_retries;
uint32_t max_transfer_depth;
uint32_t max_transfer_buffers;
uint32_t max_transfer_size;
uint32_t max_recv_depth;
uint32_t max_recv_buffers;
uint32_t max_recv_size;
uint32_t max_rdma_depth;
uint32_t max_rdma_size;
uint32_t max_send_retries;
boolean_t reliable;
boolean_t msg_flow_control;
boolean_t auto_path_migration;
boolean_t auto_cq_rearm;
} ti_conn_attr_t;
FIELDS
connect_request_timeout
Specified in milliseconds, the elapsed time before a connection request
is considered to have timed-out. TI_TIMEOUT will be returned as the
connection request status [ti_conn_request_t] in the connection request
callback routine. Accept() or reject() can also return TI_TIMEOUT if
the timeout expired after the return of the connection request callback
routine and before the accept or reject call.
A value of (0xFFFFFFFF) indicates the connect request should never
timeout.
connect_request_retries
Maximum number of connect_request_timeout's periods allowed before a
connect request is declared timeout'ed.
Re-read 'connect_request_timeout'.
max_transfer_depth
The maximum number of outstanding outbound operations that can be active
on the connection. This includes all message send, RDMA read, and RDMA
write operations. Typically, this value corresponds with the
element_count parameter specified through a resource pool creation call
[res_pool_create].
max_transfer_buffers
The maximum number of data buffers allocated for any data element posted
as an outbound data transfer. This includes message send, RDMA read,
and RDMA write operations. Typically, this value corresponds with the
largest buffers_per_element parameter specified through a resource pool
creation call.
max_transfer_size
The maximum number of message bytes which will be transmitted.
max_recv_depth
The maximum number of outstanding operations posted to receive incoming
data on the connection. Typically, this value corresponds with the
buffers_per_element parameter specified through the res_pool_create()
call.
max_recv_buffers
The maximum number of buffers allocated for any element posted as an
inbound data transfer. Typically, this value corresponds with the
largest bufs_per_element parameter specified through the
res_pool_create() call.
max_recv_size
The maximum number of receive bytes. A receive operation will receive
this max into a single buffer.
max_rdma_depth
Maximum number of outstanding RDMA requests.
max_rdma_size
Maximum size in bytes of a single RDMA operation. This value may be
zero if the connection does not support RDMA requests.
max_send_retries
Maximum times a send is retried before an error is declared.
reliable
If set to TRUE, signals that the connection will use hardware
implemented reliable data transfers for reliable communication. If set
to FALSE, the connection will use unreliable hardware data transfers.
msg_flow_control
Indicates whether the connection uses the hardware based flow
control mechanism. Flow control guarantees that a buffer is available
on the remote side of a connection to receive incoming data. If set to
FALSE, the user of the connection is responsible for flow control
activities. Flow control requires a reliable connection, inbound
message, and outbound messaging or RDMA write operations.
auto_path_migration
For connections and links supporting APM (Automatic Path Migration).
The transport will determine the multiple best possible paths (least
overlap across networks) between the endpoints and manage the fail-over
transparently.
auto_cq_rearm
Automatic CQ (Completion Queue) rearm:
TRUE
Implies the transport interface will rearm the CQ and poll before
calling the data transfer callback routine [ti_xfer_cb_t].
FALSE
Implies the transport interface will NOT rearm the CQ before calling
the data transfer callback routine. Therefore the data transfer
callback routine will receive a <null> data element list as it's
input parameter. Additionally, it is the responsibility of the data
transfer callback routine [ti_xfer_cb_t] to rearm the CQ and poll
for completed data transfers.
NOTES:
Do we want to expose VL configurations? For InfiniBand,
SM will automatically provide the proper SL (based on some congestion
management service) with the pathrecord when establishing connections.
SEE ALSO
connect, listen, ti_xfer_cb_t, ti_conn_attr_t, rearm_cq, poll_cq
NAME
ti_pool_response_t -- Response to a resource pool get request
DESCRIPTION
A response structure filled out by a resource pool when notifying the caller
of a completed request. Elements requested by caller are referenced by this
response structure.
typedef struct _ti_pool_response
{
void *p_context1;
void *p_context2;
ti_data_element_t *p_element_list;
uint32_t element_count;
ti_status_t status;
} ti_pool_response_t;
FIELDS
status
operation status:
TI_SUCCESS, elements are retrieved from pool.
TI_ERROR - no elements retrieved from pool.
*p_context1
res_pool_get() caller supplied context #1
*p_context2
res_pool_get() caller supplied context #2
*p_element_list
Pointer to first element in the list.
element_count
number of elements in the list. If 'partial_alloc_OK' in res_pool_get()
call, then element_count can be less than requested although always >
zero.
SEE ALSO
res_pool_get, res_pool_create
NAME
ti_xfer_cb_t -- Data transfer completion Call Back routine
DESCRIPTION
A data transfer callback routine is invoked, at device interrupt level,
after the completion of any data transfers (RDMA, send or receive) for an
armed Completion Queue (CQ). Accept initially arms the specified CQ.
If the completion queue is not armed, no callback will occur.
Due to the elevated priority of the data transfer callback execution
environment, blocking for any reason is prohibited and will jeopardize
system integrity.
SYNOPSIS
typedef void ( *ti_xfer_cb_t )
( ti_cq_handle_t cq_handle,
ti_data_element_t *de_list );
PARAMETERS
cq_handle
The handle to the CQ (Completion Queue) that had a new entry inserted.
'cq_handle' is a suitable input argument for the 'poll_cq' call which
will return a list of completed data elements. Useful when
'ti_conn_attr_t.auto_cq_rearm' is FALSE, see following discussion.
de_list
Based on the connection attributes [ti_conn_attr_t] field
'auto_cq_rearm', the parameter 'de_list' may be <NULL>.
If 'ti_conn_attr_t.auto_cq_rearm' is FALSE, then the callback parameter
'de_list' will be <NULL>, 'cq_handle' is valid. The connection
initiator has elected to be responsible for rearming and polling the
CQ (Completion Queue) for data transfer completion events.
The callback routine must be aware of the elevated processor priority
at which it is executing at, as it has complete control of the processor* until it returns from the callback routine. Therefore, repeatedly
polling the CQ until there are no more completion events available may
seem like a good idea, it must be carefully weighted against letting
other threads execute.
If 'ti_conn_attr_t.auto_cq_rearm' is TRUE, then the callback parameter
'de_list' will be a pointer to a list of completed data elements. Each
data element contains it's own data transfer operation status word used
to determine the success of the transfer operation. The CQ will be
rearmed and polled prior to the data transfer callback routine being
invoked. The transport interface will rearm, poll and invoke the
callback routine once per CQ completion signal.
SEE ALSO
ti_cq_handle_t, ti_comp_info_t, ti_data_element_t, msg_send, rdma_write
rearm_cq, poll_cq, ti_conn_attr_t
NAME
ti_context_cb_t
DESCRIPTION
A client supplied context pointer is passed to the asynchronous
callback routine as its sole argument. The Callback is a result of
a client invoked transport interface routine such as destroy_ti or
disconnect. The callback routine executes in a thread context.
SYNOPSIS
typedef void ( *ti_context_cb_t ) (void *p_context) ;
PARAMETERS
p_context
[in] A caller supplied async context pointer
NOTES
SEE ALSO
destroy_ti, disconnect
NAME
ti_err_cb_t
DESCRIPTION
A user-specified callback that is invoked after an error has occurred on
an allocated connection. Error callback execution context is that of a
system thread.
SYNOPSIS
typedef void ( *ti_err_cb_t ) ( ti_err_rec_t *p_err_rec );
PARAMETERS
p_err_rec
Error information returned to the client, indicating the reason for the
error and associated context information.
SEE ALSO
create_cq, ti_comp_info_t
NAME
ti_res_pool_cb_t
DESCRIPTION
A Protocol provider optionally provides this callback routine when
requesting elements from a resource pool. The callback is invoked by the
transport interface when the requested pool data_elements are available.
Callback execution context is system thread or tasklet.
SYNOPSIS
typedef void ( *ti_res_pool_cb_t ) ( ti_pool_response_t *p_resp );
PARAMETERS
p_resp
A consumer supplied async context is passed back
NOTES
A resource pool get callback is invoked after an asynchronous get
operation completes. A response to the get request is returned via
the callback function's input parameter.
SEE ALSO
respool_get
NAME
ti_comp_info_t -- send/recv/rdma/error completion handler information
DESCRIPTION
Provides information used to process completed data transfer requests.
Clients use this structure to specify which completion queue to use when
processing completions and which callbacks to invoke after a data transfer
has completed.
typedef struct _ti_comp_info
{
ti_cq_handle_t h_send_cq;
ti_cq_handle_t h_recv_cq;
ti_xfer_cb_t pfn_send_cb;
ti_xfer_cb_t pfn_recv_cb;
ti_xfer_cb_t pfn_rdma_cb;
ti_err_cb_t pfn_err_cb;
void *p_err_context;
} ti_comp_info_t;
FIELDS
h_send_cq
A handle to an existing send completion queue.
The completion service monitors all message sends and
RDMA operation completions. Sends and receives may be monitored via
the same completion queue.
h_recv_cq
A handle to an existing receive completion queue.
The completion service monitors all completions for receiving messages.
pfn_send_cb
A pointer to a function that will be invoked after a message send
operation has completed. For additional information about data transfer
completion callback functions, see the Data Transfer Callback.
Set to NULL if client will always poll connection for send completions.
pfn_recv_cb
A pointer to a function that will be invoked after a message receive
operation has completed. For additional information about data
transfer completion callback functions, see the Data Transfer Callback.
Set to NULL if client will always poll connection for recv completions.
pfn_rdma_cb
A pointer to a function that will be invoked after a RDMA
operation has completed. For additional information about RDMA
transfer completion callback functions, see the RDMA Transfer Callback.
Set to NULL if client will always poll connection for RDMA completions.
pfn_err_cb
A pointer to a function that will be invoked when an asynchronous error
occurs on the connection. For additional information about error
callback functions, see the ti_err_cb_t function description.
p_err_context
consumer-specified context returned to through the error callback
function.
SEE ALSO
accept
NAME
ti_ops_t -- Transport Operations Vector
DESCRIPTION
A vector of transport service entry-points exported by the Transport.
All entry-points will contain a valid function pointer which may be the TI
supplied nop function (which always returns failure: TI_UNSUPPORTED).
All AF_INET_OFFLOAD protocol providers are expected to export transport
registration and deregistration entry points which enable the protocol
provider to retrieve the transport operations vector (ti_ops_t).
FIELDS
*transport_name
descriptive name for this transport. String memory is managed by
the transport layer.
*link_dev_name
link device name utilized by this transport. String memory is managed
by the transport layer.
SEE ALSO
All TI entry points are documented under the function name
(e.g. events_register).
SDP_register_transport & SDP_deregister_transport
for protocol provider exported interfaces.
typedef struct _ti_ops
{
char *transport_name; // descriptive name for this transport.
char *link_dev_name; // link device utilized by this transport.
// Transport Interface services
char * (*ti_status_2_str) (
IN ti_status_t status );
ti_status_t (*events_register) (
IN const void *p_Context,
IN ti_event_cb_t Event_callback );
ti_status_t (*events_deregister) (
IN const void *p_Context,
IN ti_event_cb_t Event_callback );
ti_status_t (*create_ti) (
OUT ti_handle_t *ph_ti );
ti_status_t (*destroy_ti) (
IN ti_handle_t h_ti,
IN ti_context_cb_t pfn_cb,
IN void *p_context );
ti_status_t (*debug_svc) (
IN ti_handle_t h_ti,
IN ti_debug_level_t debug_level );
// INET address mapping service: Local and remote endpoints, blocking call
ti_status_t (*create_endpoint) (
IN ti_handle_t h_ti,
IN ti_inet_addr_t *p_addr,
IN ti_port_t port,
OUT ti_endpoint_handle_t *ph_ep );
// INET address mapping service: release transport endpoint information
ti_status_t (*destroy_endpoint) (
IN ti_endpoint_handle_t h_ep );
// Transport protection domain services.
ti_status_t (*create_pd) (
IN ti_handle_t h_ti,
IN ti_endpoint_handle_t h_local_ep,
OUT ti_pd_handle_t *ph_pd );
ti_status_t (*destroy_pd) (
IN ti_pd_handle_t h_pd );
// Tranport Completion Queue management
ti_status_t (*create_cq) (
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN uint32_t max_cq_entries,
OUT ti_cq_handle_t *ph_cq );
ti_status_t (*poll_cq) (
IN ti_cq_handle_t h_cq,
OUT ti_data_element_t **p_list);
ti_status_t (*rearm_cq) (
IN ti_cq_handle_t h_cq );
ti_status_t (*destroy_cq) (
IN ti_cq_handle_t h_cq );
// Tranport connection services
ti_status_t (*connect) (
IN ti_handle_t h_ti,
IN ti_conn_attr_t *p_conn_attr,
IN ti_conn_info_t *p_conn_info,
OUT ti_conn_request_t *p_conn_req,
OUT ti_conn_handle_t *ph_conn );
ti_status_t (*accept) (
IN ti_conn_request_t *p_conn_req,
IN uint32_t max_rdma_size,
IN ti_comp_info_t *p_comp,
IN ti_data_element_t *p_recv_mlist);
ti_status_t (*reject) (
IN ti_conn_request_t *p_conn_req );
ti_status_t (*listen) (
IN ti_handle_t h_ti,
IN ti_conn_attr_t *p_conn_attr,
IN ti_conn_info_t *p_conn_info,
OUT ti_conn_request_t *p_conn_req,
OUT ti_conn_handle_t *ph_conn );
ti_status_t (*disconnect) (
IN ti_conn_handle_t h_conn,
IN ti_context_cb_t pfn_cb,
IN void *p_context );
// Memory registration services
ti_status_t (*reg_virt_mem) (
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN void *p_vaddr,
IN uint32_t byte_count,
IN ti_access_t access_rights,
OUT ti_rdma_desc_t *p_rdma_desc,
OUT ti_mem_handle_t *ph_vmem );
ti_status_t (*reg_phys_mem) (
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN uint64_t *p_phys_page_addr_array,
IN uint32_t num_pages,
IN ti_access_t access_rights,
IN OUT ti_rdma_desc_t *p_rdma_desc,
OUT ti_mem_handle_t *ph_pmem );
ti_status_t (*dereg_mem) (
IN ti_mem_handle_t h_mem );
// Resource Pool services
ti_status_t (*res_pool_create) (
IN ti_handle_t h_ti,
IN ti_res_pool_type_t pool_type,
IN uint32_t element_cnt,
IN uint32_t bufs_per_element,
IN uint32_t *buf_size_array,
IN uint32_t context_size,
OUT ti_res_pool_handle_t *ph_pool );
// destroy either msg or rdma pool
ti_status_t (*res_pool_destroy) (
IN ti_res_pool_handle_t h_pool);
// get or put elements based on pool handle
ti_status_t (*res_pool_get) (
IN ti_res_pool_handle_t h_pool,
IN uint32_t element_count,
IN boolean_t partial_allocation_OK,
IN ti_res_pool_cb_t res_cb,
IN void *p_context1,
IN void *p_context2,
IN OUT ti_pool_response_t *p_res_pool_resp );
ti_status_t (*res_pool_put) (
IN ti_res_pool_handle_t h_pool,
IN ti_data_element_t *p_delist );
// Message send & receive
ti_status_t (*msg_send) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_delist );
ti_status_t (*msg_recv) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_delist );
// Remote DMA operations
ti_status_t (*rdma_read) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_melist );
ti_status_t (*rdma_write) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_melist );
// Remote DMA operation list with bundled message (completion indication)
ti_status_t (*rdma_read_send) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_read_list,
IN ti_data_element_t *p_msg);
ti_status_t (*rdma_write_send) (
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_write_list,
IN ti_data_element_t *p_msg);
// Atomic operation
ti_status_t (*atomic_op) (
IN ti_conn_handle_t h_conn,
IN ti_atomic_op_t operation,
IN OUT ti_data_element_t *p_data );
ti_status_t (*io_ctl) (
IN ti_handle_t h_ti,
IN ti_ioctl_t cmd,
IN OUT uint32_t *p_bytes,
IN OUT void *p_data);
} ti_ops_t;
NAME
ti_status_2_str - generate a status message given a TI status return value.
DESCRIPTION
Given an ti_status_t return value, create a C-string which identifies
the given status return value. Example: TI_INVALID_HANDLE would generate
the string "Invalid Handle".
SYNOPSIS
char *
ti_status_2_str ( IN const ti_status_t status );
PARAMETERS
status
[in] an ti_status_t return value.
RETURN VALUE
C-string pointer to an status message.
NAME
events_register -- register for transport event notifications.
DESCRIPTION
Upon a successful return from events_register, the caller (SDP for instance)
will begin to receive transport event notifications via the input parameter
'Event_callback' routine. Transport event notifications will continue until
such a time that events_deregister() is called. The TI will deliver event
notifications for all instances of the network link device name registered
by the transport [ti_events_enable]. For events definitions see
[ti_event_t]. The event callback routine and it's behavior is documented as
[ti_event_cb_t].
The Event_callback routine delivers transport event notifications.
The 'event' input parameter is always defined. Based on the type of event,
the (void *) pointer 'data' will actually be an event specific structure
pointer or event raw data.
SYNOPSIS
ti_status_t
events_register ( IN const void *p_Context,
IN ti_event_cb_t Event_callback );
PARAMETERS
*p_Context
[in] optional caller supplied context pointer which is returned as an
input parameter to the Event callback routine.
Event_callback
[in] event notification callback function pointer.
RETURN VALUE
TI_SUCCESS
The caller has successfully registered with the Transport Interface
(TI) layer to receive transport event notification.
TI_INVALID_PARAMETER
Missing event callback parameter.
SEE ALSO
ti_event_t, ti_event_cb_t, ti_ops_t, events_deregister
ti_transport_desc_t
NAME
events_deregister -- stop transport event notifications
DESCRIPTION
Request the TI to stop delivery of transport events in that the event
callback routine will no longer be invoked. The p_Context and Event_callback
input parameters must match those which were supplied to the
events_register() call.
SYNOPSIS
ti_status_t
events_deregister ( IN const void *p_Context,
IN ti_event_cb_t Event_callback );
PARAMETERS
*p_Context
[in] caller supplied context pointer which is returned as an input
parameter to the Event callback routine.
Event_callback
[in] event notification callback function pointer.
RETURN VALUES
TI_SUCCESS
The transport provider instance successfully deregistered from the TI.
TI_FAILURE
No link_event_register() with matching input parameters.
SEE ALSO
events_register, ti_event_t, ti_event_cb_t
NAME
create_ti -- Create a Transport Interface service handle
DESCRIPTION
Create a new transport interface service handle. A TI service handle defines
the scope maintained for connections, data transfers and resoure pools.
A transport service handle is required to communicate with the transport
operations interfaces (ti_ops).
The transport service handle usage model is envisioned to be one service
handle per transport client. The service handle allows multiple simultaneous
connections along with multiple connections over time, mutiple simultaneous
data transfers and resource pools.
SYNOPSIS
ti_status_t
create_ti ( OUT ti_handle_t *ph_ti );
PARAMETERS
ph_ti
[out] addrs of where the transport instance handle is written
RETURN VALUE
TI_SUCCESS
The transport interface instance handle is successfully created
TI_INSUFFICIENT_RESOURCES
Insufficient resources to complete request.
SEE ALSO
destroy_ti
NAME
destroy_ti -- Destroy an TI service handle
DESCRIPTION
The consumer must call destroy_ti to release all reserved transport
resources including, connections, pools, and completion queues.
When a service is destroyed, processing halts on the specified service.
This call is asynchronous, hence the required context callback routine.
When the context callback routine is invoked in a system thread context,
the transport service along with allocated resources have been destroyed.
Additional usage of the handle will generate TI_INVALID_HANDLE errors.
SYNOPSIS
ti_status_t
destroy_ti_svc (
IN ti_handle_t h_ti,
IN ti_context_cb_t pfn_cb,
IN void *p_context );
PARAMETERS
h_ti
[in] Transport handle pointer
pfn_cb
[in] A callback function (pointer) that is invoked once the requested
service has been destroyed. NULL if no callback is desired; destroy_ti
still runs asynchronously.
*p_context
[in] User-defined & supplied context information returned through the
destroy callback function as it's only argument.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_STATE
A transport service instance is not in a valid state on which to
perform the specified operation.
SEE ALSO
create_ti
NAME
debug_svc -- Set the current debug level of the specified library instance
DESCRIPTION
This function sets the current debug level for the specified service when
running a debug version of the library. If the executing code is a released
(non-debug) version of the library, the function returns INVALID_OPERATION.
A successful call to this function sets the debug output for the specified
library instance, resource pool or data transfer service.
Setting the debug level of the transport instance sets the debug level of
all library services, including the default debug level for newly created
services. In addition, it enables debug messages for the connection,
service creation, and memory registration function calls.
SYNOPSIS
ti_status_t
debug_svc (
IN ti_handle_t h_ti,
IN ti_debug_level_t debug_level );
PARAMETERS
h_ti
[in] Transport handle pointer
debug_level
[in] New debug level to use with the library service:
TI_DBG_OFF - output fatal messages
TI_DBG_STD - output fatal+error messages
TI_DBG_VERBOSE - output fatal+error+warnings+info messages
TI_DBG_DEBUG - output all:
fatal+error+warnings+info+routine-entry-exit
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_OPERATION
The operation is not valid for the current service state.
TI_INVALID_PARAMETER
Invalid input parameter.
NAME
create_endpoint -- maps IP address/port to transport endpoint information
DESCRIPTION
This function provides a transport endpoint mapping to the IP address
and IP port pair. Supports IPv4 and IPv6 address formats. Returns an opaque
handle that is used in connection creation. May be a blocking call.
SYNOPSIS
ti_status_t
create_endpoint(
IN ti_handle_t h_ti,
IN ti_inet_addr_t *p_addr,
IN ti_port_t port,
OUT ti_endpoint_handle_t *ph_ep );
PARAMETERS
h_ti
[in] Transport handle pointer
*p_addr
[in] IP address in host-order format
port
[in] protocol allocated port space identifier
ph_ep
[out] If no mapping could be resolved then h_ep is set to NULL.
Otherwise, the opaque handle represents transport endpoint information
for the specified IP addr/port pair. Opaque memory, which the handle
points to, is allocated on behalf of the caller and is therefore the
callers responsibility to release the memory when the endpoint
information is no longer required, see destroy_endpoint() call.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_ERROR
Could not provide a mapping for this address/port pair.
TI_INVALID_PARAMETER
Invalid input parameter.
TI_INSUFFICIENT_MEMORY
Unable to allocate endpoint handle dynamic memory
NOTES
For local endpoint information the call will not block since the information
is readily available to transports. However, for remote endpoints,
an address resolution process may to be invoked if the remote IP address
is not in the local ARP cache. In this case, the call would block waiting
for the ARP response. The transport is responsible for setting reasonable
timeout values for non-responsive ARPs since it understands the underlying
architecture of its ARP mechanism and link attributes.
Remember to release endpoint information to prevent memory leaks.
SEE ALSO
destroy_endpoint, ti_conn_info_t
NAME
destroy_endpoint -- destroy transport endpoint information
DESCRIPTION
Release specified transport endpoint information
SYNOPSIS
ti_status_t
destroy_endpoint( IN ti_endpoint_handle_t h_ep );
PARAMETERS
h_ep
[in] valid transport endpoint information handle.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_PARAMETER
Invalid input parameter.
NOTES
Transport may release cached information. Opaque memory, which the handle
points to, is allocated on behalf of the caller and is therefore the callers
responsibility to release the memory when the endpoint information is no
longer required.
SEE ALSO
create_endpoint
NAME
create_pd -- Create a protection domain for data transfers.
DESCRIPTION
A protection domain is created for each transport instance (h_ti).
All connections must be associated with a protection domain. A protection
domain can be shared across multiple connections but cannot be shared
across transport service instances.
A protection domain handle is utilized in memory registration and
connection establishment.
SYNOPSIS
ti_status_t
create_pd (
IN ti_handle_t h_ti,
IN ti_endpoint_handle_t h_local_ep,
OUT ti_pd_handle_t *ph_pd );
PARAMETERS
h_ti
[in] Transport instance handle
h_local_ep
[in] The handle to the local endpoint for scope of protection domain
ph_pd
[out] addrs of where the protection domain handle is written
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_HANDLE
Invalid TI handle
TI_INVALID_PARAMETER
Invalid input parameter.
TI_INSUFFICIENT_MEMORY
unable to kmalloc handle memory
TI_INSUFFICIENT_RESOURCES
cl event init or IB CA open failed.
TI_ERROR
Could not allocate a protection domain for this interface.
SEE ALSO
destroy_pd, reg_virt_mem, ti_conn_info_t
NAME
destroy_pd -- destroy a protection domain
DESCRIPTION
A protection domain is destroyed with PD allocated resources released.
SYNOPSIS
ti_status_t
destroy_pd ( IN ti_pd_handle_t h_pd );
PARAMETERS
h_pd
[in] Protection domain handle pointer
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_PARAMETER
Invalid protection domain handle
SEE ALSO
create_pd
NAME
create_cq -- Create a Completion Queue for data transfers.
DESCRIPTION
A Completion Queue (CQ) is created under the scope of a transport service
instance (ti_handle_t) and a protection domain (ti_pd_handle_t). A CQ
will be bound to a connection during the connection acceptance phase
[accept]. A CQ can be bound as a send CQ, receive CQ or a CQ configured to
handle both send and receive completions; see the h_send_cq and h_recv_cq
fields in the ti_comp_info_t structure. Two separate completion queues
(send/recv) can be allocated and bound to a connection if clients want to
separate send and receive processing.
A completion queue can also be bound to more than one connection. It
is up to the CQ creator to allocate sufficient CQ entries to accomodate
all connections using the CQ.
The 'max_cq_entries' (CQ depth) parameter MUST be greater than or equal to
the connection specified max send or receive queue depth, see
ti_conn_attr_t structure fields [max_transfer_depth, max_recv_depth] to
prevent CQ over-run errors which destroy an established connection.
If a CQ is configured for send completions only, then 'max_cq_entries' must
be >= max_transfer_depth. If a CQ is receive only, then 'max_cq_entries'
must be >= max_recv_depth. If a CQ is configured to handle both send and
receive completions, then 'max_cq_entries' must be >= (max_recv_depth +
max_transfer_depth). If a CQ is full and the hardware attempts to write a CQ
entry, the connection error callback routine will be called with an
TI_CQ_OVERRUN error. All other CQ errors will generate TI_CQ_ERROR at the
connection error callback routine [ti_comp_info_t, accept].
A CQ is created in the unarmed state and armed as a side-effect of the
transport accept call.
SYNOPSIS
ti_status_t
create_cq (
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN uint32_t max_cq_entries,
OUT ti_cq_handle_t *ph_cq );
PARAMETERS
h_ti
[in] transport handle pointer
h_pd
[in] protection domain handle
max_cq_entries
The maximum number of entries that can be inserted into a CQ, in
essence, the CQ depth.
ph_cq
[out] address of where the completion queue handle is written
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_HANDLE
Transport Interface handle is invalid.
TI_INVALID_PARAMETER
Output handle pointer is NULL, Protection Domain handle is NULL or
max_cq_entries > max limit (64K).
TI_INSUFFICIENT_RESOURCES
Unable to create Completion Queue.
SEE ALSO
ti_comp_info_t, destroy_cq, poll_cq, rearm_cq, ti_conn_req_cb_t
ti_err_cb_t, ti_conn_attr_t
NAME
poll_cq -- poll a completion queue and return completed data elements.
DESCRIPTION
The specified completion queue is polled for the existence of completion
queue entries. Returns a pointer to a <NULL> terminated list of data
elements containing the status of the completed transfer(s). The completion
queue should not have been created with a signaled completion queue
callback routine as race conditions could occur.
SYNOPSIS
ti_status_t
poll_cq(
IN ti_cq_handle_t h_cq,
OUT ti_data_element_t **p_list );
PARAMETERS
h_cq
[in] completion queue handle
p_list
[out] address to write the pointer to first element in the list of
completed transfers. Set to NULL if completion queue is empty.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_OPERATION
The operation is not valid for the current service state.
TI_INVALID_PARAMETER
Invalid input parameter.
SEE ALSO
create_cq, destroy_cq, rearm_cq, ti_xfer_cb_t
NAME
rearm_cq -- rearm a completion queue.
DESCRIPTION
Indicates to the transport that it should notify the client when the next
completion is added to the completion queue.
The completion queue is rearmed upon return from the IO completion callback
routine.
SYNOPSIS
ti_status_t
rearm_cq( IN ti_cq_handle_t h_cq );
PARAMETERS
h_cq
[in] completion queue handle
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_OPERATION
The operation is not valid for the current service state.
TI_INVALID_PARAMETER
Invalid input parameter.
SEE ALSO
create_cq, destroy_cq, poll_cq, ti_conn_attr_t(auto_cq_rearm)
NAME
destroy_cq -- rearm a completion queue.
DESCRIPTION
Transport should disarm the completion queue, then destroy it releasing
all allocated resources.
SYNOPSIS
ti_status_t
destroy_cq(
IN ti_cq_handle_t h_cq );
PARAMETERS
h_cq
[in] completion queue handle
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_OPERATION
The operation is not valid for the current service state.
TI_INVALID_PARAMETER
Invalid input parameter.
PORTABILITY
Kernel only.
SEE ALSO
create_cq, rearm_cq, poll_cq
NAME
connect -- schedule a reliable transport connection creation request.
DESCRIPTION
Schedule a reliable transport connection request to the specified remote
endpoint. A connection is a communications channel used to send, receive
messages and perform RDMA operations on. A connecction provides optional
message-level flow control and supports RDMA operations larger than that
supported by the underlying hardware. Multiple connections may be created
for a single transport service instance.
If the connect() return value was not TI_SUCCESS, then no connect request
has been queued, hence further processing is not required; the connection
callback routine will not be invoked. If the connect() call returns
TI_SUCCESS, then the connection request has been scheduled and possibly not
yet completed. A handle to the newly scheduled connection request is
returned to the connect() caller through thru the OUT parameter of the
connect call. At this juncture the connection has NOT been established,
hence the connection handle is invalid for data transfer operations.
A connection handle will only become valid once an accept() call has been
invoked using the handle.
Connect is a non-blocking call, control may return before the Connection
Request (CR) cycle has been completed. The connect() caller must
synchronize with the CR callback routine to determine the status of the
connection request: success[accept() called], failure[reject() called] or
simple notification of the connection request (neither accept or reject has
been called from the callback routine). The conn_request_t.status field
must be set from the connection request callback routine
(conn_info.conn_request_cb) to indicate the final status of the connection
request.
The transport interface invokes the connection request callback routine in
a system thread context, with a pointer to an ti_conn_request_t* structure
which describes the connection request. The ti_conn_request_t* 'status'
code is not valid at this juncture, it is the responsibility of the
connection request callback routine to set the status field.
The connection request cycle can be completed within the CR callback
routine by calling accept() or reject(). The connection request cycle can
also be completed outside the CR callback routine, provided the elapsed
time between the connect() call and the invocation of accept() or reject()
does not exceed the connection timeout specified in the structure
ti_conn_attr_t.connect_request_timeout.
The Connection request structure pointer supplied to the connection request
callback routine as an input parameter, is populated by the transport
interface layer before the connect request callback is invoked.
If the Connection Request callback routine determines private data
comparisons do not meet the caller pre-defined values, a reject() call
can be invoked to 'reject' the connection attempt and record the reject
status in the ti_conn_request_t.status input parameter field.
A connection can not become ESTABLISHED until an accept() call has been
issued. The accept call may block while establishing the connection. Accept
status must be recorded in the ti_conn_request_t.status input parameter
field.
It is the connect() callers responsibility to synchronize with the
connection request callback routine and to determine the connection request
status: success (accept called), failure (reject called) or simple connect
request notification. The connection request callback routine will record
the final connection status in the ti_conn_request_t.status input parameter
field.
Connect calls specify connection information through the use of two input
structures. The ti_conn_attr_t parameter specifies negotiable attributes
about the connection itself. While the ti_conn_info_t parameter specifies
how the connection will be established including local and remote endpoint
information.
The connection information structure, ti_conn_info_t, exposes user-defined
connection properties that may be used by connecting entities. Data
submitted through the p_private_data parameter is passed to the remote
endpoint as part of the connection request and is supplied to the local side
through its connection request callback. The p_private_data may also
be used to match connection requests on the remote system by having the
remote entity specify a private_data compare buffer as part of its
ti_conn_info_t structure. Private data authentication is option as far as
the transport is concerned.
Once connect() has return success and the CR callback routine has set
ti_conn_request_t.status == TI_SUCCESS, the disconnect() call must be
used to cancel a connection request or shutdown an existing connection.
SYNOPSIS
ti_status_t
connect(
IN ti_handle_t h_ti,
IN ti_conn_attr_t *p_conn_attr,
IN ti_conn_info_t *p_conn_info,
OUT ti_conn_request_t *p_conn_req,
OUT ti_conn_handle_t *ph_conn );
PARAMETERS
h_ti
[in] transport service instance handle
*p_conn_attr
[in] data connection attributes
*p_conn_info
[in] connection information, end-points, private data and connection
request callback routine.
*p_conn_req
[out] Connection request structure pointer. Structure is initialized by
the transport layer with connection request information and passed,
as the input parameter, to the user specified [p_conn_info] connect
request callback routine.
*ph_conn
[out] where the Connection handle pointer is written
RETURN VALUE
TI_SUCCESS
The operation was scheduled successfully.
TI_INVALID_PARAMETER
One of the input parameters is considered invalid.
TI_INVALID_OPERATION
The operation is not valid for the current service for this version of
the library.
TI_ERROR
Unable to establish a connection.
SEE ALSO
ti_conn_req_cb_t, accept, reject, ti_conn_info_t, ti_conn_attr_t
disconnect, listen
NAME
accept -- Accept a connection request and establish a reliable connection.
DESCRIPTION
The transport interface will invoke the (ti_conn_info_t) specified
connection request callback routine to provide connection request (CR)
notification. Within the CR callback routine a connection can be accepted,
rejected or deferred until a later time. The connection request cycle must
be completed with the timeout specified in the
ti_conn_attr_t.connect_request_timeout interval, otherwise the connection
will never reach the established state.
Once the connection callback routine has optionally validated the remote
connection side attributes (see private data payload) either 'accept' or
'reject' must be called to complete the connection request cycle.
Once the 'accept' call returns success, the connection is considered
established.
accept() will modify the ti_conn_handle_t, referenced in the
ti_conn_request_t structure, to record the established connection state to
enable the 'ti_conn_handle_t' as valid for subsequent data transfer
operations.
accept() also arms the Completion Queue(s) specified in the ti_comp_info_t
structure. Therefore, upon successful return from accept, all specified
completion queue(s) are armed and ready for data transmissions.
The underlying connection protocol can cause accept() to block until
the remote side acknowledges the connection. As a result, users calling
accept() may not hold locks, raise interrupt priority, or perform other
activities that might result in a deadlock condition when the accepting
thread blocks.
It is the connect() or listen() callers responsibility to synchronize with
the connection request callback to determine the success(accept) or failure
(reject) of the connect or listen request.
SYNOPSIS
ti_status_t
accept (
IN ti_conn_request_t *p_conn_req,
IN uint32_t max_rdma_size,
IN ti_comp_info_t *p_comp_info,
IN ti_data_element_t *p_recv_mlist );
PARAMETERS
*p_conn_req
[in] A connection request structure pointer. This parameter is returned
through the connection request callback routine as an input parameter
[ti_conn_req_cb_t]. The *p_conn_req structure is initialized by the
calling TI. The opaque ti_conn_handle_t is a member of this structure
and modified by the accept routine to indicate successful acceptance
of this established connection.
max_rdma_size
[in] The maximum size of a single RDMA request supported by the remote
side of the connection. This value may be zero if the connection does
not support RDMA requests.
*p_comp_info
[in] data transfer completion information used by the newly connected
data transfer service.
*p_recv_mlist
[in] A reference to a list of data elements [ti_data_element_t] which
will become posted receive buffers on the data transfer service to
receive incoming data. The buffers will be posted before the connection
completes to ensure that inbound messages are not dropped.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_PARAMETER
One of the parameters has an invalid or unsupported setting.
This status usually signals that a specified value is outside of a
given range.
TI_INSUFFICIENT_RESOURCES
There are not enough internal library resources available to complete
the request. Also used to indicate that a remote endpoint lacks the
resources necessary to complete a request.
TI_TIMEOUT
An operation did not complete within the specified time period. Likely
the ti_conn_attr_t.connect_request_timeout value expired.
TI_ERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
reject, connect, listen, ti_conn_req_cb_t, ti_conn_info_t
NAME
reject -- Reject a request to establish a connection from a remote endpoint.
DESCRIPTION
This call rejects a connection request attempt from a remote endpoint thus
completing a connection request cycle. Request() can, but it is not
required to, be invoked from within a connect request callback routine for
connect() or listen() to refuse a connection request.
The ti_conn_handle_t, referenced in the ti_conn_request_t structure, is
modified by the reject() call to record the connection request rejection
and mark it as an invalid handle.
SYNOPSIS
ti_status_t
reject( IN ti_conn_request_t *ph_conn_req );
PARAMETERS
*p_conn_request
[in] connection request pointer. The connection request structure
pointer is supplied to the connection request callback routine as an
input parameter [ti_conn_req_cb_t].
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_TIMEOUT
An operation did not complete within the specified time period. Likely
the ti_conn_attr_t.connect_request_timeout value expired.
SEE ALSO
accept, connect, listen, ti_conn_req_cb_t, ti_conn_request_t
ti_conn_attr_t
NAME
listen -- listen for incoming connection establishment requests
DESCRIPTION
This call schedules a connection listen request for incoming connection
establishment requests. If the listen() return value is not TI_SUCCESS,
then no connection request was scheduled hence no further processing is
required; the callback routine will not be invoked. If the call returned
TI_SUCCESS, a listen connection handle is returned as an output parameter.
This listen connection handle is only used to cancel [disconnect()] the
listen request, it is not valid for data transfers on an established
connection.
Actual connection establishment is accomplished by the invocation of an
accept() call. The Connection Request (CR) notification callback routine
[ti_conn_req_cb_t], specified in the ti_conn_info_t parameter, can complete
the connection listen request cycle by calling accept(), reject() or defer
the connection completion cycle until a later date. The connection request
cycle must be completed within the ti_conn_attr_t.connect_request_timeout
interval, otherwise the connection can never become established; both
accept and reject will fail with TI_TIMEOUT status. If the accept or reject
processing is to be deferred, the callback routine input argument must be
copied to stable storage as the ti_conn_request_t struct is only valid for
the duration of the callback routine.
When the CR callback is invoked, the ti_conn_request_t* input parameter
contains a CR 'status' indicator (ti_status_t). If the status is not
TI_SUCCESS, then the callback routine needs only to synchronize with the
listen() caller and return. No further processing is required. If the CR
status is TI_SUCCESS, then the connection listen request cycle can be
completed within the CR callback routine or deferred to a later time bounded
by the ti_conn_attr_t.connect_request_timeout interval.
In order for the connection listen request cycle to be completed, either an
accept() or reject() call must be invoked. Accept() will establish a
connection with the remote endpoint or reject() will inform the remote
endpoint the connection is not accepted. Either way, the ti_conn_handle_t
specified in the ti_conn_request_t callback routine input parameter, is
modified to reflect the connection state. Please note, the ti_conn_handle_t
specificed in the ti_conn_request_t is a new connection handle which
represents the current connection request. Accept() validates this handle
for subsequent data transfer requests while reject() invalidates the handle.
It is the listen() caller's responsibility to synchronize with the CR
callback routine to determine connection status: !TI_SUCCESS,
success[accept() with a new ti_conn_handle_t], failure[reject] or deferral
of the connection listen request.
Once listen() has return success, the disconnect() call is used to cancel
a connection listen request or shutdown an accepted/established connection.
SYNOPSIS
ti_status_t
listen(
IN ti_handle_t h_ti,
IN ti_conn_attr_t *p_conn_attr,
IN ti_conn_info_t *p_conn_info,
OUT ti_conn_request_t *p_conn_req,
OUT ti_conn_handle_t *ph_conn );
PARAMETERS
h_ti
[in] transport service instance handle
*p_conn_attr
[in] specify data connection attributes
*p_conn_info
[in] Specify connection information, end-points, private data and
callback routines.
*p_conn_req
[out] Connection request structure pointer. Structure is initialized by
the transport layer with connection request information and passed,
as the input parameter, to the user specified [p_conn_info] connect
request callback routine.
*ph_conn
[out] writeable listen connection handle pointer. This handle is only
used to cancel the connection listen request. See ti_conn_request_t
structure for the connection handle (ti_conn_handle_t) to be used in
established connection data requests.
RETURN VALUE
TI_SUCCESS
The operation was scheduled successfully.
TI_INVALID_PARAMETER
One of the input parameters is considered invalid or out of range.
TI_INVALID_OPERATION
The operation is not valid for the current service for this version of
the transport provider.
SEE ALSO
ti_conn_req_cb_t, ti_conn_request_t, accept, reject, connect, disconnect
NAME
disconnect -- disconnect a transport endpoint connection
DESCRIPTION
Schedule a connection shutdown request. When the specified callback is
invoked, in a system thread context with 'p_context' as its input
parameter, the connection has been disconnected; the local ti_conn_handle_t
is considered invalid.
If the specified connection was established, then the remote endpoint is
notified of the pending connection shutdown. If the connection has a
pending connection request, the request is cancelled (connection request
callback fires with TI_CANCELLED status). In either case, allocated
connection resources such as Completion Queue(s), resource pools, etc. may
still be allocated and must be released.
SYNOPSYS
ti_status_t
disconnect(
IN ti_conn_handle_t h_conn,
IN ti_context_cb_t pf_callback,
IN void *p_context );
PARAMETERS
h_conn
[in] valid connection handle, connection may or may not be connected.
pf_callback
[in] Context callback routine, which if non-null, is invoked by the TI
upon the completion of the connection shutdown.
*p_context
[in] Optional - User created & defined context pointer which is passed
to the callback routine.
RETURN VALUE
TI_SUCCESS
The operation scheduled successfully.
TI_INVALID_HANDLE
Connection handle is invalid
TI_INVALID_OPERATION
The operation is not valid for the current service for this version of
the library.
SEE ALSO
connect, listen, ti_conn_handle_t, ti_conn_request_t, ti_conn_req_cb_t
NAME
reg_virt_mem -- Register Virtual memory with the transport hardware
DESCRIPTION
Register allocated virtual memory with the transport hardware. In order to
perform send, receive or RDMA operations, data memory must be registered
with the transport hardware. The memory range may be of an arbitrary size
and alignment; however, the actual area of memory registered will be on a
VM page-size granularity. The p_vaddr parameter does not need to be VM page
aligned.
SYNOPSYS
ti_status_t
reg_virt_mem (
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN void *p_vaddr,
IN uint32_t byte_count,
IN ti_access_t access_rights,
OUT ti_rdma_desc_t *p_rdma_desc,
OUT ti_mem_handle_t *ph_vmem );
PARAMETERS
h_ti
[in] transport service instance handle pointer
*h_pd
[in] transport protection domain handle
*p_vaddr
[in] Virtual memory address where registration starts.
byte_count
[in] Number of memory bytes starting at p_vaddr to register.
access_rights
[in] Transport access permissions granted the virtual address range
being registered.
*p_rmda_desc
[out] valid, ready to use RDMA descriptor.
ph_vmem
[out] Location of memory handle assigned to registered memory range.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INSUFFICIENT_RESOURCES
There are not enough internal library resources available to complete
the request. The request may be retried after resources become
available.
TI_INSUFFICIENT_MEMORY
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_ERROR
A catastrophic error has occurred that could not be handled by the
library.
NOTES
Upon an TI_SUCCESS return value, the registered virtual memory range VM
translations have been locked (pinned/wired) by the operating system so
the physical pages will not change underneath the VM pages. Similar to
preparing memory for DMA device operations.
SEE ALSO
create_pd, msg_send, rdma_write, ti_access_t
NAME
reg_phys_mem -- Register physical memory for data transfer operations.
DESCRIPTION
This function allows the user to register a list of physical addresses that
comprise a single Virtual memory address range/region. The memory region
can be of an arbitrary size and alignment; however, all physical addresses
in the list must be page aligned and the actual area of memory registered
is on a VM page-size granularity. Registered VM pages are locked into
physical memory, and the contents of the memory region are not altered.
Memory is registered for use only with the protection domain specified
through the 'h_pd' handle parameter. The access_rights parameter specifies
the type of access permitted to the memory region using the returned
RDMA descriptor. Memory access rights may be read-only, write-only, or
read-write. ph_pmem parameter will be set if the memory can be registered.
SYNOPSYS
ti_status_t
reg_phys_mem(
IN ti_handle_t h_ti,
IN ti_pd_handle_t h_pd,
IN uint64_t *p_phys_page_addr_array,
IN uint32_t num_pages,
IN ti_access_t access_rights,
IN OUT ti_rdma_desc_t *p_rdma_desc,
OUT ti_mem_handle_t *ph_pmem );
PARAMETERS
h_ti
[in] transport service instance handle
h_pd
[in] A protection domain handle for this transport service instance.
*p_phys_page_addr_array
[in] Location of an array of physical page addresses to register. Each
entry (physical page address) in the array should be a uint64_t
(IA-64 platform support) containing the starting address of a physical
page in memory. Array entries are not required to be physically
contiguous pages.
num_pages
[in] Number of physical page addresses in the array.
access_rights
[in] Access permission to the Physical address being registered.
*p_rdma_desc
[in] RDMA descriptor: p_rdma_desc.vaddr is the requested starting
virtual address assigned to the physical page range, possibly
page aligned (round-down) in the case of a page offset.
[out] RDMA descriptor:
p_rdma_desc.vaddr is the OS assigned virtual address including any
page offset from the requested VM start.
p_rdma_desc.rkey is the assigned memory region key.
ph_pmem
[out] Location of memory handle assigned to registered physical
address range.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INSUFFICIENT_RESOURCES
There are not enough internal library resources available to complete
the request. The request may be retried after resources become
available.
TI_INSUFFICIENT_MEMORY
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_ERROR
A catastrophic error has occurred that could not be handled by the
library.
NOTES
VM memory translations for the VM address range are locked until the
range is deregistered.
SEE ALSO
dereg_phys_mem, ti_rdma_desc_t, msg_send, rdma_write
NAME
dereg_mem -- Deregister memory previously registered
DESCRIPTION
This function deregisters a memory region previously registered through the
reg_phys_mem() or reg_virt_mem calls. The memory handle assigned to the
region is invalid following a successful return from this function.
Unregistered memory may not be used in data transfers over the fabric. The
caller of this function should ensure that there are no pending operations
on the memory region that require it to be registered, such as an
outstanding data transfer.
SYNOPSYS
ti_status_t
dereg_phys_mem( IN ti_mem_handle_t h_pmem );
PARAMETERS
h_pmem
[in] Valid registered physical/virtual memory handle
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_HANDLE
Invalid memory handle.
TI_ERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
reg_phys_mem, reg_virt_mem
NAME
res_pool_create -- Create a message resource pool filled with data elements
DESCRIPTION
This function creates a message resource pool filled with data elements.
A data element (ti_data_element_t) is required to perform data transfer
operations using transport services (send/recv/rdma_{read/write). The
resource pool manages the allocation, registration, and distribution of
data elements and buffers. If the this call is successful, a handle to
the newly created resource pool is returned. Multiple pools may be created
for a single transport service instance.
The resource pool type identifies all data elements to be either RDMA,
or buffered. RDMA implies the underlying transport hardware will access data
buffer memory utilizing RDMA read or write operations. Buffered type
indicates the underlying transport hardware will access data buffer memory
in a non-RDMA fashion (local DMA or processor). The main reason for
specifying RDMA or buffered access type is to provide the underlying
accelerated transport layer with advanced knowledge of the access method,
hence allowing the transport to setup transport specific access attributes
or allocations.
If a context_size greater than zero is specified, then the pool manager
allocates a user-controlled context buffer for each element. A context
buffer may be used to store context information with each element and is
referenced by the p_context field of the element structure. If context_size
is set to zero, then the p_context field is set to NULL and may be
overridden by the user. The context pointer is always a part of the element
structure, the value of the pointer is not dereferenced by anyone but the
resource pool user. Therefore, the context pointer itself, can be data in
the case where 'context_size == 0'.
SYNOPSYS
ti_status_t
res_pool_create(
IN ti_handle_t ph_ti,
IN ti_res_pool_type_t pool_type,
IN uint32_t element_cnt,
IN uint32_t bufs_per_element,
IN uint32_t *buf_size_array,
IN uint32_t context_size,
OUT ti_res_pool_handle_t *ph_pool );
PARAMETERS
*ph_ti,
[in] addrs of the TI service instance handle
pool_type
[in] Type of elements in this pool: RDMA or Buffered. Type RDMA implies
the memory specified in the element data buffer memory will be RDMA
read or written. Type 'Buffered' indicates element data buffer memory
will be read, written or both by the underlying transport hardware in
a non-RDMA fashion (local DMA or processor access).
element_cnt
[in] Number of elements allocated and managed by the resource pool.
Parameter must be > 0 or an TI_INVALID_PARAMETER is returned.
bufs_per_element,
[in] The number of buffers to allocate per element. Multiple buffers per
element are used to support scatter-gather data transfers. Parameter
must be greater than zero.
*buf_size_array,
[in] An array of buffer sizes, in bytes, of each data buffer allocated
per element by the pool. Allocated buffers are transport registered
[reg_virt_mem] and ready to use on a connected transport service. The
'buf_size_array' must contain 'bufs_per_element' entries.
A 'buf_size_array' entry may be zero, which allows users to provide
their own transport registered (reg_virt_mem, reg_phys_mem) data
buffers at a later time.
context_size,
[in] Size, in bytes, of a user-defined data section allocated for each
data element in the resource pool. This context memory can be used to
store user-defined context information. If no context information is
required, then specify zero as the size. Each pool element always has
the context pointer field 'p_context' allocated. The context pointer is
not dereferenced by any TI routines, therefore you always have
sizeof(void*) bytes of context in the context pointer itself. When the
'context_size' is > 0, then 'p_context' is a pointer to allocated
context memory. Allocated context memory is released in the
res_pool_destroy call.
*ph_pool
[out] Locate of the handle for the newly created resource pool.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_PARAMETER
out of range parameter.
TI_INSUFFICIENT_RESOURCES
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
NOTES
When sizing buffers, be aware of optimal buffer alignment w.r.t. cache line
size (BIG performance win). When pool buffers are allocated by this call,
all data buffers are initially allocated from one large memory allocation
then divided into smaller chunks of 'buffer_size' which are assigned to
each data buffer descriptor.
SEE ALSO
msg_send, rdma_write, res_pool_get, res_pool_destroy
reg_virt_mem, reg_phys_mem
PARAMETERS
*ph_ti,
[in] addrs of the TI service instance handle
pool_type
[in] Type of elements in this pool: RDMA or Buffered. Type RDMA implies
the memory specified in the element data buffer memory will be RDMA
read or written. Type 'Buffered' indicates element data buffer memory
will be read, written or both by the underlying transport hardware in
a non-RDMA fashion (local DMA or processor access).
element_cnt
[in] Number of elements allocated and managed by the resource pool.
Parameter must be > 0 or an TI_INVALID_PARAMETER is returned.
bufs_per_element,
[in] The number of buffers to allocate per element. Multiple buffers per
element are used to support scatter-gather data transfers. Parameter
must be greater than zero.
*buf_size_array,
[in] An array of buffer sizes, in bytes, of each data buffer allocated
per element by the pool. Allocated buffers are transport registered
[reg_virt_mem] and ready to use on a connected transport service. The
'buf_size_array' must contain 'bufs_per_element' entries.
A 'buf_size_array' entry may be zero, which allows users to provide
their own transport registered (reg_virt_mem, reg_phys_mem) data
buffers at a later time.
context_size,
[in] Size, in bytes, of a user-defined data section allocated for each
data element in the resource pool. This context memory can be used to
store user-defined context information. If no context information is
required, then specify zero as the size. Each pool element always has
the context pointer field 'p_context' allocated. The context pointer is
not dereferenced by any TI routines, therefore you always have
sizeof(void*) bytes of context in the context pointer itself. When the
'context_size' is > 0, then 'p_context' is a pointer to allocated
context memory. Allocated context memory is released in the
res_pool_destroy call.
*ph_pool
[out] Locate of the handle for the newly created resource pool.
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_PARAMETER
out of range parameter.
TI_INSUFFICIENT_RESOURCES
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
NOTES
When sizing buffers, be aware of optimal buffer alignment w.r.t. cache line
size (BIG performance win). When pool buffers are allocated by this call,
all data buffers are initially allocated from one large memory allocation
then divided into smaller chunks of 'buffer_size' which are assigned to
each data buffer descriptor.
SEE ALSO
msg_send, rdma_write, res_pool_get, res_pool_destroy
reg_virt_mem, reg_phys_mem
NAME
res_pool_destroy -- Destroy a resource pool
DESCRIPTION
res_pool_destroy will destroy all elements in a resource pool, including
the release of all allocated data buffers and contexts.
All elements MUST BE RETURNED to the pool in order for this operation to
be successful otherwise TI_RESOURCE_BUSY is returned. This is not a FATAL
error, although you have just leaked system memory!
If any data buffers are allocated to a pool, then any data buffer memory
referenced by returned pool elements will be deregistered and released
before this call returns. Be careful of what you put back in the pool before
you call res_pool_destroy.
SYNOPSYS
ti_status_t
res_pool_destroy ( IN ti_res_pool_handle_t ph_pool);
PARAMETERS
ph_pool
[in] valid resource pool handle
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVALID_HANDLE
ph_pool is NOT a resource pool handle.
TI_RESOURCE_BUSY
Specified resource pool had outstanding elements. Not all elements have
been returned to the pool, therefore the pool cannot be destroyed.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
res_pool_create, res_pool_get, res_pool_put
NAME
res_pool_get -- retrieve a list of message elements from a resource pool
DESCRIPTION
This function retrieves a list of elements from a resource pool. The
function call interface allows both synchronous and asynchronous modes of
operation. Selection of the operating mode is controlled through the
pf_res_callback input parameter. Whether the get function operates
asynchronously(with a callback) or synchronously, it completes an
ti_pool_response_t structure when fulfilling a request.
The response parameter, may be provided by the user for synchronous
operation, or as the single input parameter passed to the callback
function for asynchronous operation. For information pertaining to the
contents of the response structure, see 'ti_pool_response_t' definition.
Synchronous operation is requested with a valid *p_res_pool_resp
parameter and a NULL Callback function pointer in the request. In this mode,
an attempt is made to retrieve elements from the free pool. If sufficient
elements are available, they are returned as part of the response,
TI_SUCCESS status is returned. If the free pool is empty or unable to
complete the request, TI_INSUFFICIENT_RESOURCES is returned.
Note that with synchronous operation, if Partial_alloc_OK is specified,
the number of elements returned may be less than the number specified.
The user is responsible for checking the number of elements returned
against the number specified and for taking any appropriate action.
In this mode, a request for the remaining elements will not be queued by
the resource pool.
Asynchronous operation is performed by specifying a NULL p_res_pool_resp
parameter and a valid pf_res_callback function pointer in the request.
In this async mode, an attempt is made to retrieve elements from the free
pool. If sufficient elements are available, they are immediately returned
to the user through the callback function. In this case, the function will
return TI_SUCCESS if all elements requested were returned or pending status
if only part of the elements were returned. Queued requests are serviced
in first-in, first-out order by the resource pool. When additional elements
become available, the request at the head of the queue is examined and its
callback function is invoked when the request can be fulfilled. The request
remains at the head of the queue until the user has received all of the
elements for that request. Note that this implies that requests with
PartialOk set to true can receive multiple responses.
The p_res_pool_resp and callback input parameters are mutually exclusive;
one or the other must be specified, but not both. An TI_INVALID_SETTING
error is returned if both or neither are provided. Note that if any
asynchronous requests are pending, synchronous function calls will always
return with an TI_INSUFFICIENT_RESOURCES status.
SYNOPSYS
ti_status_t
res_pool_get(
IN ti_res_pool_handle_t h_pool,
IN uint32_t element_count,
IN boolean_t partial_allocation_OK,
IN ti_res_pool_cb_t pf_res_callback,
IN void *p_context1,
IN void *p_context2,
IN OUT ti_pool_response_t *p_res_pool_resp );
PARAMETERS
h_pool
[in] pointer to a resource pool handle
element_cnt
[in] Number of free elements to retrieve from the pool.
partial_alloc_OK
[in] (TRUE) signals whether all requested elements must be returned as a
single list or (FALSE) if multiple responses are allowed.
pf_res_callback
[in] Optional callback routine pointer. If supplied, then the callback
routine is executed when message elements are available. If <NULL>,
then the caller blocks until message elements are available with
parameter 'p_res_pool_resp' expected to be non-NULL.
*p_context1
[in] caller supplied context #1, returned in resource callback response
structure.
*p_context2
[in] caller supplied context #2, returned in resource callback response.
*p_res_pool_resp
[in]
NULL if pf_res_callback function pointer is non-NULL which indicates
the pf_res_callback function will return the res_pool_response.
Otherwise, a valid res_pool_response pointer and pf_res_callback
must be NULL. pf_res_callback & p_res_pool are mutually exclusive.
[out]
NULL if pf_res_callback function pointer is non-NULL which indicates
the pf_res_callback function will return the res_pool_response.
Otherwise, 'p_res_pool_resp' points to a resource pool response
struct which contains an operation status indicator, along with a
list of message elements allocated on an TI_SUCCESS status return.
RETURN VALUE
TI_SUCCESS
The operation successfully queued. See res_pool_response 'status' for
the actual get operation status.
TI_INVALID_SETTING
Out of range parameters or specification of pf_res_callback with
non-null p_res_pool_resp.
TI_INSUFFICIENT_RESOURCES
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
res_pool_create, ti_pool_response_t, res_pool_destroy
NAME
res_pool_put -- return free message elements back to a resource pool
DESCRIPTION
This function returns a list of elements and associated data buffers back
to its respective resource pool. The head of the element list is referenced
by the p_element_list parameter. All elements returned in the list must
belong to a single resource pool. Multiple res_pool_put() calls are
required to return elements to different resource pools. When elements are
returned to a resource pool, a check is performed to determine if there are
any pending res_pool_get() requests. If pending requests exist, and
sufficient resources were returned to satisfy one or more of them, then the
pending requests are fulfilled using the same thread context. The elements
are delivered to the requester by invoking the callback function specified
in the original request. If there are no pending requests, the elements are
simply returned to the free pool.
SYNOPSYS
ti_status_t
res_pool_put(
IN ti_res_pool_handle_t h_pool,
IN ti_data_element_t *p_delist );
PARAMETERS
*ph_pool
[in] pointer to a resource pool handle
*p_delist
[in] data element list, >= 1 in length.
RETURN VALUE
TI_SUCCESS
The operation successfully completed.
TI_INSUFFICIENT_RESOURCES
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
res_pool_create, res_pool_get, res_pool_destroy
NAME
msg_send -- Post messages to the send queue of a connected transport service
DESCRIPTION
This function posts a list of messages on the send queue of a connected
transport service. If one or more requests are queued ahead of the
send waiting to be transferred, TI_PENDING is returned to the user.
Otherwise, TI_SUCCESS is returned, indicating that the request has been
queued to the hardware for processing. Before a message can be posted,
data element(s) must be obtained from a message resource pool, and all data
buffers referenced by the request must be registered with the transport
[reg_virt_mem]. For data buffers created by the message resource pool,
[res_pool_create] this is done automatically. User-managed buffers require
that the buffer memory must be registered [reg_virt_mem] before sending any
messages.
The message send operation is strictly asynchronous. A callback will be
invoked after the operation completes. The format of the callback function
is defined under [ti_xfer_cb_t]. To provide lower latency per request,
the completion callback is invoked immediately after any completion. This
immediacy might result in multiple completion callbacks per msg_send()
operation, versus providing a single callback only after all messages
referenced by p_delist have completed.
SYNOPSYS
ti_status_t
msg_send(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_delist );
PARAMETERS
h_conn
[in] pointer to a Handle for an existing data transfer service.
*p_delist
[in] Location of a list of data elements comprising message data
transfer requests. This parameter must point to valid data elements
obtained from a transport resource pool. All data buffers referenced
by the data elements must have already been transport registered
[reg_virt_mem, reg_phys_mem].
RETURN VALUE
TI_SUCCESS
The operation successfully queued or completed.
TI_PENDING
A request to the transport cannot be completed immediately and has been
queued for later processing.
TI_INSUFFICIENT_RESOURCES
There was insufficient system memory available to allocate required
resources to perform an operation. Additional system memory must be
made available before the operation is retried.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
reg_virt_mem, reg_phys_mem, ti_data_element_t, ti_xfer_cb_t
res_pool_get, ti_conn_handle_t
NAME
msg_recv -- Post receive messages to the receive queue of a connection
DESCRIPTION
This function posts messages (data elements) to the receive queue of a
connected transport service [ti_conn_handle_t]. Before a message can be
posted, data element(s) must be obtained from a message resource pool, and
all data buffers referenced by this request must be transport registered
with the transport service. For data buffers created by the transport
service [res_pool_create], this is done automatically when the resource
pool was created. User-managed buffers require that the user first transport
register the memory before sending the message.
The message receive operation is strictly asynchronous. A callback is
invoked after the operation completes. The format of the callback function
is defined in [ti_xfer_cb_t]. To provide lower latency per request,
the completion callback is invoked immediately after any completion. This
immediacy may result in multiple completion callbacks per msg_recv()
operation, versus providing a single callback only after all messages
referenced by p_delist have completed.
SYNOPSYS
ti_status_t
msg_recv(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_delist );
PARAMETERS
h_conn
[in] pointer to a Handle for an existing data transfer connection.
*p_delist
[in] Location of a list of data elements comprising message receive
requests. This parameter must point to valid data elements obtained
from a transport resource pool. All data buffers referenced by the
data elements must have already been transport registered.
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
reg_virt_mem, reg_phys_mem, ti_data_element_t, ti_xfer_cb_t, res_pool_get
NAME
rdma_read -- Initiate an RMDA read data transfer operation
DESCRIPTION
This function initiates RMDA read data transfer operations over a connected
transport connection. Multiple data buffers may be submitted as a single
operation. An RDMA read request copies data from a remote memory location
into local memory. rdma_read() may be issued only on a reliable,
connection-oriented data transfer service. Before an RDMA request can be
posted to a transport service, RDMA data elements must be obtained from a
transport resource pool [res_pool_get]. All data buffers referenced by the
data element must be registered [reg_virt_mem, reg_phys_mem] with the
transport. Once a request has been submitted, its completion is tracked by
the connection and associated RDMA completion callback routine. The
request is considered completed only after all data buffers have been
successfully read. RDMA operations are strictly asynchronous. A callback
routine is invoked after the operation completes. The format of the
callback function is defined in [ti_xfer_cb_t]. Each data element returned
to the callback routine has an operation status (RDMA read in this case).
The success or failure of an RDMA read operation is determined by the
data element status.
To provide lower latency per request, the completion callback is invoked
immediately after any completion. This immediacy might result in multiple
completion callbacks per rdma_read() operation, versus providing a single
callback only after all messages referenced by p_rdma_list have completed.
SYNOPSYS
ti_status_t
rdma_read(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_list );
PARAMETERS
h_conn
[in] Handle to an existing established/connected transport connection.
*p_rdma_list
[in] Location of a list of RDMA data elements comprising message data
read requests. This parameter must point to valid data elements
obtained from a transport resource pool. All data buffers referenced by
the data elements must have already been transport registered
[reg_virt_mem, res_pool_get].
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
ti_xfer_cb_t, ti_data_element_t, res_pool_create, res_pool_get
NAME
rdma_write -- Initiates an RDMA write data transfer operation
DESCRIPTION
This function initiates an RDMA write data transfer operations over a
connected transport connection. Multiple data buffers may be submitted as a
single operation. An RDMA write request copies data from transport
registered local memory locations into remote memory. RDMA writes may be
issued only over connection-oriented data transfer services. Before an RDMA
request can be posted to a transport data service, RDMA data elements must
be obtained from a RDMA resource pool, and all data buffers referenced by a
request must be transport registered [reg_virt_mem, res_pool_get].
XXX
For an
RDMA operation to successfully transfer immediate data over the fabric, a
message buffer must be waiting on the remote side of a connection to receive
the incoming data. To remove the restriction of having every RDMA operation
transfer immediate data over a connection exposing immediate data, the
library recognizes the 32-bit value 0xFFFFFFFF as meaning that no immediate
data should be sent with a given RDMA operation.
XXX
Once a request has been submitted, its completion is tracked by the
data transfer completion service. The request is considered complete only
after all data buffers have been successfully written. RDMA operations are
strictly asynchronous. A callback will be invoked after the operation
completes. The format of the callback function is defined in
[ti_xfer_cb_t].
To provide lower latency per request, the RDMA completion callback routine
is invoked immediately after any completion. This immediacy might result in
multiple completion callbacks per rdma_write() operation, versus providing
a single callback only after all messages referenced by p_rdma_list have
completed.
SYNOPSYS
ti_status_t
rdma_write(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_list );
PARAMETERS
h_conn
[in] Handle to an existing connected data transfer connection.
*p_rdma_list
[in] Location of a list of RDMA data elements comprising message data
write requests. This parameter must point to valid data elements
obtained from a transport resource pool [res_pool_get]. All data
buffers referenced by the data elements must have already been transport
registered [reg_virt_mem, res_pool_get].
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_PENDING
A request to the library cannot be completed immediately and has been
queued for later processing.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
ti_xfer_cb_t, ti_data_element_t, res_pool_create, res_pool_get
reg_virt_mem, reg_phys_mem
NAME
rdma_read_send -- RDMA read with message completion indication
DESCRIPTION
SYNOPSYS
ti_status_t
rdma_read_send(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_read_list,
IN ti_data_element_t *p_msg);
PARAMETERS
h_conn
[in] Handle to an existing connected transport service.
*p_rdma_read_list
[in] Location of a list of RDMA message elements comprising RDMA read
requests. This parameter must point to valid message elements
obtained from a library resource pool.
*p_msg
[in] Message sent after RDMA requests have ALL completed successfully.
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_PENDING
A request to the library cannot be completed immediately and has been
queued for later processing.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
rdma_write_send, ti_data_element_t, rdma_read, ti_conn_handle_t
res_pool_get
NAME
rdma_write_send -- RDMA write with message completion indication
DESCRIPTION
SYNOPSYS
ti_status_t
rdma_write_send(
IN ti_conn_handle_t h_conn,
IN ti_data_element_t *p_rdma_write_list,
IN ti_data_element_t *p_msg);
PARAMETERS
h_conn
[in] Handle to an existing connected transport service.
*p_rdma_write_list
[in] Location of a list of RDMA message elements comprising RDMA write
requests. This parameter must point to valid data elements obtained
from a transport resource pool [res_pool_get].
*p_msg
[in] Message sent after RDMA requests have ALL completed successfully.
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_PENDING
A request to the library cannot be completed immediately and has been
queued for later processing.
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
rdma_read_send, ti_data_element_t, rdma_read, ti_conn_handle_t
res_pool_get
NAME
atomic_op -- perform remote atomic operations
DESCRIPTION
SYNOPSYS
ti_status_t
atomic_op( IN ti_conn_handle_t h_conn,
IN ti_atomic_op_t operation,
IN OUT ti_data_element_t *p_data );
PARAMETERS
h_conn
[in] Handle to an existing connected transport connection.
operation
[in] atomic operation identifier [ti_atomic_op_t]
*p_data
pointer to a data element that describes remote data, rdma descriptor
if valid.
*p_data
[out] pointer to a data element which describes local registered data
RETURN VALUE
TI_SUCCESS
The operation successfully queued.
TI_INVLAID_HANDLE
Bad handle to a connected transport service.
TI_INVLAID_PARAMETER
Bad atomic operation type
TI_FERROR
A catastrophic error has occurred that could not be handled by the
library.
SEE ALSO
ti_atomic_op_t, ti_data_element_t
NAME
io_ctl -- Transport I/O control
DESCRIPTION
Transport I/O control interface designed to support those transport
functions not covered by the existing operation definitions.
SYNOPSIS
ti_status_t
io_ctl(
IN ti_handle_t h_ti,
IN ti_ioctl_t cmd,
IN void *p_data_in,
IN OUT void *p_data_out );
PARAMETERS
h_ti
[in] addrs of an transport service instance handle
cmd
[in] ioctl command value: see 'ti_ioctl_t' enum.
*p_data_in
[in] ioctl command data
*p_data_out
[out] ioctl command outputs
RETURN VALUE
TI_SUCCESS
The operation completed successfully.
TI_INVLAID_HANDLE
Bad transport service handle.
TI_INSUFFICIENT_RESOURCES
There are not enough internal library resources available to complete
the request. The request may be retried after resources become
available.
TI_INVALID_STATE
A service of the library is not in a valid state on which to perform the
specified operation.
TI_INVALID_SETTING
One of the parameters has an invalid or unsupported setting. This status
usually signals that a specified value is outside of a given range.
SEE ALSO
ti_ioctl_t
NAME
ti_transport_desc_t
PURPOSE
Transport attributes description
DESCRIPTION
This structure is returned as an event ID specific data argument to the
events_register() event notification callback routine when the event ID is
defined to be 'TI_NETDEV_ONLINE'.
SYNOPSIS
typedef struct _transport_desc
{
net_device_t *dev;
ti_inet_addr_t *IP_addr;
ti_inet_addr_t *IP_netmask;
ti_inet_addr_t *first_hop_IP_route_addr;
} ti_transport_desc_t;
FIELDS
dev
transport link device pointer
IP_addr
Primary IP address assigned to the link device
IP_netmask
IP network mask
first_hop_IP_route_addr
IP address of the first hop when routing
SEE ALSO
events_register, events_deregister, ti_event_t
NAME
ti_inet_route_event_t
PURPOSE
IP address referenced when the Inet route database changed.
DESCRIPTION
This structure is returned as an argument to the ti_events_enable() callback
routine when the event is defined to be any of the events that involve
changes to the Internet routing database (TI_EVENT_RTF_*).
SYNOPSIS
typedef struct _inet_route_event_
{
ti_inet_addr_t ipaddr;
net_device_t *dev;
} ti_inet_route_event_t;
FIELDS
IP_addr
IP address associated with the IP route database change
*dev
network interface device instance pointer where IP address is assigned.
SEE ALSO
link_event_register, ti_event_t
NAME
SDP_register_transport -- Register the Transport Interface with a protocol.
PURPOSE
A Protocol provider exported interface which provides the Transport
Interface layer the ability to register with a protocol provider.
The registration process includes a TI event notification routine,
which delivers Transport Interface (TI) specific events to the protocol
provider (SDP).
DESCRIPTION
All AF_INET_OFFLOAD protocol providers are expected to export transport
registration and deregistration entry points which enable the protocol
provider to retrieve the transport operations vector (ti_ops_t).
TI (Transport Interface) events will be enabled at this juncture, hence the
supplied ti_event_notify routine will be called when TI events occur.
TI events encompass TI shutdown, restart, etc.
NOTES
Do not confuse TI events with transport events. Transport events
[events_register] are transport provider specific (link up/down etc.). Here
we deal with TI (Transport Interface) layer events only (TI shutdown, etc).
SYNOPSIS
void
SDP_register_transport( ti_ops_t *ops, ti_event_cb_t ti_event_notify );
PARAMETERS
[in] *ops
Transport provider supported operations vector.
[in] ti_event_notify
Function which is called by the Transport Interface to notify a protocol
of the occurance of a TI event.
SEE ALSO
ti_opts_t, ti_event_cb_t, ti_event_t
NAME
SDP_deregister_transport -- de-register the TI from a protocol provider.
PURPOSE
Protocol provider exported interface which allows the Transport Interface
to de-register itself from a protocol provider.
DESCRIPTION
TI event notification is canceled along with de-registering the Transport
Interface from the protocol provider.
SYNOPSIS
void
SDP_deregister_transport( ti_ops_t *ops, ti_event_cb_t ti_event_notify );
NOTES
Include TI/ti_exports.h
PARAMETERS
Exact same arguments as used in SDP_register_transport().
[in] *ops
Transport provider supported operations vector.
[in] ti_event_notify
Function which is called by the Transport Interface to notify a protocol
SEE ALSO
ti_opts_t, ti_event_cb_t, ti_event_t
NAME
ti_event_to_str -- translate a TI event code to a descriptive string.
DESCRIPTION
Generate a human readable descriptive string which represents the TI event.
SYNOPSIS
char *
ti_event_to_str( ti_event_t event, void *data, char *buf, u32 buf_len );
PARAMETERS
[in] event
TI event code.
[in] data
Event specific data.
[in/out] *buf
Address of where the descriptive string is written
[in] buf_len
Max length, in bytes, of the memory 'buf' points at.
NOTES
Include TI/ti_exports.h
SEE ALSO
ti_event_t
NAME
ti_status_to_str -- translate a TI status code to a descriptive string.
DESCRIPTION
Return a string address which describes/illuminates the TI status code.
SYNOPSIS
char *
ti_status_to_str( const ti_status_t status );
PARAMETERS
[in] status
TI status code.
NOTES
Include TI/ti_exports.h
SEE ALSO
ti_status_t
NAME
mk_inet_addr_p -- translate an INET (ipV4) address to a printable string.
DESCRIPTION
Return a string address which represents the IPv4 INET address.
SYNOPSIS
char *
mk_inet_addr_p( unchar *IPv4_addr, char *buf );
PARAMETERS
[in] *IPv4_addr
struct in_addr address
[in] *buf
Output buffer where the IP address string is written.
NOTES
It is assumes the suppled output buffer is large enough to accomadate a
broadcast IP address '255.255.255.255' (16 chars: 15 chars+<null>).
Include TI/ti_exports.h
NAME
strdup -- create an exact copy of a null byte terminated string.
DESCRIPTION
Return a string address which is a new copy of the input string.
SYNOPSIS
char *
strdup( const char *string );
PARAMETERS
[in] *string
Address of the new string.
NOTES
kmalloc() is used to allocate memory for the new string.
Include TI/ti_exports.h
NAME
sample_TI_bind -- bind a protocol to a transport
PURPOSE
A protocol requests the Transport Interface to bind the specified link
device to the calling protocol.
DESCRIPTION
Transport binding equates to the delivery of transport events for the
specified link device. Event delivery is enabled via the event notification
callback routine parameter.
A protocol will receive transport events for those link devices which match
the input parameter 'pd->link_dev_name'. A link device name can be generic,
in that it contains no device instance number (i.e., 'eth'). A generic link
device name will match any instance of the device.
Transport events continue to be delivered to the protocol, via the callback
routine, until such a time that the ti_events_deregister() routine has been
called. A protocol is responsible to filter out un-wanted trnasport events.
PARAMETERS
INPUTS
pd protocol data pointer
OUTPUT
none.
SIDE EFFECTS
Transport event notification remains enabled until the TI interface
routine events_deregister() is called.
RETURN VALUE
TRUE - Accepted a bind request to the specified transport
FALSE - Rejected the transport specific bind request.
SEE ALSO
sample_TI_callback, ti_events_deregister, ti_events_register
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
boolean_t
sample_TI_bind( IN proto_data_t *pd )
{
ti_status_t status;
pd->tp.bound = FALSE; // no transport bound at this juncture
pd->link_dev_name = "IBoIP";// absence of device instance # == all devices
// request the TI bind an Transport to this protocol.
status = ti_ops->events_register( "Sockets Direct Protocol",
pd->link_name,
sample_TI_callback,
pd->event_context );
return (status == TI_SUCCESS ? TRUE : FALSE);
}
#endif // SOURCE
NAME
sample_TI_callback -- transport event notification callback routine
PURPOSE
Handle transport event notifications for a protocol provider.
DESCRIPTION
A protocol provider has specified this routine in an events_register() call.
The transport will invoke this routine when transport events occur.
PARAMETERS
INPUTS
event
Transport event identifier.
*data
Pointer to transport event specific data.
*context
events_register() supplied context input parameter.
OUTPUT
none.
SEE ALSO
sample_TI_bind, ti_event_t, ti_transport_desc_t, ti_inet_route_event_t
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
void
sample_TI_callback( IN ti_event_t event,
IN void *data,
IN void *context )
{
proto_data_t *pd = context; // protocol data
ti_transport_desc_t *tpd;
switch ( event )
{
case TI_EVENT_NETDEV_ONLINE:
tpd = (ti_transport_desc_t *)data; // event specific data
// copy online Transport info to proto specific area.
*(ti_transport_desc_t *)&pd->tp.tpd = *tpd;
pd->tp->online = TRUE;
break;
case TI_*: // handle other transport events.
<...>
break;
default:
break;
}
}
#endif // SOURCE
NAME
sample_sock_init -- SAMPLE CODE
PURPOSE
Main focus here is to provide the global framework in which all other
examples to operate in. Primarily global data definitions required for the
examples.
DESCRIPTION
The operating environment is assumed to be the Linux kernel underneath
the socket layer at the AF_INET_OFFLOAD protocol level. The common input
arg is the socket and sock structures.
PARAMETERS
INPUTS
sock socket struct pointer
OUTPUT
none.
RETURN VALUE (boolean)
TRUE Successful initialization
FALSE Failed initialization
SIDE EFFECTS
sock struct member struct ap_opt initialized.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
// Transport specific
boolean_t transport_initialized = FALSE;
ti_opts_t *ti_ops = NULL; // set during transport initialization
// transport definitions
typedef _transport_data_ {
boolean_t online; // set from event callback SDP_ti_event()
ti_transport_desc_t tpd; // INET address information
} transport_data_t;
typedef struct _Protocol_Data_ {
char *link_dev_name; // set when transport registered with
// this protocol.
void *transport_context;
transport_data_t tp;
} proto_data_t;
proto_data_t ProtocolData; // Protocol specific data.
// ap_opt structure has been added to the 'sock' structure; see linux/sock.h
// SDP Connection states
#define SDP_CLOSED 1
#define SDP_PENDING 2
#define SDP_CONNECTING 3
#define SDP_CANCELED 4
#define SDP_CONNECTED 5
typedef struct _ap_private
{
int desired_rmsg_size;
} ap_private_t;
struct ap_opt
{
// transport exported operations vector
ti_ops_t *ti_ops;
// Connection states
int state;
// Handles to the TI resources provided by bound transport
ti_handle_t h_ti;
ti_conn_handle_t h_conn; // set in conn request callback routine
ti_conn_handle_t h_listen_conn; // set by listen()
ti_cq_handle_t h_cq;
ti_res_pool_handle_t h_msg_pool;
ti_res_pool_handle_t h_rdma_pool;
// Reference back to socket
struct socket *sk;
// Connect request information from incoming connections
ti_conn_request_t conn_req;
// Connection Information: endpoints, pd, private data, cb
ti_conn_info_t conn_info;
// Connection and framing attributes: Max tx/rx buffer resources,
// h/w flow control and automatic path migration
ti_conn_attr_t conn_attr;
// Message and RDMA attributes
int rmsg_size;
int rmsg_depth;
int smsg_size;
int smsg_depth;
int rdma_depth;
int max_msg_size;
int max_rdma_size;
// Recv Message queue
ti_data_element_t rmsgs;
atomic_t recv_q_lock;
// Private-data sent as part of connection request/reply.
ap_private_t private_data;
// wait queue heads
wait_queue_head_t send_waitq;
wait_queue_head_t recv_waitq;
};
// Data element send context, one of these is attached to each message element
// created in the resource pool (res_pool_create)
typedef struct _send_context
{
struct ap_opt *p_conn;
} send_context_t;
// Global Data Declarations
atomic_t ap_sockets_allocated;
//
// Initialize socket state and SDP protocol state
//
int sample_sock_init( IN struct sock *sk )
{
struct ap_opt *p_conn = &sk->tp_pinfo.af_ap;
if ( transport_initialized == FALSE )
{
proto_data_t *pd = & Protocol_Data;
// wait for transport to register with SDP
// global 'ti_ops' is set.
// Enable transport event delivery
status = ti_ops->events_register( (void *)pd, sample_TI_callback);
if ( status != TI_SUCCESS )
{
return -ENOPERM;
}
transport_initialized = TRUE;
}
p_conn->ti_ops = ti_ops; // transport interface operations dispatch vector
p_conn->h_conn = NULL;
p_conn->h_listen_conn = NULL;
p_conn->state = SDP_CLOSED;
init_waitqueue( &p_conn->xmit_wait );
init_waitqueue( &p_conn->recv_wait );
sk->state = TCP_CLOSE;
sk->use_write_queue = 0; // no write space callback
// Use new SDP defaults to size up private buffer pool
p_conn->rmsg_size = p_conn->smsg_size = sysctl_ap_msg_size;
p_conn->rmsg_depth = p_conn->smsg_depth = sysctl_ap_msg_depth;
sk->sndbuf = sk->rcvbuf = sysctl_ap_msg_size * sysctl_ap_msg_depth;
// MTU size based on RDMA threshold and Transport's message MTU
// Defaults can be reset at the bind based on the endpoint
atomic_inc(&ap_sockets_allocated);
return SUCCESS;
}
#endif // SOURCE
NAME
sample_bind -- SAMPLE CODE
PURPOSE
Create local endpoint information given an IPv4 address.
DESCRIPTION
The semantics of 'bind' here are the user-mode socket bind() flavor.
A transport endpoint descriptor is created to be used later in protection
domain and connection creation.
PARAMETERS
INPUTS
sk sock struct pointer
uaddr sockaddr struct pointer
addr_len length, in bytes, of the address
OUTPUT
p_conn local transport endpoint information updated.
RETURN VALUE
0 == SUCCESS
else negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
int sample_bind(IN struct sock *sk,
IN struct sockaddr *uaddr,
IN int addr_len )
{
struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
struct ap_opt *p_conn = &sk->tp_pinfo.af_ap;
ti_inet_addr_t ti_addr;
unsigned short snum;
if (addr_len < sizeof(struct sockaddr_in))
return -EINVAL;
lock_sock(sk);
if ((sk->state != TCP_CLOSE) || (sk->num != 0))
{
release_sock(sk);
return (-EINVAL)
}
// need to create a transport service instance handle?
if ( !p_conn->h_ti )
{
// create a transport interface handle to communicate with the transport
if ( p_conn->ti_ops->create_ti( &p_conn->h_ti ) != TI_SUCCESS )
{
release_sock(sk);
return (-EINVAL)
}
}
// Initialize local addresses for socket and transport, AF_INET is IPv4
ti_addr.IP_addr_type = TI_ADDR_IPV4;
ti_addr.ip4_addr = addr->sin_addr.s_addr;
sk->rcv_saddr = sk->saddr = addr->sin_addr.s_addr;
// Create local endpoint for the transport
if ( p_conn->ti_ops->create_endpoint(
p_conn->h_ti,
&ti_addr,
ntohs(addr->sin_port),
&p_conn->conn_info.h_src) != TI_SUCCESS )
{
release_sock(sk);
return (-EADDRINUSE) // endpoint conflict, in use
}
// Create the protection domain for this local interface
if ( p_conn->ti_ops->create_pd( p_conn->h_ti,
p_conn->conn_info.h_src,
&p_conn->conn_info.pd ) != TI_SUCCESS )
{
// destroy endpoint and return
destroy_endpoint( p_conn->conn_info.h_dst );
release_sock(sk);
return (-ENOSR); // out of streams resources
}
// Set the max message and RDMA sizes here
// do we need a new get_attributes call or get them
// returned during create_ti????
// p_conn->max_msg_size;
// p_conn->max_rdma_size;
// set socket state and return
sk->userlocks |= SOCK_BINDADDR_LOCK;
sk->userlocks |= SOCK_BINDPORT_LOCK;
sk->sport = addr->sin_port;
sk->daddr = 0;
sk->dport = 0;
sk_dst_reset(sk);
release_sock(sk);
return SUCCESS;
}
#endif // SOURCE
NAME
sample_connect -- SAMPLE CODE
PURPOSE
Request a reliable transport connection to the specified destination
protocol port and IP address.
DESCRIPTION
Follows the semantics of a user-mode connect() request. Upon completion
of an accept() from within the connection request callback routine, the
transport connection is ready to send/recv data.
PARAMETERS
INPUTS
sk sock struct pointer
uaddr sockaddr struct pointer
addr_len length, in bytes, of the address
RETURN VALUE
0 == SUCCESS
else negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// Open a connection to a peer agent at given IP address/port
// This routine calls transport to resolve peer IP
// endpoint information and to send the connect request.
// Local endpoint already created in a previous sample_bind call
// for this socket.
//
int sample_connect( IN struct sock *sk,
IN struct sockaddr *uaddr,
IN int addr_len )
{
ap_opt *p_conn = sk->ap_opt; // setup in init call
ti_status_t status;
struct sockaddr_in *addr = (struct sockaddr_in *)uaddr;
ti_inet_addr_t ti_addr;
if (addr_len < sizeof(struct sockaddr_in))
return(-EINVAL);
if (addr->sin_family != AF_INET)
return(-EAFNOSUPPORT);
lock_sock(sk);
// If no bind then do autobind by calling
if (!p_conn->conn_info.h_src)
return(-EINVAL);
// Setup address struct for transport, AF_INET is IPv4
ti_addr.IP_addr_type = TI_ADDR_IPV4;
ti_addr.ip4_addr = addr->sin_addr.s_addr;
// Create remote endpoint for the transport
if ( p_conn->ti_ops->create_endpoint(
p_conn->h_ti,
&ti_addr,
addr->sin_port,
&p_conn->conn_info.h_dst ) != TI_SUCCESS )
{
release_sock(sk);
return (-ENETUNREACH); // Network unreachable
}
// Initialize connection information:
// Local endpoint and protection domain created
// during previous socket bind. Remote endpoint just created.
p_conn->conn_info.private_data_len = sizeof(ap_private_t);
p_conn->conn_info.p_private_data = &p_conn->private_data;
p_conn->conn_info.private_data_offset = 0;
p_conn->conn_info.private_data_offset_len = 0;
p_conn->conn_info.pfn_conn_request_cb = sample_connect_request_callback;
p_conn->conn_info.p_context = (void*) &p_conn;
// Initialize the connection and framing attributes
p_conn->conn_attr.max_transfer_depth = p_conn->smsg_depth +
p_conn->rdma_depth;
p_conn->conn_attr.max_transfer_buffers = 1;
p_conn->conn_attr.max_recv_depth = p_conn->rmsg_depth;
p_conn->conn_attr.max_recv_buffers = 1;
p_conn->conn_attr.reliable = TRUE;
p_conn->conn_attr.msg_flow_control = TRUE;
p_conn->conn_attr.auto_path_migration = TRUE;
// Set private data, desired recv message size
p_conn->private_data.desired_rmsg_size = p_conn->rmsg_size;
// We now have created local and remote endpoints
// and set the connection attributes and information.
// Ready to call the transport's connect request function
if ( p_conn->ti_ops->connect ( p_conn->h_ti,
&p_conn->conn_attr,
&p_conn->conn_info,
&p_conn->h_conn ) != TI_SUCCESS )
{
destroy_endpoint( p_conn->h_dst );
p_conn->h_dst = NULL;
release_sock(sk);
return -EADDRNOTAVAIL; // Cannot assign requested address
}
// Update socket fields appropriately
sk->dport = addr->sin_port;
sk->daddr = addr->sin_addr.s_addr;
sk->err = 0;
sk->done = 0;
sk->state = TCP_SYN_SENT; // see linux/tcp.h
// Update sock->ap_opt fields
p_conn->state = SDP_PENDING;
release_sock(sk);
return SUCCESS;
}
#endif // SOURCE
NAME
sample_listen -- SAMPLE CODE
PURPOSE
Sample code which schedules a transport listen request.
DESCRIPTION
A transport listen request is scheduled. A connection request message will
eventually arrive, from a remote endpoint connect() request, and trigger
the listen connection request CallBack Routine (CBR). The CBR can
accept() the the connection request thus 'establishing a connection',
reject() the request or save connection request information and
notify an external agent. In this listen example the connection request
CBR will save the volatile ti_conn_request_t information in stable storage
and wake the waiting socket accept thread.
See sample_connect_request_callback example.
PARAMETERS
INPUTS
sk sock structure pointer
OUTPUT
none.
SIDE EFFECTS
sock struct marked at 'listening' for a connection.
RETURN VALUE
0 == SUCCESS
else negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// Listen on local address/port endpoint
//
int sample_listen( IN struct sock *sk )
{
struct ap_opt *p_conn = sk->ap_opt; // setup in init call
ti_status_t status;
// say we are listen()ing & allocate a local port.
sk->state = TCP_LISTEN; // see linux/tcp.h
if (sk->prot->get_port(sk,sk->num) != 0)
{
sk->state = TCP_CLOSE;
return (-EADDRINUSE);
}
sk->sport = htons(sk->num);
sk_dst_reset(sk);
sk->prot->hash(sk);
// Initialize connection information:
// Local endpoint and protection domain created
// during previous socket bind. Remote endpoint just created.
p_conn->conn_info.private_data_len = sizeof(MY_PRIVATE_INFO);
p_conn->conn_info.p_private_data = p_conn->MyPrivateInfo;
p_conn->conn_info.private_data_offset = 0;
p_conn->conn_info.private_data_offset_len = 0;
p_conn->conn_info.pfn_conn_cb = sample_connect_request_callback;
p_conn->conn_info.p_context = (void*) &p_conn;
// Initialize the connection and framing attributes
p_conn->conn_attr.max_transfer_depth = p_conn->smsg_depth +
p_conn->rdma_depth;
p_conn->conn_attr.max_transfer_buffers = 1;
p_conn->conn_attr.max_recv_depth = p_conn->rmsg_depth;
p_conn->conn_attr.max_recv_buffers = 1;
p_conn->conn_attr.reliable = TRUE;
p_conn->conn_attr.msg_flow_control = TRUE;
p_conn->conn_attr.auto_path_migration = TRUE;
// We now have created local endpoints
// and set the connection attributes and information.
// Ready to call the transport's listen request function.
// The listen connection handle 'h_listen_conn' is used only to cancel this
// listen() request. Handle 'h_conn' specified in the connection request
// callback routine input parameter 'ti_conn_request_t.h_conn', afer an
// accept, is the correct handle to use in data transfer requests.
if ( p_conn->ti_ops->listen ( p_conn->h_ti,
&p_conn->conn_attr,
&p_conn->conn_info,
&p_conn->h_listen_conn ) != TI_SUCCESS )
{
destroy_endpoint( p_conn->h_dst );
p_conn->h_dst = NULL;
sk->state = TCP_CLOSE;
return -EADDRNOTAVAIL; // Cannot assign requested address
}
// Update socket fields appropriately
sk->err = 0;
sk->done = 0;
// Update sock->ap_opt fields
p_conn->state = SDP_PENDING;
return SUCCESS;
}
#endif // SOURCE
NAME
sample_connect_request_callback -- SAMPLE CODE
PURPOSE
Example connection callback routine which will process a connection request.
DESCRIPTION
This Connection Request Callback routine is invoked, within a system thread
context, in response to the arrival of a connection request message. The
connection is established via an accept() call or rejected with the reject()
call. Accept and reject both modify the connection handle (ti_conn_handle_t)
specified in the input arg (ti_conn_request_t). This connection handle is
the same one returned at the connect() call site or a new connection handle
in the case of a listen() call. Therefore, in the case of a listen()
callback, it is the responsibility of the callback routine to save the new
connection handle.
If the connection reaches the 'established' state via an accept() call, then
the connection handle from ti_conn_request_t is referenced in subsequent
data transfer requests [msg_send, rmda_read].
It is the responsibility of the connect request callback routine to
synchronize and communicate connection establishment with the connect() or
listen() caller.
Completing the connection request cycle [accept, reject] within the callback
routine is not mandatory. The ti_conn_request_t struct must be copied to
stable storage as the input struct memory is only valid for the duration
of the connection request callback routine.
PARAMETERS
INPUTS
p_conn_req A connection request (ti_conn_request_t).
OUTPUT
none.
SIDE EFFECTS
Connection handle (ti_conn_handle_t) referenced in the input arg
p_conn_req (ti_conn_request_t) is modified to reflect the connection
state by the call to accept() or reject().
RETURN VALUE
none - void function.
NOTE
An easy way to distinguish connect() or listen() from the connection request
callback routine is the ti_conn_handle_t output parameter returned at the
connect() call site will be the same as the connection handle passed into
the connection request callback routine as 'pConnReq.h_conn'. For a listen()
connection request callback, 'pConnReq.h_conn' will be a new connection
handle, hence different than the handle returned at the listen() call site.
SOURCE
#if 0
#include "sdp_ti.h"
// Forward declarations.
void InitializeMsgs( IN ti_pool_response_t *pPoolResp );
void CqErrorCallback( IN ti_err_cb_t *pErrorResp );
void ConnErrorCallback( IN ti_err_cb_t *pErrorResp );
void sample_msg_recv_complete(
IN ti_cq_handle_t h_cq, IN ti_data_element_t *pMsgList);
void sample_msg_send_complete(
IN ti_cq_handle_t h_cq, IN ti_data_element_t *pMsgList);
// callback routine to process successful, failed, and canceled
// connection attempts. This routine completes listen requests.
void sample_connect_request_callback( IN ti_conn_request_t *pConnReq )
{
struct sock *sk = (struck sock *)pConnReq->p_context;
struct ap_opt *p_conn = (struct ap_opt *)&sk->tp_pinfo.af_ap;
ap_private_t *p_private;
uint32 cq_entries, MsgContextSize = sizeof(void*);
ti_comp_info_t CompInfo;
ti_pool_response_t MsgPoolResp;
ti_status_t Status;
lock_sock(sk);
p_private = pConnReq->p_private_data;
// The connection is being established. Check remote rmsg
// size and compare with ours. Agree on smaller of the two
// and build the send and recv buffer pools accordingly
if( p_conn->rmsg_size != p_private->desired_rmsg_size )
{
p_private->desired_rmsg_size = min( p_conn->rmsg_size,
p_private->desired_rmsg_size );
p_conn->rmsg_size = p_conn->smsg_size = p_private->desired_rmsg_size;
}
// If this callback is from a listen then the application must
// accept otherwise we can go ahead and accept since this is
// a callback from a connect request and there will be no accept.
// See sample_listen at the listen() call site, output parameter.
if ( p_conn->state == SDP_PENDING && pconn->h_listen_conn )
{
// change connection state, signal for select, poll or socket_accept
p_conn->state = SDP_CONNECTING;
SignalConnectRequest( p_conn );
release_sock(sk);
return;
}
// If required - Verify listen() caller supplified additional private Data
// @ private_data_offset. Normally utilized by a listen() callback to
// further qualify connection request. See
// pConnReq->private_data_offset & pConnReq->private_data_offset_len
// Create a new message pool and
// completion services. We will poll for completed sends, but setup
// a callback for notification of completed receives.
// Create a message pool.
// Set context for each of the messages to reference one of our own
// request data structures. This will call InitMsgs directly before
// returning. InitMsgs will put the message elements back into the
// message pool after initializing their context fields.
//
msg_count = p_conn->rmsg_depth + p_conn->smsg_depth;
MsgBufferSize = pConnReq->pConnAttr->MaxMsgSize;
Status = p_conn->ti_ops->res_pool_create( p_conn->h_ti,
TI_RES_POOL_MSG,
msg_count,
1,
&MsgBufferSize,
sizeof(send_context_t),
&p_conn->MsgPoolHandle );
if( Status != FSUCCESS )
{
// We failed to create the message pool. Reject the connection.
// Destroy CQ, Reject connection.
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( pConnReq );
release_sock(sk);
RequestCanceled( p_conn, TI_INSUFFICIENT_MEMORY );
return;
}
// We need to accept the connection.
// Set the channel's completion information.
// one CQ shared for sends and recvs
CompInfo.h_send_cq = p_conn->h_cq;
CompInfo.h_recv_cq = p_conn->h_cq;
CompInfo.pfn_send_cb = sample_msg_send_complete;
CompInfo.pfn_recv_cb = sample_msg_recv_complete;
CompInfo.pfn_err_cb = ConnErrCallback;
CompInfo.p_err_context = p_conn;
// Get half of the messages to post as initial receives to ensure first
// message is not dropped.
Status = p_conn->ti_ops->res_pool_get( p_conn->MsgPoolHandle,
msg_count/2,
FALSE,
NULL,
0,
&MsgPoolResp );
if( Status != FSUCCESS )
{
// The message elements posted through the accept call will
// have failed. They were returned by the library through the receive
// callback function. The callback will have returned each element
// to the message pool. NOTE: If no receive callback is defined
// a poll_cq in a while loop would be required to get and put all
// elements back on the message pool.
//
// Destroy CQ, message pool, Reject connection.
p_conn->ti_ops->res_pool_destroy ( p_conn->MsgPoolHandle );
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( pConnReq );
release_sock(sk);
RequestCanceled( pConnReq, Status );
return;
}
// Accept the connection using the request with no changes to private
// data provided by remote side, If successfull then pConnReq->h_conn will
// contain the connection handle to be used for all data transfers on this
// connection. 'pConnReq->h_conn' is returned as an OUT parameter for
// listen & connect.
Status = p_conn->ti_ops->accept(pConnReq,
MaxRdmaSize,
&CompInfo,
MsgPoolResp->pElementList );
if( Status != FSUCCESS )
{
// The message elements posted through the accept call will
// have failed. They were returned by the library through the receive
// callback function. The callback will have returned each element
// to the message pool. NOTE: If no receive callback is defined
// a poll_cq in a while loop would be required to get and put all
// elements back on the message pool.
//
// Destroy CQ, message pool, Reject connection.
p_conn->ti_ops->res_pool_destroy ( p_conn->MsgPoolHandle );
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( pConnReq );
release_sock(sk);
RequestCanceled( pConnReq, Status );
return;
}
// Connection was accepted, ready to send and receive data.
// At this juncture, an easy way to distinguish connect() or listen() is
// the ti_conn_handle_t returned at the connect() call site will be the
// same as the connection handle passed into the callback routine as
// 'pConnReq.h_conn'. For a listen() callback, 'pConnReq.h_conn' will be a
// new connection handle, hence different than the handle returned from the
// listen() call.
// change connection state
p_conn->state = SDP_CONNECTED;
sk->state = TCP_CONNECTED;
release_sock(sk);
// Signal socket for connect complete state
SignalConnectDone( p_conn );
}
#endif // SOURCE
NAME
sample_accept -- SAMPLE CODE
PURPOSE
Example code which implements socket accept() semantics using a transport
connection accept() call.
DESCRIPTION
This routine was called in response to a user issuing a socket accept system
call. Here we retrieve the saved connection request callback routine
argument (ti_conn_request_t) and complete the connection request cycle by
transport 'accept()ing' the connection. It is assumed that a listen
connection request callback routine has executed in the past and saved it's
input parameter (ti_conn_request_t) in the sock structure (*sk) so this
routine can retrieve it and complete the connection request cycle via
accept() or reject() call.
Note the connection handled 'ti_conn_request_t.h_conn' is the handle which
will be modified by reject or accept, and in the accept case become a valid
connection handle to be used in subsequent data transfer requests.
PARAMETERS
INPUTS
sk sock struct pointer
flags socket flags
OUTPUT
*err where to return an error status
SIDE EFFECTS
new sock struct created.
RETURN VALUE
NULL indicates an error encountered with *err containing the error status.
Otherwise, a new sock struct pointer which represents a connected socket.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// accept connection requests,
// wait for a PENDING or accept a CONNECTING socket
//
struct sock *
sample_accept( IN struct sock *sk,
IN int flags,
OUT int *err )
{
struct ap_opt *p_conn = sk->ap_opt; // setup in init call
struct sock *newsk;
ti_status_t status;
uint32_t num_cq_entries;
lock_sock(sk);
// Block, waiting for connection callback for the listen
if ( p_conn->state == SDP_PENDING ) {
release_sock(sk);
WaitOnSignalConnectRequest();
lock_sock(sk);
}
// Should be in connecting state ready to respond to
// connect request from remote
if ( p_conn->state != SDP_CONNECTING )
{
release_sock(sk);
*err = -EINVAL;
return NULL;
}
// inbound connection request is valid, ready to be accepted,
// Get a new sock structure using existing sock structure
if ((newsk = ap_alloc_sock(sk, GFP_KERNEL)) == NULL) {
release_sock(sk);
*err = -ENOBUFS;
return NULL;
}
// Set proper fields in newsk...and work from the new context
// Here is where we could check the backlog and re-issue a listen
// on the original socket. Assume no backlog for now to keep sample
// simple.
release_sock(sk);
p_conn = newsk->ap_opt;
lock_sock(newsk);
// If required - Verify listen() caller supplified additional private Data
// @ private_data_offset. Normally utilized by a listen() callback to
// further qualify connection request. See
// pConnReq->private_data_offset & pConnReq->private_data_offset_len
// Create a new message pool and completion services. We will poll for
// completed sends, but setup a callback for notification of completed
// receives.
// Create the completion queue. One per connection to handle both send and
// receive completions:
num_cq_entries = p_conn->conn_attr.max_recv_depth
+ p_conn->conn_attr.max_transfer_depth;
Status = p_conn->ti_ops->create_cq( p_conn->h_ti,
p_conn->conn_info.ph_pd,
num_cq_entries,
&p_conn->ph_cq );
if( Status != FSUCCESS )
{
// We failed to create the completion queue.
// Reject connection.
p_conn->ti_ops->reject( &p_conn->pConnReq );
release_sock(newsk);
RequestCanceled( p_conn, TI_INSUFFICIENT_MEMORY );
*err = -ENOSR;
return NULL;
}
// Create a message pool.
// Set context for each of the messages to reference one of our own
// request data structures. This will call InitMsgs directly before
// returning. InitMsgs will put the message elements back into the
// message pool after initializing their context fields.
//
msg_count = p_conn->rmsg_depth + p_conn->smsg_depth;
MsgBufferSize = pConnReq->pConnAttr->MaxMsgSize;
Status = p_conn->ti_ops->res_pool_create( p_conn->h_ti,
TI_RES_POOL_MSG,
msg_count,
1,
&MsgBufferSize,
sizeof(send_context_t),
&p_conn->MsgPoolHandle );
if( Status != FSUCCESS )
{
// We failed to create the message pool. Reject the connection.
// Destroy CQ, Reject connection.
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( &p_conn->pConnReq );
release_sock(newsk);
RequestCanceled( p_conn, TI_INSUFFICIENT_MEMORY );
*err = -ENOSR;
return NULL;
}
// We need to accept the connection.
// Set the channel's completion information.
// one CQ shared for sends and recvs
CompInfo.h_send_cq = p_conn->h_cq;
CompInfo.h_recv_cq = p_conn->h_cq;
CompInfo.pfn_send_cb = sample_msg_send_complete;
CompInfo.pfn_recv_cb = sample_msg_recv_complete;
CompInfo.pfn_err_cb = ConnErrCallback;
CompInfo.p_err_context = p_conn;
// Get half of the messages to post as initial receives to ensure first
// message is not dropped.
Status = p_conn->ti_ops->res_pool_get( p_conn->MsgPoolHandle,
msg_count/2,
FALSE,
NULL,
0,
&MsgPoolResp );
if( Status != FSUCCESS )
{
// The message elements posted through the accept call will
// have failed. They were returned by the library through the receive
// callback function. The callback will have returned each element
// to the message pool. NOTE: If no receive callback is defined
// a poll_cq in a while loop would be required to get and put all
// elements back on the message pool.
//
// Destroy CQ, message pool, Reject connection.
p_conn->ti_ops->res_pool_destroy ( p_conn->MsgPoolHandle );
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( &p_conn->pConnReq );
release_sock(newsk);
RequestCanceled( pConnReq, Status );
*err = -ENOSR;
return NULL;
}
// Accept the connection using the request with no changes to private
// data provided by remote side, If successfull then pConnReq->h_conn will
// contain the connection handle to be used for all data transfers on this
// connection.
Status = p_conn->ti_ops->accept(&p_conn->pConnReq,
MaxRdmaSize,
&CompInfo,
MsgPoolResp->pElementList );
if( Status != FSUCCESS )
{
// The message elements posted through the accept call will
// have failed. They were returned by the library through the receive
// callback function. The callback will have returned each element
// to the message pool. NOTE: If no receive callback is defined
// a poll_cq in a while loop would be required to get and put all
// elements back on the message pool.
//
// Destroy CQ, message pool, Reject connection.
p_conn->ti_ops->res_pool_destroy ( p_conn->MsgPoolHandle );
p_conn->ti_ops->destroy_cq (p_conn->ph_cq );
p_conn->ti_ops->reject( &p_conn->pConnReq );
release_sock(newsk);
RequestCanceled( pConnReq, Status );
*err = -ENOSR;
return NULL;
}
// Connection accepted at the transport, ready to send and receive messages.
// Change connection state. Save the 'new' established connection handle to
// be used in future data transfer requests.
p_conn->h_conn = p_conn->pConnReq.h_conn;
p_conn->state = SDP_CONNECTED;
newsk->state = TCP_CONNECTED;
release_sock(newsk);
return newsk;
}
#endif // SOURCE
NAME
sample_sendmsg -- SAMPLE CODE
PURPOSE
Demonstrate how to send data over an established transport connection.
DESCRIPTION
Copy user-mode data into a transport resource pool data element buffer, then
transmit the data. Wait for the send completion callback routine to wake
the sleeping send thread. Once the send completion routine has awaked this
thread, return the send status.
PARAMETERS
INPUTS
*sk struct sock pointer
*msg struct msghdr pointer
size total send size in bytes
OUTPUT
none.
RETURN VALUE
0 == SUCCESS, else a negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// Send data on connected socket
//
int sample_sendmsg( IN struct sock *sk,
IN struct msghdr *msg,
IN int size)
{
struct ap_opt *p_conn = sk->ap_opt; // setup in init call
ti_status_t status;
ti_data_element_t *p_msg;
ti_pool_response_t res_pool_resp;
send_context_t *send_context;
int mss;
lock_sock(sk);
// Check for connection,
if ( sk->state != SDP_CONNECTED )
{
release_sock(sk);
return -EAGAIN;
}
// get the current maximum segment size and start sending
mss = p_conn->msg_size;
// If size is greater then mss then process via RDMA
// This sample code only shows message send example
// to keep it simple
if ( size > mss )
{
release_sock(sk);
return ( sample_rdma_write ( sk, msg, size ));
}
// Get a message from the free pool
if ( p_conn->ti_ops->res_pool_get( p_conn->h_msg_pool,
1, // elements
FALSE, // no partial
NULL, // block
NULL, // cb context 1
NULL, // cb context 2
&res_pool_resp ) != TI_SUCCESS )
{
// This is a blocking call and if no messages are available
// will wait for a res_pool_put() to refill the pool. Failure indicates
// a connection problem
release_sock(sk);
return -EPIPE; // broken pipe
}
p_msg = res_pool_resp->p_list;
// copy user data into message buffer for the send
if ( memcpy_fromiovec( p_msg->p_buf_list->pdata,
msg->msg_iov,
size ))
{
// Put single message back in the free pool and exit with error
p_conn->ti_ops->res_pool_put( p_conn->h_msg_pool,
p_msg );
release_sock(sk);
return -EFAULT;
}
// Update the message element for the transfer
p_msg->msg_opt = TI_OPT_SOLICITED | AT_OPT_SIGNALED;
p_msg->p_buf_list->byte_count = size;
// set a context for the send completion callback routine
send_context = (send_context_t *)p_msg->p_context;
send_context->p_conn = p_conn;
p_msg->iodone = FALSE; // initialize wait condition
// We are now ready to send the message
// Set the indication bit in the element to enable
// completion notification (callback) on this side
// and on the remote side of the connection.
if ( (p_conn->ti_ops->msg_send( p_conn->h_conn, p_mlist )) != TI_SUCCESS )
{
// Put message back on pool and return error
p_conn->ti_ops->res_pool_put( p_conn->h_msg_pool, p_msg );
release_sock(sk);
return -EPIPE; // broken pipe
}
release_sock(sk);
// suspend this sending thread until send completion callback routine
// awakes us. Send complete callback routine will set p_msg->ud_offset
// TRUE, then wake this thread.
rc = wait_event_interruptible(&p_conn->send_waitq,(p_msg->iodone != FALSE));
// retrieve send operation status and return data element back to
// free pool
status = p_msg->status; // only valid if rc != -ERESTARTSYS
p_conn->ti_ops->res_pool_put( p_conn->h_msg_pool, p_msg );
// interrupted wait or completed send request? Convert transport error
// returns into Linux negative errno code -ENOTCONN because if there was
// a transport send error then the connection has been destroyed or is in
// the process of going away.
if ( rc == -ERESTARTSYS )
return -EINTR;
else if ( status == TI_SUCCESS )
return 0;
else
return -ENOTCONN;
}
#endif // SOURCE
NAME
sample_msg_send_complete -- SAMPLE CODE
PURPOSE
Demonstrate how a message send completion callback routine might function.
DESCRIPTION
When a transport send request is completed by the accelerated transport
layer, the specified send completion callback routine is invoked in a
device interrupt context [accept, ti_comp_info_t]. The task here is to wake
sleeping send threads, one thread per data element in this example. Each
sending thread is sleeping on the 'iodone' field in the data element
structure.
It is assumed the transport connection is established with the
ti_conn_attr_t field 'auto_cq_rearm' set to TRUE. The lower transport
layer will rearm the Completion Queue and then call this routine.
PARAMETERS
INPUTS
cq_handle address of a completion handle
*de_list pointer to a list of completed send requests.
OUTPUT
none.
SIDE EFFECTS
sleeping send threads, one per data element, are awakened.
RETURN VALUE
none.
SOURCE
#if 0
void
sample_msg_send_complete( ti_cq_handle_t cq_handle,
ti_data_element_t *de_list )
{
struct ap_opt *p_conn; // connection context
ti_data_element_t *p_msg = de_list;
send_context_t *send_context;
// process the list of completed data elements
while( p_msg )
{
// collect send context and wake the sleeper
send_context = (send_context_t *)p_msg->p_context;
p_conn = send_context->p_conn; // retreive connection context
p_msg->iodone = TRUE; // release wait condition
// advance to next element before any wait related items are touched.
p_msg = p_msg->p_next_msg;
wake_up_interruptible( &p_conn->send_waitq );
// ************
// it's the sender's responsibility to return the data elements back
// to the free pool.
// ************
}
}
#endif // SOURCE
NAME
sample_msg_recv -- SAMPLE CODE
PURPOSE
Demonstrate how the socket layer retreives transport received data and
deliveres it to the user.
DESCRIPTION
The message receive completion callback routine 'sample_msg_recv_complete'
is called in an interrupt context to process transport completed receive
operations. The transport receive completion callback routine will queue
successfully completed recv data elements to the socket recv_queue 'rmsgs'.
This routine, operating in a system thread context, copies the received data
from the rmsgs queued data elements into the user-specified socket receive
buffers. When the received data element is completly drained, the data
element is removed from the rmsgs queue and reposted back to the transport
connection receive queue.
PARAMETERS
INPUTS
*sk struct sock pointer
*msg struct msghdr pointer which contains data buffer adrs & size
size total receive request byte size
nonblock blocking socket IO or not, boolean.
flags socket/protocol flags
OUTPUT
*addr_len total # of bytes received
RETURN VALUE
0 == SUCCESS, otherwise a negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// receive data on connected socket
//
int sample_msg_recv( IN struct sock *sk,
IN struct msghdr *msg,
IN int size,
IN int nonblock,
IN int flags,
OUT int *addr_len )
{
struct ap_opt *p_conn = sk->ap_opt; // setup in init call
ti_status_t status;
ti_data_element_t *p_msg;
ti_pool_response_t res_pool_resp;
int mss, rbytes, bytes_copied;
lock_sock(sk);
// Check for connection,
if ( sk->state != SDP_CONNECTED )
{
release_sock(sk);
return -EAGAIN;
}
// get the current maximum segment size and start sending
mss = p_conn->msg_size;
// If size is greater then mss then get data via RDMA
// This sample code only shows message receive example
// to keep it simple. receive completion code queues up
// the message on p_conn->rmsg list
if ( size > mss )
{
release_sock(sk);
return ( sample_rdma_read( sk, msg, size ));
}
// Any data available? get next buffer from the received data list
p_msg = p_conn->rmsgs;
// if socket is not NON_BLOCKING, wait for data to arrive
if (!p_msg)
{
WaitOnSignalRecvData(sk);
p_msg = p_conn->rmsgs;
}
// Calculate data left in receive buffer and amount to copy
rbytes = p_msg->recv_byte_count - p_msg->recv_byte_offset;
bytes_copied = min(rbytes,size);
// copy data until user buffer full or end of rmsg buffer
if ( memcpy_toiovec( msg->msg_iov,
p_msg->p_buf_list->pdata,
bytes_copied );
// if we did not copy out all of the remaining msg buffer then
// adjust the rmsg_offset of the rmsg and return bytes copied
// otherwise pull off drained recv buffer and repost buffer on connection
// receive queue
if (copied < rbytes)
{
p_msg->recv_byte_offset += bytes_copied;
}
else
{
p_conn->rmsg = p_msg->p_next_msg;
p_msg->p_next_msg = NULL;
p_msg->recv_byte_offset = 0;
if (p_conn->ti_ops->msg_recv( p_conn->h_conn, p_msg ) != TI_SUCCESS)
{
// Put message back on free pool and return error
p_conn->ti_ops->res_pool_put( p_conn->h_msg_pool, p_msg );
release_sock(sk);
return -EPIPE; // broken pipe
}
}
// update bytes copied
*addr_len = copied;
release_sock(sk);
return SUCCESS;
}
#endif // SOURCE
NAME
sample_msg_recv_complete -- SAMPLE CODE
PURPOSE
Demonstrate how a message receive completion callback routine works.
DESCRIPTION
This receive completion callback routine, executing in a device interrupt
context, will process the list of completed receive data elements. In this
InfiniBand example, if data element receive errors are detected,
then return the receive data element back to the resource pool as the
connection has failed. Otherwise, queue the data element on the socket
received data queue 'rmsgs' and wake any waiting socket receive thread(s).
It is assumed the connection was established with the ti_conn_attr_t field
'auto_cq_rearm' set to TRUE. The lower transport layer will rearm the
Completion Queue, poll for completed recv data elements and then call this
routine with a list of completed data elements.
PARAMETERS
INPUTS
cq_handle address of a completion handle
*de_list pointer to a list of completed send requests.
OUTPUT
none.
SIDE EFFECTS
sleeping send threads, one per data element are awakened.
RETURN VALUE
none.
SOURCE
#if 0
void
sample_msg_recv_complete( IN ti_cq_handle_t h_cq,
IN ti_data_element_t *pMsgList )
{
struct ap_opt *p_conn; // connection context
ti_data_element_t *p_msg = de_list, *p_next;
while( p_msg )
{
// save next data element pointer as processing will rewrite the
// data element's list pointer.
p_next = p_msg->p_next_msg;
// if not success, the repost recv buffer
if ( p_msg->status != TI_SUCCESS )
{
p_conn = (struct ti_opt *)p_msg->p_context;
// Error, connection lost - Put message back in free pool
p_conn->ti_ops->res_pool_put( p_conn->h_msg_pool, p_msg );
}
else
{
// setup for socket receive processing
p_msg->ud_offset = p_msg->recv_byte_count;
// valid receive data, queue to the socket
insert_queue_tail_locked( &p_conn->recv_q_lock,
&p_conn->rmsgs, p_msg );
// wake waiting socket receive thread(s)
// TBD.
}
// next recv data element pointer
p_msg = p_next;
}
}
#endif // SOURCE
NAME
sample_disconnect -- SAMPLE CODE
PURPOSE
Demonstrate how an established connection is disconnected.
DESCRIPTION
PARAMETERS
INPUTS
*sk sock struct pointer
OUTPUT
none.
SIDE EFFECTS
connection, if established will be shutdown, otherwise the connection
handle will be invalidated with resources released.
RETURN VALUE
0 == SUCCESS, otherwise negative errno code.
SOURCE
#if 0
#include <linux/net.h>
#include "sdp_ti.h"
//
// close the current connection
//
int
sample_close( IN struct sock *sk )
{
struct ap_opt *p_conn = sk->ap_opt; // setup in init call
ti_status_t status;
int rc = SUCCESS;
lock_sock(sk);
// already been here?
if ( sk->state == TCP_CLOSE)
release_sock(sk);
return SUCCESS;
}
// schedule a disconnect with no callback(routine or context) for an
// established connection; disconnect runs asynchronous with the connection
// shutdown. We do not care how long the transport takes to shutdown the
// connection.
if (p_conn->h_conn)
{
status = p_conn->ti_ops->disconnect(p_conn->h_conn, 0, 0);
if ( status != TI_SUCCESS )
{
rc = -EPIPE; // broken pipe
}
p_conn->h_conn = NULL;
}
sk->state = TCP_CLOSE; // see linux/tcp.h
release_sock(sk);
// destroy the TI handle - we don't want a callback hence
p_conn->ti_ops->destroy_ti( p_conn->h_ti, NULL, NULL );
return rc;
#endif // SOURCE