onload--extensions api

Onload Extensions API

The Onload Extensions API allows the user 
to customize an application using advanced features to improve performance.
The Extensions API does not create any runtime dependency on Onload 
and an application using the API can run without Onload. 
The license for the API and associated libraries is a BSD 2‐Clause License.
This section covers the follows topics:
•Common Components on page 282
•Stacks API on page 289
•Zero‐Copy API on page 298
•Templated Sends on page 311
•Delegated Sends API on page 315

Source Code

Java Native Interface ‐ Wrapper

Common Components

For all applications 
employing the Extensions API the following components are provided:
•#include <onload/extensions.h>
An application should include the header file 
containing function prototypes and constant values required when using the API.
•libonload_ext.a, libonload_ext.so
This library provides stub implementations of the extended API. 
An application that wishes to use the extensions API should link against this library.
When Onload is not present, the application will continue to function, 
but calls to the extensions API will have no effect (unless documented otherwise).
‐ To link dynamically to this library 
include the ‘‐l’ linker option on the compiler command line i.e.
‐lonload_ext
‐ You can instead link against the onload_ext.a static library. 
This is required to run the application on servers 
that do not have the dynamic libraries installed. 
When doing so, it is necessary to also link with the dynamic library 
by adding the ‘ldl’ option to the compiler command line.
‐ldl ‐l:libonload_ext.a

onload_is_present

Description
If the application is linked with libonload_ext, 
but not running with Onload this will return 0. 
If the application is running with Onload this will return 1.
Definition
int onload_is_present (void)
Formal Parameters
None
Return Value
1 from libonload.so library, or 0 from libonload_ext.a library

onload_fd_stat

struct onload_stat
{
	int32_t stack_id;
	char* stack_name;
	int32_t endpoint_id;
	int32_t endpoint_state;
};
extern int onload_fd_stat(int fd, struct onload_stat* stat);
Description
Retrieves internal details about an accelerated socket.
Definition
See above
Formal Parameters
See above
Return Value
0 socket is not accelerated
1 socket is accelerated
‐ENOMEM when memory cannot be allocated
Notes
When calling free() on stack_name use the (char *) 
because memory is allocated using malloc.
This function will call malloc() 
and so should never be called from any other function 
requiring a malloc lock.
NOTE: Can be used to check 
if a fd is accelerated without allocating memory 
if stat is declared as NULL.

onload_fd_check_feature

int onload_fd_check_feature (int fd, enum onload_fd_feature feature);
enum onload_fd_feature {
	/* Check whether this fd supports ONLOAD_MSG_WARM or not */
	ONLOAD_FD_FEAT_MSG_WARM,
	/* see Notes for details */
ONLOAD_FD_FEAT_UDP_TX_TS_HDR
};
Description
Used to check whether the Onload file descriptor supports a feature or not.
Definition
See above
Formal Parameters
See above
Return Value
0 if the feature is supported but not on this fd
>0 if the feature is supported both by onload and this fd
<0 if the feature is not supported:
‐ENOSYS if onload_fd_check_feature() is not supported.
‐ ENOTSUPP if the feature is not supported by onload.
Notes
Onload‐201509 and later versions support the 	
ONLOAD_FD_FEAT_UDP_TX_TS_HDR option. 
onload_fd_check_feature will return 1 to indicate 
that a recvmsg used to retrieve TX timestamps for UDP packets 
will return the entire Ethernet header.
NOTE: When run on older versions of onload this will return ‐EOPNOTSUPP.

onload_thread_set_spin

Description
For a thread calling this function, 
onload_thread_set_spin() sets the per‐thread spinning actions, 
it is not per‐stack and not per‐socket.
Definition
int onload_thread_set_spin(
	enum onload_spin_type type,
	unsigned spin)
Formal Parameters
type
Which operation to change the spin status of. 
The type must be one of the following:
enum onload_spin_type {
	ONLOAD_SPIN_ALL, /* enable or disable all spin options */
	ONLOAD_SPIN_UDP_RECV,
	ONLOAD_SPIN_UDP_SEND,
	ONLOAD_SPIN_TCP_RECV,
	ONLOAD_SPIN_TCP_SEND,
	ONLOAD_SPIN_TCP_ACCEPT,
	ONLOAD_SPIN_PIPE_RECV,
	ONLOAD_SPIN_PIPE_SEND,
	ONLOAD_SPIN_SELECT,
	ONLOAD_SPIN_POLL,
	ONLOAD_SPIN_PKT_WAIT,
	ONLOAD_SPIN_EPOLL_WAIT,
	ONLOAD_SPIN_STACK_LOCK,
	ONLOAD_SPIN_SOCK_LOCK,
	ONLOAD_SPIN_SO_BUSY_POLL,
	ONLOAD_SPIN_TCP_CONNECT,
	ONLOAD_SPIN_MIMIC_EF_POLL, 
	/* thread spin configuration which mimics
	* spin settings in EF_POLL_USEC. Note
	* that this has no effect on the
	* usec‐setting part of EF_POLL_USEC.
	* This needs to be set separately
	*/
	ONLOAD_SPIN_MAX /* special value to mark largest valid input */
};
spin
A boolean which indicates whether the operation should spin or not.
Return Value
0 on success
‐EINVAL if unsupported type is specified.
Notes
Spin time (for all threads) is set using the EF_SPIN_USEC parameter.
Examples
The onload_thread_set_spin API can be used 
to control spinning on a per‐thread or per‐API basis. 
The existing spin‐related configuration options set the default behavior for threads, 
and the onload_thread_set_spin API overrides the default for the thread 
calling this function.
Disable all sorts of spinning:
onload_thread_set_spin(ONLOAD_SPIN_ALL, 0);
Enable all sorts of spinning:
onload_thread_set_spin(ONLOAD_SPIN_ALL, 1);
Enable spinning only for certain threads:
1 Set the spin timeout by setting EF_SPIN_USEC, 
and disable spinning by default
by setting EF_POLL_USEC=0.
2In each thread that should spin, 
invoke onload_thread_set_spin().
Disable spinning only in certain threads:
1 Enable spinning by setting EF_POLL_USEC=<timeout>.
2In each thread that should not spin, invoke onload_thread_set_spin().
WARNING: 
If a thread is set to NOT spin and then blocks 
this may invoke an interrupt for the whole stack. 
Interrupts occurring on moderately busy threads may cause unintended 
and undesirable consequences.
Enable spinning for UDP traffic, but not TCP traffic:
1 Set the spin timeout by setting EF_SPIN_USEC, 
and disable spinning by default
by setting EF_POLL_USEC=0.
2In each thread that should spin (UDP only), do:
onload_thread_set_spin(ONLOAD_SPIN_UDP_RECV, 1)
onload_thread_set_spin(ONLOAD_SPIN_UDP_SEND, 1)
Enable spinning for TCP traffic, but not UDP traffic:
1 Set the spin timeout by setting EF_SPIN_USEC, 
and disable spinning by default
by setting EF_POLL_USEC=0.
2In each thread that should spin (TCP only), do:
onload_thread_set_spin(ONLOAD_SPIN_TCP_RECV, 1)
onload_thread_set_spin(ONLOAD_SPIN_TCP_SEND, 1)
onload_thread_set_spin(ONLOAD_SPIN_TCP_ACCEPT, 1)
Spinning and sockets:
When a thread calls onload_thread_set_spin() 
it sets the spinning actions 
applied when the thread accesses any socket 
‐ irrespective of whether the socket is created by this thread.
If a socket is created by thread‐A and is accessed by thread‐B, 
calling onload_thread_set_spin(ONLOAD_SPIN_ALL, 1) only from thread‐B 
will enable spinning for thread‐B, 
but not for thread‐A. 
In the same scenario, 
if onload_thread_set_spin(ONLOAD_SPIN_ALL, 1) is called only from thread‐A, 
then spinning is enabled only for thread‐A, but not for thread‐B.
The onload_thread_set_spin() function sets the per‐thread spinning action.

onload_thread_get_spin

Description
For the current thread, identify which operations should spin.
Definition
int onload_thread_get_spin(
	unsigned *state)
Formal Parameters
state
Location at which to write the spin status as a bitmask. 
Bit n of the mask is set 
if spinning has been enabled for spin type n 
(see onload_thread_set_spin on page 285).
Return Value
0 on success
Notes
Spin time (for all threads) is set using the EF_SPIN_USEC parameter.
Examples
Determine if spinning is enabled for UDP receive:
unsigned state;
onload_thread_get_spin(&state);
if (state & (1 << ONLOAD_SPIN_UDP_RECV)) {
	// spinning is enabled for UDP receive
}

onload_socket_nonaccel

Description
Create a socket 
which is not accelerated by Onload. 
This function is useful 
when attempting to reserve a port for an ephemeral ef_vi instance 
without installing Onload filters. 
It is also possible 
to use the stackname API to disable acceleration for specific socket(s).
Definition
int onload_socket_nonaccel(
	int domain,
	int type,
	int protocol)
Formal Parameters
This function takes arguments and returns values 
that correspond exactly to the standard socket() function call.
Return Value
Return the file descriptor that refers to the created endpoint.
‐1 with errno ENOSYS if the Onload extensions library is not in use.

onload_socket_unicast_nonaccel

Description
Create a socket 
that will only accelerate multicast traffic. 
If this socket is not able to receive multicast, for example, 
because it is bound to a unicast local address, 
or it is a TCP socket, 
then it will be handed over to the kernel.
This function is useful for cases 
where a socket will be used solely for multicast traffic 
to avoid consuming limited filter table resource. 
This does not prevent unicast traffic 
from arriving at the socket, 
and if appropriate traffic is received, 
it will still be delivered via the un‐accelerated path. 
It is most useful for sockets that are bound to INADDR_ANY, 
because for these Onload must install a filter per IP address 
that is configured on an accelerated interface, 
on each accelerated hardware port.
If a socket is bound to a multicast local address, 
then no unicast filters will be installed, so there is no need for this function.
Definition
int onload_socket_unicast_nonaccel(
	int domain,
	int type,
	int protocol)
Formal Parameters
This function takes arguments and returns values 
that correspond exactly to the standard socket() function call.
Return Value
Return the file descriptor that refers to the created endpoint.
‐1 with errno ENOSYS if the Onload extensions library is not in use.

Stacks API

给套接字分配onload stack
设置不同onload stack的属性
Using the Onload Extensions API 
an application can bind selected sockets 
to specific Onload stacks 
and in this way ensure that time‐critical sockets are not starved of resources 
by other non‐critical sockets. 

The API allows an application 
to select sockets 
which are to be accelerated 
thus reserving Onload resources for performance critical paths. 
This also prevents non‐critical paths from creating jitter for critical paths.

onload_set_stackname

Description
Select the Onload stack 
that new sockets are placed in. 
A socket can exist only in a single stack. 
A socket can be moved to a different stack 
‐ see onload_move_fd() below.
Moving a socket to a different stack 
does not create a copy of the socket 
in originator and target stacks.
Definition
int onload_set_stackname(
	int who,
	int scope,
	const char *name)
Formal Parameters
who
Must be one of the following:
‐ ONLOAD_THIS_THREAD 
‐ to modify the stack name in which all subsequent sockets are created by this thread.
‐ ONLOAD_ALL_THREADS 
‐ to modify the stack name in which all subsequent sockets are created by all threads 		
in the current process. 
ONLOAD_THIS_THREAD takes precedence over ONLOAD_ALL_THREADS.
scope
Must be one of the following:
‐ ONLOAD_SCOPE_THREAD 
‐ name is scoped with current thread
‐ ONLOAD_SCOPE_PROCESS 
‐ name is scoped with current process
‐ ONLOAD_SCOPE_USER 
‐ name is scoped with current user
‐ ONLOAD_SCOPE_GLOBAL 
‐ name is global across all threads, users and processes.
‐ ONLOAD_SCOPE_NOCHANGE 
‐ undo effect of a previous call 
to onload_set_stackname(ONLOAD_THIS_THREAD, …), 
see Notes on page 290.
name
One of the following:
‐ the stack name up to 8 characters.
‐ an empty string to set no stackname
‐ the special value ONLOAD_DONT_ACCELERATE to prevent sockets 
created in this thread, user, process from being accelerated.
Sockets identified by the options above 
will belong to the Onload stack 
until a subsequent call using onload_set_stackname identifies a different stack 
or the ONLOAD_SCOPE_NOCHANGE option is used.
Return Value
0 on success
‐1 with errno set to ENAMETOOLONG if the name exceeds permitted length
‐1 with errno set to EINVAL if other parameters are invalid.
Notes
Note 1
This applies for stacks 
selected for sockets created by socket() and for pipe(), 
it has no effect on accept(). 
Passively opened sockets created via accept() will always be in the same stack 
as the listening socket that they are linked to, 
this means that the following are functionally identical 
i.e.
onload_set_stackname(foo)
socket
listen
onload_set_stackname(bar)
accept
and:
onload_set_stackname(foo)
socket
listen
accept
onload_set_stackname(bar)
In both cases the listening socket and the accepted socket will be in stack foo.
Note 2
Scope defines the namespace in which a stack belongs. 
A stackname of foo in scope user is not the same 
as a stackname of foo in scope thread. 
Scope restricts the visibility of a stack 
to either the current thread, current process, current user or is unrestricted (global). 			
This has the property 
that with, for example, process 
based scoping, 
two processes can have the same stackname 
without sharing a stack 
‐ as the stack for each process has a different namespace.
Note 3
Scoping can be thought of as adding a suffix to the supplied name e.g.
ONLOAD_SCOPE_THREAD: <stackname>‐t<thread_id>
ONLOAD_SCOPE_PROCESS: <stackname>‐p<process_id>
ONLOAD_SCOPE_USER: <stackname>‐u<user_id>
ONLOAD_SCOPE_GLOBAL: <stackname>
This is an example only and the implementation is free to do something different 
such as maintaining different lists for different scopes.
Note 4
ONLOAD_SCOPE_NOCHANGE will undo the effect 
of a previous call to onload_set_stackname(ONLOAD_THIS_THREAD, …).
If you have previously used onload_set_stackname(ONLOAD_THIS_THREAD, …) 
and want to revert to the behavior of threads 
that are using the ONLOAD_ALL_THREADS configuration, 
without changing that configuration, you can do the following:
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_NOCHANGE, 
	"");
Related environment variables
Related environment variables are:
EF_DONT_ACCELERATE
Default: 0
Minimum: 0
Maximum: 1
Scope: Per‐process
If this environment variable is set 
then acceleration for ALL sockets is disabled 
and handed off to the kernel stack 
until the application overrides this state with a call to onload_set_stackname().
EF_STACK_PER_THREAD
Default: 0
Minimum: 0
Maximum: 1
Scope: Per‐process
If this environment variable is set 
each socket created by the application will be placed in a stack 
depending on the thread in which it is created. 
Stacks could, for example, be named 
using the thread ID of the thread 
that creates the stack, but this should not be relied upon.
A call to onload_set_stackname overrides this variable. 
EF_DONT_ACCELERATE takes precedence over this variable.
EF_NAME
Default: none
Minimum: 0 chars
Maximum: 8 chars
Scope: per‐stack
The environment variable EF_NAME will be honored 
to control Onload stack sharing. 
However, a call to onload_set_stackname overrides this variable and, 	
EF_DONT_ACCELERATE and EF_STACK_PER_THREAD 
both take precedence over EF_NAME.

onload_move_fd

Description
Move the file descriptor to the current stack. 
The target stack can be specified with onload_set_stackname(),
then use onload_move_fd() to put the socket into the target stack.
A socket can exist only in a single stack. 
Moving a socket to a different stack does not create a copy of the socket 
in originator and target stacks. 
Limited to TCP closed or accepted sockets only.
Definition
int onload_move_fd (int fd)
Formal Parameters
fd ‐ the file descriptor to be moved to the current stack.
Return Value
0 on success
non‐zero otherwise.
Notes
•Useful to move fds obtained by accept() 
to a different Onload stack from the listening socket.
•Cannot be used on actively opened connections, 
although it is possible to use onload_set_stackname() 
before calling connect() to achieve the same result.
•The socket must have empty send 
and retransmit queues (i.e. send not called on this socket)
•The socket must have a simple receive queue (no loss, reordering, etc)
•The fd is not yet in an epoll set.
•The onload_move_fd function should not be used 
if SO_TIMESTAMPING is set to a non‐zero value for the originating socket.
•Should not be used simultaneously 
with other I/O multiplex actions i.e. poll(), select(), recv() etc on the file descriptor.
•This function is not async‐safe 
and should never be called from any process function handling signals.
•This function cannot be used to hand sockets over to the kernel. 
It is not possible to use onload_set_stackname (ONLOAD_DONT_ACCELERATE) 
and then onload_move_fd().

NOTE: 
The onload_move_fd function does not check 
whether a destination stack has either RX or TX timestamping enabled.

onload_stackname_save

Description
Save the state of the current onload stack 
identified by the previous call to onload_set_stackname()
Definition
int onload_stackname_save (void)
Formal Parameters
none
Return Value
0 on success
‐ENOMEM when memory cannot be allocated.

onload_stackname_restore

Description
Restore stack state 
saved with a previous call to onload_stackname_save(). 
All updates/changes to state of the current stack will be deleted 
and all state previously saved will be restored. 
To avoid unexpected results, 
the stack should be restored in the same thread 
as used to call onload_stackname_save().
Definition
int onload_stackname_restore (void)
Formal Parameters
none
Return Value
0 on success
non‐zero if an error occurs.
Notes
The API stackname save and restore functions provide flexibility 
when binding sockets to an Onload stack.
Using a combination of onload_set_stackname(), 
onload_stackname_save() and onload_stackname_restore(), 
the user is able to create default stack settings 
which apply to one or more sockets, 
save this state 
and then create changed stack settings 
which are applied to other sockets. 
The original default settings can then be restored to apply to subsequent sockets.

Stacks API Usage

Using a combination of the EF_DONT_ACCELERATE environment variable 
and the function onload_set_stackname(), 
the user is able to control/select sockets 
which are to be accelerated and isolate these performance critical sockets 
and threads from the rest of the system.

onload_stack_opt_set_int

Description
Set/modify per stack options 
that all subsequently created stacks will use 
instead of using the existing global stack options.
Definition
int onload_stack_opt_set_int(
	const char* name,
	int64_t value)
Formal Parameters
name
Stack option to modify
value
New value for the stack option.
Example
onload_stack_opt_set_int(“EF_SCALABLE_FILTERS_ENABLE”, 1);
Return Value
0 on success
errno set to EINVAL if the requested option is not found or ENOMEM.
Notes
Cannot be used to modify options on existing stacks 
‐ only for new stacks.
Cannot be used to modify process options 
‐ only stack options.
Modified options will be used for all newly created stacks 
until onload_stack_opt_reset() is called.

onload_stack_opt_reset

Description
Revert to using global stack options for newly created stacks.
Definition
int onload_stack_opt_reset(void)
Formal Parameters
None.
Return Value
0 always
Notes
Should be called following a call to onload_stack_opt_set_int() 
to revert to using global stack options for all newly created stacks.

Stacks API ‐ Examples

•This thread will use stack foo, 
other threads in the stack will continue as before.
onload_set_stackname(
	ONLOAD_THIS_THREAD, 
	ONLOAD_SCOPE_GLOBAL, 
	"foo")
•All threads in this process will get their own stack called foo. 
This is equivalent to the EF_STACK_PER_THREAD environment variable.
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_THREAD, 
	"foo")
•All threads in this process 
will share a stack called foo. 
If another process did the same function call it will get its own stack.
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_PROCESS, 
	"foo")
•All threads in this process will share a stack called foo. 
If another process run by the same user did the same, 
it would share the same stack as the first process. 
If another process run by a different user did the same it would get is own stack.
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_USER, 
	"foo")
•Equivalent to EF_NAME. 
All threads will use a stack called foo 
which is shared by any other process which does the same.
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_GLOBAL, 
	"foo")
•Equivalent to EF_DONT_ACCELERATE. 
New sockets/pipes will not be accelerated 
until another call to onload_set_stackname().
onload_set_stackname(
	ONLOAD_ALL_THREADS, 
	ONLOAD_SCOPE_GLOBAL, 
	ONLOAD_DONT_ACCELERATE)

onload_ordered_epoll_wait

For details of the Wire Order Delivery feature 
refer to Wire Order Delivery on page 87
Description
If the epoll set contains accelerated sockets 
in only one stack 
this function can be used instead of epoll_wait() 
to return events in the order 
these were recovered from the wire. 
There is no explicit check on sockets, 
so applications must ensure that the rules are applied to avoid mis‐ordering of packets.
Definition
int onload_ordered_epoll_wait (
	int epfd,
	struct epoll_event *events,
	struct onload_ordered_epoll_event *oo_events,
	int maxevents,
	int timeout);
Formal Parameters
See definition epoll_wait().
Return Value
•A positive value identifies the number of epoll_evs / ordered_evs to process.
•A zero value indicates there are no events
 which can be processed 
 while maintaining ordering 
 i.e. there may be no data or only unordered data.
•A negative return value identifies an error condition.
Notes
Any file descriptors returned as ready 
without a valid timestamp 
i.e. tv_sec = 0, should be considered un‐ordered with respect to the rest of the set. 
This can occur for data received via the kernel 
or data returned without a hardware timestamp 
i.e. from an interface that does not support hardware timestamping.
The environment variable EF_UL_EPOLL=1 must be set 
Hardware timestamps are required. 
This feature is only available on the SFN7000, SFN8000 and X2 series adapters.
struct onload_ordered_epoll_event{
	/* The hardware timestamp of the first readable data */
	struct timespec ts;
	/* Number of bytes that may be read to maintain wire order */
	int bytes
};
ONLOAD_MSG_ONEPKT and EF_TCP_RCVBUF_STRICT are incompatible 
with the wire order delivery feature.
Refer to Wire Order Delivery on page 87 for details.
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

raindayinrain

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值