zero copy解析，通过sendfile分析

808人阅读 评论(0)


To understand the impact of sendfile, it is important to understand the common data path for transfer of data from file to socket:The operating system reads data from the disk into pagecache in kernel spaceThe application reads the data from kernel space into a user-space bufferThe application writes the data back into kernel space into a socket bufferThe operating system copies the data from the socket buffer to the NIC buffer where it is sent over the networkThis is clearly inefficient, there are four copies and two system calls. Using sendfile, this re-copying is avoided by allowing the OS to send the data from pagecache to the network directly. So in this optimized path, only the final copy to the NIC buffer is needed.

NDFILE(2)               Linux Programmer's Manual              SENDFILE(2)


NAME         top

       sendfile - transfer data between file descriptors


SYNOPSIS         top

       #include <sys/sendfile.h>

ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);


DESCRIPTION         top

       sendfile() copies data between one file descriptor and another.
Because this copying is done within the kernel, sendfile() is more
efficient than the combination of read(2) and write(2), which would
require transferring data to and from user space.

in_fd should be a file descriptor opened for reading and out_fd
should be a descriptor opened for writing.

If offset is not NULL, then it points to a variable holding the file
offset from which sendfile() will start reading data from in_fd.
When sendfile() returns, this variable will be set to the offset of
the byte following the last byte that was read.  If offset is not
NULL, then sendfile() does not modify the current file offset of
in_fd; otherwise the current file offset is adjusted to reflect the
number of bytes read from in_fd.

If offset is NULL, then data will be read from in_fd starting at the
current file offset, and the file offset will be updated by the call.

count is the number of bytes to copy between the file descriptors.

The in_fd argument must correspond to a file which supports
mmap(2)-like operations (i.e., it cannot be a socket).

In Linux kernels before 2.6.33, out_fd must refer to a socket.  Since
Linux 2.6.33 it can be any file.  If it is a regular file, then
sendfile() changes the file offset appropriately.


RETURN VALUE         top

       If the transfer was successful, the number of bytes written to out_fd
is returned.  On error, -1 is returned, and errno is set
appropriately.


ERRORS         top

       EAGAIN Nonblocking I/O has been selected using O_NONBLOCK and the
write would block.

EBADF  The input file was not opened for reading or the output file
was not opened for writing.

EINVAL Descriptor is not valid or locked, or an mmap(2)-like
operation is not available for in_fd.

EIO    Unspecified error while reading from in_fd.

ENOMEM Insufficient memory to read from in_fd.


VERSIONS         top

       sendfile() is a new feature in Linux 2.2.  The include file
<sys/sendfile.h> is present since glibc 2.1.


CONFORMING TO         top

       Not specified in POSIX.1-2001, or other standards.

Other UNIX systems implement sendfile() with different semantics and
prototypes.  It should not be used in portable programs.


NOTES         top

       If you plan to use sendfile() for sending files to a TCP socket, but
need to send some header data in front of the file contents, you will
find it useful to employ the TCP_CORK option, described in tcp(7), to
minimize the number of packets and to tune performance.

In Linux 2.4 and earlier, out_fd could also refer to a regular file,
and sendfile() changed the current offset of that file.

The original Linux sendfile() system call was not designed to handle
large file offsets.  Consequently, Linux 2.4 added sendfile64(), with
a wider type for the offset argument.  The glibc sendfile() wrapper
function transparently deals with the kernel differences.

Applications may wish to fall back to read(2)/write(2) in the case
where sendfile() fails with EINVAL or ENOSYS.

The Linux-specific splice(2) call supports transferring data between
arbitrary files (e.g., a pair of sockets).


       mmap(2), open(2), socket(2), splice(2)


COLOPHON         top

       This page is part of release 3.54 of the Linux man-pages project.  A
description of the project, and information about reporting bugs, can
be found at http://www.kernel.org/doc/man-pages/.


0
0

* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
个人资料
• 访问：246670次
• 积分：4272
• 等级：
• 排名：第7082名
• 原创：166篇
• 转载：47篇
• 译文：13篇
• 评论：28条
评论排行
最新评论