(standard c libraries translation )fcntl

fcntl - manipulate file descriptor
fcntl - 操作文件描述符

所需头文件
#include <unistd.h>
#include <fcntl.h>

int fcntl(int fd, int cmd, ... /* arg */ );


fcntl() performs one of the operations described below on the open file descriptor fd.  The operation is determined by cmd.
fcntl在打开的文件描述符fd上执行一序列操作,具体操作取决于cmd

fcntl()  can  take an optional third argument.  Whether or not this argument is required is determined by cmd.  The required argument type is indicated in parentheses after each cmd name (in most cases, the required type is int, and we identify the argument using the name arg), or void is specified  if the argument is not required.
fcntl可以带三个额外的参数,至于参数是否需要带取决于cmd,括号内在cmd名字后面指定需求的参数类型,如果没有参数则是空

Duplicating a file descriptor
复制一个文件描述符
F_DUPFD (int)
Find  the  lowest  numbered available file descriptor greater than or equal to arg and make it be a copy of fd.  This is different from dup2(2), which uses exactly the descriptor specified.
查找最小可用的文件描述符,大于或等于arg作为fd的拷贝,这与dup2并不相同,dup2是直接使用指定的描述符

On success, the new descriptor is returned. See dup(2) for further details.
成功的时候返回新的文件描述符,在dup中查找更多细节

F_DUPFD_CLOEXEC (int; since Linux 2.6.24)
As for F_DUPFD, but additionally set the close-on-exec flag for the duplicate descriptor.  Specifying this flag permits a program  to  avoid  an additional  fcntl()  F_SETFD  operation  to  set  the  FD_CLOEXEC  flag.   For an explanation of why this flag is useful, see the description of O_CLOEXEC in open(2).
类似于F_DUPFD,但是给描述符副本增加了close-on-exec标志位,指定这个标志位允许程序可以避免使用额外的时候fcntl F_SETFD操作来设置FD_CLOEXEC标志,为了解释为什么这个标志是有用的,请参见open的O_CLOEXEC描述


File descriptor flags
The following commands manipulate the flags associated with a file descriptor.  Currently, only one such flag is defined: FD_CLOEXEC, the close-on-exec flag.  If the FD_CLOEXEC bit is 0, the file descriptor will remain open across an execve(2), otherwise it will be closed.
下面的命令操作关联的文件描述符,目前只定义了一个标志位:FD_CLOEXEC,close-on-exec标志,如果FD_CLOEXEC的位是0,文件描述符在execve之后依然打开,否则将会关闭
F_GETFD (void)
Read the file descriptor flags; arg is ignored.
读取文件描述符标志,arg被忽略

F_SETFD (int)
Set the file descriptor flags to the value specified by arg.
通过给定的arg设置文件描述符的值


File status flags
Each  open  file description has certain associated status flags, initialized by open(2) and possibly modified by fcntl().  Duplicated file descriptors (made with dup(2), fcntl(F_DUPFD), fork(2), etc.) refer to the same open file description, and thus share the same file status flags. The file status flags and their semantics are described in open(2).
每个打开的文件描述符都确定关联标志位状态,用open初始化,可能会被fcntl修改,文件描述符的副本(通过dup,fcntl,fork等等)关联同样打开的文件描述符,因此共享同样的文件状态标志,文件表舒服标志和他们的协议都在open中有描述

F_GETFL (void)
Get the file access mode and the file status flags; arg is ignored.
获取文件的接触模式和文件的状态标志,arg被忽略
F_SETFL (int)
Set the file status flags to the value specified by arg.  File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file creation flags (i.e.,  O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in arg are ignored.  On Linux this command can change only the O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags.
用指定的arg设置文件状态标志的值,文件接触模式(O_RDONLY, O_WRONLY, O_RDWR)和文件创建标志(O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC)的arg被忽略,在linux中,这个命令只能改变O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK标志

Advisory locking
F_GETLK, F_SETLK and F_SETLKW are used to acquire, release, and test for the existence of record locks  (also  known  as  file-segment  or  file-region locks).  The third argument, lock, is a pointer to a structure that has at least the following fields (in unspecified order).
F_GETLK, F_SETLK和F_SETLKW用来获取,释放和测试记录锁的互斥(也就是我们所说的文件段或者文件区域锁),第三个参数lock,是一个指向至少有如下成员的结构体指针
       struct flock {
           ...
           short l_type;    /* Type of lock: F_RDLCK,
                               F_WRLCK, F_UNLCK */
           short l_whence;  /* How to interpret l_start:
                               SEEK_SET, SEEK_CUR, SEEK_END */
           off_t l_start;   /* Starting offset for lock */
           off_t l_len;     /* Number of bytes to lock */
           pid_t l_pid;     /* PID of process blocking our lock
                               (F_GETLK only) */
           ...
       };
The  l_whence,  l_start,  and l_len fields of this structure specify the range of bytes we wish to lock.  Bytes past the end of the file may be locked, but not bytes before the start of the file.
l_whence,  l_start和l_len结构体成员指定了需要锁定的字节范围,文件结尾之后的字节能被锁定,但是文件之前的字节无法锁定

l_start is the starting offset for the lock, and is interpreted relative to either: the start of the file (if l_whence is SEEK_SET); the  current  file offset  (if l_whence is SEEK_CUR); or the end of the file (if l_whence is SEEK_END).  In the final two cases, l_start can be a negative number provided the offset does not lie before the start of the file.
l_start是开始锁的文件偏移量,意思是跟如下两者之一有关系:文件的开始处(如果l_whence是SEEK_SET),当前的文件偏移(如果l_whence是SEEK_CUR),或者是文件的结尾(如果l_whence是SEEK_END),后面的两种情况,l_start可以是一个负数提供文件偏移,但是不会位于文件开始之前

l_len specifies the number of bytes to be locked.  If l_len is positive, then the range  to  be  locked  covers  bytes  l_start  up  to  and  including l_start+l_len-1.   Specifying 0 for l_len has the special meaning: lock all bytes starting at the location specified by l_whence and l_start through to the end of file, no matter how large the file grows.
l_len定义锁住的字节,如果l_len是一个正数,那么锁的范围将是l_start到l_start+l_len-1,定义l_len的值是0有特殊的意义,锁住l_start和l_whence所定义的所有字节,直到文件尾部,不管这个文件有多大

POSIX.1-2001 allows (but does not require) an implementation to support a negative l_len value; if l_len is negative, the interval  described  by  lock covers bytes l_start+l_len up to and including l_start-1.  This is supported by Linux since kernel versions 2.4.21 and 2.5.49.
POSIX.1-2001允许(但没有要求)实现来支持l_len的负数值,如果l_len是负数,那么区间就是从l_start+l_len到l_start-1,这个在linux2.4.21到2.5.49中有支持

The l_type field can be used to place a read (F_RDLCK) or a write (F_WRLCK) lock on a file.  Any number of processes may hold a read lock (shared lock) on a file region, but only one process may hold a write lock (exclusive lock).  An exclusive lock excludes all other locks, both shared and  exclusive.
l_type成员用来指定读或者写文件锁,在同一个文件作用域内,多个进程可以持有读锁(共享锁),但是只有一个进程可以持有写锁(互斥锁),互斥锁排斥其他的锁,包括共享锁和互斥锁

A  single process can hold only one type of lock on a file region; if a new lock is applied to an already-locked region, then the existing lock is converted to the new lock type.  (Such conversions may involve splitting, shrinking, or coalescing with an existing lock if the byte  range  specified  by the new lock does not precisely coincide with the range of the existing lock.)
在同一个文件作用域内,一个进程只能持有一种锁,如果申请锁定一个已经被锁定的区域,那么存在的锁将会被转换成新型的锁。

F_SETLK (struct flock *)
Acquire  a  lock (when l_type is F_RDLCK or F_WRLCK) or release a lock (when l_type is F_UNLCK) on the bytes specified by the l_whence, l_start, and l_len fields of lock.  If a conflicting lock is held by another process, this call returns -1 and sets errno to EACCES or EAGAIN.
申请一个锁(当l_type是F_RDCLK或者F_WRLCK)或者释放一个锁(当l_type是F_UNLCK)到l_whence,l_start和l_len指定的字节,如果与其他进程持有的锁相冲突,调用将返回-1,errno被设置成EACCESS或者EAGAIN

F_SETLKW (struct flock *)
As for F_SETLK, but if a conflicting lock is held on the file, then wait for that lock to be released.  If a signal  is  caught  while  waiting, then  the  call is interrupted and (after the signal handler has returned) returns immediately (with return value -1 and errno set to EINTR; see signal(7)).
关于F_SETLK,如果文件持有一个冲突锁,需要等待其他锁释放,如果在等待的时候捕捉到了信号,调用将会被中断(信号handler返回之后)之后直接返回(返回值-1,errno被设置成EINTR)

F_GETLK (struct flock *)
On input to this call, lock describes a lock we would like to place on the file.  If the lock could be placed, fcntl() does not  actually  place it,  but  returns F_UNLCK in the l_type field of lock and leaves the other fields of the structure unchanged.  If one or more incompatible locks would prevent this lock being placed, then fcntl() returns details about one of these locks in the l_type, l_whence, l_start, and  l_len  fields of lock and sets l_pid to be the PID of the process holding that lock.
在输入这个调用,描述了锁在文件中该放置的地方,如果锁能够被放置,fcntl实际上不放置它,但是l_type在F_UNLCK的情况下返回,保持结构体的其他值不变,如果有一个或者多个冲突的锁,那么将阻止锁被放置,然后fcntl返回l_type, l_whence, l_start和l_len的值,然后设置持有锁的进程pid作为l_pid的值

In order to place a read lock, fd must be open for reading.  In order to place a write lock, fd must be open for writing.  To place both types of lock, open a file read-write.
为了放置一个读锁,fd需要以可读的方式打开,为了放置一个写锁,fd需要以可写的方式打开,如果两个都需要放置,则需要以读写的方式打开

As well as being removed by an explicit F_UNLCK, record locks are automatically released when the process terminates or if it closes any file  descriptor  referring to a file on which locks are held.  This is bad: it means that a process can lose the locks on a file like /etc/passwd or /etc/mtab when for some reason a library function decides to open, read and close it.
记录锁会被显式的F_UNLCK删除,也会在进程终止后自动释放,关闭持有锁的fd所关联的的文件,这样子是有问题的,这意味着进程可以关掉/etc/passwd或者/etc/mtab的文件锁,当因为某种原因导致的库函数open,read或者close

Record locks are not inherited by a child created via fork(2), but are preserved across an execve(2).
记录锁不被fork出来的子进程锁继承,但是会通过execve保存

Because of the buffering performed by the stdio(3) library, the use of record locking with routines in that package should be avoided; use read(2)  and write(2) instead.
因为stdio库的缓冲执行,在那里面记录锁的使用应当被避免,使用read和write来代替

Mandatory locking (Non-POSIX.)  The above record locks may be either advisory or mandatory, and are advisory by default.
强制锁,上面的记录锁建议锁或者强制锁,默认情况下是建议锁

Advisory locks are not enforced and are useful only between cooperating processes.
建议锁是不是强制的,仅仅在协作的进程中有作用

Mandatory locks are enforced for all processes.  If a process tries to perform an incompatible access (e.g., read(2) or write(2)) on a file region that has an incompatible mandatory lock, then the result depends upon whether the O_NONBLOCK flag is enabled for its open file description.  If  the  O_NON‐BLOCK  flag  is  not  enabled, then system call is blocked until the lock is removed or converted to a mode that is compatible with the access.  If the O_NONBLOCK flag is enabled, then the system call fails with the error EAGAIN.
强制锁对所有的进程都有强制作用,如果一个进程试图互斥的接触(例如read或者write)一个文件作用域,这个文件拥有互斥的强制锁,这个时候结果取决于打开文件的描述符的O_NONBLOCK标志是否有打开

To make use of mandatory locks, mandatory locking must be enabled both on the file system that contains the file to be locked, and on the file  itself. Mandatory  locking  is  enabled  on  a  file system using the "-o mand" option to mount(8), or the MS_MANDLOCK flag for mount(2).  Mandatory locking is enabled on a file by disabling group execute permission on the file and enabling the set-group-ID permission bit (see chmod(1) and chmod(2)).
为了使用强制锁,强制锁必须不仅在包含需要锁定文件的文件系统中锁定,还有文件本身也需要锁定,强制锁在文件系统中使用mount的"-o mand"参数打开,或者使用MS_MANDLOCK标志,通过禁用文件的组执行权限和打开set-group-ID权限位来打开互斥锁

The Linux implementation of mandatory locking is unreliable.  See BUGS below.
linux的强制锁实现是不可靠的,见下面的BUGS

Managing signals
F_GETOWN, F_SETOWN, F_GETOWN_EX, F_SETOWN_EX, F_GETSIG and F_SETSIG are used to manage I/O availability signals:
F_GETOWN, F_SETOWN, F_GETOWN_EX, F_SETOWN_EX, F_GETSIG和F_SETSIG用来管理有效的I/O信号

F_GETOWN (void)
Return (as the function result) the process ID or process group currently receiving SIGIO and SIGURG signals for events on file  descriptor  fd. Process IDs are returned as positive values; process group IDs are returned as negative values (but see BUGS below).  arg is ignored.
当文件描述符fd收到SIGIO和SIGURG之后,返回进程id或者进程组,进程id返回是一个正数,进程组id返回的是一个负数

F_SETOWN (int)
Set  the  process  ID or process group ID that will receive SIGIO and SIGURG signals for events on file descriptor fd to the ID given in arg.  A process ID is specified as a positive value; a process group ID is specified as a negative value.  Most commonly, the calling process  specifies itself as the owner (that is, arg is specified as getpid(2)).
当文件描述符fd收到SIGIO和SIGURG之后,用给定的arg值设置进程id或进程组id,进程id需要指定一个正数,进程组id需要指定一个负数,通常情况下调用进程指定本身作为所有者

If you set the O_ASYNC status flag on a file descriptor by using the F_SETFL command of fcntl(), a SIGIO signal is sent whenever input or output becomes possible on that file descriptor.  F_SETSIG can be used to obtain delivery of a signal other  than  SIGIO.   If  this  permission  check fails, then the signal is silently discarded.
如果fcntl使用F_SETFL命令的O_ASYNC标志,无论什么时候文件描述符的输入或者输出可用就会发送一个SIGIO的信号,F_SETSIG可以用来获取不单是SIGIO的信号,如果检查权限失败,那么信号将会被隐式的删除

Sending  a  signal  to  the  owner process (group) specified by F_SETOWN is subject to the same permissions checks as are described for kill(2), where the sending process is the one that employs F_SETOWN (but see BUGS below).
通过定义F_SETOWN来给owner进程发送信号,当发送进程使用F_SETOWN的时候,受与kill类似的权限检查支配,

If the file descriptor fd refers to a socket, F_SETOWN also selects the recipient of SIGURG signals that are  delivered  when  out-of-band  data arrives on that socket.  (SIGURG is sent in any situation where select(2) would report the socket as having an "exceptional condition".)
如果文件描述符关联的是一个socket,F_SETOWN在带外数据到达socket的时候同样选择接受者发送SIGURG信号

The following was true in 2.6.x kernels up to and including kernel 2.6.11:
If  a  nonzero  value is given to F_SETSIG in a multithreaded process running with a threading library that supports thread groups (e.g., NPTL), then a positive value given to F_SETOWN has a different meaning: instead of being a process ID identifying a whole process, it  is a  thread  ID identifying a specific thread within a process.  Consequently, it may be necessary to pass F_SETOWN the result of gettid(2) instead of getpid(2) to get sensible results when F_SETSIG is used.  (In current Linux threading implementations, a main thread's  thread ID  is  the  same as its process ID.  This means that a single-threaded program can equally use gettid(2) or getpid(2) in this scenario.) Note, however, that the statements in this paragraph do not apply to the SIGURG signal generated for out-of-band data on a  socket:  this signal is always sent to either a process or a process group, depending on the value given to F_SETOWN.
下面的规定在kernel2.6.x版本以上是正确的,包含kernel2.6.11
在线程库支持线程组的多线程进程中,如果F_SETSIG是一个非0值,F_SETOWN的正数值有不同的含义:指的是进程内部的具体线程id,而不是整个进程的进程id,因此,在使用F_SETSIG的情况下,有必要使用gettid(),而不是getpid来传递数据给F_SETOWN(目前的linux线程实现,主线程的线程id跟所在进程的进程id是相同的,因此单线程的程序使用gettid()和getpid()是一样的结果),然而,本段的陈述并不为SIGURG信号生成带外socket数据:这个信号通常发送给进程或者进程组,取决于F_SETOWN的值

The  above  behavior was accidentally dropped in Linux 2.6.12, and won't be restored.  From Linux 2.6.32 onward, use F_SETOWN_EX to target SIGIO and SIGURG signals at a particular thread.
前面的行为在linux2.6.12中意外的被删除了,从2.6.32版本之后,在特殊的线程中,使用F_SETOWN_EX来捕获SIGIO和SIGURG信号

F_GETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
Return the current file descriptor owner settings as defined by a previous F_SETOWN_EX operation.  The information is returned in the  structure pointed to by arg, which has the following form:
返回当前文件描述符的owner,在前一个F_SETOWN_EX操作中定义,返回指向结构体的信息,格式如下:
          struct f_owner_ex {
              int   type;
              pid_t pid;
          };

The type field will have one of the values F_OWNER_TID, F_OWNER_PID, or F_OWNER_PGRP.  The pid field is a positive integer representing a thread ID, process ID, or process group ID.  See F_SETOWN_EX for more details.
type成员变量有F_OWNER_TID, F_OWNER_PID或者F_OWNER_PGRP几个值,pid成员变量是一个正整数,代表线程id,进程id或者进程组id,从F_SETOWN_EX中查看更多的信息

F_SETOWN_EX (struct f_owner_ex *) (since Linux 2.6.32)
This operation performs a similar task to F_SETOWN.  It allows the caller to direct I/O availability signals to a specific thread,  process,  or process group.  The caller specifies the target of signals via arg, which is a pointer to a f_owner_ex structure.  The type field has one of the following values, which define how pid is interpreted:
这个操作表现跟F_SETOWN类似,它允许调用者直接I/O信号到指定的线程,进程或者进程组,调用者通过arg的值来指定信号,这是一个指向f_owner_ex结构体的指针,成员变量type有如下值,定义了如何解释pid:
F_OWNER_TID Send the signal to the thread whose thread ID (the value returned by a call to clone(2) or gettid(2)) is specified in pid.
发送信号给指定pid的线程
F_OWNER_PID Send the signal to the process whose ID is specified in pid.
发送信号给指定pid的进程
F_OWNER_PGRP Send the signal to the process group whose ID is specified in pid.  (Note that, unlike with F_SETOWN, a process group ID is specified  as a positive value here.)
发送信号给指定pid的进程组

F_GETSIG (void) Return  (as  the  function result) the signal sent when input or output becomes possible.  A value of zero means SIGIO is sent.  Any other value (including SIGIO) is the signal sent instead, and in this case additional info is available to the signal handler if installed with  SA_SIGINFO. arg is ignored.
F_GETSIG当输入或者输出可用的时候返回发送的信号,返回值是0意味着发送了SIGIO信号,其他任何值都意味着发送信号了,在安装了SA_SIGINFO情况下,附加信息对于信号捕捉函数来说是可用的

F_SETSIG (int) Set  the  signal  sent when input or output becomes possible to the value given in arg.  A value of zero means to send the default SIGIO signal. Any other value (including SIGIO) is the signal to send instead, and in this case  additional  info  is  available  to  the  signal  handler  if installed with SA_SIGINFO.
F_SETSIG设置当输入输出可用的时候发送的信号,0值意味着发送默认的SIGIO信号,其他任何值意味着发送了对应的信号,在安装了SA_SIGINFO情况下,附加信息对于信号捕捉函数来说是可用的

By  using F_SETSIG with a nonzero value, and setting SA_SIGINFO for the signal handler (see sigaction(2)), extra information about I/O events is passed to the handler in a siginfo_t structure.  If the si_code field indicates the source is SI_SIGIO, the si_fd field gives the file  descriptor  associated  with  the event.  Otherwise, there is no indication which file descriptors are pending, and you should use the usual mechanisms (select(2), poll(2), read(2) with O_NONBLOCK set etc.) to determine which file descriptors are available for I/O.
F_SETSIG是一个非0值的时候,设置SA_SIGINFO给信号捕捉函数,额外的信息通过siginfo_t结构体传递,如果si_code表示资源是SI_SIGIO,si_fd使文件描述符与事件相关联,否则没有信息指定具体哪个文件描述符将要发生,你需要使用用户机制来决定哪个文件描述符是I/O可用的

By selecting a real time signal (value >= SIGRTMIN), multiple I/O events may be queued using the same signal numbers.  (Queuing is dependent  on available memory).  Extra information is available if SA_SIGINFO is set for the signal handler, as above.
通过选择一个实时的信号,多路I/O时间可能会使用同样的信号数字排队(排队取决于可用的内存),额外的信息是可用的,当SA_SIGINFO设置用于信号捕捉函数

Note  that Linux imposes a limit on the number of real-time signals that may be queued to a process (see getrlimit(2) and signal(7)) and if this limit is reached, then the kernel reverts to delivering SIGIO, and this signal is delivered to the entire process  rather  than  to  a  specific thread.
linux强制设置进程可能排队的实时信号数量限制,如果达到限制,kernel将恢复发送SIGIO,这个信号将发送给所有的进程,而不是特定的线程

Using these mechanisms, a program can implement fully asynchronous I/O without using select(2) or poll(2) most of the time.
使用这些机制,程序可以不用select和poll就能充分实现异步I/O

The  use  of O_ASYNC, F_GETOWN, F_SETOWN is specific to BSD and Linux.  F_GETOWN_EX, F_SETOWN_EX, F_GETSIG, and F_SETSIG are Linux-specific.  POSIX has asynchronous I/O and the aio_sigevent structure to achieve similar things; these are also available in Linux as part of the GNU C Library (Glibc).
O_ASYNC, F_GETOWN, F_SETOWN的使用在BSD和linux中有定义,F_GETOWN_EX, F_SETOWN_EX, F_GETSIG和F_SETSIG是linux定义的,POSIX有异步I/O和aio_sigevent结构体来完成类似的事情,作为linux中作为glibc的一部分

F_SETLEASE and F_GETLEASE (Linux 2.4 onward) are used (respectively) to establish a new lease, and  retrieve  the  current  lease,  on  the  open  file description  referred  to  by  the file descriptor fd.  A file lease provides a mechanism whereby the process holding the lease (the "lease holder") is notified (via delivery of a signal) when a process (the "lease breaker") tries to open(2) or truncate(2) the file referred to by that file descriptor.
F_SETLEASE和F_GETLEASE用来设置一个租期和获取当前租期,对于文件描述符fd所关联的文件,凭借进程持有的租期被通知,文件租期提供一种机制当文件试图打开或者截断文件描述符fd所关联的文件的时候

F_SETLEASE (int)
Set or remove a file lease according to which of the following values is specified in the integer arg:
通过arg所指定的值设置或者移除文件租期

F_RDLCK
Take out a read lease.  This will cause the calling process to be notified when the file is opened for writing or is truncated.   A  read lease can only be placed on a file descriptor that is opened read-only.
设置一个读租期,这将导致调用进程被通知,当文件是以写权限打开或者被截断,读租期只能在文件描述符是只读方式打开的时候被设置

F_WRLCK
Take  out  a  write  lease.  This will cause the caller to be notified when the file is opened for reading or writing or is truncated.  A write lease may be placed on a file only if there are no other open file descriptors for the file.
设置一个写租期,这将导致调用者被通知,当文件以可读,可写方式打开或者被截断的时候,写租期只能在当前文件没有其他打开的文件描述符的时候被设置

F_UNLCK
Remove our lease from the file.
删除文件租期

Leases are associated with an open file description (see open(2)).  This means that duplicate file descriptors (created by,  for  example,  fork(2)  or dup(2))  refer  to  the  same  lease, and this lease may be modified or released using any of these descriptors.  Furthermore, the lease is released by either an explicit F_UNLCK operation on any of these duplicate descriptors, or when all such descriptors have been closed.
租期关联打开的文件描述符,这意味着文件描述符的副本(例如通过fork或者dup创建)拥有同样的租期,租期可以被这些文件描述符中的任何一个修改,此外,租期可以通过任何一个文件描述符副本显式的调用F_UNLCK,或者当文件描述符被关闭来释放

Leases may only be taken out on regular files.  An unprivileged process may only take out a lease on a file whose UID (owner) matches the  file  system UID of the process.  A process with the CAP_LEASE capability may take out leases on arbitrary files.
租期只能在常规文件上生效,一个没有特权的进程只能设置租期到那些文件UID跟进程的系统UID相匹配的文件上,带CAP_LEASE属性的进程可以设置租期到任何文件上

F_GETLEASE (void)
Indicates  what  type  of  lease is associated with the file descriptor fd by returning either F_RDLCK, F_WRLCK, or F_UNLCK, indicating, respectively, a read lease , a write lease, or no lease.  arg is ignored.
返回F_RDLCK, F_WRLCK或者F_UNLCK来得到文件描述符fd所关联的文件目前的租期类型,分别指:读租期,写租期,没有租期

When a process (the "lease breaker") performs an open(2) or truncate(2) that conflicts with a lease established via  F_SETLEASE,  the  system  call  is blocked  by  the kernel and the kernel notifies the lease holder by sending it a signal (SIGIO by default).  The lease holder should respond to receipt of this signal by doing whatever cleanup is required in preparation for the file to be accessed by another process (e.g., flushing cached buffers)  and then either remove or downgrade its lease.  A lease is removed by performing an F_SETLEASE command specifying arg as F_UNLCK.  If the lease holder currently holds a write lease on the file, and the lease breaker is opening the file for reading, then it is sufficient for the lease holder to  downgrade the lease to a read lease.  This is done by performing an F_SETLEASE command specifying arg as F_RDLCK.
如果一个进程(“租期破坏者”)通过F_SETLEASE使用open或者truncate跟已经建立的租期有冲突,系统调用将被kernel阻塞住,kernel通过信号(默认是SIGIO)通知租期持有者,租期持有者需要对收到的信号做出反应,不管接触该文件的其他进程是否被要求清除,然后删除或者降级租期,通过F_SETLEASE设置参数F_UNLCK来删除租期,如果租期持有者当前持有一个写的租期,租期破坏者以读的方式打开文件,这足以使租期持有者降级租期到读租期,通过使用F_SETLEASE设置参数F_RDLCK来达到目的

If  the  lease  holder  fails  to downgrade or remove the lease within the number of seconds specified in /proc/sys/fs/lease-break-time then the kernel forcibly removes or downgrades the lease holder's lease.
如果租期持有者在/proc/sys/fs/lease-break-time指定的秒数中降级或者删除租期失败,kernel会强行删除或者降级租期持有者的租期

Once a lease break has been initiated, F_GETLEASE returns the target lease type (either F_RDLCK or F_UNLCK, depending on what would be compatible  with the  lease  breaker)  until  the  lease  holder  voluntarily downgrades or removes the lease or the kernel forcibly does so after the lease break timer expires.
当一个租期破坏被初始化,F_GETLEASE返回目的租期类型(至于是F_RDLCK还是F_UNLCK,这个取决于租期破坏者的需求),直到租期持有者自动降级或者删除租期,或者在租期破坏者超时之后kernel强制的实施

Once the lease has been voluntarily or forcibly removed or downgraded, and assuming the lease breaker has not unblocked its  system  call,  the  kernel permits the lease breaker's system call to proceed.
当租期被自愿地或者强制地删除或者降级后,假设租期破坏者没有非阻塞的系统调佣,kernel允许租期破坏者系统调用继续进行

If  the  lease  breaker's  blocked  open(2) or truncate(2) is interrupted by a signal handler, then the system call fails with the error EINTR, but the other steps still occur as described above.  If the lease breaker is killed by a signal while blocked in open(2) or truncate(2), then the  other  steps still  occur  as  described  above.   If the lease breaker specifies the O_NONBLOCK flag when calling open(2), then the call immediately fails with the error EWOULDBLOCK, but the other steps still occur as described above.
如果租期破坏者的阻塞open和truncate被信号捕获函数所中断,这个系统调用将会返回EINTR错误,但是上面描述的步骤依然会发生,如果租期破坏者在open或者truncate阻塞的时候被信号杀死,上面描述的步骤同样会发生,如果租期破坏者使用O_NONBLOCK标志调用open,调用会直接返回EWOULDBLOCK错误,上面描述的步骤同样会发生

The default signal used to notify the lease holder is SIGIO, but this can be changed using the F_SETSIG command to fcntl().  If a F_SETSIG  command  is performed  (even  one specifying SIGIO), and the signal handler is established using SA_SIGINFO, then the handler will receive a siginfo_t structure as its second argument, and the si_fd field of this argument will hold the descriptor of the leased file that has been accessed by another process.  (This is useful if the caller holds leases against multiple files).
默认通知租期持有者的信号是SIGIO,但是这个可以通过fcntl使用F_SETSIG命令来改变,如果使用F_SETSIG(即使指定的是SIGIO),信号捕捉函数使用SA_SIGINFO建立,然后捕捉函数会收到一个siginfo_t结构体作为第二个参数,si_fd变量将会保存其他进程所接触的租期文件的描述符

File and directory change notification (dnotify) F_NOTIFY (int)
(Linux  2.4 onward) Provide notification when the directory referred to by fd or any of the files that it contains is changed.  The events to be notified are specified in arg, which is a bit mask specified by ORing together zero or more of the following bits:
当fd所关联的路径或者文件有改动就发出同志,通知的事件在arg中指定,通过或操作0个或这多个下面的位指定的字节掩码
DN_ACCESS   A file was accessed (read, pread, readv)
文件可接触
DN_MODIFY   A file was modified (write, pwrite, writev, truncate, ftruncate).
文件被修改
DN_CREATE   A file was created (open, creat, mknod, mkdir, link, symlink, rename).
文件被创建
DN_DELETE   A file was unlinked (unlink, rename to another directory, rmdir).
文件被删除
DN_RENAME   A file was renamed within this directory (rename).
文件被改名字
DN_ATTRIB   The attributes of a file were changed (chown, chmod, utime[s]).
文件的属性被改变
(In order to obtain these definitions, the _GNU_SOURCE feature test macro must be defined before including any header files.)
为了获取这些定义,在包含头文件之前需要定义_GNU_SOURCE宏

Directory notifications are normally "one-shot", and the application must  reregister  to  receive  further  notifications.   Alternatively,  if DN_MULTISHOT is included in arg, then notification will remain in effect until explicitly removed.
路径通知通常是一次性的,应用程序接受更多通知需要注册,或者,如果arg中包含DN_MULTISHOT,通知将有效存在直到显示的删除

A  series  of  F_NOTIFY requests is cumulative, with the events in arg being added to the set already monitored.  To disable notification of all events, make an F_NOTIFY call specifying arg as 0.
F_NOTIFY系列的请求是可以累计的,arg中增加的事件集都将被监视,如果需要关闭所有的事件通知,使用F_NOTIFY命令指定arg的值是0即可

Notification occurs via delivery of a signal.  The default signal is SIGIO, but this can be changed using the F_SETSIG command to  fcntl().   In the  latter case, the signal handler receives a siginfo_t structure as its second argument (if the handler was established using SA_SIGINFO) and the si_fd field of this structure contains the file descriptor which generated the notification (useful when establishing notification on multiple directories).
通知以信号的方式传递,默认的信号是SIGIO,fcntl使用F_SETSIG命令可以改变这个,在后面的事件中,信号捕捉函数收到siginfo_t结构体作为第二个参数(如果捕捉函数是通过SA_SIGINFO建立的),结构体成员变量si_fd包含了生成通知的文件的描述符

Especially when using DN_MULTISHOT, a real time signal should be used for notification, so that multiple notifications can be queued.
特别是在使用DN_MULTISHOT,通知必须使用实时信号,因此多路通知需要排队

NOTE:  New applications should use the inotify interface (available since kernel 2.6.13), which provides a much superior interface for obtaining notifications of file system events.  See inotify(7).
新的应用程序需要使用新的通知接口,提供了更好的借口用来获取文件系统事件的通知

Changing the capacity of a pipe F_SETPIPE_SZ (int; since Linux 2.6.35)
Change the capacity of the pipe referred to by fd to be at least arg bytes.  An unprivileged process can adjust the pipe capacity to  any  value between  the  system  page  size and the limit defined in /proc/sys/fs/pipe-max-size (see proc(5)).  Attempts to set the pipe capacity below the page size are silently rounded up to the page size.  Attempts by  an  unprivileged  process  to  set  the  pipe  capacity  above  the  limit  in /proc/sys/fs/pipe-max-size  yield  the  error EPERM; a privileged process (CAP_SYS_RESOURCE) can override the limit.  When allocating the buffer for the pipe, the kernel may use a capacity larger than arg, if that is convenient for the implementation.  The F_GETPIPE_SZ  operation  returns he  actual  size  used.   Attempting to set the pipe capacity smaller than the amount of buffer space currently used to store data produces the error EBUSY.
更改fd所关联的管道的容量到arg字节,一个非特权的进程可以在系统页尺寸和/proc/sys/fs/pipe-max-size中定义的大小中调整管道容量,试图设置管道容量到页尺寸之下,接近页尺寸是安静的,非特权进程试图设置管道容量大于/proc/sys/fs/pipe-max-size中指定的值会返回EPERM错误,特权进程可以无视这个限制,当从管道中分配缓冲区的时候,kernel可以使用大于arg值的容量,如果在实现中方便的话,F_GETPIPE_SZ操作会返回实际使用的尺寸,试图设置管道的容量小于当前缓冲区数据使用的尺寸会返回EBUSY错误

F_GETPIPE_SZ (void; since Linux 2.6.35)
Return (as the function result) the capacity of the pipe referred to by fd.
返回fd关联的管道的容量

RETURN VALUE
For a successful call, the return value depends on the operation:
为了调用成功,返回值取决于操作:
F_DUPFD  The new descriptor.
新的描述符
F_GETFD  Value of file descriptor flags.
文件描述符的标志
F_GETFL  Value of file status flags.
文件状态的标志
F_GETLEASE Type of lease held on file descriptor.
文件描述符持有的租期类型
F_GETOWN Value of descriptor owner.
描述符所有者的值
F_GETSIG Value of signal sent when read or write becomes possible, or zero for traditional SIGIO behavior.
当读或者写可用的时候发送的信号的值,0的时候发送默认的SIGIO
F_GETPIPE_SZ The pipe capacity.
管道的容量
All other commands Zero.

On error, -1 is returned, and errno is set appropriately.
失败的情况下会返回-1,errno被设置成合适值

ERRORS
EACCES or EAGAIN Operation is prohibited by locks held by other processes.
操作被其他进程的锁持有者所禁止
EAGAIN The operation is prohibited because the file has been memory-mapped by another process.
操作被禁止,因为文件被内存映射到其他进程
EBADF  fd is not an open file descriptor, or the command was F_SETLK or F_SETLKW and the file descriptor open mode doesn't match with the type of  lock requested.
fd所关联的不是一个打开的文件描述符,或者设置了F_SETLK或F_SETLKW命令,文件描述符的打开模式跟请求的锁类型不匹配
EDEADLK It was detected that the specified F_SETLKW command would cause a deadlock.
检测到F_SETLKW命令导致了死锁
EFAULT lock is outside your accessible address space.
锁溢出了可用的地址空间
EINTR  For  F_SETLKW,  the command was interrupted by a signal; see signal(7).  For F_GETLK and F_SETLK, the command was interrupted by a signal before the lock was checked or acquired.  Most likely when locking a remote file (e.g., locking over NFS), but can sometimes happen locally.
对于F_SETLKW,命令被信号中断,对于F_GETLK和F_SETLK,命令在锁检查或者申请之前被信号中断,经常发生在锁一个远程文件(如果NFS),但是有时候本地文件也会发生
EINVAL For F_DUPFD, arg is negative or is greater than the maximum allowable value.  For F_SETSIG, arg is not an allowable signal number.
对于F_DUPFD,arg是一个负数,或者大于最大可用的值,对于F_SETSIG,arg不是一个可用的信号值
EMFILE For F_DUPFD, the process already has the maximum number of file descriptors open.
对于F_DUPFD,进程已经打开了最多的文件描述符
ENOLCK Too many segment locks open, lock table is full, or a remote locking protocol failed (e.g., locking over NFS).
打开了过多的段锁,锁表已经满了,或者远程锁协议失败
EPERM  Attempted to clear the O_APPEND flag on a file that has the append-only attribute set.
试图清除一个只附加属性的文件的O_APPEND标志

NOTES
The original Linux fcntl() system call was not designed to handle large file offsets (in the flock structure).  Consequently, an fcntl64() system  call was  added  in  Linux  2.4.   The  newer  system  call  employs a different structure for file locking, flock64, and corresponding commands, F_GETLK64, F_SETLK64, and F_SETLKW64.  However, these details can be ignored by applications using glibc, whose fcntl() wrapper function transparently employs the more recent system call where it is available.
原生的linux fcntl系统调用并没有涉及到大的文件偏移量,后续fcntl64系统调用在linux 2.4中加入,新的系统调用拥有不同的文件锁结构体,flock64和相应的命令
F_GETLK64, F_SETLK64和F_SETLKW64,然而在使用glibc的时候这些细节将会被忽略,glibc的fcntl封装函数透明的使用最近可用的系统调用

The errors returned by dup2(2) are different from those returned by F_DUPFD.
dup2返回的这些错误跟F_DUPFD返回的错误是不同的

Since kernel 2.0, there is no interaction between the types of lock placed by flock(2) and fcntl().
自从kernel2.0之后,flock和fcntl使用锁的方式已经不同了(以前fcntl的锁是通过flock实现的哦)

Several  systems  have  more  fields in struct flock such as, for example, l_sysid.  Clearly, l_pid alone is not going to be very useful if the process holding the lock may live on a different machine.
一些系统的flock拥有更多的成员变量,例如l_sysid,如果进程持有的锁可能在其他机器上运行,那么单独的l_pid可能没有什么用

BUGS
A limitation of the Linux system call conventions on some architectures (notably i386) means that if a (negative) process group ID to  be  returned  by F_GETOWN  falls  in  the  range  -1 to -4095, then the return value is wrongly interpreted by glibc as an error in the system call; that is, the return value of fcntl() will be -1, and errno will contain the (positive) process group ID.  The Linux-specific F_GETOWN_EX  operation  avoids  this  problem. Since glibc version 2.11, glibc makes the kernel F_GETOWN problem invisible by implementing F_GETOWN using F_GETOWN_EX.
由于linux系统调用的限制的约定,在一些体系结构上意味着如果F_GETOWN返回一个进程的组id在范围-1到-4095上,glibc的系统调用把返回值误认为是一个错误,这也就是fcntl的返回值会是-1,errno会返回包含进程租id的信息,linux环境下F_GETOWN_EX操作可以避免这个问题,自从glibc2.1.1之后,glibc通过使用F_GETOWN_EX来实现F_GETOWN,从而使kernel的F_GETOWN问题消失了

In  Linux 2.4 and earlier, there is bug that can occur when an unprivileged process uses F_SETOWN to specify the owner of a socket file descriptor as a process (group) other than the caller.  In this case, fcntl() can return -1 with errno set to EPERM, even when the owner process (group)  is  one  that the caller has permission to send signals to.  Despite this error return, the file descriptor owner is set, and signals will be sent to the owner.
在linux2.4之前,有一个bug可能发生在非特权进程作为一个进程(组),而不是调用者使用F_SETOWN来指定socket文件描述符,在这种情况下fcntl会返回-1, errno被设置成EPERM,即使owner进程(组)是调用者,且有权限发送信号,尽管错误返回了,但是文件描述符的owner被设置了,信号将会发送给owner

The  implementation  of mandatory locking in all known versions of Linux is subject to race conditions which render it unreliable: a write(2) call that overlaps with a lock may modify data after the mandatory lock is acquired; a read(2) call that overlaps with a lock may detect  changes  to  data  that were  made  only  after  a  write  lock was acquired.  Similar races exist between mandatory locks and mmap(2).  It is therefore inadvisable to rely on mandatory locking.
强制锁的实现在linux的所有版本中都会遇到竞争,因此会导致不可信:在强制锁之后,写调用跟锁有重叠会导致数据改变,仅仅在强制锁被申请之后,读调用跟锁有重叠可能检测到数据的变化,类似的竞争同样存在与强制锁跟mmap之间,因此依赖强制锁是不明智的

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值