从2.6.36 以后的内核已经废弃了 file_operations 中的 ioctl 函数指针,取而代之的是 unlocked_ioctl, 用户空间对应的系统调用没有发生变化。
struct file_operations
----------------------
This describes how the VFS can manipulate an open file. As of kernel
2.6.22, the following members are defined:
struct file_operations {
struct module *owner;
loff_t (*llseek) (struct file *, loff_t, int);
ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
int (*readdir) (struct file *, void *, filldir_t);
unsigned int (*poll) (struct file *, struct poll_table_struct *);
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
int (*open) (struct inode *, struct file *);
int (*flush) (struct file *);
int (*release) (struct inode *, struct file *);
int (*fsync) (struct file *, int datasync);
int (*aio_fsync) (struct kiocb *, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *);
ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *);
ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);
ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
int (*check_flags)(int);
int (*flock) (struct file *, int, struct file_lock *);
ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned int);
ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned int);
};
Again, all methods are called without any locks being held, unless
otherwise noted.
llseek: called when the VFS needs to move the file position index
read: called by read(2) and related system calls
aio_read: called by io_submit(2) and other asynchronous I/O operations
write: called by write(2) and related system calls
aio_write: called by io_submit(2) and other asynchronous I/O operations
readdir: called when the VFS needs to read the directory contents
poll: called by the VFS when a process wants to check if there is
activity on this file and (optionally) go to sleep until there
is activity. Called by the select(2) and poll(2) system calls
unlocked_ioctl: called by the ioctl(2) system call.
compat_ioctl: called by the ioctl(2) system call when 32 bit system calls
are used on 64 bit kernels.
mmap: called by the mmap(2) system call
open: called by the VFS when an inode should be opened. When the VFS
opens a file, it creates a new "struct file". It then calls the
open method for the newly allocated file structure. You might
think that the open method really belongs in
"struct inode_operations", and you may be right. I think it's
done the way it is because it makes filesystems simpler to
implement. The open() method is a good place to initialize the
"private_data" member in the file structure if you want to point
to a device structure
flush: called by the close(2) system call to flush a file
release: called when the last reference to an open file is closed
fsync: called by the fsync(2) system call
fasync: called by the fcntl(2) system call when asynchronous
(non-blocking) mode is enabled for a file
lock: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW
commands
readv: called by the readv(2) system call
writev: called by the writev(2) system call
sendfile: called by the sendfile(2) system call
get_unmapped_area: called by the mmap(2) system call
check_flags: called by the fcntl(2) system call for F_SETFL command
flock: called by the flock(2) system call
splice_write: called by the VFS to splice data from a pipe to a file. This
method is used by the splice(2) system call
splice_read: called by the VFS to splice data from file to a pipe. This
method is used by the splice(2) system call
Note that the file operations are implemented by the specific
filesystem in which the inode resides. When opening a device node
(character or block special) most filesystems will call special
support routines in the VFS which will locate the required device
driver information. These support routines replace the filesystem file
operations with those for the device driver, and then proceed to call
the new open() method for the file. This is how opening a device file
in the filesystem eventually ends up calling the device driver open()
method.
内核使用 unlocked_ioctl 替换 ioctl 的背景:
参考:
http://www.cnblogs.com/hfww/archive/2012/03/31/2426717.html
The ioctl() system call has long been out of favor among the kernel developers, who see it as a completely uncontrolled entry point into the kernel. Given the vast number of applications which expect ioctl() to be present, however, it will not go away anytime soon. So it is worth the trouble to ensure that ioctl()calls are performed quickly and correctly - and that they do not unnecessarily impact the rest of the system.
ioctl() is one of the remaining parts of the kernel which runs under the Big Kernel Lock (BKL). In the past, the usage of the BKL has made it possible for long-running ioctl() methods to create long latencies for unrelated processes. Recent changes, which have made BKL-covered code preemptible, have mitigated that problem somewhat. Even so, the desire to eventually get rid of the BKL altogether suggests that ioctl() should move out from under its protection.
Simply removing the lock_kernel() call before calling ioctl() methods is not an option, however. Each one of those methods must first be audited to see what other locking may be necessary for it to run safely outside of the BKL. That is a huge job, one which would be hard to do in a single "flag day" operation. So a migration path must be provided. As of 2.6.11, that path will exist.
The patch (by Michael s. Tsirkin) adds a new member to the file_operations structure:
long (*unlocked_ioctl) (struct file *filp, unsigned int cmd, unsigned long arg);
If a driver or filesystem provides an unlocked_ioctl() method, it will be called in preference to the older ioctl(). The differences are that the inode argument is not provided (it's available as filp->f_dentry->d_inode) and the BKL is not taken prior to the call. All new code should be written with its own locking, and should use unlocked_ioctl(). Old code should be converted as time allows. For code which must run on multiple kernels, there is a new HAVE_UNLOCKED_IOCTL macro which can be tested to see if the newer method is available or not.
Michael's patch adds one other operation:
long (*compat_ioctl) (struct file *filp, unsigned int cmd, unsigned long arg);
If this method exists, it will be called (without the BKL) whenever a 32-bit process calls ioctl() on a 64-bit system. It should then do whatever is required to convert the argument to native data types and carry out the request. If compat_ioctl() is not provided, the older conversion mechanism will be used, as before. The HAVE_COMPAT_IOCTL macro can be tested to see if this mechanism is available on any given kernel.
The compat_ioctl() method will probably filter down into a few subsystems. Andi Kleen has posted patches adding new compat_ioctl() methods to the block_device_operations and scsi_host_template structures, for example, though those patches have not been merged as of this writing.
unlocked_ioctl 和 ioctl 的区别:
参考:
http://blog.sina.com.cn/s/blog_693301190100vyhh.html
在kernel 2.6.36 中已经完全删除了struct file_operations 中的ioctl函数指针,取而代之的是unlocked_ioctl 。
compat_ioctl少了inode参数,可以通过 filp->f_dentry->d_inode方法获得。