linux unlocked_ioctl

从2.6.36 以后的内核已经废弃了 file_operations 中的 ioctl 函数指针,取而代之的是 unlocked_ioctl, 用户空间对应的系统调用没有发生变化。


struct file_operations
----------------------

This describes how the VFS can manipulate an open file. As of kernel
2.6.22, the following members are defined:

struct file_operations {
    struct module *owner;
    loff_t (*llseek) (struct file *, loff_t, int);
    ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
    ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
    ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
    ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
    int (*readdir) (struct file *, void *, filldir_t);
    unsigned int (*poll) (struct file *, struct poll_table_struct *);
    long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
    long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
    int (*mmap) (struct file *, struct vm_area_struct *);
    int (*open) (struct inode *, struct file *);
    int (*flush) (struct file *);
    int (*release) (struct inode *, struct file *);
    int (*fsync) (struct file *, int datasync);
    int (*aio_fsync) (struct kiocb *, int datasync);
    int (*fasync) (int, struct file *, int);
    int (*lock) (struct file *, int, struct file_lock *);
    ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *);
    ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *);
    ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);
    ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
    unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
    int (*check_flags)(int);
    int (*flock) (struct file *, int, struct file_lock *);
    ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned int);
    ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned int);
};


Again, all methods are called without any locks being held, unless
otherwise noted.

  llseek: called when the VFS needs to move the file position index

  read: called by read(2) and related system calls

  aio_read: called by io_submit(2) and other asynchronous I/O operations

  write: called by write(2) and related system calls

  aio_write: called by io_submit(2) and other asynchronous I/O operations

  readdir: called when the VFS needs to read the directory contents

  poll: called by the VFS when a process wants to check if there is
    activity on this file and (optionally) go to sleep until there
    is activity. Called by the select(2) and poll(2) system calls

  unlocked_ioctl: called by the ioctl(2) system call.

  compat_ioctl: called by the ioctl(2) system call when 32 bit system calls
      are used on 64 bit kernels.

  mmap: called by the mmap(2) system call

  open: called by the VFS when an inode should be opened. When the VFS
    opens a file, it creates a new "struct file". It then calls the
    open method for the newly allocated file structure. You might
    think that the open method really belongs in
    "struct inode_operations", and you may be right. I think it's
    done the way it is because it makes filesystems simpler to
    implement. The open() method is a good place to initialize the
    "private_data" member in the file structure if you want to point
    to a device structure

  flush: called by the close(2) system call to flush a file

  release: called when the last reference to an open file is closed

  fsync: called by the fsync(2) system call

  fasync: called by the fcntl(2) system call when asynchronous
    (non-blocking) mode is enabled for a file

  lock: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW
      commands

  readv: called by the readv(2) system call

  writev: called by the writev(2) system call

  sendfile: called by the sendfile(2) system call

  get_unmapped_area: called by the mmap(2) system call

  check_flags: called by the fcntl(2) system call for F_SETFL command

  flock: called by the flock(2) system call

  splice_write: called by the VFS to splice data from a pipe to a file. This
        method is used by the splice(2) system call

  splice_read: called by the VFS to splice data from file to a pipe. This
           method is used by the splice(2) system call

Note that the file operations are implemented by the specific
filesystem in which the inode resides. When opening a device node
(character or block special) most filesystems will call special
support routines in the VFS which will locate the required device
driver information. These support routines replace the filesystem file
operations with those for the device driver, and then proceed to call
the new open() method for the file. This is how opening a device file
in the filesystem eventually ends up calling the device driver open()
method.


内核使用 unlocked_ioctl 替换 ioctl 的背景:

参考:

http://www.cnblogs.com/hfww/archive/2012/03/31/2426717.html

The ioctl() system call has long been out of favor among the kernel developers, who see it as a completely uncontrolled entry point into the kernel. Given the vast number of applications which expect ioctl() to be present, however, it will not go away anytime soon. So it is worth the trouble to ensure that ioctl()calls are performed quickly and correctly - and that they do not unnecessarily impact the rest of the system.

ioctl() is one of the remaining parts of the kernel which runs under the Big Kernel Lock (BKL). In the past, the usage of the BKL has made it possible for long-running ioctl() methods to create long latencies for unrelated processes. Recent changes, which have made BKL-covered code preemptible, have mitigated that problem somewhat. Even so, the desire to eventually get rid of the BKL altogether suggests that ioctl() should move out from under its protection.

Simply removing the lock_kernel() call before calling ioctl() methods is not an option, however. Each one of those methods must first be audited to see what other locking may be necessary for it to run safely outside of the BKL. That is a huge job, one which would be hard to do in a single "flag day" operation. So a migration path must be provided. As of 2.6.11, that path will exist.

The patch (by Michael s. Tsirkin) adds a new member to the file_operations structure:

 

    long (*unlocked_ioctl) (struct file *filp, unsigned int cmd, 
                            unsigned long arg);

If a driver or filesystem provides an unlocked_ioctl() method, it will be called in preference to the older ioctl(). The differences are that the inode argument is not provided (it's available as filp->f_dentry->d_inode) and the BKL is not taken prior to the call. All new code should be written with its own locking, and should use unlocked_ioctl(). Old code should be converted as time allows. For code which must run on multiple kernels, there is a new HAVE_UNLOCKED_IOCTL macro which can be tested to see if the newer method is available or not.

Michael's patch adds one other operation:

 

    long (*compat_ioctl) (struct file *filp, unsigned int cmd, 
                          unsigned long arg);

If this method exists, it will be called (without the BKL) whenever a 32-bit process calls ioctl() on a 64-bit system. It should then do whatever is required to convert the argument to native data types and carry out the request. If compat_ioctl() is not provided, the older conversion mechanism will be used, as before. The HAVE_COMPAT_IOCTL macro can be tested to see if this mechanism is available on any given kernel.

The compat_ioctl() method will probably filter down into a few subsystems. Andi Kleen has posted patches adding new compat_ioctl() methods to the block_device_operations and scsi_host_template structures, for example, though those patches have not been merged as of this writing.


unlocked_ioctl 和 ioctl 的区别:

参考:

http://blog.sina.com.cn/s/blog_693301190100vyhh.html

kernel 2.6.35 及之前的版本中struct file_operations 一共有3个ioctl :
ioctl,unlocked_ioctl和compat_ioctl
现在只有unlocked_ioctl和compat_ioctl 了

在kernel 2.6.36 中已经完全删除了struct file_operations 中的ioctl函数指针,取而代之的是unlocked_ioctl 。

这个指针函数变了之后最大的影响是参数中 少了inode ,不过这个不是问题,因为用户程序中的ioctl对应的系统调用接口没有变化,所以用户程序不需要改变,一切都交给内核处理了,如果想在unlocked_ioctl中获得inode等信息可以用如下方法:
struct inode *inode =file->f_mapping->host;
struct block_device *bdev =inode->i_bdev;
struct gendisk *disk =bdev->bd_disk;
fmode_t mode =file->f_mode;

和int (*ioctl) (struct inode *inode, struct file*filp, unsigned int cmd, unsigned long arg)相比
compat_ioctl少了inode参数,可以通过 filp->f_dentry->d_inode方法获得。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值